2000-11-02 11:18:23 -05:00
|
|
|
|
PEP: 227
|
|
|
|
|
Title: Statically Nested Scopes
|
|
|
|
|
Version: $Revision$
|
|
|
|
|
Author: jeremy@digicool.com (Jeremy Hylton)
|
|
|
|
|
Status: Draft
|
|
|
|
|
Type: Standards Track
|
|
|
|
|
Python-Version: 2.1
|
|
|
|
|
Created: 01-Nov-2000
|
|
|
|
|
Post-History:
|
|
|
|
|
|
|
|
|
|
Abstract
|
|
|
|
|
|
|
|
|
|
This PEP proposes the addition of statically nested scoping
|
|
|
|
|
(lexical scoping) for Python 2.1. The current language definition
|
|
|
|
|
defines exactly three namespaces that are used to resolve names --
|
|
|
|
|
the local, global, and built-in namespaces. The addition of
|
|
|
|
|
nested scopes would allow resolution of unbound local names in
|
|
|
|
|
enclosing functions' namespaces.
|
|
|
|
|
|
|
|
|
|
One consequence of this change that will be most visible to Python
|
|
|
|
|
programs is that lambda statements could reference variables in
|
|
|
|
|
the namespaces where the lambda is defined. Currently, a lambda
|
|
|
|
|
statement uses default arguments to explicitly creating bindings
|
|
|
|
|
in the lambda's namespace.
|
|
|
|
|
|
2001-02-21 14:11:21 -05:00
|
|
|
|
Introduction
|
|
|
|
|
|
|
|
|
|
This proposal changes the rules for resolving free variables in
|
|
|
|
|
Python functions. The Python 2.0 definition specifies exactly
|
|
|
|
|
three namespaces to check for each name -- the local namespace,
|
|
|
|
|
the global namespace, and the builtin namespace. According to
|
|
|
|
|
this defintion, if a function A is defined within a function B,
|
|
|
|
|
the names bound in B are not visible in A. The proposal changes
|
|
|
|
|
the rules so that names bound in B are visible in A (unless A
|
|
|
|
|
contains a name binding that hides the binding in B).
|
|
|
|
|
|
|
|
|
|
The specification introduces rules for lexical scoping that are
|
|
|
|
|
common in Algol-like languages. The combination of lexical
|
|
|
|
|
scoping and existing support for first-class functions is
|
|
|
|
|
reminiscent of Scheme.
|
|
|
|
|
|
|
|
|
|
The changed scoping rules address two problems -- the limited
|
|
|
|
|
utility of lambda statements and the frequent confusion of new
|
|
|
|
|
users familiar with other languages that support lexical scoping,
|
|
|
|
|
e.g. the inability to define recursive functions except at the
|
|
|
|
|
module level.
|
|
|
|
|
|
|
|
|
|
The lambda statement introduces an unnamed function that contains
|
|
|
|
|
a single statement. It is often used for callback functions. In
|
|
|
|
|
the example below (written using the Python 2.0 rules), any name
|
|
|
|
|
used in the body of the lambda must be explicitly passed as a
|
|
|
|
|
default argument to the lambda.
|
|
|
|
|
|
|
|
|
|
from Tkinter import *
|
|
|
|
|
root = Tk()
|
|
|
|
|
Button(root, text="Click here",
|
|
|
|
|
command=lambda root=root: root.test.configure(text="..."))
|
|
|
|
|
|
|
|
|
|
This approach is cumbersome, particularly when there are several
|
|
|
|
|
names used in the body of the lambda. The long list of default
|
|
|
|
|
arguments obscure the purpose of the code. The proposed solution,
|
|
|
|
|
in crude terms, implements the default argument approach
|
|
|
|
|
automatically. The "root=root" argument can be omitted.
|
|
|
|
|
|
2000-12-13 23:50:32 -05:00
|
|
|
|
Specification
|
2000-11-02 11:18:23 -05:00
|
|
|
|
|
2000-12-13 23:50:32 -05:00
|
|
|
|
Python is a statically scoped language with block structure, in
|
|
|
|
|
the traditional of Algol. A code block or region, such as a
|
|
|
|
|
module, class defintion, or function body, is the basic unit of a
|
|
|
|
|
program.
|
2000-11-02 11:18:23 -05:00
|
|
|
|
|
2000-12-13 23:50:32 -05:00
|
|
|
|
Names refer to objects. Names are introduced by name binding
|
|
|
|
|
operations. Each occurrence of a name in the program text refers
|
|
|
|
|
to the binding of that name established in the innermost function
|
|
|
|
|
block containing the use.
|
2000-11-02 11:18:23 -05:00
|
|
|
|
|
2000-12-13 23:50:32 -05:00
|
|
|
|
The name binding operations are assignment, class and function
|
|
|
|
|
definition, and import statements. Each assignment or import
|
|
|
|
|
statement occurs within a block defined by a class or function
|
|
|
|
|
definition or at the module level (the top-level code block).
|
2000-11-02 11:18:23 -05:00
|
|
|
|
|
2000-12-13 23:50:32 -05:00
|
|
|
|
If a name binding operation occurs anywhere within a code block,
|
|
|
|
|
all uses of the name within the block are treated as references to
|
|
|
|
|
the current block. (Note: This can lead to errors when a name is
|
|
|
|
|
used within a block before it is bound.)
|
2000-11-02 11:18:23 -05:00
|
|
|
|
|
2000-12-13 23:50:32 -05:00
|
|
|
|
If the global statement occurs within a block, all uses of the
|
|
|
|
|
name specified in the statement refer to the binding of that name
|
|
|
|
|
in the top-level namespace. Names are resolved in the top-level
|
|
|
|
|
namespace by searching the global namespace, the namespace of the
|
|
|
|
|
module containing the code block, and the builtin namespace, the
|
|
|
|
|
namespace of the module __builtin__. The global namespace is
|
|
|
|
|
searched first. If the name is not found there, the builtin
|
|
|
|
|
namespace is searched.
|
2000-11-02 11:18:23 -05:00
|
|
|
|
|
2000-12-13 23:50:32 -05:00
|
|
|
|
If a name is used within a code block, but it is not bound there
|
|
|
|
|
and is not declared global, the use is treated as a reference to
|
2000-12-14 09:53:02 -05:00
|
|
|
|
the nearest enclosing function region. (Note: If a region is
|
|
|
|
|
contained within a class definition, the name bindings that occur
|
|
|
|
|
in the class block are not visible to enclosed functions.)
|
2000-11-02 11:18:23 -05:00
|
|
|
|
|
2000-12-13 23:50:32 -05:00
|
|
|
|
A class definition is an executable statement that may uses and
|
|
|
|
|
definitions of names. These references follow the normal rules
|
|
|
|
|
for name resolution. The namespace of the class definition
|
|
|
|
|
becomes the attribute dictionary of the class.
|
2000-11-02 11:18:23 -05:00
|
|
|
|
|
2001-02-21 14:11:21 -05:00
|
|
|
|
The following operations are name binding operations. If they
|
|
|
|
|
occur within a block, they introduce new local names in the
|
|
|
|
|
current block unless there is also a global declaration.
|
2000-11-02 11:18:23 -05:00
|
|
|
|
|
2001-02-21 14:11:21 -05:00
|
|
|
|
Function defintion: def name ...
|
|
|
|
|
Class definition: class name ...
|
|
|
|
|
Assignment statement: name = ...
|
|
|
|
|
Import statement: import name, import module as name,
|
|
|
|
|
from module import name
|
|
|
|
|
Implicit assignment: names are bound by for statements and except
|
|
|
|
|
clauses
|
2000-12-13 23:50:32 -05:00
|
|
|
|
|
2001-02-21 14:11:21 -05:00
|
|
|
|
The arguments of a function are also local.
|
2000-12-13 23:50:32 -05:00
|
|
|
|
|
2001-02-21 14:11:21 -05:00
|
|
|
|
There are several cases where Python statements are illegal when
|
|
|
|
|
used in conjunction with nested scopes that contain free
|
|
|
|
|
variables.
|
2000-12-13 23:50:32 -05:00
|
|
|
|
|
2001-02-21 14:11:21 -05:00
|
|
|
|
If a variable is referenced in an enclosing scope, it is an error
|
|
|
|
|
to delete the name. The compiler will raise a SyntaxError for
|
|
|
|
|
'del name'.
|
2000-12-13 23:50:32 -05:00
|
|
|
|
|
2001-02-21 14:11:21 -05:00
|
|
|
|
If the wildcard form of import (import *) is used in a function
|
|
|
|
|
and the function contains a nested block with free variables, the
|
|
|
|
|
compiler will raise a SyntaxError.
|
2000-12-13 23:50:32 -05:00
|
|
|
|
|
2001-02-21 14:11:21 -05:00
|
|
|
|
If exec is used in a function and the function contains a nested
|
|
|
|
|
block with free variables, the compiler will raise a SyntaxError
|
|
|
|
|
unless the exec explicit specifies the local namespace for the
|
|
|
|
|
exec. (In other words, "exec obj" would be illegal, but
|
|
|
|
|
"exec obj in ns" would be legal.)
|
|
|
|
|
|
|
|
|
|
Discussion
|
2000-12-13 23:50:32 -05:00
|
|
|
|
|
|
|
|
|
The specified rules allow names defined in a function to be
|
|
|
|
|
referenced in any nested function defined with that function. The
|
|
|
|
|
name resolution rules are typical for statically scoped languages,
|
|
|
|
|
with three primary exceptions:
|
|
|
|
|
|
2000-12-14 09:53:02 -05:00
|
|
|
|
- Names in class scope are not accessible.
|
2000-12-13 23:50:32 -05:00
|
|
|
|
- The global statement short-circuits the normal rules.
|
|
|
|
|
- Variables are not declared.
|
|
|
|
|
|
2000-12-14 09:53:02 -05:00
|
|
|
|
Names in class scope are not accessible. Names are resolved in
|
|
|
|
|
the innermost enclosing function scope. If a class defintion
|
|
|
|
|
occurs in a chain of nested scopes, the resolution process skips
|
|
|
|
|
class definitions. This rule prevents odd interactions between
|
|
|
|
|
class attributes and local variable access. If a name binding
|
|
|
|
|
operation occurs in a class defintion, it creates an attribute on
|
|
|
|
|
the resulting class object. To access this variable in a method,
|
|
|
|
|
or in a function nested within a method, an attribute reference
|
|
|
|
|
must be used, either via self or via the class name.
|
2000-12-13 23:50:32 -05:00
|
|
|
|
|
|
|
|
|
An alternative would have been to allow name binding in class
|
|
|
|
|
scope to behave exactly like name binding in function scope. This
|
|
|
|
|
rule would allow class attributes to be referenced either via
|
|
|
|
|
attribute reference or simple name. This option was ruled out
|
|
|
|
|
because it would have been inconsistent with all other forms of
|
|
|
|
|
class and instance attribute access, which always use attribute
|
|
|
|
|
references. Code that used simple names would have been obscure.
|
|
|
|
|
|
|
|
|
|
The global statement short-circuits the normal rules. Under the
|
|
|
|
|
proposal, the global statement has exactly the same effect that it
|
|
|
|
|
does for Python 2.0. It's behavior is preserved for backwards
|
|
|
|
|
compatibility. It is also noteworthy because it allows name
|
|
|
|
|
binding operations performed in one block to change bindings in
|
|
|
|
|
another block (the module).
|
|
|
|
|
|
|
|
|
|
Variables are not declared. If a name binding operation occurs
|
|
|
|
|
anywhere in a function, then that name is treated as local to the
|
|
|
|
|
function and all references refer to the local binding. If a
|
|
|
|
|
reference occurs before the name is bound, a NameError is raised.
|
|
|
|
|
The only kind of declaration is the global statement, which allows
|
|
|
|
|
programs to be written using mutable global variables. As a
|
|
|
|
|
consequence, it is not possible to rebind a name defined in an
|
|
|
|
|
enclosing scope. An assignment operation can only bind a name in
|
|
|
|
|
the current scope or in the global scope. The lack of
|
|
|
|
|
declarations and the inability to rebind names in enclosing scopes
|
|
|
|
|
are unusual for lexically scoped languages; there is typically a
|
|
|
|
|
mechanism to create name bindings (e.g. lambda and let in Scheme)
|
|
|
|
|
and a mechanism to change the bindings (set! in Scheme).
|
|
|
|
|
|
2001-02-21 14:11:21 -05:00
|
|
|
|
XXX Alex Martelli suggests comparison with Java, which does not
|
|
|
|
|
allow name bindings to hide earlier bindings.
|
|
|
|
|
|
2000-12-13 23:50:32 -05:00
|
|
|
|
Examples
|
|
|
|
|
|
|
|
|
|
A few examples are included to illustrate the way the rules work.
|
|
|
|
|
|
2001-02-21 14:11:21 -05:00
|
|
|
|
XXX Explain the examples
|
|
|
|
|
|
|
|
|
|
>>> def make_adder(base):
|
|
|
|
|
... def adder(x):
|
|
|
|
|
... return base + x
|
|
|
|
|
... return adder
|
|
|
|
|
>>> add5 = make_adder(5)
|
|
|
|
|
>>> add5(6)
|
|
|
|
|
11
|
|
|
|
|
|
2000-12-14 09:40:45 -05:00
|
|
|
|
>>> def make_fact():
|
|
|
|
|
... def fact(n):
|
|
|
|
|
... if n == 1:
|
|
|
|
|
... return 1L
|
|
|
|
|
... else:
|
|
|
|
|
... return n * fact(n - 1)
|
|
|
|
|
... return fact
|
|
|
|
|
>>> fact = make_fact()
|
|
|
|
|
>>> fact(7)
|
|
|
|
|
5040L
|
|
|
|
|
|
|
|
|
|
>>> def make_wrapper(obj):
|
|
|
|
|
... class Wrapper:
|
|
|
|
|
... def __getattr__(self, attr):
|
|
|
|
|
... if attr[0] != '_':
|
|
|
|
|
... return getattr(obj, attr)
|
|
|
|
|
... else:
|
|
|
|
|
... raise AttributeError, attr
|
|
|
|
|
... return Wrapper()
|
|
|
|
|
>>> class Test:
|
|
|
|
|
... public = 2
|
|
|
|
|
... _private = 3
|
|
|
|
|
>>> w = make_wrapper(Test())
|
|
|
|
|
>>> w.public
|
|
|
|
|
2
|
|
|
|
|
>>> w._private
|
|
|
|
|
Traceback (most recent call last):
|
|
|
|
|
File "<stdin>", line 1, in ?
|
|
|
|
|
AttributeError: _private
|
2000-12-13 23:50:32 -05:00
|
|
|
|
|
|
|
|
|
An example from Tim Peters of the potential pitfalls of nested scopes
|
|
|
|
|
in the absence of declarations:
|
|
|
|
|
|
|
|
|
|
i = 6
|
|
|
|
|
def f(x):
|
|
|
|
|
def g():
|
|
|
|
|
print i
|
|
|
|
|
# ...
|
|
|
|
|
# skip to the next page
|
|
|
|
|
# ...
|
|
|
|
|
for i in x: # ah, i *is* local to f, so this is what g sees
|
|
|
|
|
pass
|
|
|
|
|
g()
|
|
|
|
|
|
|
|
|
|
The call to g() will refer to the variable i bound in f() by the for
|
|
|
|
|
loop. If g() is called before the loop is executed, a NameError will
|
|
|
|
|
be raised.
|
2000-11-02 11:18:23 -05:00
|
|
|
|
|
2001-02-21 14:11:21 -05:00
|
|
|
|
XXX need some counterexamples
|
2000-12-13 23:50:32 -05:00
|
|
|
|
|
|
|
|
|
Backwards compatibility
|
|
|
|
|
|
2001-02-21 14:11:21 -05:00
|
|
|
|
There are two kinds of compatibility problems caused by nested
|
|
|
|
|
scopes. In one case, code that behaved one way in earlier
|
|
|
|
|
versions, behaves differently because of nested scopes. In the
|
|
|
|
|
other cases, certain constructs interact badly with nested scopes
|
|
|
|
|
and will trigger SyntaxErrors at compile time.
|
|
|
|
|
|
|
|
|
|
The following example from Skip Montanaro illustrates the first
|
|
|
|
|
kind of problem:
|
2000-12-13 23:50:32 -05:00
|
|
|
|
|
|
|
|
|
x = 1
|
|
|
|
|
def f1():
|
|
|
|
|
x = 2
|
|
|
|
|
def inner():
|
|
|
|
|
print x
|
|
|
|
|
inner()
|
|
|
|
|
|
|
|
|
|
Under the Python 2.0 rules, the print statement inside inner()
|
|
|
|
|
refers to the global variable x and will print 1 if f1() is
|
|
|
|
|
called. Under the new rules, it refers to the f1()'s namespace,
|
|
|
|
|
the nearest enclosing scope with a binding.
|
2001-02-21 14:11:21 -05:00
|
|
|
|
|
2000-12-13 23:50:32 -05:00
|
|
|
|
The problem occurs only when a global variable and a local
|
|
|
|
|
variable share the same name and a nested function uses that name
|
|
|
|
|
to refer to the global variable. This is poor programming
|
|
|
|
|
practice, because readers will easily confuse the two different
|
2001-02-21 14:11:21 -05:00
|
|
|
|
variables. One example of this problem was found in the Python
|
|
|
|
|
standard library during the implementation of nested scopes.
|
2000-12-13 23:50:32 -05:00
|
|
|
|
|
|
|
|
|
To address this problem, which is unlikely to occur often, a
|
|
|
|
|
static analysis tool that detects affected code will be written.
|
|
|
|
|
The detection problem is straightfoward.
|
|
|
|
|
|
2001-02-21 14:11:21 -05:00
|
|
|
|
The other compatibility problem is casued by the use of 'import *'
|
|
|
|
|
and 'exec' in a function body, when that function contains a
|
|
|
|
|
nested scope and the contained scope has free variables. For
|
|
|
|
|
example:
|
|
|
|
|
|
|
|
|
|
y = 1
|
|
|
|
|
def f():
|
|
|
|
|
exec "y = 'gotcha'" # or from module import *
|
|
|
|
|
def g():
|
|
|
|
|
return y
|
|
|
|
|
...
|
|
|
|
|
|
|
|
|
|
At compile-time, the compiler cannot tell whether an exec that
|
|
|
|
|
operators on the local namespace or an import * will introduce
|
|
|
|
|
name bindings that shadow the global y. Thus, it is not possible
|
|
|
|
|
to tell whether the reference to y in g() should refer to the
|
|
|
|
|
global or to a local name in f().
|
|
|
|
|
|
|
|
|
|
In discussion of the python-list, people argued for both possible
|
|
|
|
|
interpretations. On the one hand, some thought that the reference
|
|
|
|
|
in g() should be bound to a local y if one exists. One problem
|
|
|
|
|
with this interpretation is that it is impossible for a human
|
|
|
|
|
reader of the code to determine the binding of y by local
|
|
|
|
|
inspection. It seems likely to introduce subtle bugs. The other
|
|
|
|
|
interpretation is to treat exec and import * as dynamic features
|
|
|
|
|
that do not effect static scoping. Under this interpretation, the
|
|
|
|
|
exec and import * would introduce local names, but those names
|
|
|
|
|
would never be visible to nested scopes. In the specific example
|
|
|
|
|
above, the code would behave exactly as it did in earlier versions
|
|
|
|
|
of Python.
|
|
|
|
|
|
|
|
|
|
Since each interpretation is problemtatic and the exact meaning
|
|
|
|
|
ambiguous, the compiler raises an exception.
|
|
|
|
|
|
|
|
|
|
A brief review of three Python projects (the standard library,
|
|
|
|
|
Zope, and a beta version of PyXPCOM) found four backwards
|
|
|
|
|
compatibility issues in approximately 200,000 lines of code.
|
|
|
|
|
There was one example of case #1 (subtle behavior change) and two
|
|
|
|
|
examples of import * problems in the standard library.
|
|
|
|
|
|
|
|
|
|
(The interpretation of the import * and exec restriction that was
|
|
|
|
|
implemented in Python 2.1a2 was much more restrictive, based on
|
|
|
|
|
language that in the reference manual that had never been
|
|
|
|
|
enforced. These restrictions were relaxed following the release.)
|
|
|
|
|
|
2000-12-13 23:50:32 -05:00
|
|
|
|
locals() / vars()
|
|
|
|
|
|
|
|
|
|
These functions return a dictionary containing the current scope's
|
|
|
|
|
local variables. Modifications to the dictionary do not affect
|
|
|
|
|
the values of variables. Under the current rules, the use of
|
|
|
|
|
locals() and globals() allows the program to gain access to all
|
|
|
|
|
the namespaces in which names are resolved.
|
|
|
|
|
|
|
|
|
|
An analogous function will not be provided for nested scopes.
|
|
|
|
|
Under this proposal, it will not be possible to gain
|
|
|
|
|
dictionary-style access to all visible scopes.
|
|
|
|
|
|
|
|
|
|
Rebinding names in enclosing scopes
|
|
|
|
|
|
|
|
|
|
There are technical issues that make it difficult to support
|
|
|
|
|
rebinding of names in enclosing scopes, but the primary reason
|
|
|
|
|
that it is not allowed in the current proposal is that Guido is
|
|
|
|
|
opposed to it. It is difficult to support, because it would
|
|
|
|
|
require a new mechanism that would allow the programmer to specify
|
|
|
|
|
that an assignment in a block is supposed to rebind the name in an
|
|
|
|
|
enclosing block; presumably a keyword or special syntax (x := 3)
|
|
|
|
|
would make this possible.
|
2000-11-02 11:18:23 -05:00
|
|
|
|
|
2000-12-13 23:50:32 -05:00
|
|
|
|
The proposed rules allow programmers to achieve the effect of
|
|
|
|
|
rebinding, albeit awkwardly. The name that will be effectively
|
|
|
|
|
rebound by enclosed functions is bound to a container object. In
|
|
|
|
|
place of assignment, the program uses modification of the
|
|
|
|
|
container to achieve the desired effect:
|
|
|
|
|
|
|
|
|
|
def bank_account(initial_balance):
|
|
|
|
|
balance = [initial_balance]
|
|
|
|
|
def deposit(amount):
|
|
|
|
|
balance[0] = balance[0] + amount
|
|
|
|
|
return balance
|
|
|
|
|
def withdraw(amount):
|
|
|
|
|
balance[0] = balance[0] - amount
|
|
|
|
|
return balance
|
|
|
|
|
return deposit, withdraw
|
|
|
|
|
|
|
|
|
|
Support for rebinding in nested scopes would make this code
|
|
|
|
|
clearer. A class that defines deposit() and withdraw() methods
|
|
|
|
|
and the balance as an instance variable would be clearer still.
|
|
|
|
|
Since classes seem to achieve the same effect in a more
|
|
|
|
|
straightforward manner, they are preferred.
|
|
|
|
|
|
|
|
|
|
Implementation
|
|
|
|
|
|
2001-02-21 14:11:21 -05:00
|
|
|
|
The implementation for C Python uses flat closures [1]. Each def
|
|
|
|
|
or lambda statement that is executed will create a closure if the
|
|
|
|
|
body of the function or any contained function has free
|
|
|
|
|
variables. Using flat closures, the creation of closures is
|
|
|
|
|
somewhat expensive but lookup is cheap.
|
|
|
|
|
|
|
|
|
|
The implementation adds several new opcodes and two new kinds of
|
|
|
|
|
names in code objects. A variable can be either a cell variable
|
|
|
|
|
or a free variable for a particular code object. A cell variable
|
|
|
|
|
is referenced by containing scopes; as a result, the function
|
|
|
|
|
where it is defined must allocate separate storage for it on each
|
|
|
|
|
invocation. A free variable is reference via a function's closure.
|
|
|
|
|
|
|
|
|
|
XXX Much more to say here
|
|
|
|
|
|
|
|
|
|
References
|
|
|
|
|
|
|
|
|
|
[1] Luca Cardelli. Compiling a functional language. In Proc. of
|
|
|
|
|
the 1984 ACM Conference on Lisp and Functional Programming,
|
|
|
|
|
pp. 208-217, Aug. 1984
|
|
|
|
|
http://citeseer.nj.nec.com/cardelli84compiling.html
|
2000-12-13 23:53:15 -05:00
|
|
|
|
|
2000-11-02 11:18:23 -05:00
|
|
|
|
|
|
|
|
|
Local Variables:
|
|
|
|
|
mode: indented-text
|
|
|
|
|
indent-tabs-mode: nil
|
|
|
|
|
End:
|