new draft

This commit is contained in:
Jeremy Hylton 2000-12-14 04:50:32 +00:00
parent 1fa85eead2
commit d1917acf5a
1 changed files with 293 additions and 85 deletions

View File

@ -23,103 +23,311 @@ Abstract
statement uses default arguments to explicitly creating bindings statement uses default arguments to explicitly creating bindings
in the lambda's namespace. in the lambda's namespace.
Specification
Notes Python is a statically scoped language with block structure, in
the traditional of Algol. A code block or region, such as a
module, class defintion, or function body, is the basic unit of a
program.
This section describes several issues that will be fleshed out and Names refer to objects. Names are introduced by name binding
addressed in the final draft of the PEP. Until that draft is operations. Each occurrence of a name in the program text refers
ready, please direct comments to the author. to the binding of that name established in the innermost function
block containing the use.
This change has been proposed many times in the past. It has The name binding operations are assignment, class and function
always been stymied by the possibility of creating cycles that definition, and import statements. Each assignment or import
could not be collected by Python's reference counting garbage statement occurs within a block defined by a class or function
collector. The additional of the cycle collector in Python 2.0 definition or at the module level (the top-level code block).
eliminates this concern.
Guido once explained that his original reservation about nested If a name binding operation occurs anywhere within a code block,
scopes was a reaction to their overuse in Pascal. In large Pascal all uses of the name within the block are treated as references to
programs he was familiar with, block structure was overused as an the current block. (Note: This can lead to errors when a name is
organizing principle for the program, leading to hard-to-read used within a block before it is bound.)
code.
Greg Ewing developed a proposal "Python Nested Lexical Scoping If the global statement occurs within a block, all uses of the
Enhancement" in Aug. 1999[1] name specified in the statement refer to the binding of that name
in the top-level namespace. Names are resolved in the top-level
namespace by searching the global namespace, the namespace of the
module containing the code block, and the builtin namespace, the
namespace of the module __builtin__. The global namespace is
searched first. If the name is not found there, the builtin
namespace is searched.
Michael Hudson's bytecodehacks projects[2] provides facilities to If a name is used within a code block, but it is not bound there
support nested scopes using the closure module. and is not declared global, the use is treated as a reference to
the nearest enclosing function region. A region is visible from a
block is all enclosing blocks are introduced by function
defintions. (Note: If a region is contained within a class
definition, the name bindings that occur in the class block are
not visible to enclosed functions.)
Examples: A class definition is an executable statement that may uses and
definitions of names. These references follow the normal rules
for name resolution. The namespace of the class definition
becomes the attribute dictionary of the class.
def make_adder(n): Discussion
def adder(x):
return x + n
return adder
add2 = make_adder(2)
add2(5) == 7
This proposal changes the rules for resolving free variables in
Python functions. The Python 2.0 definition specifies exactly
three namespaces to check for each name -- the local namespace,
the global namespace, and the builtin namespace. According to
this defintion, if a function A is defined within a function B,
the names bound in B are not visible in A. The proposal changes
the rules so that names bound in B are visible in A (unless A
contains a name binding that hides the binding in B).
The specification introduces rules for lexical scoping that are
common in Algol-like languages. The combination of lexical
scoping and existing support for first-class functions is
reminiscent of Scheme.
The changed scoping rules address two problems -- the limited
utility of lambda statements and the frequent confusion of new
users familiar with other languages that support lexical scoping,
e.g. the inability to define recursive functions except at the
module level.
The lambda statement introduces an unnamed function that contains
a single statement. It is often used for callback functions. In
the example below (written using the Python 2.0 rules), any name
used in the body of the lambda must be explicitly passed as a
default argument to the lambda.
from Tkinter import * from Tkinter import *
root = Tk() root = Tk()
Button(root, text="Click here", Button(root, text="Click here",
command = lambda : root.test.configure(text="...")) command=lambda root=root: root.test.configure(text="..."))
This approach is cumbersome, particularly when there are several
names used in the body of the lambda. The long list of default
arguments obscure the purpose of the code. The proposed solution,
in crude terms, implements the default argument approach
automatically. The "root=root" argument can be omitted.
One controversial issue is whether it should be possible to modify The specified rules allow names defined in a function to be
the value of variables defined in an enclosing scope. referenced in any nested function defined with that function. The
name resolution rules are typical for statically scoped languages,
with three primary exceptions:
One part of the issue is how to specify that an assignment in the - Class definitions hide names.
local scope should reference to the definition of the variable in - The global statement short-circuits the normal rules.
an enclosing scope. Assignment to a variable in the current scope - Variables are not declared.
creates a local variable in the scope. If the assignment is
supposed to refer to a global variable, the global statement must
be used to prevent a local name from being created. Presumably,
another keyword would be required to specify "nearest enclosing
scope."
Guido is opposed to allowing modifications (need to clarify Class definitions hide names. Names are resolved in the innermost
exactly why). If you are modifying variables bound in enclosing enclosing function scope. If a class defintion occurs in a chain
scopes, you should be using a class, he says. of nested scopes, the resolution process skips class definitions.
This rule prevents odd interactions between class attributes and
local variable access. If a name binding operation occurs in a
class defintion, it creates an attribute on the resulting class
object. To access this variable in a method, or in a function
nested within a method, an attribute reference must be used,
either via self or via the class name.
The problem occurs only when a program attempts to rebind the name An alternative would have been to allow name binding in class
in the enclosing scope. A mutable object, e.g. a list or scope to behave exactly like name binding in function scope. This
dictionary, can be modified by a reference in a nested scope; this rule would allow class attributes to be referenced either via
is an obvious consequence of Python's reference semantics. The attribute reference or simple name. This option was ruled out
ability to change mutable objects leads to an inelegant because it would have been inconsistent with all other forms of
workaround: If a program needs to rebind an immutable object, class and instance attribute access, which always use attribute
e.g. a number or tuple, store the object in a list and have all references. Code that used simple names would have been obscure.
references to the object use this list:
The global statement short-circuits the normal rules. Under the
proposal, the global statement has exactly the same effect that it
does for Python 2.0. It's behavior is preserved for backwards
compatibility. It is also noteworthy because it allows name
binding operations performed in one block to change bindings in
another block (the module).
Variables are not declared. If a name binding operation occurs
anywhere in a function, then that name is treated as local to the
function and all references refer to the local binding. If a
reference occurs before the name is bound, a NameError is raised.
The only kind of declaration is the global statement, which allows
programs to be written using mutable global variables. As a
consequence, it is not possible to rebind a name defined in an
enclosing scope. An assignment operation can only bind a name in
the current scope or in the global scope. The lack of
declarations and the inability to rebind names in enclosing scopes
are unusual for lexically scoped languages; there is typically a
mechanism to create name bindings (e.g. lambda and let in Scheme)
and a mechanism to change the bindings (set! in Scheme).
Examples
A few examples are included to illustrate the way the rules work.
>>> def make_fact():
... def fact(n):
... if n == 1:
... return 1L
... else:
... return n * fact(n - 1)
... return fact
>>> fact = make_fact()
>>> fact(7)
5040L
>>> def make_adder(base):
... def adder(x):
... return base + x
... return adder
>>> add5 = make_adder(5)
>>> add5(6)
11
>>> def make_wrapper(obj):
... class Wrapper:
... def __getattr__(self, attr):
... if attr[0] != '_':
... return getattr(obj, attr)
... else:
... raise AttributeError, attr
... return Wrapper()
>>> class Test:
... public = 2
... _private = 3
>>> w = make_wrapper(Test())
>>> w.public
2
>>> w._private
Traceback (most recent call last):
File "<stdin>", line 1, in ?
AttributeError: _private
An example from Tim Peters of the potential pitfalls of nested scopes
in the absence of declarations:
i = 6
def f(x):
def g():
print i
# ...
# skip to the next page
# ...
for i in x: # ah, i *is* local to f, so this is what g sees
pass
g()
The call to g() will refer to the variable i bound in f() by the for
loop. If g() is called before the loop is executed, a NameError will
be raised.
Other issues
Backwards compatibility
The proposed changes will break backwards compatibility for some
code. The following example from Skip Montanaro illustrates:
x = 1
def f1():
x = 2
def inner():
print x
inner()
Under the Python 2.0 rules, the print statement inside inner()
refers to the global variable x and will print 1 if f1() is
called. Under the new rules, it refers to the f1()'s namespace,
the nearest enclosing scope with a binding.
The problem occurs only when a global variable and a local
variable share the same name and a nested function uses that name
to refer to the global variable. This is poor programming
practice, because readers will easily confuse the two different
variables.
To address this problem, which is unlikely to occur often, a
static analysis tool that detects affected code will be written.
The detection problem is straightfoward.
locals() / vars()
These functions return a dictionary containing the current scope's
local variables. Modifications to the dictionary do not affect
the values of variables. Under the current rules, the use of
locals() and globals() allows the program to gain access to all
the namespaces in which names are resolved.
An analogous function will not be provided for nested scopes.
Under this proposal, it will not be possible to gain
dictionary-style access to all visible scopes.
Rebinding names in enclosing scopes
There are technical issues that make it difficult to support
rebinding of names in enclosing scopes, but the primary reason
that it is not allowed in the current proposal is that Guido is
opposed to it. It is difficult to support, because it would
require a new mechanism that would allow the programmer to specify
that an assignment in a block is supposed to rebind the name in an
enclosing block; presumably a keyword or special syntax (x := 3)
would make this possible.
The proposed rules allow programmers to achieve the effect of
rebinding, albeit awkwardly. The name that will be effectively
rebound by enclosed functions is bound to a container object. In
place of assignment, the program uses modification of the
container to achieve the desired effect:
def bank_account(initial_balance): def bank_account(initial_balance):
balance = [initial_balance] balance = [initial_balance]
def deposit(amount): def deposit(amount):
balance[0] = balance[0] + amount balance[0] = balance[0] + amount
return balance
def withdraw(amount): def withdraw(amount):
balance[0] = balance[0] - amount balance[0] = balance[0] - amount
return balance
return deposit, withdraw return deposit, withdraw
I would prefer for the language to support this style of Support for rebinding in nested scopes would make this code
programming directly rather than encouraging programs to use this clearer. A class that defines deposit() and withdraw() methods
somewhat obfuscated style. Of course, an instance would probably and the balance as an instance variable would be clearer still.
be clearer in this case. Since classes seem to achieve the same effect in a more
straightforward manner, they are preferred.
One implementation issue is how to represent the environment that Implementation
stores variables that are referenced by nested scopes. One
possibility is to add a pointer to each frame's statically
enclosing frame and walk the chain of links each time a non-local
variable is accessed. This implementation has some problems,
because access to nonlocal variables is slow and causes garbage to
accumulate unnecessarily. Another possibility is to construct an
environment for each function that provides access to only the
non-local variables. This environment would be explicitly passed
to nested functions.
An implementation effort is underway. The implementation requires
a way to create closures, an object that combines a function's
code and the environment in which to resolve free variables.
References There are a variety of implementation alternatives for closures.
One possibility is to use a static link from a nested function to
its enclosing environment. This implementation requires several
links to be followed if there is more than one level of nesting
and keeps many garbage objects alive longer than necessary.
[1] http://www.cosc.canterbury.ac.nz/~greg/python/lexscope.html One fairly simple implementation approach would be to implement
the default argument hack currently used for lambda support. Each
function object would have a func_env slot that holds a tuple of
free variable bindings. The code inside the function would use
LOAD_ENV to access these bindings rather than the typical
LOAD_FAST.
[2] http://sourceforge.net/projects/bytecodehacks/ The problem with this approach is that rebindings are not visible
to the nested function. Consider the following example:
import threading
import time
def outer():
x = 2
def inner():
while 1:
print x
time.sleep(1)
threading.Thread(target=inner).start()
while 1:
x = x + 1
time.sleep(0.8)
If the func_env slot is defined when MAKE_FUNCTION is called, then
x in innner() is bound to the value of x in outer() at function
definition time. This is the default argument hack, but not
actual name resolution based on statically nested scopes.
Local Variables: Local Variables: