Several revisions, primarily to clarify backwards compatibility issues.

This commit is contained in:
Jeremy Hylton 2001-02-21 19:11:21 +00:00
parent 78f99f2ced
commit 0568294565
1 changed files with 158 additions and 94 deletions

View File

@ -23,6 +23,45 @@ Abstract
statement uses default arguments to explicitly creating bindings
in the lambda's namespace.
Introduction
This proposal changes the rules for resolving free variables in
Python functions. The Python 2.0 definition specifies exactly
three namespaces to check for each name -- the local namespace,
the global namespace, and the builtin namespace. According to
this defintion, if a function A is defined within a function B,
the names bound in B are not visible in A. The proposal changes
the rules so that names bound in B are visible in A (unless A
contains a name binding that hides the binding in B).
The specification introduces rules for lexical scoping that are
common in Algol-like languages. The combination of lexical
scoping and existing support for first-class functions is
reminiscent of Scheme.
The changed scoping rules address two problems -- the limited
utility of lambda statements and the frequent confusion of new
users familiar with other languages that support lexical scoping,
e.g. the inability to define recursive functions except at the
module level.
The lambda statement introduces an unnamed function that contains
a single statement. It is often used for callback functions. In
the example below (written using the Python 2.0 rules), any name
used in the body of the lambda must be explicitly passed as a
default argument to the lambda.
from Tkinter import *
root = Tk()
Button(root, text="Click here",
command=lambda root=root: root.test.configure(text="..."))
This approach is cumbersome, particularly when there are several
names used in the body of the lambda. The long list of default
arguments obscure the purpose of the code. The proposed solution,
in crude terms, implements the default argument approach
automatically. The "root=root" argument can be omitted.
Specification
Python is a statically scoped language with block structure, in
@ -65,45 +104,40 @@ Specification
for name resolution. The namespace of the class definition
becomes the attribute dictionary of the class.
The following operations are name binding operations. If they
occur within a block, they introduce new local names in the
current block unless there is also a global declaration.
Function defintion: def name ...
Class definition: class name ...
Assignment statement: name = ...
Import statement: import name, import module as name,
from module import name
Implicit assignment: names are bound by for statements and except
clauses
The arguments of a function are also local.
There are several cases where Python statements are illegal when
used in conjunction with nested scopes that contain free
variables.
If a variable is referenced in an enclosing scope, it is an error
to delete the name. The compiler will raise a SyntaxError for
'del name'.
If the wildcard form of import (import *) is used in a function
and the function contains a nested block with free variables, the
compiler will raise a SyntaxError.
If exec is used in a function and the function contains a nested
block with free variables, the compiler will raise a SyntaxError
unless the exec explicit specifies the local namespace for the
exec. (In other words, "exec obj" would be illegal, but
"exec obj in ns" would be legal.)
Discussion
This proposal changes the rules for resolving free variables in
Python functions. The Python 2.0 definition specifies exactly
three namespaces to check for each name -- the local namespace,
the global namespace, and the builtin namespace. According to
this defintion, if a function A is defined within a function B,
the names bound in B are not visible in A. The proposal changes
the rules so that names bound in B are visible in A (unless A
contains a name binding that hides the binding in B).
The specification introduces rules for lexical scoping that are
common in Algol-like languages. The combination of lexical
scoping and existing support for first-class functions is
reminiscent of Scheme.
The changed scoping rules address two problems -- the limited
utility of lambda statements and the frequent confusion of new
users familiar with other languages that support lexical scoping,
e.g. the inability to define recursive functions except at the
module level.
The lambda statement introduces an unnamed function that contains
a single statement. It is often used for callback functions. In
the example below (written using the Python 2.0 rules), any name
used in the body of the lambda must be explicitly passed as a
default argument to the lambda.
from Tkinter import *
root = Tk()
Button(root, text="Click here",
command=lambda root=root: root.test.configure(text="..."))
This approach is cumbersome, particularly when there are several
names used in the body of the lambda. The long list of default
arguments obscure the purpose of the code. The proposed solution,
in crude terms, implements the default argument approach
automatically. The "root=root" argument can be omitted.
The specified rules allow names defined in a function to be
referenced in any nested function defined with that function. The
name resolution rules are typical for statically scoped languages,
@ -152,10 +186,23 @@ Discussion
mechanism to create name bindings (e.g. lambda and let in Scheme)
and a mechanism to change the bindings (set! in Scheme).
XXX Alex Martelli suggests comparison with Java, which does not
allow name bindings to hide earlier bindings.
Examples
A few examples are included to illustrate the way the rules work.
XXX Explain the examples
>>> def make_adder(base):
... def adder(x):
... return base + x
... return adder
>>> add5 = make_adder(5)
>>> add5(6)
11
>>> def make_fact():
... def fact(n):
... if n == 1:
@ -167,14 +214,6 @@ Examples
>>> fact(7)
5040L
>>> def make_adder(base):
... def adder(x):
... return base + x
... return adder
>>> add5 = make_adder(5)
>>> add5(6)
11
>>> def make_wrapper(obj):
... class Wrapper:
... def __getattr__(self, attr):
@ -212,12 +251,18 @@ Examples
loop. If g() is called before the loop is executed, a NameError will
be raised.
Other issues
XXX need some counterexamples
Backwards compatibility
The proposed changes will break backwards compatibility for some
code. The following example from Skip Montanaro illustrates:
There are two kinds of compatibility problems caused by nested
scopes. In one case, code that behaved one way in earlier
versions, behaves differently because of nested scopes. In the
other cases, certain constructs interact badly with nested scopes
and will trigger SyntaxErrors at compile time.
The following example from Skip Montanaro illustrates the first
kind of problem:
x = 1
def f1():
@ -235,12 +280,58 @@ Backwards compatibility
variable share the same name and a nested function uses that name
to refer to the global variable. This is poor programming
practice, because readers will easily confuse the two different
variables.
variables. One example of this problem was found in the Python
standard library during the implementation of nested scopes.
To address this problem, which is unlikely to occur often, a
static analysis tool that detects affected code will be written.
The detection problem is straightfoward.
The other compatibility problem is casued by the use of 'import *'
and 'exec' in a function body, when that function contains a
nested scope and the contained scope has free variables. For
example:
y = 1
def f():
exec "y = 'gotcha'" # or from module import *
def g():
return y
...
At compile-time, the compiler cannot tell whether an exec that
operators on the local namespace or an import * will introduce
name bindings that shadow the global y. Thus, it is not possible
to tell whether the reference to y in g() should refer to the
global or to a local name in f().
In discussion of the python-list, people argued for both possible
interpretations. On the one hand, some thought that the reference
in g() should be bound to a local y if one exists. One problem
with this interpretation is that it is impossible for a human
reader of the code to determine the binding of y by local
inspection. It seems likely to introduce subtle bugs. The other
interpretation is to treat exec and import * as dynamic features
that do not effect static scoping. Under this interpretation, the
exec and import * would introduce local names, but those names
would never be visible to nested scopes. In the specific example
above, the code would behave exactly as it did in earlier versions
of Python.
Since each interpretation is problemtatic and the exact meaning
ambiguous, the compiler raises an exception.
A brief review of three Python projects (the standard library,
Zope, and a beta version of PyXPCOM) found four backwards
compatibility issues in approximately 200,000 lines of code.
There was one example of case #1 (subtle behavior change) and two
examples of import * problems in the standard library.
(The interpretation of the import * and exec restriction that was
implemented in Python 2.1a2 was much more restrictive, based on
language that in the reference manual that had never been
enforced. These restrictions were relaxed following the release.)
locals() / vars()
These functions return a dictionary containing the current scope's
@ -288,54 +379,27 @@ Rebinding names in enclosing scopes
Implementation
An implementation effort is underway. The implementation requires
a way to create closures, an object that combines a function's
code and the environment in which to resolve free variables.
The implementation for C Python uses flat closures [1]. Each def
or lambda statement that is executed will create a closure if the
body of the function or any contained function has free
variables. Using flat closures, the creation of closures is
somewhat expensive but lookup is cheap.
There are a variety of implementation alternatives for closures.
Two typical ones are nested closures and flat closures. Nested
closures use a static link from a nested function to its enclosing
environment. This implementation requires several links to be
followed if there is more than one level of nesting and keeps many
garbage objects alive longer than necessary.
The implementation adds several new opcodes and two new kinds of
names in code objects. A variable can be either a cell variable
or a free variable for a particular code object. A cell variable
is referenced by containing scopes; as a result, the function
where it is defined must allocate separate storage for it on each
invocation. A free variable is reference via a function's closure.
Flat closures are roughly similar to the default argument hack
currently used for lambda support. Each function object would
have a func_env slot that holds a tuple of free variable bindings.
The code inside the function would use LOAD_ENV to access these
bindings rather than the typical LOAD_FAST.
XXX Much more to say here
The problem with this approach is that rebindings are not visible
to the nested function. Consider the following example:
References
import threading
import time
def outer():
x = 2
def inner():
while 1:
print x
time.sleep(1)
threading.Thread(target=inner).start()
while 1:
x = x + 1
time.sleep(0.8)
If the func_env slot is defined when MAKE_FUNCTION is called, then
x in innner() is bound to the value of x in outer() at function
definition time. This is the default argument hack, but not
actual name resolution based on statically nested scopes.
To support shared visibility of updates, it will be necessary to
have a tuple of cells that contain references to variables. The
extra level of indirection should allow updates to be shared.
It is not clear whether the current 1-pass Python compiler can
determine which references are to globals and which are references
to enclosing scopes. It may be possible to make minimal changes
that defers the optimize() call until a second pass, after scopes
have been determined.
[1] Luca Cardelli. Compiling a functional language. In Proc. of
the 1984 ACM Conference on Lisp and Functional Programming,
pp. 208-217, Aug. 1984
http://citeseer.nj.nec.com/cardelli84compiling.html
Local Variables: