diff --git a/pep-0227.txt b/pep-0227.txt index f5bdc1d03..993fb8cdb 100644 --- a/pep-0227.txt +++ b/pep-0227.txt @@ -23,6 +23,45 @@ Abstract statement uses default arguments to explicitly creating bindings in the lambda's namespace. +Introduction + + This proposal changes the rules for resolving free variables in + Python functions. The Python 2.0 definition specifies exactly + three namespaces to check for each name -- the local namespace, + the global namespace, and the builtin namespace. According to + this defintion, if a function A is defined within a function B, + the names bound in B are not visible in A. The proposal changes + the rules so that names bound in B are visible in A (unless A + contains a name binding that hides the binding in B). + + The specification introduces rules for lexical scoping that are + common in Algol-like languages. The combination of lexical + scoping and existing support for first-class functions is + reminiscent of Scheme. + + The changed scoping rules address two problems -- the limited + utility of lambda statements and the frequent confusion of new + users familiar with other languages that support lexical scoping, + e.g. the inability to define recursive functions except at the + module level. + + The lambda statement introduces an unnamed function that contains + a single statement. It is often used for callback functions. In + the example below (written using the Python 2.0 rules), any name + used in the body of the lambda must be explicitly passed as a + default argument to the lambda. + + from Tkinter import * + root = Tk() + Button(root, text="Click here", + command=lambda root=root: root.test.configure(text="...")) + + This approach is cumbersome, particularly when there are several + names used in the body of the lambda. The long list of default + arguments obscure the purpose of the code. The proposed solution, + in crude terms, implements the default argument approach + automatically. The "root=root" argument can be omitted. + Specification Python is a statically scoped language with block structure, in @@ -65,45 +104,40 @@ Specification for name resolution. The namespace of the class definition becomes the attribute dictionary of the class. + The following operations are name binding operations. If they + occur within a block, they introduce new local names in the + current block unless there is also a global declaration. + + Function defintion: def name ... + Class definition: class name ... + Assignment statement: name = ... + Import statement: import name, import module as name, + from module import name + Implicit assignment: names are bound by for statements and except + clauses + + The arguments of a function are also local. + + There are several cases where Python statements are illegal when + used in conjunction with nested scopes that contain free + variables. + + If a variable is referenced in an enclosing scope, it is an error + to delete the name. The compiler will raise a SyntaxError for + 'del name'. + + If the wildcard form of import (import *) is used in a function + and the function contains a nested block with free variables, the + compiler will raise a SyntaxError. + + If exec is used in a function and the function contains a nested + block with free variables, the compiler will raise a SyntaxError + unless the exec explicit specifies the local namespace for the + exec. (In other words, "exec obj" would be illegal, but + "exec obj in ns" would be legal.) + Discussion - This proposal changes the rules for resolving free variables in - Python functions. The Python 2.0 definition specifies exactly - three namespaces to check for each name -- the local namespace, - the global namespace, and the builtin namespace. According to - this defintion, if a function A is defined within a function B, - the names bound in B are not visible in A. The proposal changes - the rules so that names bound in B are visible in A (unless A - contains a name binding that hides the binding in B). - - The specification introduces rules for lexical scoping that are - common in Algol-like languages. The combination of lexical - scoping and existing support for first-class functions is - reminiscent of Scheme. - - The changed scoping rules address two problems -- the limited - utility of lambda statements and the frequent confusion of new - users familiar with other languages that support lexical scoping, - e.g. the inability to define recursive functions except at the - module level. - - The lambda statement introduces an unnamed function that contains - a single statement. It is often used for callback functions. In - the example below (written using the Python 2.0 rules), any name - used in the body of the lambda must be explicitly passed as a - default argument to the lambda. - - from Tkinter import * - root = Tk() - Button(root, text="Click here", - command=lambda root=root: root.test.configure(text="...")) - - This approach is cumbersome, particularly when there are several - names used in the body of the lambda. The long list of default - arguments obscure the purpose of the code. The proposed solution, - in crude terms, implements the default argument approach - automatically. The "root=root" argument can be omitted. - The specified rules allow names defined in a function to be referenced in any nested function defined with that function. The name resolution rules are typical for statically scoped languages, @@ -152,10 +186,23 @@ Discussion mechanism to create name bindings (e.g. lambda and let in Scheme) and a mechanism to change the bindings (set! in Scheme). + XXX Alex Martelli suggests comparison with Java, which does not + allow name bindings to hide earlier bindings. + Examples A few examples are included to illustrate the way the rules work. + XXX Explain the examples + + >>> def make_adder(base): + ... def adder(x): + ... return base + x + ... return adder + >>> add5 = make_adder(5) + >>> add5(6) + 11 + >>> def make_fact(): ... def fact(n): ... if n == 1: @@ -167,14 +214,6 @@ Examples >>> fact(7) 5040L - >>> def make_adder(base): - ... def adder(x): - ... return base + x - ... return adder - >>> add5 = make_adder(5) - >>> add5(6) - 11 - >>> def make_wrapper(obj): ... class Wrapper: ... def __getattr__(self, attr): @@ -212,12 +251,18 @@ Examples loop. If g() is called before the loop is executed, a NameError will be raised. -Other issues + XXX need some counterexamples Backwards compatibility - The proposed changes will break backwards compatibility for some - code. The following example from Skip Montanaro illustrates: + There are two kinds of compatibility problems caused by nested + scopes. In one case, code that behaved one way in earlier + versions, behaves differently because of nested scopes. In the + other cases, certain constructs interact badly with nested scopes + and will trigger SyntaxErrors at compile time. + + The following example from Skip Montanaro illustrates the first + kind of problem: x = 1 def f1(): @@ -230,17 +275,63 @@ Backwards compatibility refers to the global variable x and will print 1 if f1() is called. Under the new rules, it refers to the f1()'s namespace, the nearest enclosing scope with a binding. - + The problem occurs only when a global variable and a local variable share the same name and a nested function uses that name to refer to the global variable. This is poor programming practice, because readers will easily confuse the two different - variables. + variables. One example of this problem was found in the Python + standard library during the implementation of nested scopes. To address this problem, which is unlikely to occur often, a static analysis tool that detects affected code will be written. The detection problem is straightfoward. + The other compatibility problem is casued by the use of 'import *' + and 'exec' in a function body, when that function contains a + nested scope and the contained scope has free variables. For + example: + + y = 1 + def f(): + exec "y = 'gotcha'" # or from module import * + def g(): + return y + ... + + At compile-time, the compiler cannot tell whether an exec that + operators on the local namespace or an import * will introduce + name bindings that shadow the global y. Thus, it is not possible + to tell whether the reference to y in g() should refer to the + global or to a local name in f(). + + In discussion of the python-list, people argued for both possible + interpretations. On the one hand, some thought that the reference + in g() should be bound to a local y if one exists. One problem + with this interpretation is that it is impossible for a human + reader of the code to determine the binding of y by local + inspection. It seems likely to introduce subtle bugs. The other + interpretation is to treat exec and import * as dynamic features + that do not effect static scoping. Under this interpretation, the + exec and import * would introduce local names, but those names + would never be visible to nested scopes. In the specific example + above, the code would behave exactly as it did in earlier versions + of Python. + + Since each interpretation is problemtatic and the exact meaning + ambiguous, the compiler raises an exception. + + A brief review of three Python projects (the standard library, + Zope, and a beta version of PyXPCOM) found four backwards + compatibility issues in approximately 200,000 lines of code. + There was one example of case #1 (subtle behavior change) and two + examples of import * problems in the standard library. + + (The interpretation of the import * and exec restriction that was + implemented in Python 2.1a2 was much more restrictive, based on + language that in the reference manual that had never been + enforced. These restrictions were relaxed following the release.) + locals() / vars() These functions return a dictionary containing the current scope's @@ -288,54 +379,27 @@ Rebinding names in enclosing scopes Implementation - An implementation effort is underway. The implementation requires - a way to create closures, an object that combines a function's - code and the environment in which to resolve free variables. + The implementation for C Python uses flat closures [1]. Each def + or lambda statement that is executed will create a closure if the + body of the function or any contained function has free + variables. Using flat closures, the creation of closures is + somewhat expensive but lookup is cheap. - There are a variety of implementation alternatives for closures. - Two typical ones are nested closures and flat closures. Nested - closures use a static link from a nested function to its enclosing - environment. This implementation requires several links to be - followed if there is more than one level of nesting and keeps many - garbage objects alive longer than necessary. + The implementation adds several new opcodes and two new kinds of + names in code objects. A variable can be either a cell variable + or a free variable for a particular code object. A cell variable + is referenced by containing scopes; as a result, the function + where it is defined must allocate separate storage for it on each + invocation. A free variable is reference via a function's closure. - Flat closures are roughly similar to the default argument hack - currently used for lambda support. Each function object would - have a func_env slot that holds a tuple of free variable bindings. - The code inside the function would use LOAD_ENV to access these - bindings rather than the typical LOAD_FAST. + XXX Much more to say here - The problem with this approach is that rebindings are not visible - to the nested function. Consider the following example: +References - import threading - import time - - def outer(): - x = 2 - def inner(): - while 1: - print x - time.sleep(1) - threading.Thread(target=inner).start() - while 1: - x = x + 1 - time.sleep(0.8) - - If the func_env slot is defined when MAKE_FUNCTION is called, then - x in innner() is bound to the value of x in outer() at function - definition time. This is the default argument hack, but not - actual name resolution based on statically nested scopes. - - To support shared visibility of updates, it will be necessary to - have a tuple of cells that contain references to variables. The - extra level of indirection should allow updates to be shared. - - It is not clear whether the current 1-pass Python compiler can - determine which references are to globals and which are references - to enclosing scopes. It may be possible to make minimal changes - that defers the optimize() call until a second pass, after scopes - have been determined. + [1] Luca Cardelli. Compiling a functional language. In Proc. of + the 1984 ACM Conference on Lisp and Functional Programming, + pp. 208-217, Aug. 1984 + http://citeseer.nj.nec.com/cardelli84compiling.html Local Variables: