Add lots of detail and examples to the Details section. Free

variables are now captured at definition time.
This commit is contained in:
Guido van Rossum 2003-10-23 06:36:37 +00:00
parent 1a2ad824e4
commit f7361e1f66
1 changed files with 103 additions and 31 deletions

View File

@ -65,7 +65,7 @@ for itertools.ifilter() and itertools.imap(). In contrast, the
utility of other itertools will be enhanced by generator expressions::
dotproduct = sum(x*y for x,y in itertools.izip(x_vector, y_vector))
Having a syntax similar to list comprehensions also makes it easy to
convert existing code into an generator expression when scaling up
application.
@ -85,16 +85,29 @@ proposal from Peter Norvig [3]_.
The Details
===========
1. The semantics of a generator expression are equivalent to creating
an anonymous generator function and calling it. There's still discussion
about whether that generator function should copy the current value of
all free variables into default arguments.
(None of this is exact enough in the eye of a reader from Mars, but I
hope the examples convey the intention well enough for a discussion in
c.l.py. The Python Reference Manual should contain a 100% exact
semantic and syntactic specification.)
2. The syntax requires that a generator expression always needs to be inside
a set of parentheses and cannot have a comma on either side. Unfortunately,
this is different from list comprehensions. While [1, x for x in R] is
illegal, [x for x in 1, 2, 3] is legal, meaning [x for x in (1,2,3)].
With reference to the file Grammar/Grammar in CVS, two rules change:
1. The semantics of a generator expression are equivalent to creating
an anonymous generator function and calling it. For example::
g = (x**2 for x in range(10))
print g.next()
is equivalent to::
def __gen():
for x in range(10):
yield x**2
g = __gen()
print g.next()
2. The syntax requires that a generator expression always needs to be
directly inside a set of parentheses and cannot have a comma on either
side. With reference to the file Grammar/Grammar in CVS, two rules
change:
a) The rule::
@ -109,33 +122,92 @@ With reference to the file Grammar/Grammar in CVS, two rules change:
b) The rule for arglist needs similar changes.
This means that you can write::
2. The loop variable is not exposed to the surrounding function. This
facilates the implementation and makes typical use cases more reliable.
In some future version of Python, list comprehensions will also hide the
sum(x**2 for x in range(10))
but you would have to write::
reduce(operator.add, (x**2 for x in range(10)))
and also::
g = (x**2 for i in range(10))
i.e. if a function call has a single positional argument, it can be a
generator expression without extra parentheses, but in all other cases
you have to parenthesize it.
3. The loop variable (if it is a simple variable or a tuple of simple
variables) is not exposed to the surrounding function. This facilates
the implementation and makes typical use cases more reliable. In some
future version of Python, list comprehensions will also hide the
induction variable from the surrounding code (and, in Py2.4, warnings
will be issued for code accessing the induction variable).
3. There is still discussion about whether variable referenced in generator
expressions will exhibit late binding just like other Python code. In the
following example, the iterator runs *after* the value of y is set to one::
def h():
y = 0
l = [1,2]
def gen(S):
for x in S:
yield x+y
it = gen(l)
y = 1
for v in it:
print v
For example::
4. List comprehensions will remain unchanged::
x = "hello"
y = list(x for x in "abc")
print x # prints "hello", not "c"
(Loop variables may also use constructs like x[i] or x.a; this form
may be deprecated.)
4. All free variable bindings are captured at the time this function
is defined, and passed into it using default argument values. For
example::
x = 0
g = (x for c in "abc") # x is not the loop variable!
x = 1
print g.next() # prints 0 (captured x), not 1 (current x)
This behavior of free variables is almost always what you want when
the generator expression is evaluated at a later point than its
definition. In fact, to date, no examples have been found of code
where it would be better to use the execution-time instead of the
definition-time value of a free variable.
Note that free variables aren't copied, only their binding is
captured. They may still change if they are mutable, for example::
x = []
g = (x for c in "abc")
x.append(1)
print g.next() # prints [1], not []
5. List comprehensions will remain unchanged. For example::
[x for x in S] # This is a list comprehension.
[(x for x in S)] # This is a list containing one generator expression.
Unfortunately, there is currently a slight syntactic difference. The
expression::
[x for x in 1, 2, 3]
is legal, meaning::
[x for x in (1, 2, 3)]
But generator expressions will not allow the former version::
(x for x in 1, 2, 3)
is illegal.
The former list comprehension syntax will become illegal in Python
3.0, and should be deprecated in Python 2.4 and beyond.
List comprehensions also "leak" their loop variable into the
surrounding scope. This will also change in Python 3.0, so that the
semantic definition of a list comprehension in Python 3.0 will be
equivalent to list(<generator expression>). Python 2.4 and beyond
should issue a deprecation warning if a list comprehension's loop
variable has the same name as a variable used in the immediately
surrounding scope.
Reduction Functions
===================
@ -152,9 +224,9 @@ Acknowledgements
* Raymond Hettinger first proposed the idea of "generator comprehensions"
in January 2002.
* Peter Norvig resurrected the discussion in his proposal for
Accumulation Displays [3]_.
Accumulation Displays.
* Alex Martelli provided critical measurements that proved the performance
benefits of generator expressions. He also provided strong arguments
@ -179,7 +251,7 @@ References
.. [3] Peter Norvig's Accumulation Display Proposal
http:///www.norvig.com/pyacc.html
.. [4] Jeff Epler had worked up a patch demonstrating
the previously proposed bracket and yield syntax
http://python.org/sf/795947