200 lines
6.1 KiB
Plaintext
200 lines
6.1 KiB
Plaintext
PEP: 289
|
|
Title: Generator Expressions
|
|
Version: $Revision$
|
|
Last-Modified: $Date$
|
|
Author: python@rcn.com (Raymond D. Hettinger)
|
|
Status: Active
|
|
Type: Standards Track
|
|
Content-Type: text/x-rst
|
|
Created: 30-Jan-2002
|
|
Python-Version: 2.3
|
|
Post-History: 22-Oct-2003
|
|
|
|
|
|
Abstract
|
|
========
|
|
|
|
This PEP introduces generator expressions as a high performance,
|
|
memory efficient generalization of list comprehensions [1]_ and
|
|
generators [2]_.
|
|
|
|
|
|
Rationale
|
|
=========
|
|
|
|
Experience with list comprehensions has shown their wide-spread
|
|
utility throughout Python. However, many of the use cases do
|
|
not need to have a full list created in memory. Instead, they
|
|
only need to iterate over the elements one at a time.
|
|
|
|
For instance, the following summation code will build a full list of
|
|
squares in memory, iterate over those values, and, when the reference
|
|
is no longer needed, delete the list::
|
|
|
|
sum([x*x for x in range(10)])
|
|
|
|
Time, clarity, and memory are conserved by using an generator
|
|
expession instead::
|
|
|
|
sum(x*x for x in range(10))
|
|
|
|
Similar benefits are conferred on constructors for container objects::
|
|
|
|
s = Set(word for line in page for word in line.split())
|
|
d = dict( (k, func(v)) for k in keylist)
|
|
|
|
Generator expressions are especially useful with functions like sum(),
|
|
min(), and max() that reduce an iterable input to a single value::
|
|
|
|
max(len(line) for line in file if line.strip())
|
|
|
|
Generator expressions also address some examples of functionals coded
|
|
with lambda::
|
|
|
|
reduce(lambda s, a: s + a.myattr, data, 0)
|
|
reduce(lambda s, a: s + a[3], data, 0)
|
|
|
|
These simplify to::
|
|
|
|
sum(a.myattr for a in data)
|
|
sum(a[3] for a in data)
|
|
|
|
List comprehensions greatly reduced the need for filter() and map().
|
|
Likewise, generator expressions are expected to minimize the need
|
|
for itertools.ifilter() and itertools.imap(). In contrast, the
|
|
utility of other itertools will be enhanced by generator expressions::
|
|
|
|
dotproduct = sum(x*y for x,y in itertools.izip(x_vector, y_vector))
|
|
|
|
Having a syntax similar to list comprehensions also makes it easy to
|
|
convert existing code into an generator expression when scaling up
|
|
application.
|
|
|
|
|
|
BDFL Pronouncements
|
|
===================
|
|
|
|
The previous version of this PEP was REJECTED. The bracketed yield
|
|
syntax left something to be desired; the performance gains had not been
|
|
demonstrated; and the range of use cases had not been shown. After,
|
|
much discussion on the python-dev list, the PEP has been resurrected
|
|
its present form. The impetus for the discussion was an innovative
|
|
proposal from Peter Norvig [3]_.
|
|
|
|
|
|
The Details
|
|
===========
|
|
|
|
1. The semantics of a generator expression are equivalent to creating
|
|
an anonymous generator function and calling it. There's still discussion
|
|
about whether that generator function should copy the current value of
|
|
all free variables into default arguments.
|
|
|
|
2. The syntax requires that a generator expression always needs to be inside
|
|
a set of parentheses and cannot have a comma on either side. Unfortunately,
|
|
this is different from list comprehensions. While [1, x for x in R] is
|
|
illegal, [x for x in 1, 2, 3] is legal, meaning [x for x in (1,2,3)].
|
|
With reference to the file Grammar/Grammar in CVS, two rules change:
|
|
|
|
a) The rule::
|
|
|
|
atom: '(' [testlist] ')'
|
|
|
|
changes to::
|
|
|
|
atom: '(' [listmaker1] ')'
|
|
|
|
where listmaker1 is almost the same as listmaker, but only allows
|
|
a single test after 'for' ... 'in'.
|
|
|
|
b) The rule for arglist needs similar changes.
|
|
|
|
|
|
2. The loop variable is not exposed to the surrounding function. This
|
|
facilates the implementation and makes typical use cases more reliable.
|
|
In some future version of Python, list comprehensions will also hide the
|
|
induction variable from the surrounding code (and, in Py2.4, warnings
|
|
will be issued for code accessing the induction variable).
|
|
|
|
3. There is still discussion about whether variable referenced in generator
|
|
expressions will exhibit late binding just like other Python code. In the
|
|
following example, the iterator runs *after* the value of y is set to one::
|
|
|
|
def h():
|
|
y = 0
|
|
l = [1,2]
|
|
def gen(S):
|
|
for x in S:
|
|
yield x+y
|
|
it = gen(l)
|
|
y = 1
|
|
for v in it:
|
|
print v
|
|
|
|
4. List comprehensions will remain unchanged::
|
|
|
|
[x for x in S] # This is a list comprehension.
|
|
[(x for x in S)] # This is a list containing one generator expression.
|
|
|
|
|
|
Reduction Functions
|
|
===================
|
|
|
|
The utility of generator expressions is greatly enhanced when combined
|
|
with reduction functions like sum(), min(), and max(). Separate
|
|
proposals are forthcoming that recommend several new accumulation
|
|
functions possibly including: product(), average(), alltrue(),
|
|
anytrue(), nlargest(), nsmallest().
|
|
|
|
|
|
Acknowledgements
|
|
================
|
|
|
|
* Raymond Hettinger first proposed the idea of "generator comprehensions"
|
|
in January 2002.
|
|
|
|
* Peter Norvig resurrected the discussion in his proposal for
|
|
Accumulation Displays [3]_.
|
|
|
|
* Alex Martelli provided critical measurements that proved the performance
|
|
benefits of generator expressions. He also provided strong arguments
|
|
that they were a desirable thing to have.
|
|
|
|
* Samuele Pedroni provided the example of late binding.
|
|
Various contributors have made arguments for and against late binding.
|
|
|
|
* Phillip Eby suggested "iterator expressions" as the name.
|
|
|
|
* Subsequently, Tim Peters suggested the name "generator expressions".
|
|
|
|
|
|
References
|
|
==========
|
|
|
|
.. [1] PEP 202 List Comprehensions
|
|
http://python.sourceforge.net/peps/pep-0202.html
|
|
|
|
.. [2] PEP 255 Simple Generators
|
|
http://python.sourceforge.net/peps/pep-0255.html
|
|
|
|
.. [3] Peter Norvig's Accumulation Display Proposal
|
|
http:///www.norvig.com/pyacc.html
|
|
|
|
.. [4] Jeff Epler had worked up a patch demonstrating
|
|
the previously proposed bracket and yield syntax
|
|
http://python.org/sf/795947
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
|
|
..
|
|
Local Variables:
|
|
mode: indented-text
|
|
indent-tabs-mode: nil
|
|
sentence-end-double-space: t
|
|
fill-column: 70
|
|
End:
|