243 lines
10 KiB
Plaintext
243 lines
10 KiB
Plaintext
|
PEP: 289
|
|||
|
Title: Generator Comprehensions
|
|||
|
Version: $Revision$
|
|||
|
Last-Modified: $Date$
|
|||
|
Author: python@rcn.com (Raymond D. Hettinger)
|
|||
|
Status: Rejected
|
|||
|
Type: Standards Track
|
|||
|
Created: 30-Jan-2002
|
|||
|
Python-Version: 2.3
|
|||
|
Post-History:
|
|||
|
|
|||
|
|
|||
|
Abstract
|
|||
|
|
|||
|
This PEP introduces generator comprehensions as an idea for
|
|||
|
enhancing the generators introduced in Python version 2.2 [1].
|
|||
|
The goal is to increase the convenience, utility, and power of
|
|||
|
generators by making it easy to convert a list comprehension into
|
|||
|
a generator.
|
|||
|
|
|||
|
|
|||
|
Rationale
|
|||
|
|
|||
|
Python 2.2 introduced the concept of an iterable interface as
|
|||
|
proposed in PEP 234 [4]. The iter() factory function was provided
|
|||
|
as common calling convention and deep changes were made to use
|
|||
|
iterators as a unifying theme throughout Python. The unification
|
|||
|
came in the form of establishing a common iterable interface for
|
|||
|
mappings, sequences, and file objects.
|
|||
|
|
|||
|
Generators, as proposed in PEP 255 [1], were introduced as a means
|
|||
|
for making it easier to create iterators, especially ones with
|
|||
|
complex internal execution or variable states. When I created new
|
|||
|
programs, generators were often the tool of choice for creating an
|
|||
|
iterator.
|
|||
|
|
|||
|
However, when updating existing programs, I found that the tool
|
|||
|
had another use, one that improved program function as well as
|
|||
|
structure. Some programs exhibited a pattern of creating large
|
|||
|
lists and then looping over them. As data sizes increased, the
|
|||
|
programs encountered scalability limitations owing to excessive
|
|||
|
memory consumption (and malloc time) for the intermediate lists.
|
|||
|
Generators were found to be directly substitutable for the lists
|
|||
|
while eliminating the memory issues through lazy evaluation
|
|||
|
a.k.a. just in time manufacturing.
|
|||
|
|
|||
|
Python itself encountered similar issues. As a result, xrange()
|
|||
|
and xreadlines() were introduced. And, in the case of file
|
|||
|
objects and mappings, just-in-time evaluation became the norm.
|
|||
|
Generators provide a tool to program memory conserving for-loops
|
|||
|
whenever complete evaluation is not desired because of memory
|
|||
|
restrictions or availability of data.
|
|||
|
|
|||
|
The next step in the evolution of generators is to establish a
|
|||
|
generator alternative to list comprehensions [3]. This
|
|||
|
alternative provides a simple way to convert a list comprehension
|
|||
|
into a generator whenever memory issues arise.
|
|||
|
|
|||
|
This suggestion is designed to take advantage of the existing
|
|||
|
implementation and require little additional effort to
|
|||
|
incorporate. It is backward compatible and requires no new
|
|||
|
keywords.
|
|||
|
|
|||
|
|
|||
|
BDFL Pronouncements
|
|||
|
|
|||
|
Generator comprehensions are REJECTED. The rationale is that the
|
|||
|
benefits are marginal since generators can already be coded
|
|||
|
directly and the costs are high because implementation and
|
|||
|
maintenance require major efforts with the parser.
|
|||
|
|
|||
|
|
|||
|
Reference Implementation
|
|||
|
|
|||
|
There is not currently a CPython implementation; however, a
|
|||
|
simulation module written in pure Python is available on
|
|||
|
SourceForge [5]. The simulation is meant to allow direct
|
|||
|
experimentation with the proposal.
|
|||
|
|
|||
|
There is also a module [6] with working source code for all of the
|
|||
|
examples used in this PEP. It serves as a test suite for the
|
|||
|
simulator and it documents how each the new feature works in
|
|||
|
practice.
|
|||
|
|
|||
|
The authors and implementers of PEP 255 [1] were contacted to
|
|||
|
provide their assessment of whether these enhancements were going
|
|||
|
to be straight-forward to implement and require only minor
|
|||
|
modification of the existing generator code. Neil felt the
|
|||
|
assertion was correct. Ka-Ping thought so also. GvR said he
|
|||
|
could believe that it was true. Later GvR re-assessed and thought
|
|||
|
that it would be difficult to tweak the code generator to produce
|
|||
|
a separate object. Tim did not have an opportunity to give an
|
|||
|
assessment.
|
|||
|
|
|||
|
|
|||
|
Specification for Generator Comprehensions :
|
|||
|
|
|||
|
If a list comprehension starts with a 'yield' keyword, then
|
|||
|
express the comprehension with a generator. For example:
|
|||
|
|
|||
|
g = [yield (len(line),line) for line in file if len(line)>5]
|
|||
|
|
|||
|
This would be implemented as if it had been written:
|
|||
|
|
|||
|
def __temp(self):
|
|||
|
for line in file:
|
|||
|
if len(line) > 5:
|
|||
|
yield (len(line), line)
|
|||
|
g = __temp()
|
|||
|
|
|||
|
Note A: There is some discussion about whether the enclosing
|
|||
|
brackets should be part of the syntax for generator
|
|||
|
comprehensions. On the plus side, it neatly parallels list
|
|||
|
comprehensions and would be immediately recognizable as a similar
|
|||
|
form with similar internal syntax (taking maximum advantage of
|
|||
|
what people already know). More importantly, it sets off the
|
|||
|
generator comprehension from the rest of the function so as to not
|
|||
|
suggest that the enclosing function is a generator (currently the
|
|||
|
only cue that a function is really a generator is the presence of
|
|||
|
the yield keyword). On the minus side, the brackets may falsely
|
|||
|
suggest that the whole expression returns a list. Most of the
|
|||
|
feedback received to date indicates that brackets are helpful and
|
|||
|
not misleading. Unfortunately, the one dissent is from GvR.
|
|||
|
|
|||
|
A key advantage of the generator comprehension syntax is that it
|
|||
|
makes it trivially easy to transform existing list comprehension
|
|||
|
code to a generator by adding yield. Likewise, it can be
|
|||
|
converted back to a list by deleting yield. This makes it easy to
|
|||
|
scale-up programs from small datasets to ones large enough to
|
|||
|
warrant just in time evaluation.
|
|||
|
|
|||
|
Note B: List comprehensions expose their looping variable and
|
|||
|
leave that variable in the enclosing scope. The code, [str(i) for
|
|||
|
i in range(8)] leaves 'i' set to 7 in the scope where the
|
|||
|
comprehension appears. This behavior is by design and reflects an
|
|||
|
intent to duplicate the result of coding a for-loop instead of a
|
|||
|
list comprehension. Further, the variable 'i' is in a defined and
|
|||
|
potentially useful state on the line immediately following the
|
|||
|
list comprehension.
|
|||
|
|
|||
|
In contrast, generator comprehensions do not expose the looping
|
|||
|
variable to the enclosing scope. The code, [yield str(i) for i in
|
|||
|
range(8)] leaves 'i' untouched in the scope where the
|
|||
|
comprehension appears. This is also by design and reflects an
|
|||
|
intent to duplicate the result of coding a generator directly
|
|||
|
instead of a generator comprehension. Further, the variable 'i'
|
|||
|
is not in a defined state on the line immediately following the
|
|||
|
list comprehension. It does not come into existence until
|
|||
|
iteration starts (possibly never).
|
|||
|
|
|||
|
Comments from GvR: Cute hack, but I think the use of the [] syntax
|
|||
|
strongly suggests that it would return a list, not an
|
|||
|
iterator. I also think that this is trying to turn Python into
|
|||
|
a functional language, where most algorithms use lazy infinite
|
|||
|
sequences, and I just don't think that's where its future
|
|||
|
lies.
|
|||
|
|
|||
|
I don't think it's worth the trouble. I expect it will take a
|
|||
|
lot of work to hack it into the code generator: it has to
|
|||
|
create a separate code object in order to be a generator.
|
|||
|
List comprehensions are inlined, so I expect that the
|
|||
|
generator comprehension code generator can't share much with
|
|||
|
the list comprehension code generator. And this for something
|
|||
|
that's not that common and easily done by writing a 2-line
|
|||
|
helper function. IOW the ROI isn't high enough.
|
|||
|
|
|||
|
Comments from Ka-Ping Yee: I am very happy with the things you have
|
|||
|
proposed in this PEP. I feel quite positive about generator
|
|||
|
comprehensions and have no reservations. So a +1 on that.
|
|||
|
|
|||
|
Comments from Neil Schemenauer: I'm -0 on the generator list
|
|||
|
comprehensions. They don't seem to add much. You could
|
|||
|
easily use a nested generator to do the same thing. They
|
|||
|
smell like lambda.
|
|||
|
|
|||
|
Comments from Magnus Lie Hetland: Generator comprehensions seem mildly
|
|||
|
useful, but I vote +0. Defining a separate, named generator
|
|||
|
would probably be my preference. On the other hand, I do see
|
|||
|
the advantage of "scaling up" from list comprehensions.
|
|||
|
|
|||
|
Comments from the Community: The response to the generator comprehension
|
|||
|
proposal has been mostly favorable. There were some 0 votes
|
|||
|
from people who didn't see a real need or who were not
|
|||
|
energized by the idea. Some of the 0 votes were tempered by
|
|||
|
comments that the reviewer did not even like list
|
|||
|
comprehensions or did not have any use for generators in any
|
|||
|
form. The +1 votes outnumbered the 0 votes by about two to
|
|||
|
one.
|
|||
|
|
|||
|
Author response: I've studied several syntactical variations and
|
|||
|
concluded that the brackets are essential for:
|
|||
|
- teachability (it's like a list comprehension)
|
|||
|
- set-off (yield applies to the comprehension not the enclosing
|
|||
|
function)
|
|||
|
- substitutability (list comprehensions can be made lazy just by
|
|||
|
adding yield)
|
|||
|
|
|||
|
What I like best about generator comprehensions is that I can
|
|||
|
design using list comprehensions and then easily switch to a
|
|||
|
generator (by adding yield) in response to scalability
|
|||
|
requirements (when the list comprehension produces too large
|
|||
|
of an intermediate result). Had generators already been
|
|||
|
in-place when list comprehensions were accepted, the yield
|
|||
|
option might have been incorporated from the start. For
|
|||
|
certain, the mathematical style notation is explicit and
|
|||
|
readable as compared to a separate function definition with an
|
|||
|
embedded yield.
|
|||
|
|
|||
|
|
|||
|
References
|
|||
|
|
|||
|
[1] PEP 255 Simple Generators
|
|||
|
http://python.sourceforge.net/peps/pep-0255.html
|
|||
|
|
|||
|
[2] PEP 212 Loop Counter Iteration
|
|||
|
http://python.sourceforge.net/peps/pep-0212.html
|
|||
|
|
|||
|
[3] PEP 202 List Comprehensions
|
|||
|
http://python.sourceforge.net/peps/pep-0202.html
|
|||
|
|
|||
|
[4] PEP 234 Iterators
|
|||
|
http://python.sourceforge.net/peps/pep-0234.html
|
|||
|
|
|||
|
[5] A pure Python simulation of every feature in this PEP is at:
|
|||
|
http://sourceforge.net/tracker/download.php?group_id=5470&atid=305470&file_id=17348&aid=513752
|
|||
|
|
|||
|
[6] The full, working source code for each of the examples in this PEP
|
|||
|
along with other examples and tests is at:
|
|||
|
http://sourceforge.net/tracker/download.php?group_id=5470&atid=305470&file_id=17412&aid=513756
|
|||
|
|
|||
|
|
|||
|
Copyright
|
|||
|
|
|||
|
This document has been placed in the public domain.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Local Variables:
|
|||
|
mode: indented-text
|
|||
|
indent-tabs-mode: nil
|
|||
|
fill-column: 70
|
|||
|
End:
|