2002-02-01 00:59:14 -05:00
|
|
|
|
PEP: 279
|
|
|
|
|
Title: Enhanced Generators
|
|
|
|
|
Version: $Revision$
|
|
|
|
|
Last-Modified: $Date$
|
2002-03-04 08:20:02 -05:00
|
|
|
|
Author: python@rcn.com (Raymond D. Hettinger)
|
2002-02-01 00:59:14 -05:00
|
|
|
|
Status: Draft
|
|
|
|
|
Type: Standards Track
|
|
|
|
|
Created: 30-Jan-2002
|
|
|
|
|
Python-Version: 2.3
|
2002-02-04 16:03:03 -05:00
|
|
|
|
Post-History:
|
2002-02-01 00:59:14 -05:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Abstract
|
|
|
|
|
|
|
|
|
|
This PEP introduces four orthogonal (not mutually exclusive) ideas
|
2002-03-04 08:20:02 -05:00
|
|
|
|
for enhancing the generators introduced in Python version 2.2 [1].
|
|
|
|
|
The goal is to increase the convenience, utility, and power
|
2002-02-01 09:55:46 -05:00
|
|
|
|
of generators.
|
2002-02-01 00:59:14 -05:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Rationale
|
|
|
|
|
|
|
|
|
|
Starting with xrange() and xreadlines(), Python has been evolving
|
|
|
|
|
toward a model that provides lazy evaluation as an alternative
|
|
|
|
|
when complete evaluation is not desired because of memory
|
|
|
|
|
restrictions or availability of data.
|
|
|
|
|
|
|
|
|
|
Starting with Python 2.2, a second evolutionary direction came in
|
|
|
|
|
the form of iterators and generators. The iter() factory function
|
|
|
|
|
and generators were provided as convenient means of creating
|
|
|
|
|
iterators. Deep changes were made to use iterators as a unifying
|
|
|
|
|
theme throughout Python. The unification came in the form of
|
|
|
|
|
establishing a common iterable interface for mappings, sequences,
|
|
|
|
|
and file objects. In the case of mappings and file objects, lazy
|
2002-03-04 08:20:02 -05:00
|
|
|
|
evaluation became the norm.
|
2002-02-01 00:59:14 -05:00
|
|
|
|
|
|
|
|
|
The next steps in the evolution of generators are:
|
|
|
|
|
|
|
|
|
|
1. Add built-in functions which provide lazy alternatives to their
|
|
|
|
|
complete evaluation counterparts and one other convenience
|
|
|
|
|
function which was made possible once iterators and generators
|
|
|
|
|
became available. The new functions are xzip, xmap, xfilter,
|
|
|
|
|
and indexed.
|
|
|
|
|
|
|
|
|
|
2. Provide a generator alternative to list comprehensions [3]
|
|
|
|
|
making generator creation as convenient as list creation.
|
|
|
|
|
|
2002-02-04 16:03:03 -05:00
|
|
|
|
3. Extend the syntax of the 'yield' keyword to enable generator
|
2002-02-01 00:59:14 -05:00
|
|
|
|
parameter passing. The resulting increase in power simplifies
|
|
|
|
|
the creation of consumer streams which have a complex execution
|
|
|
|
|
state and/or variable state.
|
|
|
|
|
|
|
|
|
|
4. Add a generator method to enable exceptions to be passed to a
|
|
|
|
|
generator. Currently, there is no clean method for triggering
|
2002-03-04 08:20:02 -05:00
|
|
|
|
exceptions from outside the generator. Also, generator exception
|
|
|
|
|
passing helps mitigate the try/finally prohibition for generators.
|
2002-02-01 00:59:14 -05:00
|
|
|
|
|
|
|
|
|
All of the suggestions are designed to take advantage of the
|
|
|
|
|
existing implementation and require little additional effort to
|
|
|
|
|
incorporate. Each is backward compatible and requires no new
|
2002-02-04 16:03:03 -05:00
|
|
|
|
keywords. These generator tools go into Python 2.3 when
|
|
|
|
|
generators become final and are not imported from __future__.
|
2002-02-01 00:59:14 -05:00
|
|
|
|
|
2002-03-04 08:20:02 -05:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Reference Implementation
|
|
|
|
|
|
|
|
|
|
There is not currently a CPython implementation; however, a simulation
|
|
|
|
|
module written in pure Python is available on SourceForge [8]. The
|
|
|
|
|
simulation covers every feature proposed in this PEP and is meant
|
|
|
|
|
to allow direct experimentation with the proposals.
|
|
|
|
|
|
|
|
|
|
There is also a module [9] with working source code for all of the
|
|
|
|
|
examples used in this PEP. It serves as a test suite for the simulator
|
|
|
|
|
and it documents how each of the new features works in practice.
|
|
|
|
|
|
2002-02-07 07:08:12 -05:00
|
|
|
|
|
2002-02-01 00:59:14 -05:00
|
|
|
|
|
|
|
|
|
Specification for new built-ins:
|
|
|
|
|
|
2002-03-04 08:20:02 -05:00
|
|
|
|
def xfilter(pred, gen):
|
2002-02-01 00:59:14 -05:00
|
|
|
|
'''
|
|
|
|
|
xfilter(...)
|
|
|
|
|
xfilter(function, sequence) -> list
|
|
|
|
|
|
|
|
|
|
Return an iterator containing those items of sequence for
|
|
|
|
|
which function is true. If function is None, return a list of
|
|
|
|
|
items that are true.
|
|
|
|
|
'''
|
|
|
|
|
if pred is None:
|
|
|
|
|
for i in gen:
|
|
|
|
|
if i:
|
|
|
|
|
yield i
|
|
|
|
|
else:
|
|
|
|
|
for i in gen:
|
|
|
|
|
if pred(i):
|
|
|
|
|
yield i
|
|
|
|
|
|
2002-03-04 08:20:02 -05:00
|
|
|
|
def xmap(fun, *collections): ### Code from Python Cookbook [6]
|
2002-02-01 00:59:14 -05:00
|
|
|
|
'''
|
|
|
|
|
xmap(...)
|
|
|
|
|
xmap(function, sequence[, sequence, ...]) -> list
|
|
|
|
|
|
|
|
|
|
Return an iterator applying the function to the items of the
|
|
|
|
|
argument collection(s). If more than one collection is given,
|
|
|
|
|
the function is called with an argument list consisting of the
|
|
|
|
|
corresponding item of each collection, substituting None for
|
|
|
|
|
missing values when not all collections have the same length.
|
2002-03-04 08:20:02 -05:00
|
|
|
|
If the function is None, return an iterator of the items of the
|
|
|
|
|
collection (or an iterator of tuples if more than one collection).
|
2002-02-01 00:59:14 -05:00
|
|
|
|
'''
|
|
|
|
|
gens = map(iter, collections)
|
|
|
|
|
values_left = [1]
|
|
|
|
|
def values():
|
2002-03-04 08:20:02 -05:00
|
|
|
|
# Emulate map behavior by padding sequences with None
|
|
|
|
|
# when they run out of values.
|
2002-02-01 00:59:14 -05:00
|
|
|
|
values_left[0] = 0
|
|
|
|
|
for i in range(len(gens)):
|
|
|
|
|
iterator = gens[i]
|
|
|
|
|
if iterator is None:
|
|
|
|
|
yield None
|
|
|
|
|
else:
|
|
|
|
|
try:
|
|
|
|
|
yield iterator.next()
|
|
|
|
|
values_left[0] = 1
|
|
|
|
|
except StopIteration:
|
|
|
|
|
gens[i] = None
|
|
|
|
|
yield None
|
|
|
|
|
while 1:
|
|
|
|
|
args = tuple(values())
|
|
|
|
|
if not values_left[0]:
|
|
|
|
|
raise StopIteration
|
2002-02-01 09:55:46 -05:00
|
|
|
|
yield fun(*args)
|
2002-02-01 00:59:14 -05:00
|
|
|
|
|
2002-03-04 08:20:02 -05:00
|
|
|
|
def xzip(*collections):
|
2002-02-01 00:59:14 -05:00
|
|
|
|
'''
|
|
|
|
|
xzip(...)
|
|
|
|
|
xzip(seq1 [, seq2 [...]]) -> [(seq1[0], seq2[0] ...), (...)]
|
|
|
|
|
|
|
|
|
|
Return a iterator of tuples, where each tuple contains the
|
|
|
|
|
i-th element from each of the argument sequences or iterable.
|
|
|
|
|
The returned iterator is truncated in length to the length of
|
|
|
|
|
the shortest argument collection.
|
|
|
|
|
'''
|
|
|
|
|
gens = map(iter, collections)
|
|
|
|
|
while 1:
|
2002-03-04 08:20:02 -05:00
|
|
|
|
yield tuple([g.next() for g in gens])
|
2002-02-01 00:59:14 -05:00
|
|
|
|
|
2002-03-04 08:20:02 -05:00
|
|
|
|
def indexed(collection, cnt=0, limit=None):
|
2002-02-01 00:59:14 -05:00
|
|
|
|
'Generates an indexed series: (0,seqn[0]), (1,seqn[1]) ...'
|
|
|
|
|
gen = iter(collection)
|
|
|
|
|
while limit is None or cnt<limit:
|
2002-02-01 09:55:46 -05:00
|
|
|
|
yield (cnt, gen.next())
|
2002-02-01 00:59:14 -05:00
|
|
|
|
cnt += 1
|
|
|
|
|
|
2002-03-04 08:20:02 -05:00
|
|
|
|
|
2002-02-01 00:59:14 -05:00
|
|
|
|
Note A: PEP 212 Loop Counter Iteration [2] discussed several
|
|
|
|
|
proposals for achieving indexing. Some of the proposals only work
|
|
|
|
|
for lists unlike the above function which works for any generator,
|
|
|
|
|
xrange, sequence, or iterable object. Also, those proposals were
|
|
|
|
|
presented and evaluated in the world prior to Python 2.2 which did
|
2002-03-04 08:20:02 -05:00
|
|
|
|
not include generators. As a result, the generator-less version in
|
|
|
|
|
PEP 212 had the disadvantage of consuming memory with a giant list
|
|
|
|
|
of tuples. The generator version presented here is fast and light,
|
|
|
|
|
works with all iterables, and allows users to abandon the sequence
|
|
|
|
|
in mid-stream.
|
2002-02-01 00:59:14 -05:00
|
|
|
|
|
2002-02-07 07:08:12 -05:00
|
|
|
|
|
2002-03-04 08:20:02 -05:00
|
|
|
|
Note B: An alternate, simplified definition of indexed is:
|
|
|
|
|
|
|
|
|
|
def indexed(collection, cnt=0, limit=sys.maxint):
|
|
|
|
|
'Generates an indexed series: (0,seqn[0]), (1,seqn[1]) ...'
|
|
|
|
|
return xzip( xrange(cnt,limit), collection )
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Note C: As it stands, the Python code for xmap is slow. The actual
|
|
|
|
|
implementation of the functions should be written in C for speed.
|
|
|
|
|
The pure Python code listed above is meant only to specify how the
|
|
|
|
|
functions would behave, in particular that they should as closely as
|
|
|
|
|
possible emulate their non-lazy counterparts.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Note D: Almost all of the PEP reviewers welcomed these functions but were
|
|
|
|
|
divided as to whether they should be built-ins or in a separate module.
|
|
|
|
|
The main argument for a separate module was to slow the rate of language
|
|
|
|
|
inflation. The main argument for built-ins was that these functions are
|
|
|
|
|
destined to be part of a core programming style, applicable to any object
|
|
|
|
|
with an iterable interface. Just as zip() solves the problem of looping
|
|
|
|
|
over multiple sequences, the indexed() function solves the loop counter
|
|
|
|
|
problem. Likewise, the x-functions solve the problem of applying
|
|
|
|
|
functional constructs without forcing the evaluation of an entire sequence.
|
|
|
|
|
|
|
|
|
|
If only one built-in were allowed, then indexed() is the most important
|
|
|
|
|
general purpose tool, solving the broadest class of problems while
|
|
|
|
|
improving program brevity, clarity and reliability.
|
2002-02-07 07:08:12 -05:00
|
|
|
|
|
2002-03-04 08:20:02 -05:00
|
|
|
|
|
2002-02-01 00:59:14 -05:00
|
|
|
|
|
|
|
|
|
Specification for Generator Comprehensions:
|
|
|
|
|
|
|
|
|
|
If a list comprehension starts with a 'yield' keyword, then
|
2002-02-04 16:03:03 -05:00
|
|
|
|
express the comprehension with a generator. For example:
|
2002-02-01 00:59:14 -05:00
|
|
|
|
|
2002-02-04 16:03:03 -05:00
|
|
|
|
g = [yield (len(line),line) for line in file if len(line)>5]
|
2002-03-04 08:20:02 -05:00
|
|
|
|
print g.next()
|
2002-02-01 00:59:14 -05:00
|
|
|
|
|
|
|
|
|
This would be implemented as if it had been written:
|
|
|
|
|
|
2002-03-04 08:20:02 -05:00
|
|
|
|
def __temp(self):
|
|
|
|
|
for line in file:
|
|
|
|
|
if len(line) > 5:
|
|
|
|
|
yield (len(line), line)
|
|
|
|
|
g = __temp()
|
|
|
|
|
print g.next()
|
2002-02-04 16:03:03 -05:00
|
|
|
|
|
2002-03-04 08:20:02 -05:00
|
|
|
|
Note A: There is some discussion about whether the enclosing brackets
|
2002-02-01 00:59:14 -05:00
|
|
|
|
should be part of the syntax for generator comprehensions. On the
|
|
|
|
|
plus side, it neatly parallels list comprehensions and would be
|
|
|
|
|
immediately recognizable as a similar form with similar internal
|
|
|
|
|
syntax (taking maximum advantage of what people already know).
|
|
|
|
|
More importantly, it sets off the generator comprehension from the
|
|
|
|
|
rest of the function so as to not suggest that the enclosing
|
|
|
|
|
function is a generator (currently the only cue that a function is
|
|
|
|
|
really a generator is the presence of the yield keyword). On the
|
|
|
|
|
minus side, the brackets may falsely suggest that the whole
|
2002-02-01 09:55:46 -05:00
|
|
|
|
expression returns a list. Most of the feedback received to date
|
2002-03-04 08:20:02 -05:00
|
|
|
|
indicates that brackets are helpful and not misleading.
|
2002-02-01 00:59:14 -05:00
|
|
|
|
|
2002-03-04 08:20:02 -05:00
|
|
|
|
Note B: List comprehensions expose their looping variable and
|
|
|
|
|
leave that variable in the enclosing scope. The code, [str(i) for
|
2002-02-07 07:08:12 -05:00
|
|
|
|
i in range(8)] leaves 'i' set to 7 in the scope where the
|
|
|
|
|
comprehension appears. This behavior is by design and reflects an
|
|
|
|
|
intent to duplicate the result of coding a for-loop instead of a
|
|
|
|
|
list comprehension. Further, the variable 'i' is in a defined and
|
|
|
|
|
potentially useful state on the line immediately following the
|
|
|
|
|
list comprehension.
|
|
|
|
|
|
|
|
|
|
In contrast, generator comprehensions do not expose the looping
|
|
|
|
|
variable to the enclosing scope. The code, [yield str(i) for i in
|
|
|
|
|
range(8)] leaves 'i' untouched in the scope where the
|
|
|
|
|
comprehension appears. This is also by design and reflects an
|
|
|
|
|
intent to duplicate the result of coding a generator directly
|
|
|
|
|
instead of a generator comprehension. Further, the variable 'i'
|
|
|
|
|
is not in a defined state on the line immediately following the
|
|
|
|
|
list comprehension. It does not come into existence until
|
2002-03-04 08:20:02 -05:00
|
|
|
|
iteration starts (possibly never).
|
|
|
|
|
|
2002-02-07 07:08:12 -05:00
|
|
|
|
|
2002-02-01 00:59:14 -05:00
|
|
|
|
|
2002-02-04 16:03:03 -05:00
|
|
|
|
Specification for Generator Parameter Passing:
|
2002-02-01 00:59:14 -05:00
|
|
|
|
|
|
|
|
|
1. Allow 'yield' to assign a value as in:
|
|
|
|
|
|
|
|
|
|
def mygen():
|
|
|
|
|
while 1:
|
|
|
|
|
x = yield None
|
|
|
|
|
print x
|
|
|
|
|
|
2002-03-04 08:20:02 -05:00
|
|
|
|
2. Let the .next() method take a value to pass to the generator as in:
|
2002-02-01 00:59:14 -05:00
|
|
|
|
|
|
|
|
|
g = mygen()
|
2002-02-01 09:55:46 -05:00
|
|
|
|
g.next() # runs the generator until the first 'yield'
|
|
|
|
|
g.next(1) # '1' is bound to 'x' in mygen(), then printed
|
|
|
|
|
g.next(2) # '2' is bound to 'x' in mygen(), then printed
|
|
|
|
|
|
2002-03-04 08:20:02 -05:00
|
|
|
|
The control flow of 'yield' and 'next' is unchanged by this proposal.
|
|
|
|
|
The only change is that a value can be sent into the generator.
|
|
|
|
|
By analogy, consider the quality improvement from GOSUB (which had
|
|
|
|
|
no argument passing mechanism) to modern procedure calls (which can
|
|
|
|
|
pass in arguments and return values).
|
2002-02-01 09:55:46 -05:00
|
|
|
|
|
|
|
|
|
Most of the underlying machinery is already in place, only the
|
|
|
|
|
communication needs to be added by modifying the parse syntax to
|
|
|
|
|
accept the new 'x = yield expr' syntax and by allowing the .next()
|
|
|
|
|
method to accept an optional argument.
|
|
|
|
|
|
|
|
|
|
Yield is more than just a simple iterator creator. It does
|
|
|
|
|
something else truly wonderful -- it suspends execution and saves
|
|
|
|
|
state. It is good for a lot more than writing iterators. This
|
|
|
|
|
proposal further expands its capability by making it easier to
|
|
|
|
|
share data with the generator.
|
|
|
|
|
|
|
|
|
|
The .next(arg) mechanism is especially useful for:
|
|
|
|
|
1. Sending data to any generator
|
|
|
|
|
2. Writing lazy consumers with complex execution states
|
|
|
|
|
3. Writing co-routines (as demonstrated in Dr. Mertz's article [5])
|
|
|
|
|
|
|
|
|
|
The proposal is a clear improvement over the existing alternative
|
|
|
|
|
of passing data via global variables. It is also much simpler,
|
|
|
|
|
more readable and easier to debug than an approach involving the
|
|
|
|
|
threading module with its attendant mutexes, semaphores, and data
|
|
|
|
|
queues. A class-based approach competes well when there are no
|
2002-03-04 08:20:02 -05:00
|
|
|
|
complex execution states or variable states. However, when the
|
|
|
|
|
complexity increases, generators with parameter passing are much simpler
|
2002-02-01 09:55:46 -05:00
|
|
|
|
because they automatically save state (unlike classes which must
|
2002-03-04 08:20:02 -05:00
|
|
|
|
explicitly save the variable and execution state in instance variables).
|
2002-02-01 09:55:46 -05:00
|
|
|
|
|
2002-03-04 08:20:02 -05:00
|
|
|
|
Note A: This proposal changes 'yield' from a statement to an
|
|
|
|
|
expression with binding and precedence similar to lambda.
|
|
|
|
|
|
2002-02-01 09:55:46 -05:00
|
|
|
|
|
2002-02-04 16:03:03 -05:00
|
|
|
|
Example of a Complex Consumer
|
2002-02-01 09:55:46 -05:00
|
|
|
|
|
|
|
|
|
The encoder for arithmetic compression sends a series of
|
|
|
|
|
fractional values to a complex, lazy consumer. That consumer
|
|
|
|
|
makes computations based on previous inputs and only writes out
|
|
|
|
|
when certain conditions have been met. After the last fraction is
|
|
|
|
|
received, it has a procedure for flushing any unwritten data.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Example of a Consumer Stream
|
2002-02-01 00:59:14 -05:00
|
|
|
|
|
|
|
|
|
def filelike(packagename, appendOrOverwrite):
|
|
|
|
|
cum = []
|
|
|
|
|
if appendOrOverwrite == 'w+':
|
2002-03-04 08:20:02 -05:00
|
|
|
|
cum.extend(packages[packagename])
|
2002-02-01 00:59:14 -05:00
|
|
|
|
try:
|
|
|
|
|
while 1:
|
|
|
|
|
dat = yield None
|
|
|
|
|
cum.append(dat)
|
|
|
|
|
except FlushStream:
|
|
|
|
|
packages[packagename] = cum
|
2002-03-04 08:20:02 -05:00
|
|
|
|
|
2002-02-01 09:55:46 -05:00
|
|
|
|
ostream = filelike('mydest','w') # Analogous to file.open(name,flag)
|
|
|
|
|
ostream.next() # Advance to the first yield
|
|
|
|
|
ostream.next(firstdat) # Analogous to file.write(dat)
|
2002-02-01 00:59:14 -05:00
|
|
|
|
ostream.next(seconddat)
|
2002-03-04 08:20:02 -05:00
|
|
|
|
ostream.throw(FlushStream) # This feature proposed below
|
2002-02-01 09:55:46 -05:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Example of a Complex Consumer
|
|
|
|
|
|
|
|
|
|
Loop over the picture files in a directory, shrink them
|
2002-03-04 08:20:02 -05:00
|
|
|
|
one at a time to thumbnail size using PIL [7], and send them to a
|
2002-02-07 07:08:12 -05:00
|
|
|
|
lazy consumer. That consumer is responsible for creating a large
|
2002-03-04 08:20:02 -05:00
|
|
|
|
blank image, accepting thumbnails one at a time and placing them
|
|
|
|
|
in a 5 by 3 grid format onto the blank image. Whenever the grid is
|
2002-02-07 07:08:12 -05:00
|
|
|
|
full, it writes-out the large image as an index print. A
|
|
|
|
|
FlushStream exception indicates that no more thumbnails are
|
|
|
|
|
available and that the partial index print should be written out
|
|
|
|
|
if there are one or more thumbnails on it.
|
2002-02-01 00:59:14 -05:00
|
|
|
|
|
|
|
|
|
|
2002-03-04 08:20:02 -05:00
|
|
|
|
Example of a Producer and Consumer Used Together in a Pipe-like Fashion
|
2002-02-04 16:03:03 -05:00
|
|
|
|
|
2002-03-04 08:20:02 -05:00
|
|
|
|
'Analogy to Linux style pipes: source | upper | sink'
|
2002-02-04 16:03:03 -05:00
|
|
|
|
sink = sinkgen()
|
|
|
|
|
sink.next()
|
|
|
|
|
for word in source():
|
2002-03-04 08:20:02 -05:00
|
|
|
|
sink.next(word.upper())
|
|
|
|
|
|
2002-02-04 16:03:03 -05:00
|
|
|
|
|
|
|
|
|
|
2002-02-01 00:59:14 -05:00
|
|
|
|
Specification for Generator Exception Passing:
|
|
|
|
|
|
2002-03-04 08:20:02 -05:00
|
|
|
|
Add a .throw(exception) method to the generator interface:
|
2002-02-01 00:59:14 -05:00
|
|
|
|
|
|
|
|
|
def mygen():
|
|
|
|
|
try:
|
|
|
|
|
while 1:
|
|
|
|
|
x = yield None
|
|
|
|
|
print x
|
|
|
|
|
except FlushStream:
|
|
|
|
|
print 'Done'
|
|
|
|
|
|
|
|
|
|
g = mygen()
|
|
|
|
|
g.next(5)
|
|
|
|
|
g.throw(FlushStream)
|
|
|
|
|
|
2002-03-04 08:20:02 -05:00
|
|
|
|
There is no existing work-around for triggering an exception
|
2002-02-01 00:59:14 -05:00
|
|
|
|
inside a generator. This is a true deficiency. It is the only
|
2002-03-04 08:20:02 -05:00
|
|
|
|
case in Python where active code cannot be excepted to or through.
|
2002-02-01 09:55:46 -05:00
|
|
|
|
Even if the .next(arg) proposal is not adopted, we should add the
|
|
|
|
|
.throw() method.
|
2002-02-01 00:59:14 -05:00
|
|
|
|
|
2002-03-04 08:20:02 -05:00
|
|
|
|
Generator exception passing also helps address an intrinsic limitation
|
|
|
|
|
on generators, the prohibition against their using try/finally to
|
|
|
|
|
trigger clean-up code [1]. Without .throw(), the current work-around
|
|
|
|
|
forces the resolution or clean-up code to be moved outside the generator.
|
|
|
|
|
|
|
|
|
|
|
2002-02-01 00:59:14 -05:00
|
|
|
|
Note A: The name of the throw method was selected for several
|
|
|
|
|
reasons. Raise is a keyword and so cannot be used as a method
|
|
|
|
|
name. Unlike raise which immediately raises an exception from the
|
|
|
|
|
current execution point, throw will first return to the generator
|
|
|
|
|
and then raise the exception. The word throw is suggestive of
|
|
|
|
|
putting the exception in another location. The word throw is
|
|
|
|
|
already associated with exceptions in other languages.
|
|
|
|
|
|
2002-03-04 08:20:02 -05:00
|
|
|
|
Alternative method names were considered: resolve(), signal(),
|
|
|
|
|
genraise(), raiseinto(), and flush(). None of these seem to fit
|
|
|
|
|
as well as throw().
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Note B: The throw syntax should exactly match raise's syntax:
|
|
|
|
|
|
|
|
|
|
throw([expression, [expression, [expression]]])
|
|
|
|
|
|
|
|
|
|
Accordingly, it should be implemented to handle all of the following:
|
|
|
|
|
|
|
|
|
|
raise string g.throw(string)
|
|
|
|
|
raise string, data g.throw(string,data)
|
|
|
|
|
raise class, instance g.throw(class,instance)
|
|
|
|
|
raise instance g.throw(instance)
|
|
|
|
|
raise g.throw()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Discussion of Restartability:
|
|
|
|
|
|
|
|
|
|
Inside for-loops, generators are not substitutable for lists unless they
|
|
|
|
|
are accessed only once. A second access only works for restartable
|
|
|
|
|
objects like lists, dicts, objects defined with __getitem__, and
|
|
|
|
|
xrange objects. Generators are not the only objects which are not
|
|
|
|
|
restartable. Other examples of non-restartable sequences include file
|
|
|
|
|
objects, xreadlines objects, and the result of iter(callable,sentinel).
|
|
|
|
|
|
|
|
|
|
Since the proposed built-in functions return generators, they are also
|
|
|
|
|
non-restartable. As a result, 'xmap' is not substitutable for 'map' in
|
|
|
|
|
the following example:
|
|
|
|
|
|
|
|
|
|
alphabet = map(chr, xrange(ord('a'), ord('z')+1))
|
|
|
|
|
twoletterwords = [a+b for a in alphabet for b in alphabet]
|
|
|
|
|
|
|
|
|
|
Since generator comprehensions also return generators, they are not
|
|
|
|
|
restartable. Consequently, they are not substitutable for list
|
|
|
|
|
comprehensions in the following example:
|
|
|
|
|
|
|
|
|
|
digits = [str(i) for i in xrange(10)]
|
|
|
|
|
alphadig = [a+d for a in 'abcdefg' for d in digits]
|
|
|
|
|
|
|
|
|
|
To achieve substitutabity, generator comprehensions and x-functions
|
|
|
|
|
can be implemented in a way that supports restarts. PEP 234 [4]
|
|
|
|
|
explicitly states that restarts are to be supported through repeated
|
|
|
|
|
calls to iter(). With that guidance, it is easy to add restartability
|
|
|
|
|
to generator comprehensions using a simple wrapper class around the
|
|
|
|
|
generator function and modifying the implementation above to return:
|
|
|
|
|
|
|
|
|
|
g = Restartable(__temp) # instead of g = __temp()
|
|
|
|
|
|
|
|
|
|
Restartable is a simple (12 line) class which calls the generator function
|
|
|
|
|
to create a new, re-wound generator whenever iter() requests a restart.
|
|
|
|
|
Calls to .next() are simply forwarded to the generator. The Python source
|
|
|
|
|
code for the Restartable class can found in the PEP 279 simulator [8].
|
|
|
|
|
An actual implementation in C can achieve re-startability directly and
|
|
|
|
|
would not need the slow class wrapper used in the pure Python simulation.
|
|
|
|
|
|
|
|
|
|
The XLazy library [10] shows how restarts can be implemented for xmap,
|
|
|
|
|
xfilter, and xzip.
|
|
|
|
|
|
|
|
|
|
The upside of adding restart capability is that more list comprehensions
|
|
|
|
|
can be made lazy and save memory by adding 'yield'. Likewise,
|
|
|
|
|
more expressions that use map, filter, and zip can be made lazy just by
|
|
|
|
|
adding 'x'.
|
|
|
|
|
|
|
|
|
|
A possible downside is that x-functions have no control over whether their
|
|
|
|
|
inputs are themselves restartable. With non-restartable inputs like
|
|
|
|
|
generators or files, an x-function restart will not produce a meaningful
|
|
|
|
|
result.
|
|
|
|
|
|
2002-02-04 16:03:03 -05:00
|
|
|
|
|
2002-02-01 00:59:14 -05:00
|
|
|
|
|
|
|
|
|
References
|
2002-02-04 16:03:03 -05:00
|
|
|
|
|
2002-02-01 00:59:14 -05:00
|
|
|
|
[1] PEP 255 Simple Generators
|
|
|
|
|
http://python.sourceforge.net/peps/pep-0255.html
|
|
|
|
|
|
|
|
|
|
[2] PEP 212 Loop Counter Iteration
|
|
|
|
|
http://python.sourceforge.net/peps/pep-0212.html
|
|
|
|
|
|
2002-02-04 16:03:03 -05:00
|
|
|
|
[3] PEP 202 List Comprehensions
|
2002-02-01 00:59:14 -05:00
|
|
|
|
http://python.sourceforge.net/peps/pep-0202.html
|
|
|
|
|
|
2002-03-04 08:20:02 -05:00
|
|
|
|
[4] PEP 234 Iterators
|
|
|
|
|
http://python.sourceforge.net/peps/pep-0234.html
|
2002-02-01 00:59:14 -05:00
|
|
|
|
|
|
|
|
|
[5] Dr. David Mertz's draft column for Charming Python.
|
2002-02-04 16:03:03 -05:00
|
|
|
|
http://gnosis.cx/publish/programming/charming_python_b5.txt
|
2002-02-01 00:59:14 -05:00
|
|
|
|
|
2002-02-07 07:08:12 -05:00
|
|
|
|
[6] The code fragment for xmap() was found at:
|
|
|
|
|
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/66448
|
|
|
|
|
|
|
|
|
|
[7] PIL, the Python Imaging Library can be found at:
|
|
|
|
|
http://www.pythonware.com/products/pil/
|
|
|
|
|
|
|
|
|
|
[8] A pure Python simulation of every feature in this PEP is at:
|
|
|
|
|
http://sourceforge.net/tracker/download.php?group_id=5470&atid=305470&file_id=17348&aid=513752
|
|
|
|
|
|
|
|
|
|
[9] The full, working source code for each of the examples in this PEP
|
|
|
|
|
along with other examples and tests is at:
|
|
|
|
|
http://sourceforge.net/tracker/download.php?group_id=5470&atid=305470&file_id=17412&aid=513756
|
|
|
|
|
|
2002-03-04 08:20:02 -05:00
|
|
|
|
[10] Oren Tirosh's XLazy library with re-startable x-functions is at:
|
|
|
|
|
http://www.tothink.com/python/dataflow/
|
|
|
|
|
|
2002-02-01 00:59:14 -05:00
|
|
|
|
|
|
|
|
|
Copyright
|
|
|
|
|
|
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Local Variables:
|
|
|
|
|
mode: indented-text
|
|
|
|
|
indent-tabs-mode: nil
|
|
|
|
|
fill-column: 70
|
|
|
|
|
End:
|