Update for Py2.5.

This commit is contained in:
Raymond Hettinger 2005-01-02 21:41:54 +00:00
parent edd8847113
commit 9198b9cf32
1 changed files with 55 additions and 206 deletions

View File

@ -6,186 +6,77 @@ Author: python@rcn.com (Raymond D. Hettinger)
Status: Draft
Type: Standards Track
Created: 21-Mar-2002
Python-Version: 2.4
Python-Version: 2.5
Post-History:
Abstract
This PEP introduces ideas for enhancing the generators introduced
in Python version 2.2 [1]. The goal is to increase the
convenience, utility, and power of generators by providing a
mechanism for passing data into a generator and for triggering
exceptions inside a generator.
This PEP proposes to enhance generators by providing mechanisms for
raising exceptions and sharing data with running generators.
These mechanisms were first proposed along with two other
generator tools in PEP 279 [7]. They were split-off to this
separate PEP to allow the ideas more time to mature and for
alternatives to be considered. Subsequently, the argument
passing idea gave way to Detlef Lannert's idea of using attributes.
Rationale
Python 2.2 introduced the concept of an iterable interface as
proposed in PEP 234 [2]. The iter() factory function was provided
as common calling convention and deep changes were made to use
iterators as a unifying theme throughout Python. The unification
came in the form of establishing a common iterable interface for
mappings, sequences, and file objects.
Currently, only class based iterators can provide attributes and
exception handling. However, class based iterators are harder to
write, less compact, less readable, and slower. A better solution
is to enable these capabilities for generators.
Generators, as proposed in PEP 255 [1], were introduced as a means for
making it easier to create iterators, especially ones with complex
internal states.
Enabling attribute assignments allows data to be passed to and from
running generators. The approach of sharing data using attributes
pervades Python. Other approaches exist but are somewhat hackish
in comparison.
The next step in the evolution of generators is to allow generators to
accept attribute assignments. This allows data to be passed in a
standard Python fashion.
A related evolutionary step is to add a generator method to enable
Another evolutionary step is to add a generator method to allow
exceptions to be passed to a generator. Currently, there is no
clean method for triggering exceptions from outside the generator.
Also, generator exception passing helps mitigate the try/finally
prohibition for generators.
prohibition for generators. The need is especially acute for
generators needing to flush buffers or close resources upon termination.
The two proposals are backwards compatible and require no new
keywords. They are being recommended for Python version 2.5.
These suggestions are designed to take advantage of the existing
implementation and require little additional effort to
incorporate. They are backwards compatible and require no new
keywords. They are being recommended for Python version 2.4.
Specification for Generator Attributes
Essentially, the proposal is to emulate attribute writing for classes.
The only wrinkle is that generators lack a way to refer to instances of
themselves. So, generators need an automatic instance variable, __self__.
themselves. So, the proposal is to provide a function for discovering
the reference. For example:
Here is a minimal example:
def mygen(filename):
self = mygen.get_instance()
myfile = open(filename)
for line in myfile:
if len(line) < 10:
continue
self.pos = myfile.tell()
yield line.upper()
def mygen():
while True:
print __self__.data
yield None
g = mygen('sample.txt')
line1 = g.next()
print 'Position', g.pos
g = mygen()
g.data = 1
g.next() # prints 1
g.data = 2
g.next() # prints 2
Uses for generator attributes include:
1. Providing generator clients with extra information (as shown
above).
2. Externally setting control flags governing generator operation
(possibly telling a generator when to step in or step over
data groups).
3. Writing lazy consumers with complex execution states
(an arithmetic encoder output stream for example).
4. Writing co-routines (as demonstrated in Dr. Mertz's articles [1]).
The control flow of 'yield' and 'next' is unchanged by this
proposal. The only change is that data can be sent into the
generator. By analogy, consider the quality improvement from
GOSUB (which had no argument passing mechanism) to modern
procedure calls (which can pass in arguments and return values).
Most of the underlying machinery is already in place, only the
__self__ variable needs to be added.
proposal. The only change is that data can passed to and from the
generator. Most of the underlying machinery is already in place,
only the access function needs to be added.
Yield is more than just a simple iterator creator. It does
something else truly wonderful -- it suspends execution and saves
state. It is good for a lot more than writing iterators. This
proposal further taps its capabilities by making it easier to
share data with the generator.
The attribute mechanism is especially useful for:
1. Sending data to any generator
2. Writing lazy consumers with complex execution states
3. Writing co-routines (as demonstrated in Dr. Mertz's articles [3])
The proposal is a clear improvement over the existing alternative
of passing data via global variables. It is also much simpler,
more readable and easier to debug than an approach involving the
threading module with its attendant mutexes, semaphores, and data
queues. A class-based approach competes well when there are no
complex execution states or variable states. However, when the
complexity increases, generators with writable attributes are much
simpler because they automatically save state (unlike classes
which must explicitly save the variable and execution state in
instance variables).
Examples
Example of a Complex Consumer
The encoder for arithmetic compression sends a series of
fractional values to a complex, lazy consumer. That consumer
makes computations based on previous inputs and only writes out
when certain conditions have been met. After the last fraction is
received, it has a procedure for flushing any unwritten data.
Example of a Consumer Stream
def filelike(packagename, appendOrOverwrite):
data = []
if appendOrOverwrite == 'w+':
data.extend(packages[packagename])
try:
while True:
data.append(__self__.dat)
yield None
except FlushStream:
packages[packagename] = data
ostream = filelike('mydest','w')
ostream.dat = firstdat; ostream.next()
ostream.dat = firstdat; ostream.next()
ostream.throw(FlushStream) # Throw is proposed below
Example of a Complex Consumer
Loop over the picture files in a directory, shrink them one at a
time to thumbnail size using PIL [4], and send them to a lazy
consumer. That consumer is responsible for creating a large blank
image, accepting thumbnails one at a time and placing them in a 5
by 3 grid format onto the blank image. Whenever the grid is full,
it writes-out the large image as an index print. A FlushStream
exception indicates that no more thumbnails are available and that
the partial index print should be written out if there are one or
more thumbnails on it.
Example of a Producer and Consumer Used Together in a Pipe-like Fashion
'Analogy to Linux style pipes: source | upper | sink'
sink = sinkgen()
for word in source():
sink.data = word.upper()
sink.next()
Initialization Mechanism
If the attribute passing idea is accepted, Detlef Lannert further
proposed that generator instances have attributes initialized to
values in the generator's func_dict. This makes it easy to set
default values. For example:
def mygen():
while True:
print __self__.data
yield None
mygen.data = 0
g = mygen() # g initialized with .data set to 0
g.next() # prints 0
g.data = 1
g.next() # prints 1
Rejected Alternative
One idea for passing data into a generator was to pass an argument
through next() and make a assignment using the yield keyword:
datain = yield dataout
. . .
dataout = gen.next(datain)
The intractable problem is that the argument to the first next() call
has to be thrown away, because it doesn't correspond to a yield keyword.
Specification for Generator Exception Passing:
@ -197,7 +88,7 @@ Specification for Generator Exception Passing:
log = []
try:
while True:
log.append( time.time() - start )
log.append(time.time() - start)
yield log[-1]
except WriteLog:
writelog(log)
@ -214,9 +105,7 @@ Specification for Generator Exception Passing:
Generator exception passing also helps address an intrinsic
limitation on generators, the prohibition against their using
try/finally to trigger clean-up code [1]. Without .throw(), the
current work-around forces the resolution or clean-up code to be
moved outside the generator.
try/finally to trigger clean-up code [2].
Note A: The name of the throw method was selected for several
reasons. Raise is a keyword and so cannot be used as a method
@ -227,10 +116,10 @@ Specification for Generator Exception Passing:
already associated with exceptions in other languages.
Alternative method names were considered: resolve(), signal(),
genraise(), raiseinto(), and flush(). None of these seem to fit
as well as throw().
genraise(), raiseinto(), and flush(). None of these fit as well
as throw().
Note B: The throw syntax should exactly match raise's syntax:
Note B: The full throw() syntax should exactly match raise's syntax:
throw([expression, [expression, [expression]]])
@ -243,59 +132,19 @@ Specification for Generator Exception Passing:
raise g.throw()
Comments from GvR: I'm not convinced that the cleanup problem that
this is trying to solve exists in practice. I've never felt
the need to put yield inside a try/except. I think the PEP
doesn't make enough of a case that this is useful.
This one gets a big fat -1 until there's a good motivational
section.
Comments from Ka-Ping Yee: I agree that the exception issue needs to
be resolved and [that] you have suggested a fine solution.
Comments from Neil Schemenauer: The exception passing idea is one I
hadn't thought of before and looks interesting. If we enable
the passing of values back, then we should add this feature
too.
Comments for Magnus Lie Hetland: Even though I cannot speak for the
ease of implementation, I vote +1 for the exception passing
mechanism.
Comments from the Community: The response has been mostly favorable. One
negative comment from GvR is shown above. The other was from
Martin von Loewis who was concerned that it could be difficult
to implement and is withholding his support until a working
patch is available. To probe Martin's comment, I checked with
the implementers of the original generator PEP for an opinion
on the ease of implementation. They felt that implementation
would be straight-forward and could be grafted onto the
existing implementation without disturbing its internals.
Author response: When the sole use of generators is to simplify writing
iterators for lazy producers, then the odds of needing
generator exception passing are slim. If, on the other hand,
generators are used to write lazy consumers, create
coroutines, generate output streams, or simply for their
marvelous capability for restarting a previously frozen state,
THEN the need to raise exceptions will come up frequently.
References
[1] PEP 255 Simple Generators
http://www.python.org/peps/pep-0255.html
[2] PEP 234 Iterators
http://www.python.org/peps/pep-0234.html
[3] Dr. David Mertz's draft column for Charming Python.
[1] Dr. David Mertz's draft columns for Charming Python:
http://gnosis.cx/publish/programming/charming_python_b5.txt
http://gnosis.cx/publish/programming/charming_python_b7.txt
[4] PIL, the Python Imaging Library can be found at:
http://www.pythonware.com/products/pil/
[2] PEP 255 Simple Generators:
http://www.python.org/peps/pep-0255.html
[3] Proof-of-concept recipe:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/164044
Copyright