Integrated Raymond Hettinger's latest update
This commit is contained in:
parent
080582a8fd
commit
f2115cc852
196
pep-0279.txt
196
pep-0279.txt
|
@ -14,8 +14,8 @@ Abstract
|
|||
|
||||
This PEP introduces four orthogonal (not mutually exclusive) ideas
|
||||
for enhancing the generators as introduced in Python version 2.2
|
||||
[1]. The goal is increase the convenience, utility, and power of
|
||||
generators.
|
||||
[1]. The goal is to increase the convenience, utility, and power
|
||||
of generators.
|
||||
|
||||
|
||||
Rationale
|
||||
|
@ -115,7 +115,7 @@ Specification for new built-ins:
|
|||
args = tuple(values())
|
||||
if not values_left[0]:
|
||||
raise StopIteration
|
||||
yield func(*args)
|
||||
yield fun(*args)
|
||||
|
||||
def xzip( *collections ):
|
||||
'''
|
||||
|
@ -135,7 +135,7 @@ Specification for new built-ins:
|
|||
'Generates an indexed series: (0,seqn[0]), (1,seqn[1]) ...'
|
||||
gen = iter(collection)
|
||||
while limit is None or cnt<limit:
|
||||
yield (cnt, collection.next())
|
||||
yield (cnt, gen.next())
|
||||
cnt += 1
|
||||
|
||||
Note A: PEP 212 Loop Counter Iteration [2] discussed several
|
||||
|
@ -185,7 +185,7 @@ Specification for Generator Comprehensions:
|
|||
function is a generator (currently the only cue that a function is
|
||||
really a generator is the presence of the yield keyword). On the
|
||||
minus side, the brackets may falsely suggest that the whole
|
||||
expression returns a list. All of the feedback received to date
|
||||
expression returns a list. Most of the feedback received to date
|
||||
indicates that brackets do not make a false suggestion and are
|
||||
in fact helpful.
|
||||
|
||||
|
@ -202,30 +202,54 @@ Specification for two-way Generator Parameter Passing:
|
|||
2. Let the .next() method take a value to pass to generator as in:
|
||||
|
||||
g = mygen()
|
||||
g.next() # runs the generators until the first 'yield'
|
||||
g.next(1) # the '1' gets bound to 'x' in mygen()
|
||||
g.next(2) # the '2' gets bound to 'x' in mygen()
|
||||
g.next() # runs the generator until the first 'yield'
|
||||
g.next(1) # '1' is bound to 'x' in mygen(), then printed
|
||||
g.next(2) # '2' is bound to 'x' in mygen(), then printed
|
||||
|
||||
Note A: An early question arose, when would you need this? The
|
||||
answer is that existing generators make it easy to write lazy
|
||||
producers which may have a complex execution state and/or complex
|
||||
variable state. This proposal makes it equally easy to write lazy
|
||||
consumers which may also have a complex execution or variable
|
||||
state.
|
||||
The control flow is unchanged by this proposal. The only change
|
||||
is that a value can be sent into the generator. By analogy,
|
||||
consider the quality improvement from GOSUB (which had no argument
|
||||
passing mechanism) to modern procedure calls (which pass in
|
||||
arguments and return values).
|
||||
|
||||
For instance, when writing an encoder for arithmetic compression,
|
||||
a series of fractional values are sent to a function which has
|
||||
periodic output and a complex state which depends on previous
|
||||
inputs. Also, that encoder requires a flush() function when no
|
||||
additional fractions are to be output. It is helpful to think of
|
||||
the following parallel with file output streams:
|
||||
Most of the underlying machinery is already in place, only the
|
||||
communication needs to be added by modifying the parse syntax to
|
||||
accept the new 'x = yield expr' syntax and by allowing the .next()
|
||||
method to accept an optional argument.
|
||||
|
||||
ostream = file('mydest.txt','w')
|
||||
ostream.write(firstdat)
|
||||
ostream.write(seconddat)
|
||||
ostream.flush()
|
||||
Yield is more than just a simple iterator creator. It does
|
||||
something else truly wonderful -- it suspends execution and saves
|
||||
state. It is good for a lot more than writing iterators. This
|
||||
proposal further expands its capability by making it easier to
|
||||
share data with the generator.
|
||||
|
||||
With the proposed extensions, it could be written like this:
|
||||
The .next(arg) mechanism is especially useful for:
|
||||
1. Sending data to any generator
|
||||
2. Writing lazy consumers with complex execution states
|
||||
3. Writing co-routines (as demonstrated in Dr. Mertz's article [5])
|
||||
|
||||
The proposal is a clear improvement over the existing alternative
|
||||
of passing data via global variables. It is also much simpler,
|
||||
more readable and easier to debug than an approach involving the
|
||||
threading module with its attendant mutexes, semaphores, and data
|
||||
queues. A class-based approach competes well when there are no
|
||||
complex execution states or variable states. When the complexity
|
||||
increases, generators with two-way communication are much simpler
|
||||
because they automatically save state (unlike classes which must
|
||||
explicitly save the variable and execution state in instance
|
||||
variables).
|
||||
|
||||
|
||||
Example of a Complex Consumer
|
||||
|
||||
The encoder for arithmetic compression sends a series of
|
||||
fractional values to a complex, lazy consumer. That consumer
|
||||
makes computations based on previous inputs and only writes out
|
||||
when certain conditions have been met. After the last fraction is
|
||||
received, it has a procedure for flushing any unwritten data.
|
||||
|
||||
|
||||
Example of a Consumer Stream
|
||||
|
||||
def filelike(packagename, appendOrOverwrite):
|
||||
cum = []
|
||||
|
@ -237,110 +261,24 @@ Specification for two-way Generator Parameter Passing:
|
|||
cum.append(dat)
|
||||
except FlushStream:
|
||||
packages[packagename] = cum
|
||||
ostream = filelike('mydest','w')
|
||||
ostream.next()
|
||||
ostream.next(firstdat)
|
||||
ostream = filelike('mydest','w') # Analogous to file.open(name,flag)
|
||||
ostream.next() # Advance to the first yield
|
||||
ostream.next(firstdat) # Analogous to file.write(dat)
|
||||
ostream.next(seconddat)
|
||||
ostream.throw( FlushStream ) # this feature discussed below
|
||||
|
||||
Note C: Almost all of the machinery necessary to implement this
|
||||
extension is already in place. The parse syntax needs to be
|
||||
modified to accept the new x = yield None syntax and the .next()
|
||||
method needs to allow an argument.
|
||||
|
||||
Note D: Some care must be used when writing a values to the
|
||||
generator because execution starts at the top of the generator not
|
||||
at the first yield.
|
||||
|
||||
Consider the usual flow using .next() without an argument.
|
||||
|
||||
g = mygen(p1) will bind p1 to a local variable and then return a
|
||||
generator to be bound to g and NOT run any code in mygen().
|
||||
y = g.next() runs the generator from the first line until it
|
||||
encounters a yield when it suspends execution and a returns
|
||||
a value to be bound to y
|
||||
|
||||
Since the same flow applies when you are submitting values, the
|
||||
first call to .next() should have no argument since there is no
|
||||
place to put it.
|
||||
|
||||
g = mygen(p1) will bind p1 to a local variable and then return a
|
||||
generator to be bound to g and NOT run any code in mygen()
|
||||
g.next() will START execution in mygen() from the first line. Note,
|
||||
that there is nowhere to bind any potential arguments that
|
||||
might have been supplied to next(). Execution continues
|
||||
until the first yield is encountered and control is returned
|
||||
to the caller.
|
||||
g.next(val) resumes execution at the yield and binds val to the
|
||||
left hand side of the yield assignment and continues running
|
||||
until another yield is encountered. This makes sense because
|
||||
you submit values expecting them to be processed right away.
|
||||
ostream.throw( FlushStream ) # This feature proposed below
|
||||
|
||||
|
||||
Q. Two-way generator parameter passing seems awfully bold. To
|
||||
my mind, one of the great things about generators is that they
|
||||
meet the (very simple) definition of an iterator. With this,
|
||||
they no longer do. I like lazy consumers -- really I do --
|
||||
but I'd rather be conservative about putting something like
|
||||
this in the language.
|
||||
Example of a Complex Consumer
|
||||
|
||||
A. If you don't use x = yield expr, then nothing changes and you
|
||||
haven't lost anything. So, it isn't really bold. It simply
|
||||
adds an option to pass in data as well as take it out. Other
|
||||
generator implementations (like the thread based generator.py)
|
||||
already have provisions for two-way parameter passing so that
|
||||
consumers are put on an equal footing with producers. Two-way
|
||||
is the norm, not the exception.
|
||||
|
||||
Yield is not just a simple iterator creator. It does
|
||||
something else truly wonderful -- it suspends execution and
|
||||
saves state. It is good for a lot more than its original
|
||||
purpose. Dr. Mertz's article [5] shows how they can be used
|
||||
to create general purpose co-routines.
|
||||
|
||||
Besides, 98% of the mechanism is already in place. Only the
|
||||
communication needs to be added. Remember GOSUB which neither
|
||||
took nor returned data. Routines which accepted parameters
|
||||
and returned values were a major step forward.
|
||||
|
||||
When you first need to pass information into a generator, the
|
||||
existing alternative is clumsy. It involves setting a global
|
||||
variable, calling .next(), and assigning the local from the
|
||||
global.
|
||||
|
||||
|
||||
Q. Why not introduce another keyword 'accept' for lazy consumers?
|
||||
|
||||
A. To avoid conflicts with 'yield', to avoid creating a new
|
||||
keyword, and to take advantage of the explicit clarity of the
|
||||
'=' operator.
|
||||
|
||||
|
||||
Q. How often does one need to write a lazy consumer or a co-routine?
|
||||
|
||||
A. Not often. But, when you DO have to write one, this approach
|
||||
is the easiest to implement, read, and debug.
|
||||
|
||||
It clearly beats using existing generators and passing data
|
||||
through global variables. It is much clearer and easier to
|
||||
debug than an equivalent approach using threading, mutexes,
|
||||
semaphores, and data queues. A class based approach competes
|
||||
well when there are no complex execution states or variable
|
||||
states. When the complexity increases, generators with
|
||||
two-way communication are much simpler because they
|
||||
automatically save state unlike classes which must explicitly
|
||||
store variable and execution state in instance variables.
|
||||
|
||||
|
||||
Q. Why does yield require an argument? Isn't yield None too wordy?
|
||||
|
||||
A. It doesn't matter for the purposes of this PEP. For
|
||||
information purposes, here is the reasoning as I understand
|
||||
it. Though return allows an implicit None, some now consider
|
||||
this to be weak design. There is some spirit of "Explicit is
|
||||
better than Implicit". More importantly, in most uses of
|
||||
yield, a missing argument is more likely to be a bug than an
|
||||
intended yield None.
|
||||
Loop over the picture files in a directory, shrink them
|
||||
one-at-a-time to thumbnail size using PIL, and send them to a lazy
|
||||
consumer. That consumer is responsible for creating a large blank
|
||||
image, accepting thumbnails one-at-a-time and placing them in a
|
||||
5x3 grid format onto the blank image. Whenever the grid is full,
|
||||
it writes-out the large image as an index print. A FlushStream
|
||||
exception indicates that no more thumbnails are available and that
|
||||
the partial index print should be written out if there are one or
|
||||
more thumbnails on it.
|
||||
|
||||
|
||||
Specification for Generator Exception Passing:
|
||||
|
@ -362,8 +300,8 @@ Specification for Generator Exception Passing:
|
|||
There is no existing work around for triggering an exception
|
||||
inside a generator. This is a true deficiency. It is the only
|
||||
case in Python where active code cannot be excepted to or through.
|
||||
Even if .next(arg) is not adopted, we should add the .throw()
|
||||
method.
|
||||
Even if the .next(arg) proposal is not adopted, we should add the
|
||||
.throw() method.
|
||||
|
||||
Note A: The name of the throw method was selected for several
|
||||
reasons. Raise is a keyword and so cannot be used as a method
|
||||
|
@ -400,7 +338,7 @@ References
|
|||
Generator Comprehensions
|
||||
http://groups.google.com/groups?hl=en&th=215e6e5a7bfd526&rnum=2
|
||||
|
||||
|
||||
Discussion Draft of this PEP
|
||||
http://groups.google.com/groups?hl=en&th=df8b5e7709957eb7
|
||||
|
||||
[5] Dr. David Mertz's draft column for Charming Python.
|
||||
|
@ -418,5 +356,3 @@ mode: indented-text
|
|||
indent-tabs-mode: nil
|
||||
fill-column: 70
|
||||
End:
|
||||
|
||||
|
||||
|
|
Loading…
Reference in New Issue