Add motivation (supplied by Shane Hathaway).
Explain what happens when a block contains a yield. Add comparison to thunks. Add examples.
This commit is contained in:
parent
81ccba3725
commit
959fdd2fdb
250
pep-0340.txt
250
pep-0340.txt
|
@ -48,9 +48,53 @@ Proposal Evolution
|
|||
__next__() methods; there is no user-friendly API to call
|
||||
__error__().
|
||||
|
||||
Motivation and Use Cases
|
||||
Perhaps __error__() should be named __exit__().
|
||||
|
||||
TBD.
|
||||
Motivation and Summary
|
||||
|
||||
(Thanks to Shane Hathaway -- Hi Shane!)
|
||||
|
||||
Good programmers move commonly used code into reusable functions.
|
||||
Sometimes, however, patterns arise in the structure of the
|
||||
functions rather than the actual sequence of statements. For
|
||||
example, many functions acquire a lock, execute some code specific
|
||||
to that function, and unconditionally release the lock. Repeating
|
||||
the locking code in every function that uses it is error prone and
|
||||
makes refactoring difficult.
|
||||
|
||||
Block statements provide a mechanism for encapsulating patterns of
|
||||
structure. Code inside the block statement runs under the control
|
||||
of an object called a block iterator. Simple block iterators
|
||||
execute code before and after the code inside the block statement.
|
||||
Block iterators also have the opportunity to execute the
|
||||
controlled code more than once (or not at all), catch exceptions,
|
||||
or receive data from the body of the block statement.
|
||||
|
||||
A convenient way to write block iterators is to write a generator
|
||||
(PEP 255). A generator looks a lot like a Python function, but
|
||||
instead of returning a value immediately, generators pause their
|
||||
execution at "yield" statements. When a generator is used as a
|
||||
block iterator, the yield statement tells the Python interpreter
|
||||
to suspend the block iterator, execute the block statement body,
|
||||
and resume the block iterator when the body has executed.
|
||||
|
||||
The Python interpreter behaves as follows when it encounters a
|
||||
block statement based on a generator. First, the interpreter
|
||||
instantiates the generator and begins executing it. The generator
|
||||
does setup work appropriate to the pattern it encapsulates, such
|
||||
as acquiring a lock, opening a file, starting a database
|
||||
transaction, or starting a loop. Then the generator yields
|
||||
execution to the body of the block statement using a yield
|
||||
statement. When the block statement body completes, raises an
|
||||
uncaught exception, or sends data back to the generator using a
|
||||
continue statement, the generator resumes. At this point, the
|
||||
generator can either clean up and stop or yield again, causing the
|
||||
block statement body to execute again. When the generator
|
||||
finishes, the interpreter leaves the block statement.
|
||||
|
||||
Use Cases
|
||||
|
||||
TBD. For now, see the Examples section near the end.
|
||||
|
||||
Specification: the Iteration Exception Hierarchy
|
||||
|
||||
|
@ -230,6 +274,18 @@ Specification: the Anonymous Block Statement
|
|||
block-statement is left. The iterator also gets a chance if the
|
||||
block-statement is left through raising an exception.
|
||||
|
||||
Note that a yield-statement (or a yield-expression, see below) in
|
||||
a block-statement is not treated differently. It suspends the
|
||||
function containing the block *without* notifying the block's
|
||||
iterator. The blocks's iterator is entirely unaware of this
|
||||
yield, since the local control flow doesn't actually leave the
|
||||
block. In other words, it is *not* like a break, continue or
|
||||
return statement. When the loop that was resumed by the yield
|
||||
calls next(), the block is resumed right after the yield. The
|
||||
generator finalization semantics described below guarantee (within
|
||||
the limitations of all finalization semantics) that the block will
|
||||
be resumed eventually.
|
||||
|
||||
Specification: Generator Exception Handling
|
||||
|
||||
Generators will implement the new __next__() method API, as well
|
||||
|
@ -370,13 +426,190 @@ Specification: Alternative __next__() and Generator Exception Handling
|
|||
EXPR2" is changed; break and return translate to themselves in
|
||||
that case).
|
||||
|
||||
Comparison to Thunks
|
||||
|
||||
Alternative semantics proposed for the block-statement turn the
|
||||
block into a thunk (an anonymous function that blends into the
|
||||
containing scope).
|
||||
|
||||
The main advantage of thunks that I can see is that you can save
|
||||
the thunk for later, like a callback for a button widget (the
|
||||
thunk then becomes a closure). You can't use a yield-based block
|
||||
for that (except in Ruby, which uses yield syntax with a
|
||||
thunk-based implementation). But I have to say that I almost see
|
||||
this as an advantage: I think I'd be slightly uncomfortable seeing
|
||||
a block and not knowing whether it will be executed in the normal
|
||||
control flow or later. Defining an explicit nested function for
|
||||
that purpose doesn't have this problem for me, because I already
|
||||
know that the 'def' keyword means its body is executed later.
|
||||
|
||||
The other problem with thunks is that once we think of them as the
|
||||
anonymous functions they are, we're pretty much forced to say that
|
||||
a return statement in a thunk returns from the thunk rather than
|
||||
from the containing function. Doing it any other way would cause
|
||||
major weirdness when the thunk were to survive its containing
|
||||
function as a closure (perhaps continuations would help, but I'm
|
||||
not about to go there :-).
|
||||
|
||||
But then an IMO important use case for the resource cleanup
|
||||
template pattern is lost. I routinely write code like this:
|
||||
|
||||
def findSomething(self, key, default=None):
|
||||
self.lock.acquire()
|
||||
try:
|
||||
for item in self.elements:
|
||||
if item.matches(key):
|
||||
return item
|
||||
return default
|
||||
finally:
|
||||
self.lock.release()
|
||||
|
||||
and I'd be bummed if I couldn't write this as:
|
||||
|
||||
def findSomething(self, key, default=None):
|
||||
block synchronized(self.lock):
|
||||
for item in self.elements:
|
||||
if item.matches(key):
|
||||
return item
|
||||
return default
|
||||
|
||||
This particular example can be rewritten using a break:
|
||||
|
||||
def findSomething(self, key, default=None):
|
||||
block synchronized(self.lock):
|
||||
for item in self.elements:
|
||||
if item.matches(key):
|
||||
break
|
||||
else:
|
||||
item = default
|
||||
return item
|
||||
|
||||
but it looks forced and the transformation isn't always that easy;
|
||||
you'd be forced to rewrite your code in a single-return style
|
||||
which feels too restrictive.
|
||||
|
||||
Also note the semantic conundrum of a yield in a thunk -- the only
|
||||
reasonable interpretation is that this turns the thunk into a
|
||||
generator!
|
||||
|
||||
Greg Ewing believes that thunks "would be a lot simpler, doing
|
||||
just what is required without any jiggery pokery with exceptions
|
||||
and break/continue/return statements. It would be easy to explain
|
||||
what it does and why it's useful."
|
||||
|
||||
But in order to obtain the required local variable sharing between
|
||||
the thunk and the containing function, every local variable used
|
||||
or set in the thunk would have to become a 'cell' (our mechanism
|
||||
for sharing variables between nested scopes). Cells slow down
|
||||
access compared to regular local variables: access involves an
|
||||
extra C function call (PyCell_Get() or PyCell_Set()).
|
||||
|
||||
Perhaps not entirely coincidentally, the last example above
|
||||
(findSomething() rewritten to avoid a return inside the block)
|
||||
shows that, unlike for regular nested functions, we'll want
|
||||
variables *assigned to* by the thunk also to be shared with the
|
||||
containing function, even if they are not assigned to outside the
|
||||
thunk.
|
||||
|
||||
Greg Ewing again: "generators have turned out to be more powerful,
|
||||
because you can have more than one of them on the go at once. Is
|
||||
there a use for that capability here?"
|
||||
|
||||
I believe there are definitely uses for this; several people have
|
||||
already shown how to do asynchronous light-weight threads using
|
||||
generators (e.g. David Mertz quoted in PEP 288, and Fredrik
|
||||
Lundh[3]).
|
||||
|
||||
And finally, Greg says: "a thunk implementation has the potential
|
||||
to easily handle multiple block arguments, if a suitable syntax
|
||||
could ever be devised. It's hard to see how that could be done in
|
||||
a general way with the generator implementation."
|
||||
|
||||
However, the use cases for multiple blocks seem elusive.
|
||||
|
||||
Alternatives Considered
|
||||
|
||||
TBD.
|
||||
|
||||
Examples
|
||||
|
||||
TBD.
|
||||
1. A template for ensuring that a lock, acquired at the start of a
|
||||
block, is released when the block is left:
|
||||
|
||||
def synchronized(lock):
|
||||
lock.acquire()
|
||||
try:
|
||||
yield
|
||||
finally:
|
||||
lock.release()
|
||||
|
||||
Used as follows:
|
||||
|
||||
block synchronized(myLock):
|
||||
# Code here executes with myLock held. The lock is
|
||||
# guaranteed to be released when the block is left (even
|
||||
# if by an uncaught exception).
|
||||
|
||||
2. A template for opening a file that ensures the file is closed
|
||||
when the block is left:
|
||||
|
||||
def opening(filename, mode="r"):
|
||||
f = open(filename, mode)
|
||||
try:
|
||||
yield f
|
||||
finally:
|
||||
f.close()
|
||||
|
||||
Used as follows:
|
||||
|
||||
block opening("/etc/passwd") as f:
|
||||
for line in f:
|
||||
print line.rstrip()
|
||||
|
||||
3. A template for committing or rolling back a database
|
||||
transaction:
|
||||
|
||||
def transactional(db):
|
||||
try:
|
||||
yield
|
||||
except:
|
||||
db.rollback()
|
||||
raise
|
||||
else:
|
||||
db.commit()
|
||||
|
||||
4. A template that tries something up to n times:
|
||||
|
||||
def auto_retry(n=3, exc=Exception):
|
||||
for i in range(n):
|
||||
try:
|
||||
yield
|
||||
return
|
||||
except Exception, err:
|
||||
# perhaps log exception here
|
||||
continue
|
||||
raise # re-raise the exception we caught earlier
|
||||
|
||||
Used as follows:
|
||||
|
||||
block auto_retry(3, IOError):
|
||||
f = urllib.urlopen("http://python.org/peps/pep-0340.html")
|
||||
print f.read()
|
||||
|
||||
5. It is possible to nest blocks and combine templates:
|
||||
|
||||
def synchronized_opening(lock, filename, mode="r"):
|
||||
block synchronized(lock):
|
||||
block opening(filename) as f:
|
||||
yield f
|
||||
|
||||
Used as follows:
|
||||
|
||||
block synchronized_opening("/etc/passwd", myLock) as f:
|
||||
for line in f:
|
||||
print line.rstrip()
|
||||
|
||||
6. Coroutine example TBD.
|
||||
|
||||
Acknowledgements
|
||||
|
||||
|
@ -384,10 +617,10 @@ Acknowledgements
|
|||
Brett Cannon, Brian Sabbey, Doug Landauer, Duncan Booth, Fredrik
|
||||
Lundh, Greg Ewing, Holger Krekel, Jason Diamond, Jim Jewett,
|
||||
Josiah Carlson, Ka-Ping Yee, Michael Chermside, Michael Hudson,
|
||||
Nick Coghlan, Paul Moore, Phillip Eby, Raymond Hettinger, Samuele
|
||||
Pedroni, Shannon Behrens, Steven Bethard, Terry Reedy, Tim
|
||||
Delaney, Aahz, and others. Thanks all for a valuable discussion
|
||||
and ideas.
|
||||
Neil Schemenauer, Nick Coghlan, Paul Moore, Phillip Eby, Raymond
|
||||
Hettinger, Samuele Pedroni, Shannon Behrens, Steven Bethard, Terry
|
||||
Reedy, Tim Delaney, Aahz, and others. Thanks all for the valuable
|
||||
discussion and ideas!
|
||||
|
||||
References
|
||||
|
||||
|
@ -395,6 +628,9 @@ References
|
|||
|
||||
[2] http://msdn.microsoft.com/vcsharp/programming/language/ask/withstatement/
|
||||
|
||||
[3] http://effbot.org/zone/asyncore-generators.htm
|
||||
|
||||
|
||||
Copyright
|
||||
|
||||
This document has been placed in the public domain.
|
||||
|
|
Loading…
Reference in New Issue