Add motivation (supplied by Shane Hathaway).

Explain what happens when a block contains a yield.

Add comparison to thunks.

Add examples.
This commit is contained in:
Guido van Rossum 2005-04-29 05:12:38 +00:00
parent 81ccba3725
commit 959fdd2fdb
1 changed files with 243 additions and 7 deletions

View File

@ -48,9 +48,53 @@ Proposal Evolution
__next__() methods; there is no user-friendly API to call
__error__().
Motivation and Use Cases
Perhaps __error__() should be named __exit__().
TBD.
Motivation and Summary
(Thanks to Shane Hathaway -- Hi Shane!)
Good programmers move commonly used code into reusable functions.
Sometimes, however, patterns arise in the structure of the
functions rather than the actual sequence of statements. For
example, many functions acquire a lock, execute some code specific
to that function, and unconditionally release the lock. Repeating
the locking code in every function that uses it is error prone and
makes refactoring difficult.
Block statements provide a mechanism for encapsulating patterns of
structure. Code inside the block statement runs under the control
of an object called a block iterator. Simple block iterators
execute code before and after the code inside the block statement.
Block iterators also have the opportunity to execute the
controlled code more than once (or not at all), catch exceptions,
or receive data from the body of the block statement.
A convenient way to write block iterators is to write a generator
(PEP 255). A generator looks a lot like a Python function, but
instead of returning a value immediately, generators pause their
execution at "yield" statements. When a generator is used as a
block iterator, the yield statement tells the Python interpreter
to suspend the block iterator, execute the block statement body,
and resume the block iterator when the body has executed.
The Python interpreter behaves as follows when it encounters a
block statement based on a generator. First, the interpreter
instantiates the generator and begins executing it. The generator
does setup work appropriate to the pattern it encapsulates, such
as acquiring a lock, opening a file, starting a database
transaction, or starting a loop. Then the generator yields
execution to the body of the block statement using a yield
statement. When the block statement body completes, raises an
uncaught exception, or sends data back to the generator using a
continue statement, the generator resumes. At this point, the
generator can either clean up and stop or yield again, causing the
block statement body to execute again. When the generator
finishes, the interpreter leaves the block statement.
Use Cases
TBD. For now, see the Examples section near the end.
Specification: the Iteration Exception Hierarchy
@ -230,6 +274,18 @@ Specification: the Anonymous Block Statement
block-statement is left. The iterator also gets a chance if the
block-statement is left through raising an exception.
Note that a yield-statement (or a yield-expression, see below) in
a block-statement is not treated differently. It suspends the
function containing the block *without* notifying the block's
iterator. The blocks's iterator is entirely unaware of this
yield, since the local control flow doesn't actually leave the
block. In other words, it is *not* like a break, continue or
return statement. When the loop that was resumed by the yield
calls next(), the block is resumed right after the yield. The
generator finalization semantics described below guarantee (within
the limitations of all finalization semantics) that the block will
be resumed eventually.
Specification: Generator Exception Handling
Generators will implement the new __next__() method API, as well
@ -370,13 +426,190 @@ Specification: Alternative __next__() and Generator Exception Handling
EXPR2" is changed; break and return translate to themselves in
that case).
Comparison to Thunks
Alternative semantics proposed for the block-statement turn the
block into a thunk (an anonymous function that blends into the
containing scope).
The main advantage of thunks that I can see is that you can save
the thunk for later, like a callback for a button widget (the
thunk then becomes a closure). You can't use a yield-based block
for that (except in Ruby, which uses yield syntax with a
thunk-based implementation). But I have to say that I almost see
this as an advantage: I think I'd be slightly uncomfortable seeing
a block and not knowing whether it will be executed in the normal
control flow or later. Defining an explicit nested function for
that purpose doesn't have this problem for me, because I already
know that the 'def' keyword means its body is executed later.
The other problem with thunks is that once we think of them as the
anonymous functions they are, we're pretty much forced to say that
a return statement in a thunk returns from the thunk rather than
from the containing function. Doing it any other way would cause
major weirdness when the thunk were to survive its containing
function as a closure (perhaps continuations would help, but I'm
not about to go there :-).
But then an IMO important use case for the resource cleanup
template pattern is lost. I routinely write code like this:
def findSomething(self, key, default=None):
self.lock.acquire()
try:
for item in self.elements:
if item.matches(key):
return item
return default
finally:
self.lock.release()
and I'd be bummed if I couldn't write this as:
def findSomething(self, key, default=None):
block synchronized(self.lock):
for item in self.elements:
if item.matches(key):
return item
return default
This particular example can be rewritten using a break:
def findSomething(self, key, default=None):
block synchronized(self.lock):
for item in self.elements:
if item.matches(key):
break
else:
item = default
return item
but it looks forced and the transformation isn't always that easy;
you'd be forced to rewrite your code in a single-return style
which feels too restrictive.
Also note the semantic conundrum of a yield in a thunk -- the only
reasonable interpretation is that this turns the thunk into a
generator!
Greg Ewing believes that thunks "would be a lot simpler, doing
just what is required without any jiggery pokery with exceptions
and break/continue/return statements. It would be easy to explain
what it does and why it's useful."
But in order to obtain the required local variable sharing between
the thunk and the containing function, every local variable used
or set in the thunk would have to become a 'cell' (our mechanism
for sharing variables between nested scopes). Cells slow down
access compared to regular local variables: access involves an
extra C function call (PyCell_Get() or PyCell_Set()).
Perhaps not entirely coincidentally, the last example above
(findSomething() rewritten to avoid a return inside the block)
shows that, unlike for regular nested functions, we'll want
variables *assigned to* by the thunk also to be shared with the
containing function, even if they are not assigned to outside the
thunk.
Greg Ewing again: "generators have turned out to be more powerful,
because you can have more than one of them on the go at once. Is
there a use for that capability here?"
I believe there are definitely uses for this; several people have
already shown how to do asynchronous light-weight threads using
generators (e.g. David Mertz quoted in PEP 288, and Fredrik
Lundh[3]).
And finally, Greg says: "a thunk implementation has the potential
to easily handle multiple block arguments, if a suitable syntax
could ever be devised. It's hard to see how that could be done in
a general way with the generator implementation."
However, the use cases for multiple blocks seem elusive.
Alternatives Considered
TBD.
Examples
TBD.
1. A template for ensuring that a lock, acquired at the start of a
block, is released when the block is left:
def synchronized(lock):
lock.acquire()
try:
yield
finally:
lock.release()
Used as follows:
block synchronized(myLock):
# Code here executes with myLock held. The lock is
# guaranteed to be released when the block is left (even
# if by an uncaught exception).
2. A template for opening a file that ensures the file is closed
when the block is left:
def opening(filename, mode="r"):
f = open(filename, mode)
try:
yield f
finally:
f.close()
Used as follows:
block opening("/etc/passwd") as f:
for line in f:
print line.rstrip()
3. A template for committing or rolling back a database
transaction:
def transactional(db):
try:
yield
except:
db.rollback()
raise
else:
db.commit()
4. A template that tries something up to n times:
def auto_retry(n=3, exc=Exception):
for i in range(n):
try:
yield
return
except Exception, err:
# perhaps log exception here
continue
raise # re-raise the exception we caught earlier
Used as follows:
block auto_retry(3, IOError):
f = urllib.urlopen("http://python.org/peps/pep-0340.html")
print f.read()
5. It is possible to nest blocks and combine templates:
def synchronized_opening(lock, filename, mode="r"):
block synchronized(lock):
block opening(filename) as f:
yield f
Used as follows:
block synchronized_opening("/etc/passwd", myLock) as f:
for line in f:
print line.rstrip()
6. Coroutine example TBD.
Acknowledgements
@ -384,10 +617,10 @@ Acknowledgements
Brett Cannon, Brian Sabbey, Doug Landauer, Duncan Booth, Fredrik
Lundh, Greg Ewing, Holger Krekel, Jason Diamond, Jim Jewett,
Josiah Carlson, Ka-Ping Yee, Michael Chermside, Michael Hudson,
Nick Coghlan, Paul Moore, Phillip Eby, Raymond Hettinger, Samuele
Pedroni, Shannon Behrens, Steven Bethard, Terry Reedy, Tim
Delaney, Aahz, and others. Thanks all for a valuable discussion
and ideas.
Neil Schemenauer, Nick Coghlan, Paul Moore, Phillip Eby, Raymond
Hettinger, Samuele Pedroni, Shannon Behrens, Steven Bethard, Terry
Reedy, Tim Delaney, Aahz, and others. Thanks all for the valuable
discussion and ideas!
References
@ -395,6 +628,9 @@ References
[2] http://msdn.microsoft.com/vcsharp/programming/language/ask/withstatement/
[3] http://effbot.org/zone/asyncore-generators.htm
Copyright
This document has been placed in the public domain.