From 959fdd2fdb2e1157a3cc0a0c7b26d605cb378d95 Mon Sep 17 00:00:00 2001 From: Guido van Rossum Date: Fri, 29 Apr 2005 05:12:38 +0000 Subject: [PATCH] Add motivation (supplied by Shane Hathaway). Explain what happens when a block contains a yield. Add comparison to thunks. Add examples. --- pep-0340.txt | 250 +++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 243 insertions(+), 7 deletions(-) diff --git a/pep-0340.txt b/pep-0340.txt index 32f23d4bf..55d5cd15d 100644 --- a/pep-0340.txt +++ b/pep-0340.txt @@ -48,9 +48,53 @@ Proposal Evolution __next__() methods; there is no user-friendly API to call __error__(). -Motivation and Use Cases + Perhaps __error__() should be named __exit__(). - TBD. +Motivation and Summary + + (Thanks to Shane Hathaway -- Hi Shane!) + + Good programmers move commonly used code into reusable functions. + Sometimes, however, patterns arise in the structure of the + functions rather than the actual sequence of statements. For + example, many functions acquire a lock, execute some code specific + to that function, and unconditionally release the lock. Repeating + the locking code in every function that uses it is error prone and + makes refactoring difficult. + + Block statements provide a mechanism for encapsulating patterns of + structure. Code inside the block statement runs under the control + of an object called a block iterator. Simple block iterators + execute code before and after the code inside the block statement. + Block iterators also have the opportunity to execute the + controlled code more than once (or not at all), catch exceptions, + or receive data from the body of the block statement. + + A convenient way to write block iterators is to write a generator + (PEP 255). A generator looks a lot like a Python function, but + instead of returning a value immediately, generators pause their + execution at "yield" statements. When a generator is used as a + block iterator, the yield statement tells the Python interpreter + to suspend the block iterator, execute the block statement body, + and resume the block iterator when the body has executed. + + The Python interpreter behaves as follows when it encounters a + block statement based on a generator. First, the interpreter + instantiates the generator and begins executing it. The generator + does setup work appropriate to the pattern it encapsulates, such + as acquiring a lock, opening a file, starting a database + transaction, or starting a loop. Then the generator yields + execution to the body of the block statement using a yield + statement. When the block statement body completes, raises an + uncaught exception, or sends data back to the generator using a + continue statement, the generator resumes. At this point, the + generator can either clean up and stop or yield again, causing the + block statement body to execute again. When the generator + finishes, the interpreter leaves the block statement. + +Use Cases + + TBD. For now, see the Examples section near the end. Specification: the Iteration Exception Hierarchy @@ -230,6 +274,18 @@ Specification: the Anonymous Block Statement block-statement is left. The iterator also gets a chance if the block-statement is left through raising an exception. + Note that a yield-statement (or a yield-expression, see below) in + a block-statement is not treated differently. It suspends the + function containing the block *without* notifying the block's + iterator. The blocks's iterator is entirely unaware of this + yield, since the local control flow doesn't actually leave the + block. In other words, it is *not* like a break, continue or + return statement. When the loop that was resumed by the yield + calls next(), the block is resumed right after the yield. The + generator finalization semantics described below guarantee (within + the limitations of all finalization semantics) that the block will + be resumed eventually. + Specification: Generator Exception Handling Generators will implement the new __next__() method API, as well @@ -370,13 +426,190 @@ Specification: Alternative __next__() and Generator Exception Handling EXPR2" is changed; break and return translate to themselves in that case). +Comparison to Thunks + + Alternative semantics proposed for the block-statement turn the + block into a thunk (an anonymous function that blends into the + containing scope). + + The main advantage of thunks that I can see is that you can save + the thunk for later, like a callback for a button widget (the + thunk then becomes a closure). You can't use a yield-based block + for that (except in Ruby, which uses yield syntax with a + thunk-based implementation). But I have to say that I almost see + this as an advantage: I think I'd be slightly uncomfortable seeing + a block and not knowing whether it will be executed in the normal + control flow or later. Defining an explicit nested function for + that purpose doesn't have this problem for me, because I already + know that the 'def' keyword means its body is executed later. + + The other problem with thunks is that once we think of them as the + anonymous functions they are, we're pretty much forced to say that + a return statement in a thunk returns from the thunk rather than + from the containing function. Doing it any other way would cause + major weirdness when the thunk were to survive its containing + function as a closure (perhaps continuations would help, but I'm + not about to go there :-). + + But then an IMO important use case for the resource cleanup + template pattern is lost. I routinely write code like this: + + def findSomething(self, key, default=None): + self.lock.acquire() + try: + for item in self.elements: + if item.matches(key): + return item + return default + finally: + self.lock.release() + + and I'd be bummed if I couldn't write this as: + + def findSomething(self, key, default=None): + block synchronized(self.lock): + for item in self.elements: + if item.matches(key): + return item + return default + + This particular example can be rewritten using a break: + + def findSomething(self, key, default=None): + block synchronized(self.lock): + for item in self.elements: + if item.matches(key): + break + else: + item = default + return item + + but it looks forced and the transformation isn't always that easy; + you'd be forced to rewrite your code in a single-return style + which feels too restrictive. + + Also note the semantic conundrum of a yield in a thunk -- the only + reasonable interpretation is that this turns the thunk into a + generator! + + Greg Ewing believes that thunks "would be a lot simpler, doing + just what is required without any jiggery pokery with exceptions + and break/continue/return statements. It would be easy to explain + what it does and why it's useful." + + But in order to obtain the required local variable sharing between + the thunk and the containing function, every local variable used + or set in the thunk would have to become a 'cell' (our mechanism + for sharing variables between nested scopes). Cells slow down + access compared to regular local variables: access involves an + extra C function call (PyCell_Get() or PyCell_Set()). + + Perhaps not entirely coincidentally, the last example above + (findSomething() rewritten to avoid a return inside the block) + shows that, unlike for regular nested functions, we'll want + variables *assigned to* by the thunk also to be shared with the + containing function, even if they are not assigned to outside the + thunk. + + Greg Ewing again: "generators have turned out to be more powerful, + because you can have more than one of them on the go at once. Is + there a use for that capability here?" + + I believe there are definitely uses for this; several people have + already shown how to do asynchronous light-weight threads using + generators (e.g. David Mertz quoted in PEP 288, and Fredrik + Lundh[3]). + + And finally, Greg says: "a thunk implementation has the potential + to easily handle multiple block arguments, if a suitable syntax + could ever be devised. It's hard to see how that could be done in + a general way with the generator implementation." + + However, the use cases for multiple blocks seem elusive. + Alternatives Considered TBD. Examples - TBD. + 1. A template for ensuring that a lock, acquired at the start of a + block, is released when the block is left: + + def synchronized(lock): + lock.acquire() + try: + yield + finally: + lock.release() + + Used as follows: + + block synchronized(myLock): + # Code here executes with myLock held. The lock is + # guaranteed to be released when the block is left (even + # if by an uncaught exception). + + 2. A template for opening a file that ensures the file is closed + when the block is left: + + def opening(filename, mode="r"): + f = open(filename, mode) + try: + yield f + finally: + f.close() + + Used as follows: + + block opening("/etc/passwd") as f: + for line in f: + print line.rstrip() + + 3. A template for committing or rolling back a database + transaction: + + def transactional(db): + try: + yield + except: + db.rollback() + raise + else: + db.commit() + + 4. A template that tries something up to n times: + + def auto_retry(n=3, exc=Exception): + for i in range(n): + try: + yield + return + except Exception, err: + # perhaps log exception here + continue + raise # re-raise the exception we caught earlier + + Used as follows: + + block auto_retry(3, IOError): + f = urllib.urlopen("http://python.org/peps/pep-0340.html") + print f.read() + + 5. It is possible to nest blocks and combine templates: + + def synchronized_opening(lock, filename, mode="r"): + block synchronized(lock): + block opening(filename) as f: + yield f + + Used as follows: + + block synchronized_opening("/etc/passwd", myLock) as f: + for line in f: + print line.rstrip() + + 6. Coroutine example TBD. Acknowledgements @@ -384,10 +617,10 @@ Acknowledgements Brett Cannon, Brian Sabbey, Doug Landauer, Duncan Booth, Fredrik Lundh, Greg Ewing, Holger Krekel, Jason Diamond, Jim Jewett, Josiah Carlson, Ka-Ping Yee, Michael Chermside, Michael Hudson, - Nick Coghlan, Paul Moore, Phillip Eby, Raymond Hettinger, Samuele - Pedroni, Shannon Behrens, Steven Bethard, Terry Reedy, Tim - Delaney, Aahz, and others. Thanks all for a valuable discussion - and ideas. + Neil Schemenauer, Nick Coghlan, Paul Moore, Phillip Eby, Raymond + Hettinger, Samuele Pedroni, Shannon Behrens, Steven Bethard, Terry + Reedy, Tim Delaney, Aahz, and others. Thanks all for the valuable + discussion and ideas! References @@ -395,6 +628,9 @@ References [2] http://msdn.microsoft.com/vcsharp/programming/language/ask/withstatement/ + [3] http://effbot.org/zone/asyncore-generators.htm + + Copyright This document has been placed in the public domain.