PEP: 334 Title: Simple Coroutines via SuspendIteration Version: $Revision$ Last-Modified: $Date$ Author: Clark C. Evans <cce@clarkevans.com> Status: Withdrawn Type: Standards Track Content-Type: text/x-rst Created: 26-Aug-2004 Python-Version: 3.0 Post-History: Abstract ======== Asynchronous application frameworks such as Twisted [1]_ and Peak [2]_, are based on a cooperative multitasking via event queues or deferred execution. While this approach to application development does not involve threads and thus avoids a whole class of problems [3]_, it creates a different sort of programming challenge. When an I/O operation would block, a user request must suspend so that other requests can proceed. The concept of a coroutine [4]_ promises to help the application developer grapple with this state management difficulty. This PEP proposes a limited approach to coroutines based on an extension to the iterator protocol [5]_. Currently, an iterator may raise a StopIteration exception to indicate that it is done producing values. This proposal adds another exception to this protocol, SuspendIteration, which indicates that the given iterator may have more values to produce, but is unable to do so at this time. Rationale ========= There are two current approaches to bringing co-routines to Python. Christian Tismer's Stackless [6]_ involves a ground-up restructuring of Python's execution model by hacking the 'C' stack. While this approach works, its operation is hard to describe and keep portable. A related approach is to compile Python code to Parrot [7]_, a register-based virtual machine, which has coroutines. Unfortunately, neither of these solutions is portable with IronPython (CLR) or Jython (JavaVM). It is thought that a more limited approach, based on iterators, could provide a coroutine facility to application programmers and still be portable across runtimes. * Iterators keep their state in local variables that are not on the "C" stack. Iterators can be viewed as classes, with state stored in member variables that are persistent across calls to its next() method. * While an uncaught exception may terminate a function's execution, an uncaught exception need not invalidate an iterator. The proposed exception, SuspendIteration, uses this feature. In other words, just because one call to next() results in an exception does not necessarily need to imply that the iterator itself is no longer capable of producing values. There are four places where this new exception impacts: * The simple generator [8]_ mechanism could be extended to safely 'catch' this SuspendIteration exception, stuff away its current state, and pass the exception on to the caller. * Various iterator filters [9]_ in the standard library, such as itertools.izip should be made aware of this exception so that it can transparently propagate SuspendIteration. * Iterators generated from I/O operations, such as a file or socket reader, could be modified to have a non-blocking variety. This option would raise a subclass of SuspendIteration if the requested operation would block. * The asyncore library could be updated to provide a basic 'runner' that pulls from an iterator; if the SuspendIteration exception is caught, then it moves on to the next iterator in its runlist [10]_. External frameworks like Twisted would provide alternative implementations, perhaps based on FreeBSD's kqueue or Linux's epoll. While these may seem dramatic changes, it is a very small amount of work compared with the utility provided by continuations. Semantics ========= This section will explain, at a high level, how the introduction of this new SuspendIteration exception would behave. Simple Iterators ---------------- The current functionality of iterators is best seen with a simple example which produces two values 'one' and 'two'. :: class States: def __iter__(self): self._next = self.state_one return self def next(self): return self._next() def state_one(self): self._next = self.state_two return "one" def state_two(self): self._next = self.state_stop return "two" def state_stop(self): raise StopIteration print list(States()) An equivalent iteration could, of course, be created by the following generator:: def States(): yield 'one' yield 'two' print list(States()) Introducing SuspendIteration ---------------------------- Suppose that between producing 'one' and 'two', the generator above could block on a socket read. In this case, we would want to raise SuspendIteration to signal that the iterator is not done producing, but is unable to provide a value at the current moment. :: from random import randint from time import sleep class SuspendIteration(Exception): pass class NonBlockingResource: """Randomly unable to produce the second value""" def __iter__(self): self._next = self.state_one return self def next(self): return self._next() def state_one(self): self._next = self.state_suspend return "one" def state_suspend(self): rand = randint(1,10) if 2 == rand: self._next = self.state_two return self.state_two() raise SuspendIteration() def state_two(self): self._next = self.state_stop return "two" def state_stop(self): raise StopIteration def sleeplist(iterator, timeout = .1): """ Do other things (e.g. sleep) while resource is unable to provide the next value """ it = iter(iterator) retval = [] while True: try: retval.append(it.next()) except SuspendIteration: sleep(timeout) continue except StopIteration: break return retval print sleeplist(NonBlockingResource()) In a real-world situation, the NonBlockingResource would be a file iterator, socket handle, or other I/O based producer. The sleeplist would instead be an async reactor, such as those found in asyncore or Twisted. The non-blocking resource could, of course, be written as a generator:: def NonBlockingResource(): yield "one" while True: rand = randint(1,10) if 2 == rand: break raise SuspendIteration() yield "two" It is not necessary to add a keyword, 'suspend', since most real content generators will not be in application code, they will be in low-level I/O based operations. Since most programmers need not be exposed to the SuspendIteration() mechanism, a keyword is not needed. Application Iterators --------------------- The previous example is rather contrived, a more 'real-world' example would be a web page generator which yields HTML content, and pulls from a database. Note that this is an example of neither the 'producer' nor the 'consumer', but rather of a filter. :: def ListAlbums(cursor): cursor.execute("SELECT title, artist FROM album") yield '<html><body><table><tr><td>Title</td><td>Artist</td></tr>' for (title, artist) in cursor: yield '<tr><td>%s</td><td>%s</td></tr>' % (title, artist) yield '</table></body></html>' The problem, of course, is that the database may block for some time before any rows are returned, and that during execution, rows may be returned in blocks of 10 or 100 at a time. Ideally, if the database blocks for the next set of rows, another user connection could be serviced. Note the complete absence of SuspendIterator in the above code. If done correctly, application developers would be able to focus on functionality rather than concurrency issues. The iterator created by the above generator should do the magic necessary to maintain state, yet pass the exception through to a lower-level async framework. Here is an example of what the corresponding iterator would look like if coded up as a class:: class ListAlbums: def __init__(self, cursor): self.cursor = cursor def __iter__(self): self.cursor.execute("SELECT title, artist FROM album") self._iter = iter(self._cursor) self._next = self.state_head return self def next(self): return self._next() def state_head(self): self._next = self.state_cursor return "<html><body><table><tr><td>\ Title</td><td>Artist</td></tr>" def state_tail(self): self._next = self.state_stop return "</table></body></html>" def state_cursor(self): try: (title,artist) = self._iter.next() return '<tr><td>%s</td><td>%s</td></tr>' % (title, artist) except StopIteration: self._next = self.state_tail return self.next() except SuspendIteration: # just pass-through raise def state_stop(self): raise StopIteration Complicating Factors -------------------- While the above example is straight-forward, things are a bit more complicated if the intermediate generator 'condenses' values, that is, it pulls in two or more values for each value it produces. For example, :: def pair(iterLeft,iterRight): rhs = iter(iterRight) lhs = iter(iterLeft) while True: yield (rhs.next(), lhs.next()) In this case, the corresponding iterator behavior has to be a bit more subtle to handle the case of either the right or left iterator raising SuspendIteration. It seems to be a matter of decomposing the generator to recognize intermediate states where a SuspendIterator exception from the producing context could happen. :: class pair: def __init__(self, iterLeft, iterRight): self.iterLeft = iterLeft self.iterRight = iterRight def __iter__(self): self.rhs = iter(iterRight) self.lhs = iter(iterLeft) self._temp_rhs = None self._temp_lhs = None self._next = self.state_rhs return self def next(self): return self._next() def state_rhs(self): self._temp_rhs = self.rhs.next() self._next = self.state_lhs return self.next() def state_lhs(self): self._temp_lhs = self.lhs.next() self._next = self.state_pair return self.next() def state_pair(self): self._next = self.state_rhs return (self._temp_rhs, self._temp_lhs) This proposal assumes that a corresponding iterator written using this class-based method is possible for existing generators. The challenge seems to be the identification of distinct states within the generator where suspension could occur. Resource Cleanup ---------------- The current generator mechanism has a strange interaction with exceptions where a 'yield' statement is not allowed within a try/finally block. The SuspendIterator exception provides another similar issue. The impacts of this issue are not clear. However it may be that re-writing the generator into a state machine, as the previous section did, could resolve this issue allowing for the situation to be no-worse than, and perhaps even removing the yield/finally situation. More investigation is needed in this area. API and Limitations ------------------- This proposal only covers 'suspending' a chain of iterators, and does not cover (of course) suspending general functions, methods, or "C" extension function. While there could be no direct support for creating generators in "C" code, native "C" iterators which comply with the SuspendIterator semantics are certainly possible. Low-Level Implementation ======================== The author of the PEP is not yet familiar with the Python execution model to comment in this area. References ========== .. [1] Twisted (http://twistedmatrix.com) .. [2] Peak (http://peak.telecommunity.com) .. [3] C10K (http://www.kegel.com/c10k.html) .. [4] Coroutines (http://c2.com/cgi/wiki?CallWithCurrentContinuation) .. [5] PEP 234, Iterators (http://www.python.org/dev/peps/pep-0234/) .. [6] Stackless Python (http://stackless.com) .. [7] Parrot /w coroutines (http://www.sidhe.org/~dan/blog/archives/000178.html) .. [8] PEP 255, Simple Generators (http://www.python.org/dev/peps/pep-0255/) .. [9] itertools - Functions creating iterators (http://docs.python.org/library/itertools.html) .. [10] Microthreads in Python, David Mertz (http://www-106.ibm.com/developerworks/linux/library/l-pythrd.html) Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: