410 lines
13 KiB
ReStructuredText
410 lines
13 KiB
ReStructuredText
PEP: 334
|
||
Title: Simple Coroutines via SuspendIteration
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Clark C. Evans <cce@clarkevans.com>
|
||
Status: Withdrawn
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 26-Aug-2004
|
||
Python-Version: 3.0
|
||
Post-History:
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
Asynchronous application frameworks such as Twisted [1]_ and Peak
|
||
[2]_, are based on a cooperative multitasking via event queues or
|
||
deferred execution. While this approach to application development
|
||
does not involve threads and thus avoids a whole class of problems
|
||
[3]_, it creates a different sort of programming challenge. When an
|
||
I/O operation would block, a user request must suspend so that other
|
||
requests can proceed. The concept of a coroutine [4]_ promises to
|
||
help the application developer grapple with this state management
|
||
difficulty.
|
||
|
||
This PEP proposes a limited approach to coroutines based on an
|
||
extension to the :pep:`iterator protocol <234>`. Currently, an iterator may
|
||
raise a StopIteration exception to indicate that it is done producing
|
||
values. This proposal adds another exception to this protocol,
|
||
SuspendIteration, which indicates that the given iterator may have
|
||
more values to produce, but is unable to do so at this time.
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
There are two current approaches to bringing co-routines to Python.
|
||
Christian Tismer's Stackless [6]_ involves a ground-up restructuring
|
||
of Python's execution model by hacking the 'C' stack. While this
|
||
approach works, its operation is hard to describe and keep portable. A
|
||
related approach is to compile Python code to Parrot [7]_, a
|
||
register-based virtual machine, which has coroutines. Unfortunately,
|
||
neither of these solutions is portable with IronPython (CLR) or Jython
|
||
(JavaVM).
|
||
|
||
It is thought that a more limited approach, based on iterators, could
|
||
provide a coroutine facility to application programmers and still be
|
||
portable across runtimes.
|
||
|
||
* Iterators keep their state in local variables that are not on the
|
||
"C" stack. Iterators can be viewed as classes, with state stored in
|
||
member variables that are persistent across calls to its next()
|
||
method.
|
||
|
||
* While an uncaught exception may terminate a function's execution, an
|
||
uncaught exception need not invalidate an iterator. The proposed
|
||
exception, SuspendIteration, uses this feature. In other words,
|
||
just because one call to next() results in an exception does not
|
||
necessarily need to imply that the iterator itself is no longer
|
||
capable of producing values.
|
||
|
||
There are four places where this new exception impacts:
|
||
|
||
* The :pep:`255` simple generator mechanism could be extended to safely
|
||
'catch' this SuspendIteration exception, stuff away its current
|
||
state, and pass the exception on to the caller.
|
||
|
||
* Various iterator filters [9]_ in the standard library, such as
|
||
itertools.izip should be made aware of this exception so that it can
|
||
transparently propagate SuspendIteration.
|
||
|
||
* Iterators generated from I/O operations, such as a file or socket
|
||
reader, could be modified to have a non-blocking variety. This
|
||
option would raise a subclass of SuspendIteration if the requested
|
||
operation would block.
|
||
|
||
* The asyncore library could be updated to provide a basic 'runner'
|
||
that pulls from an iterator; if the SuspendIteration exception is
|
||
caught, then it moves on to the next iterator in its runlist [10]_.
|
||
External frameworks like Twisted would provide alternative
|
||
implementations, perhaps based on FreeBSD's kqueue or Linux's epoll.
|
||
|
||
While these may seem dramatic changes, it is a very small amount of
|
||
work compared with the utility provided by continuations.
|
||
|
||
|
||
Semantics
|
||
=========
|
||
|
||
This section will explain, at a high level, how the introduction of
|
||
this new SuspendIteration exception would behave.
|
||
|
||
|
||
Simple Iterators
|
||
----------------
|
||
|
||
The current functionality of iterators is best seen with a simple
|
||
example which produces two values 'one' and 'two'. ::
|
||
|
||
class States:
|
||
|
||
def __iter__(self):
|
||
self._next = self.state_one
|
||
return self
|
||
|
||
def next(self):
|
||
return self._next()
|
||
|
||
def state_one(self):
|
||
self._next = self.state_two
|
||
return "one"
|
||
|
||
def state_two(self):
|
||
self._next = self.state_stop
|
||
return "two"
|
||
|
||
def state_stop(self):
|
||
raise StopIteration
|
||
|
||
print list(States())
|
||
|
||
An equivalent iteration could, of course, be created by the
|
||
following generator::
|
||
|
||
def States():
|
||
yield 'one'
|
||
yield 'two'
|
||
|
||
print list(States())
|
||
|
||
|
||
Introducing SuspendIteration
|
||
----------------------------
|
||
|
||
Suppose that between producing 'one' and 'two', the generator above
|
||
could block on a socket read. In this case, we would want to raise
|
||
SuspendIteration to signal that the iterator is not done producing,
|
||
but is unable to provide a value at the current moment. ::
|
||
|
||
from random import randint
|
||
from time import sleep
|
||
|
||
class SuspendIteration(Exception):
|
||
pass
|
||
|
||
class NonBlockingResource:
|
||
|
||
"""Randomly unable to produce the second value"""
|
||
|
||
def __iter__(self):
|
||
self._next = self.state_one
|
||
return self
|
||
|
||
def next(self):
|
||
return self._next()
|
||
|
||
def state_one(self):
|
||
self._next = self.state_suspend
|
||
return "one"
|
||
|
||
def state_suspend(self):
|
||
rand = randint(1,10)
|
||
if 2 == rand:
|
||
self._next = self.state_two
|
||
return self.state_two()
|
||
raise SuspendIteration()
|
||
|
||
def state_two(self):
|
||
self._next = self.state_stop
|
||
return "two"
|
||
|
||
def state_stop(self):
|
||
raise StopIteration
|
||
|
||
def sleeplist(iterator, timeout = .1):
|
||
"""
|
||
Do other things (e.g. sleep) while resource is
|
||
unable to provide the next value
|
||
"""
|
||
it = iter(iterator)
|
||
retval = []
|
||
while True:
|
||
try:
|
||
retval.append(it.next())
|
||
except SuspendIteration:
|
||
sleep(timeout)
|
||
continue
|
||
except StopIteration:
|
||
break
|
||
return retval
|
||
|
||
print sleeplist(NonBlockingResource())
|
||
|
||
In a real-world situation, the NonBlockingResource would be a file
|
||
iterator, socket handle, or other I/O based producer. The sleeplist
|
||
would instead be an async reactor, such as those found in asyncore or
|
||
Twisted. The non-blocking resource could, of course, be written as a
|
||
generator::
|
||
|
||
def NonBlockingResource():
|
||
yield "one"
|
||
while True:
|
||
rand = randint(1,10)
|
||
if 2 == rand:
|
||
break
|
||
raise SuspendIteration()
|
||
yield "two"
|
||
|
||
It is not necessary to add a keyword, 'suspend', since most real
|
||
content generators will not be in application code, they will be in
|
||
low-level I/O based operations. Since most programmers need not be
|
||
exposed to the SuspendIteration() mechanism, a keyword is not needed.
|
||
|
||
|
||
Application Iterators
|
||
---------------------
|
||
|
||
The previous example is rather contrived, a more 'real-world' example
|
||
would be a web page generator which yields HTML content, and pulls
|
||
from a database. Note that this is an example of neither the
|
||
'producer' nor the 'consumer', but rather of a filter. ::
|
||
|
||
def ListAlbums(cursor):
|
||
cursor.execute("SELECT title, artist FROM album")
|
||
yield '<html><body><table><tr><td>Title</td><td>Artist</td></tr>'
|
||
for (title, artist) in cursor:
|
||
yield '<tr><td>%s</td><td>%s</td></tr>' % (title, artist)
|
||
yield '</table></body></html>'
|
||
|
||
The problem, of course, is that the database may block for some time
|
||
before any rows are returned, and that during execution, rows may be
|
||
returned in blocks of 10 or 100 at a time. Ideally, if the database
|
||
blocks for the next set of rows, another user connection could be
|
||
serviced. Note the complete absence of SuspendIterator in the above
|
||
code. If done correctly, application developers would be able to
|
||
focus on functionality rather than concurrency issues.
|
||
|
||
The iterator created by the above generator should do the magic
|
||
necessary to maintain state, yet pass the exception through to a
|
||
lower-level async framework. Here is an example of what the
|
||
corresponding iterator would look like if coded up as a class::
|
||
|
||
class ListAlbums:
|
||
|
||
def __init__(self, cursor):
|
||
self.cursor = cursor
|
||
|
||
def __iter__(self):
|
||
self.cursor.execute("SELECT title, artist FROM album")
|
||
self._iter = iter(self._cursor)
|
||
self._next = self.state_head
|
||
return self
|
||
|
||
def next(self):
|
||
return self._next()
|
||
|
||
def state_head(self):
|
||
self._next = self.state_cursor
|
||
return "<html><body><table><tr><td>\
|
||
Title</td><td>Artist</td></tr>"
|
||
|
||
def state_tail(self):
|
||
self._next = self.state_stop
|
||
return "</table></body></html>"
|
||
|
||
def state_cursor(self):
|
||
try:
|
||
(title,artist) = self._iter.next()
|
||
return '<tr><td>%s</td><td>%s</td></tr>' % (title, artist)
|
||
except StopIteration:
|
||
self._next = self.state_tail
|
||
return self.next()
|
||
except SuspendIteration:
|
||
# just pass-through
|
||
raise
|
||
|
||
def state_stop(self):
|
||
raise StopIteration
|
||
|
||
|
||
Complicating Factors
|
||
--------------------
|
||
|
||
While the above example is straightforward, things are a bit more
|
||
complicated if the intermediate generator 'condenses' values, that is,
|
||
it pulls in two or more values for each value it produces. For
|
||
example, ::
|
||
|
||
def pair(iterLeft,iterRight):
|
||
rhs = iter(iterRight)
|
||
lhs = iter(iterLeft)
|
||
while True:
|
||
yield (rhs.next(), lhs.next())
|
||
|
||
In this case, the corresponding iterator behavior has to be a bit more
|
||
subtle to handle the case of either the right or left iterator raising
|
||
SuspendIteration. It seems to be a matter of decomposing the
|
||
generator to recognize intermediate states where a SuspendIterator
|
||
exception from the producing context could happen. ::
|
||
|
||
class pair:
|
||
|
||
def __init__(self, iterLeft, iterRight):
|
||
self.iterLeft = iterLeft
|
||
self.iterRight = iterRight
|
||
|
||
def __iter__(self):
|
||
self.rhs = iter(iterRight)
|
||
self.lhs = iter(iterLeft)
|
||
self._temp_rhs = None
|
||
self._temp_lhs = None
|
||
self._next = self.state_rhs
|
||
return self
|
||
|
||
def next(self):
|
||
return self._next()
|
||
|
||
def state_rhs(self):
|
||
self._temp_rhs = self.rhs.next()
|
||
self._next = self.state_lhs
|
||
return self.next()
|
||
|
||
def state_lhs(self):
|
||
self._temp_lhs = self.lhs.next()
|
||
self._next = self.state_pair
|
||
return self.next()
|
||
|
||
def state_pair(self):
|
||
self._next = self.state_rhs
|
||
return (self._temp_rhs, self._temp_lhs)
|
||
|
||
This proposal assumes that a corresponding iterator written using
|
||
this class-based method is possible for existing generators. The
|
||
challenge seems to be the identification of distinct states within
|
||
the generator where suspension could occur.
|
||
|
||
|
||
Resource Cleanup
|
||
----------------
|
||
|
||
The current generator mechanism has a strange interaction with
|
||
exceptions where a 'yield' statement is not allowed within a
|
||
try/finally block. The SuspendIterator exception provides another
|
||
similar issue. The impacts of this issue are not clear. However it
|
||
may be that re-writing the generator into a state machine, as the
|
||
previous section did, could resolve this issue allowing for the
|
||
situation to be no-worse than, and perhaps even removing the
|
||
yield/finally situation. More investigation is needed in this area.
|
||
|
||
|
||
API and Limitations
|
||
-------------------
|
||
|
||
This proposal only covers 'suspending' a chain of iterators, and does
|
||
not cover (of course) suspending general functions, methods, or "C"
|
||
extension function. While there could be no direct support for
|
||
creating generators in "C" code, native "C" iterators which comply
|
||
with the SuspendIterator semantics are certainly possible.
|
||
|
||
|
||
Low-Level Implementation
|
||
========================
|
||
|
||
The author of the PEP is not yet familiar with the Python execution
|
||
model to comment in this area.
|
||
|
||
|
||
References
|
||
==========
|
||
|
||
.. [1] Twisted
|
||
(http://twistedmatrix.com)
|
||
|
||
.. [2] Peak
|
||
(http://peak.telecommunity.com)
|
||
|
||
.. [3] C10K
|
||
(http://www.kegel.com/c10k.html)
|
||
|
||
.. [4] Coroutines
|
||
(http://c2.com/cgi/wiki?CallWithCurrentContinuation)
|
||
|
||
.. [6] Stackless Python
|
||
(http://stackless.com)
|
||
|
||
.. [7] Parrot /w coroutines
|
||
(http://www.sidhe.org/~dan/blog/archives/000178.html)
|
||
|
||
.. [9] itertools - Functions creating iterators
|
||
(http://docs.python.org/library/itertools.html)
|
||
|
||
.. [10] Microthreads in Python, David Mertz
|
||
(http://www-106.ibm.com/developerworks/linux/library/l-pythrd.html)
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
End:
|