Greg's latest version.

This commit is contained in:
Guido van Rossum 2009-03-27 03:33:12 +00:00
parent 7cd0217332
commit eb61d88000
1 changed files with 180 additions and 100 deletions

View File

@ -24,6 +24,37 @@ The new syntax also opens up some opportunities for optimisation when
one generator re-yields values produced by another.
Motivation
==========
A Python generator is a form of coroutine, but has the limitation that
it can only yield to its immediate caller. This means that a piece of
code containing a ``yield`` cannot be factored out and put into a
separate function in the same way as other code. Performing such a
factoring causes the called function to itself become a generator, and
it is necessary to explicitly iterate over this second generator and
re-yield any values that it produces.
If yielding of values is the only concern, this can be performed without
much difficulty using a loop such as
::
for v in g:
yield v
However, if the subgenerator is to interact properly with the caller
in the case of calls to ``send()``, ``throw()`` and ``close()``, things
become considerably more difficult. As will be seen later, the necessary
code is very complicated, and it is tricky to handle all the corner cases
correctly.
A new syntax will be proposed to address this issue. In the simplest
use cases, it will be equivalent to the above for-loop, but it will also
handle the full range of generator behaviour, and allow generator code
to be refactored in a simple and straightforward way.
Proposal
========
@ -32,7 +63,7 @@ generator:
::
yield from <expr>
yield from <expr>
where <expr> is an expression evaluating to an iterable, from which an
iterator is extracted. The iterator is run to exhaustion, during which
@ -40,40 +71,55 @@ time it yields and receives values directly to or from the caller of
the generator containing the ``yield from`` expression (the
"delegating generator").
When the iterator is another generator, the effect is the same as if
the body of the subgenerator were inlined at the point of the ``yield
from`` expression. Furthermore, the subgenerator is allowed to execute
a ``return`` statement with a value, and that value becomes the value of
the ``yield from`` expression.
Furthermore, when the iterator is another generator, the subgenerator is
allowed to execute a ``return`` statement with a value, and that value
becomes the value of the ``yield from`` expression.
In general, the semantics can be understood in terms of the iterator
In general, the semantics can be described in terms of the iterator
protocol as follows:
* Any values that the iterator yields are passed directly to the
caller.
* Any values that the iterator yields are passed directly to the
caller.
* Any values sent to the delegating generator using ``send()``
are passed directly to the iterator. If the sent value is None,
the iterator's ``next()`` method is called. If the sent value is
not None, the iterator's ``send()`` method is called. Any exception
resulting from attempting to call ``next`` or ``send`` is raised
in the delegating generator.
* Any values sent to the delegating generator using ``send()``
are passed directly to the iterator. If the sent value is None,
the iterator's ``next()`` method is called. If the sent value is
not None, the iterator's ``send()`` method is called. Any exception
resulting from attempting to call ``next`` or ``send`` is raised
in the delegating generator.
* Calls to the ``throw()`` method of the delegating generator are
forwarded to the iterator. If the iterator does not have a
``throw()`` method, the thrown-in exception is raised in the
delegating generator.
* Exceptions passed to the ``throw()`` method of the delegating
generator are forwarded to the ``throw()`` method of the iterator.
If the iterator does not have a ``throw()`` method, its ``close()``
method is called if it has one, then the thrown-in exception is
raised in the delegating generator. Any exception resulting from
attempting to call these methods (apart from one case noted below)
is raised in the delegating generator.
* If the delegating generator's ``close()`` method is called, the
``close() method of the iterator is called first if it has one,
then the delegating generator is finalised.
* The value of the ``yield from`` expression is the first argument
to the ``StopIteration`` exception raised by the iterator when it
terminates.
* The value of the ``yield from`` expression is the first argument
to the ``StopIteration`` exception raised by the iterator when it
terminates.
* ``return expr`` in a generator causes ``StopIteration(expr)`` to
be raised.
* ``return expr`` in a generator causes ``StopIteration(expr)`` to
be raised.
Fine Details
------------
The implicit GeneratorExit resulting from closing the delegating
generator is treated as though it were passed in using ``throw()``.
An iterator having a ``throw()`` method is expected to recognize
this as a request to finalize itself.
If a call to the iterator's ``throw()`` method raises a StopIteration
exception, and it is *not* the same exception object that was thrown
in, its value is returned as the value of the ``yield from`` expression
and the delegating generator is resumed.
Enhancements to StopIteration
-----------------------------
For convenience, the ``StopIteration`` exception will be given a
``value`` attribute that holds its first argument, or None if there
@ -87,50 +133,66 @@ Formal Semantics
::
result = yield from expr
RESULT = yield from EXPR
is semantically equivalent to
::
_i = iter(expr)
try:
_u = _i.next()
while 1:
try:
_v = yield _u
except Exception, _e:
_m = getattr(_i, 'throw', None)
if _m is not None:
_u = _m(_e)
else:
raise
else:
if _v is None:
_u = _i.next()
else:
_u = _i.send(_v)
except StopIteration, _e:
result = _e.value
finally:
_m = getattr(_i, 'close', None)
if _m is not None:
_m()
_i = iter(EXPR)
try:
try:
_y = _i.next()
except StopIteration, _e:
_r = _e.value
else:
while 1:
try:
_s = yield _y
except:
_m = getattr(_i, 'throw', None)
if _m is not None:
_x = sys.exc_info()
try:
_y = _m(*_x)
except StopIteration, _e:
if _e is _x[1]:
raise
else:
_r = _e.value
break
else:
_m = getattr(_i, 'close', None)
if _m is not None:
_m()
raise
else:
try:
if _s is None:
_y = _i.next()
else:
_y = _i.send(_s)
except StopIteration, _e:
_r = _e.value
break
finally:
del _i
RESULT = _r
except that implementations are free to cache bound methods for the 'next',
'send', 'throw' and 'close' methods of the iterator.
'send' and 'throw' methods of the iterator upon first use.
2. In a generator, the statement
::
return value
return value
is semantically equivalent to
::
raise StopIteration(value)
raise StopIteration(value)
except that, as currently, the exception cannot be caught by ``except``
clauses within the returning generator.
@ -139,66 +201,82 @@ clauses within the returning generator.
::
class StopIteration(Exception):
class StopIteration(Exception):
def __init__(self, *args):
if len(args) > 0:
self.value = args[0]
else:
self.value = None
Exception.__init__(self, *args)
def __init__(self, *args):
if len(args) > 0:
self.value = args[0]
else:
self.value = None
Exception.__init__(self, *args)
Rationale
=========
A Python generator is a form of coroutine, but has the limitation that
it can only yield to its immediate caller. This means that a piece of
code containing a ``yield`` cannot be factored out and put into a
separate function in the same way as other code. Performing such a
factoring causes the called function to itself become a generator, and
it is necessary to explicitly iterate over this second generator and
re-yield any values that it produces.
The Refactoring Principle
-------------------------
If yielding of values is the only concern, this is not very arduous
and can be performed with a loop such as
The rationale behind most of the semantics presented above stems from
the desire to be able to refactor generator code. It should be possible
to take an section of code containing one or more ``yield`` expressions,
move it into a separate function (using the usual techniques to deal
with references to variables in the surrounding scope, etc.), and
call the new function using a ``yield from`` expression.
::
The behaviour of the resulting compound generator should be, as far as
possible, exactly the same as the original unfactored generator in all
situations, including calls to ``next()``, ``send()``, ``throw()`` and
``close()``.
for v in g:
yield v
The semantics in cases of subiterators other than generators has been
chosen as a reasonable generalization of the generator case.
However, if the subgenerator is to interact properly with the caller
in the case of calls to ``send()``, ``throw()`` and ``close()``, things
become considerably more complicated. As the formal expansion presented
above illustrates, the necessary code is very longwinded, and it is tricky
to handle all the corner cases correctly. In this situation, the advantages
of a specialised syntax should be clear.
Finalization
------------
There was some debate as to whether explicitly finalizing the delegating
generator by calling its ``close()`` method while it is suspended at a
``yield from`` should also finalize the subiterator. An argument against
doing so is that it would result in premature finalization of the
subiterator if references to it exist elsewhere.
Consideration of non-refcounting Python implementations led to the
decision that this explicit finalization should be performed, so that
explicitly closing a factored generator has the same effect as doing
so to an unfactored one in all Python implementations.
The assumption made is that, in the majority of use cases, the subiterator
will not be shared. The rare case of a shared subiterator can be
accommodated by means of a wrapper that blocks ``throw()`` and ``send()``
calls, or by using a means other than ``yield from`` to call the
subiterator.
Generators as Threads
---------------------
A motivating use case for generators being able to return values
concerns the use of generators to implement lightweight threads. When
using generators in that way, it is reasonable to want to spread the
A motivation for generators being able to return values concerns the
use of generators to implement lightweight threads. When using
generators in that way, it is reasonable to want to spread the
computation performed by the lightweight thread over many functions.
One would like to be able to call a subgenerator as though it were
an ordinary function, passing it parameters and receiving a returned
One would like to be able to call a subgenerator as though it were an
ordinary function, passing it parameters and receiving a returned
value.
Using the proposed syntax, a statement such as
::
y = f(x)
y = f(x)
where f is an ordinary function, can be transformed into a delegation
call
::
y = yield from g(x)
y = yield from g(x)
where g is a generator. One can reason about the behaviour of the
resulting code by thinking of g as an ordinary function that can be
@ -236,7 +314,8 @@ Using a specialised syntax opens up possibilities for optimisation
when there is a long chain of generators. Such chains can arise, for
instance, when recursively traversing a tree structure. The overhead
of passing ``next()`` calls and yielded values down and up the chain
can cause what ought to be an O(n) operation to become O(n\*\*2).
can cause what ought to be an O(n) operation to become, in the worst
case, O(n\*\*2).
A possible strategy is to add a slot to generator objects to hold a
generator being delegated to. When a ``next()`` or ``send()`` call is
@ -262,13 +341,13 @@ value of the ``close()`` call to the subgenerator. However, the proposed
mechanism is attractive for a couple of reasons:
* Using the StopIteration exception makes it easy for other kinds
of iterators to participate in the protocol without having to
grow extra attributes or a close() method.
of iterators to participate in the protocol without having to
grow an extra attribute or a close() method.
* It simplifies the implementation, because the point at which the
return value from the subgenerator becomes available is the same
point at which StopIteration is raised. Delaying until any later
time would require storing the return value somewhere.
return value from the subgenerator becomes available is the same
point at which StopIteration is raised. Delaying until any later
time would require storing the return value somewhere.
Criticisms
@ -316,18 +395,19 @@ code.
To the author's knowledge, previous proposals have focused only on
yielding values, and thereby suffered from the criticism that the
two-line for-loop they replace is not sufficiently tiresome to write
to justify a new syntax. By also dealing with calls to ``send()``,
``throw()`` and ``close()``, this proposal provides considerably more
benefit.
to justify a new syntax. By dealing with the full generator
protocol, this proposal provides considerably more benefit.
Implementation
==============
Additional Material
===================
A `prototype implementation`_ is available, based on the first
optimisation outlined above.
Some examples of the use of the proposed syntax are available, and also a
prototype implementation based on the first optimisation outlined above.
.. _prototype implementation: http://www.cosc.canterbury.ac.nz/greg.ewing/python/yield-from/
`Examples and Implementation`_
.. _Examples and Implementation: http://www.cosc.canterbury.ac.nz/greg.ewing/python/yield-from/
Copyright