2009-03-20 13:36:09 -04:00
|
|
|
PEP: 380
|
|
|
|
Title: Syntax for Delegating to a Subgenerator
|
|
|
|
Version: $Revision$
|
|
|
|
Last-Modified: $Date$
|
|
|
|
Author: Gregory Ewing <greg.ewing@canterbury.ac.nz>
|
|
|
|
Status: Draft
|
|
|
|
Type: Standards Track
|
|
|
|
Content-Type: text/x-rst
|
|
|
|
Created: 13-Feb-2009
|
|
|
|
Python-Version: 2.7
|
|
|
|
Post-History:
|
|
|
|
|
|
|
|
|
|
|
|
Abstract
|
|
|
|
========
|
|
|
|
|
|
|
|
A syntax is proposed for a generator to delegate part of its
|
|
|
|
operations to another generator. This allows a section of code
|
|
|
|
containing 'yield' to be factored out and placed in another
|
|
|
|
generator. Additionally, the subgenerator is allowed to return with a
|
|
|
|
value, and the value is made available to the delegating generator.
|
|
|
|
|
|
|
|
The new syntax also opens up some opportunities for optimisation when
|
|
|
|
one generator re-yields values produced by another.
|
|
|
|
|
|
|
|
|
2009-03-26 23:33:12 -04:00
|
|
|
Motivation
|
|
|
|
==========
|
|
|
|
|
|
|
|
A Python generator is a form of coroutine, but has the limitation that
|
|
|
|
it can only yield to its immediate caller. This means that a piece of
|
|
|
|
code containing a ``yield`` cannot be factored out and put into a
|
|
|
|
separate function in the same way as other code. Performing such a
|
|
|
|
factoring causes the called function to itself become a generator, and
|
|
|
|
it is necessary to explicitly iterate over this second generator and
|
|
|
|
re-yield any values that it produces.
|
|
|
|
|
|
|
|
If yielding of values is the only concern, this can be performed without
|
|
|
|
much difficulty using a loop such as
|
|
|
|
|
|
|
|
::
|
|
|
|
|
2009-03-27 15:33:33 -04:00
|
|
|
for v in g:
|
|
|
|
yield v
|
2009-03-26 23:33:12 -04:00
|
|
|
|
|
|
|
However, if the subgenerator is to interact properly with the caller
|
|
|
|
in the case of calls to ``send()``, ``throw()`` and ``close()``, things
|
|
|
|
become considerably more difficult. As will be seen later, the necessary
|
|
|
|
code is very complicated, and it is tricky to handle all the corner cases
|
|
|
|
correctly.
|
|
|
|
|
|
|
|
A new syntax will be proposed to address this issue. In the simplest
|
|
|
|
use cases, it will be equivalent to the above for-loop, but it will also
|
|
|
|
handle the full range of generator behaviour, and allow generator code
|
|
|
|
to be refactored in a simple and straightforward way.
|
|
|
|
|
|
|
|
|
2009-03-20 13:36:09 -04:00
|
|
|
Proposal
|
|
|
|
========
|
|
|
|
|
|
|
|
The following new expression syntax will be allowed in the body of a
|
|
|
|
generator:
|
|
|
|
|
|
|
|
::
|
|
|
|
|
2009-03-27 15:33:33 -04:00
|
|
|
yield from <expr>
|
2009-03-20 13:36:09 -04:00
|
|
|
|
|
|
|
where <expr> is an expression evaluating to an iterable, from which an
|
|
|
|
iterator is extracted. The iterator is run to exhaustion, during which
|
|
|
|
time it yields and receives values directly to or from the caller of
|
|
|
|
the generator containing the ``yield from`` expression (the
|
|
|
|
"delegating generator").
|
|
|
|
|
2009-03-26 23:33:12 -04:00
|
|
|
Furthermore, when the iterator is another generator, the subgenerator is
|
|
|
|
allowed to execute a ``return`` statement with a value, and that value
|
|
|
|
becomes the value of the ``yield from`` expression.
|
2009-03-20 13:36:09 -04:00
|
|
|
|
2009-03-26 23:33:12 -04:00
|
|
|
In general, the semantics can be described in terms of the iterator
|
2009-03-20 13:36:09 -04:00
|
|
|
protocol as follows:
|
|
|
|
|
2009-03-27 15:33:33 -04:00
|
|
|
* Any values that the iterator yields are passed directly to the
|
|
|
|
caller.
|
2009-03-26 23:33:12 -04:00
|
|
|
|
2009-03-27 15:33:33 -04:00
|
|
|
* Any values sent to the delegating generator using ``send()``
|
|
|
|
are passed directly to the iterator. If the sent value is None,
|
|
|
|
the iterator's ``next()`` method is called. If the sent value is
|
|
|
|
not None, the iterator's ``send()`` method is called. Any exception
|
|
|
|
resulting from attempting to call ``next`` or ``send`` is raised
|
|
|
|
in the delegating generator.
|
2009-03-20 13:36:09 -04:00
|
|
|
|
2009-03-27 15:33:33 -04:00
|
|
|
* Exceptions passed to the ``throw()`` method of the delegating
|
|
|
|
generator are forwarded to the ``throw()`` method of the iterator.
|
|
|
|
If the iterator does not have a ``throw()`` method, its ``close()``
|
|
|
|
method is called if it has one, then the thrown-in exception is
|
|
|
|
raised in the delegating generator. Any exception resulting from
|
|
|
|
attempting to call these methods (apart from one case noted below)
|
|
|
|
is raised in the delegating generator.
|
2009-03-20 13:36:09 -04:00
|
|
|
|
2009-03-27 15:33:33 -04:00
|
|
|
* The value of the ``yield from`` expression is the first argument
|
|
|
|
to the ``StopIteration`` exception raised by the iterator when it
|
|
|
|
terminates.
|
2009-03-20 13:36:09 -04:00
|
|
|
|
2009-03-27 15:33:33 -04:00
|
|
|
* ``return expr`` in a generator causes ``StopIteration(expr)`` to
|
|
|
|
be raised.
|
2009-03-20 13:36:09 -04:00
|
|
|
|
|
|
|
|
2009-03-26 23:33:12 -04:00
|
|
|
Fine Details
|
|
|
|
------------
|
|
|
|
|
|
|
|
The implicit GeneratorExit resulting from closing the delegating
|
|
|
|
generator is treated as though it were passed in using ``throw()``.
|
|
|
|
An iterator having a ``throw()`` method is expected to recognize
|
|
|
|
this as a request to finalize itself.
|
|
|
|
|
|
|
|
If a call to the iterator's ``throw()`` method raises a StopIteration
|
2009-03-27 15:33:33 -04:00
|
|
|
exception, and it is *not* the same exception object that was thrown in,
|
|
|
|
and the original exception was not GeneratorExit, then the value of the
|
|
|
|
new exception is returned as the value of the ``yield from`` expression
|
2009-03-26 23:33:12 -04:00
|
|
|
and the delegating generator is resumed.
|
|
|
|
|
|
|
|
|
|
|
|
Enhancements to StopIteration
|
|
|
|
-----------------------------
|
2009-03-20 13:36:09 -04:00
|
|
|
|
|
|
|
For convenience, the ``StopIteration`` exception will be given a
|
|
|
|
``value`` attribute that holds its first argument, or None if there
|
|
|
|
are no arguments.
|
|
|
|
|
|
|
|
|
|
|
|
Formal Semantics
|
|
|
|
----------------
|
|
|
|
|
|
|
|
1. The statement
|
|
|
|
|
|
|
|
::
|
|
|
|
|
2009-03-27 15:33:33 -04:00
|
|
|
RESULT = yield from EXPR
|
2009-03-20 13:36:09 -04:00
|
|
|
|
|
|
|
is semantically equivalent to
|
|
|
|
|
|
|
|
::
|
|
|
|
|
2009-03-27 15:33:33 -04:00
|
|
|
_i = iter(EXPR)
|
|
|
|
try:
|
|
|
|
_y = _i.next()
|
|
|
|
except StopIteration, _e:
|
|
|
|
_r = _e.value
|
|
|
|
else:
|
|
|
|
while 1:
|
|
|
|
try:
|
|
|
|
_s = yield _y
|
|
|
|
except:
|
|
|
|
_m = getattr(_i, 'throw', None)
|
|
|
|
if _m is not None:
|
|
|
|
_x = sys.exc_info()
|
|
|
|
try:
|
|
|
|
_y = _m(*_x)
|
|
|
|
except StopIteration, _e:
|
|
|
|
if _e is _x[1] or isinstance(_x[1], GeneratorExit):
|
|
|
|
raise
|
|
|
|
else:
|
|
|
|
_r = _e.value
|
|
|
|
break
|
|
|
|
else:
|
|
|
|
_m = getattr(_i, 'close', None)
|
|
|
|
if _m is not None:
|
|
|
|
_m()
|
|
|
|
raise
|
|
|
|
else:
|
|
|
|
try:
|
|
|
|
if _s is None:
|
|
|
|
_y = _i.next()
|
|
|
|
else:
|
|
|
|
_y = _i.send(_s)
|
|
|
|
except StopIteration, _e:
|
|
|
|
_r = _e.value
|
|
|
|
break
|
|
|
|
RESULT = _r
|
2009-03-20 13:36:09 -04:00
|
|
|
|
|
|
|
except that implementations are free to cache bound methods for the 'next',
|
2009-03-26 23:33:12 -04:00
|
|
|
'send' and 'throw' methods of the iterator upon first use.
|
2009-03-20 13:36:09 -04:00
|
|
|
|
|
|
|
2. In a generator, the statement
|
|
|
|
|
|
|
|
::
|
|
|
|
|
2009-03-27 15:33:33 -04:00
|
|
|
return value
|
2009-03-20 13:36:09 -04:00
|
|
|
|
|
|
|
is semantically equivalent to
|
|
|
|
|
|
|
|
::
|
|
|
|
|
2009-03-27 15:33:33 -04:00
|
|
|
raise StopIteration(value)
|
2009-03-20 13:36:09 -04:00
|
|
|
|
|
|
|
except that, as currently, the exception cannot be caught by ``except``
|
|
|
|
clauses within the returning generator.
|
|
|
|
|
|
|
|
3. The StopIteration exception behaves as though defined thusly:
|
|
|
|
|
|
|
|
::
|
|
|
|
|
2009-03-27 15:33:33 -04:00
|
|
|
class StopIteration(Exception):
|
2009-03-20 13:36:09 -04:00
|
|
|
|
2009-03-27 15:33:33 -04:00
|
|
|
def __init__(self, *args):
|
|
|
|
if len(args) > 0:
|
|
|
|
self.value = args[0]
|
|
|
|
else:
|
|
|
|
self.value = None
|
|
|
|
Exception.__init__(self, *args)
|
2009-03-20 13:36:09 -04:00
|
|
|
|
|
|
|
|
|
|
|
Rationale
|
|
|
|
=========
|
|
|
|
|
2009-03-26 23:33:12 -04:00
|
|
|
The Refactoring Principle
|
|
|
|
-------------------------
|
2009-03-20 13:36:09 -04:00
|
|
|
|
2009-03-26 23:33:12 -04:00
|
|
|
The rationale behind most of the semantics presented above stems from
|
|
|
|
the desire to be able to refactor generator code. It should be possible
|
|
|
|
to take an section of code containing one or more ``yield`` expressions,
|
|
|
|
move it into a separate function (using the usual techniques to deal
|
|
|
|
with references to variables in the surrounding scope, etc.), and
|
|
|
|
call the new function using a ``yield from`` expression.
|
2009-03-20 13:36:09 -04:00
|
|
|
|
2009-03-26 23:33:12 -04:00
|
|
|
The behaviour of the resulting compound generator should be, as far as
|
|
|
|
possible, exactly the same as the original unfactored generator in all
|
|
|
|
situations, including calls to ``next()``, ``send()``, ``throw()`` and
|
|
|
|
``close()``.
|
2009-03-20 13:36:09 -04:00
|
|
|
|
2009-03-26 23:33:12 -04:00
|
|
|
The semantics in cases of subiterators other than generators has been
|
|
|
|
chosen as a reasonable generalization of the generator case.
|
2009-03-20 13:36:09 -04:00
|
|
|
|
2009-03-26 23:33:12 -04:00
|
|
|
|
|
|
|
Finalization
|
|
|
|
------------
|
|
|
|
|
|
|
|
There was some debate as to whether explicitly finalizing the delegating
|
|
|
|
generator by calling its ``close()`` method while it is suspended at a
|
|
|
|
``yield from`` should also finalize the subiterator. An argument against
|
|
|
|
doing so is that it would result in premature finalization of the
|
|
|
|
subiterator if references to it exist elsewhere.
|
|
|
|
|
|
|
|
Consideration of non-refcounting Python implementations led to the
|
|
|
|
decision that this explicit finalization should be performed, so that
|
|
|
|
explicitly closing a factored generator has the same effect as doing
|
|
|
|
so to an unfactored one in all Python implementations.
|
|
|
|
|
|
|
|
The assumption made is that, in the majority of use cases, the subiterator
|
|
|
|
will not be shared. The rare case of a shared subiterator can be
|
2009-03-27 15:33:33 -04:00
|
|
|
accommodated by means of a wrapper that blocks ``throw()`` and ``close()``
|
2009-03-26 23:33:12 -04:00
|
|
|
calls, or by using a means other than ``yield from`` to call the
|
|
|
|
subiterator.
|
2009-03-20 13:36:09 -04:00
|
|
|
|
|
|
|
|
|
|
|
Generators as Threads
|
|
|
|
---------------------
|
|
|
|
|
2009-03-26 23:33:12 -04:00
|
|
|
A motivation for generators being able to return values concerns the
|
|
|
|
use of generators to implement lightweight threads. When using
|
|
|
|
generators in that way, it is reasonable to want to spread the
|
2009-03-20 13:36:09 -04:00
|
|
|
computation performed by the lightweight thread over many functions.
|
2009-03-26 23:33:12 -04:00
|
|
|
One would like to be able to call a subgenerator as though it were an
|
|
|
|
ordinary function, passing it parameters and receiving a returned
|
2009-03-20 13:36:09 -04:00
|
|
|
value.
|
|
|
|
|
|
|
|
Using the proposed syntax, a statement such as
|
|
|
|
|
|
|
|
::
|
|
|
|
|
2009-03-27 15:33:33 -04:00
|
|
|
y = f(x)
|
2009-03-20 13:36:09 -04:00
|
|
|
|
|
|
|
where f is an ordinary function, can be transformed into a delegation
|
|
|
|
call
|
|
|
|
|
|
|
|
::
|
|
|
|
|
2009-03-27 15:33:33 -04:00
|
|
|
y = yield from g(x)
|
2009-03-20 13:36:09 -04:00
|
|
|
|
|
|
|
where g is a generator. One can reason about the behaviour of the
|
|
|
|
resulting code by thinking of g as an ordinary function that can be
|
|
|
|
suspended using a ``yield`` statement.
|
|
|
|
|
|
|
|
When using generators as threads in this way, typically one is not
|
|
|
|
interested in the values being passed in or out of the yields.
|
|
|
|
However, there are use cases for this as well, where the thread is
|
|
|
|
seen as a producer or consumer of items. The ``yield from``
|
|
|
|
expression allows the logic of the thread to be spread over as
|
|
|
|
many functions as desired, with the production or consumption of
|
|
|
|
items occuring in any subfunction, and the items are automatically
|
|
|
|
routed to or from their ultimate source or destination.
|
|
|
|
|
|
|
|
Concerning ``throw()`` and ``close()``, it is reasonable to expect
|
|
|
|
that if an exception is thrown into the thread from outside, it should
|
|
|
|
first be raised in the innermost generator where the thread is suspended,
|
|
|
|
and propagate outwards from there; and that if the thread is terminated
|
|
|
|
from outside by calling ``close()``, the chain of active generators
|
|
|
|
should be finalised from the innermost outwards.
|
|
|
|
|
|
|
|
|
|
|
|
Syntax
|
|
|
|
------
|
|
|
|
|
|
|
|
The particular syntax proposed has been chosen as suggestive of its
|
|
|
|
meaning, while not introducing any new keywords and clearly standing
|
|
|
|
out as being different from a plain ``yield``.
|
|
|
|
|
|
|
|
|
|
|
|
Optimisations
|
|
|
|
-------------
|
|
|
|
|
|
|
|
Using a specialised syntax opens up possibilities for optimisation
|
|
|
|
when there is a long chain of generators. Such chains can arise, for
|
|
|
|
instance, when recursively traversing a tree structure. The overhead
|
|
|
|
of passing ``next()`` calls and yielded values down and up the chain
|
2009-03-26 23:33:12 -04:00
|
|
|
can cause what ought to be an O(n) operation to become, in the worst
|
|
|
|
case, O(n\*\*2).
|
2009-03-20 13:36:09 -04:00
|
|
|
|
|
|
|
A possible strategy is to add a slot to generator objects to hold a
|
|
|
|
generator being delegated to. When a ``next()`` or ``send()`` call is
|
|
|
|
made on the generator, this slot is checked first, and if it is
|
|
|
|
nonempty, the generator that it references is resumed instead. If it
|
|
|
|
raises StopIteration, the slot is cleared and the main generator is
|
|
|
|
resumed.
|
|
|
|
|
|
|
|
This would reduce the delegation overhead to a chain of C function
|
|
|
|
calls involving no Python code execution. A possible enhancement would
|
|
|
|
be to traverse the whole chain of generators in a loop and directly
|
|
|
|
resume the one at the end, although the handling of StopIteration is
|
|
|
|
more complicated then.
|
|
|
|
|
|
|
|
|
|
|
|
Use of StopIteration to return values
|
|
|
|
-------------------------------------
|
|
|
|
|
|
|
|
There are a variety of ways that the return value from the generator
|
|
|
|
could be passed back. Some alternatives include storing it as an
|
|
|
|
attribute of the generator-iterator object, or returning it as the
|
|
|
|
value of the ``close()`` call to the subgenerator. However, the proposed
|
|
|
|
mechanism is attractive for a couple of reasons:
|
|
|
|
|
|
|
|
* Using the StopIteration exception makes it easy for other kinds
|
2009-03-27 15:33:33 -04:00
|
|
|
of iterators to participate in the protocol without having to
|
|
|
|
grow an extra attribute or a close() method.
|
2009-03-20 13:36:09 -04:00
|
|
|
|
|
|
|
* It simplifies the implementation, because the point at which the
|
2009-03-27 15:33:33 -04:00
|
|
|
return value from the subgenerator becomes available is the same
|
|
|
|
point at which StopIteration is raised. Delaying until any later
|
|
|
|
time would require storing the return value somewhere.
|
2009-03-20 13:36:09 -04:00
|
|
|
|
|
|
|
|
|
|
|
Criticisms
|
|
|
|
==========
|
|
|
|
|
|
|
|
Under this proposal, the value of a ``yield from`` expression would
|
|
|
|
be derived in a very different way from that of an ordinary ``yield``
|
|
|
|
expression. This suggests that some other syntax not containing the
|
|
|
|
word ``yield`` might be more appropriate, but no acceptable alternative
|
|
|
|
has so far been proposed.
|
|
|
|
|
|
|
|
It has been suggested that some mechanism other than ``return`` in
|
|
|
|
the subgenerator should be used to establish the value returned by
|
|
|
|
the ``yield from`` expression. However, this would interfere with
|
|
|
|
the goal of being able to think of the subgenerator as a suspendable
|
|
|
|
function, since it would not be able to return values in the same way
|
|
|
|
as other functions.
|
|
|
|
|
|
|
|
The use of an argument to StopIteration to pass the return value
|
|
|
|
has been criticised as an "abuse of exceptions", without any
|
|
|
|
concrete justification of this claim. In any case, this is only
|
|
|
|
one suggested implementation; another mechanism could be used
|
|
|
|
without losing any essential features of the proposal.
|
|
|
|
|
|
|
|
It has been suggested that a different exception, such as
|
|
|
|
GeneratorReturn, should be used instead of StopIteration to return a
|
|
|
|
value. However, no convincing practical reason for this has been put
|
|
|
|
forward, and the addition of a ``value`` attribute to StopIteration
|
|
|
|
mitigates any difficulties in extracting a return value from a
|
|
|
|
StopIteration exception that may or may not have one. Also, using a
|
|
|
|
different exception would mean that, unlike ordinary functions,
|
|
|
|
'return' without a value in a generator would not be equivalent to
|
|
|
|
'return None'.
|
|
|
|
|
|
|
|
|
|
|
|
Alternative Proposals
|
|
|
|
=====================
|
|
|
|
|
|
|
|
Proposals along similar lines have been made before, some using the
|
|
|
|
syntax ``yield *`` instead of ``yield from``. While ``yield *`` is
|
|
|
|
more concise, it could be argued that it looks too similar to an
|
|
|
|
ordinary ``yield`` and the difference might be overlooked when reading
|
|
|
|
code.
|
|
|
|
|
|
|
|
To the author's knowledge, previous proposals have focused only on
|
|
|
|
yielding values, and thereby suffered from the criticism that the
|
|
|
|
two-line for-loop they replace is not sufficiently tiresome to write
|
2009-03-26 23:33:12 -04:00
|
|
|
to justify a new syntax. By dealing with the full generator
|
|
|
|
protocol, this proposal provides considerably more benefit.
|
|
|
|
|
2009-03-20 13:36:09 -04:00
|
|
|
|
2009-03-26 23:33:12 -04:00
|
|
|
Additional Material
|
|
|
|
===================
|
2009-03-20 13:36:09 -04:00
|
|
|
|
2009-03-26 23:33:12 -04:00
|
|
|
Some examples of the use of the proposed syntax are available, and also a
|
|
|
|
prototype implementation based on the first optimisation outlined above.
|
2009-03-20 13:36:09 -04:00
|
|
|
|
2009-03-26 23:33:12 -04:00
|
|
|
`Examples and Implementation`_
|
2009-03-20 13:36:09 -04:00
|
|
|
|
2009-03-26 23:33:12 -04:00
|
|
|
.. _Examples and Implementation: http://www.cosc.canterbury.ac.nz/greg.ewing/python/yield-from/
|
2009-03-20 13:36:09 -04:00
|
|
|
|
|
|
|
|
|
|
|
Copyright
|
|
|
|
=========
|
|
|
|
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
..
|
|
|
|
Local Variables:
|
|
|
|
mode: indented-text
|
|
|
|
indent-tabs-mode: nil
|
|
|
|
sentence-end-double-space: t
|
|
|
|
fill-column: 70
|
|
|
|
coding: utf-8
|
|
|
|
End:
|