481 lines
17 KiB
ReStructuredText
481 lines
17 KiB
ReStructuredText
PEP: 380
|
||
Title: Syntax for Delegating to a Subgenerator
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Gregory Ewing <greg.ewing@canterbury.ac.nz>
|
||
Status: Final
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 13-Feb-2009
|
||
Python-Version: 3.3
|
||
Post-History:
|
||
Resolution: https://mail.python.org/pipermail/python-dev/2011-June/112010.html
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
A syntax is proposed for a generator to delegate part of its
|
||
operations to another generator. This allows a section of code
|
||
containing 'yield' to be factored out and placed in another generator.
|
||
Additionally, the subgenerator is allowed to return with a value, and
|
||
the value is made available to the delegating generator.
|
||
|
||
The new syntax also opens up some opportunities for optimisation when
|
||
one generator re-yields values produced by another.
|
||
|
||
PEP Acceptance
|
||
==============
|
||
|
||
Guido officially `accepted the PEP`_ on 26th June, 2011.
|
||
|
||
.. _accepted the PEP: https://mail.python.org/pipermail/python-dev/2011-June/112010.html
|
||
|
||
Motivation
|
||
==========
|
||
|
||
A Python generator is a form of coroutine, but has the limitation that
|
||
it can only yield to its immediate caller. This means that a piece of
|
||
code containing a ``yield`` cannot be factored out and put into a
|
||
separate function in the same way as other code. Performing such a
|
||
factoring causes the called function to itself become a generator, and
|
||
it is necessary to explicitly iterate over this second generator and
|
||
re-yield any values that it produces.
|
||
|
||
If yielding of values is the only concern, this can be performed
|
||
without much difficulty using a loop such as
|
||
|
||
::
|
||
|
||
for v in g:
|
||
yield v
|
||
|
||
However, if the subgenerator is to interact properly with the caller
|
||
in the case of calls to ``send()``, ``throw()`` and ``close()``,
|
||
things become considerably more difficult. As will be seen later, the
|
||
necessary code is very complicated, and it is tricky to handle all the
|
||
corner cases correctly.
|
||
|
||
A new syntax will be proposed to address this issue. In the simplest
|
||
use cases, it will be equivalent to the above for-loop, but it will
|
||
also handle the full range of generator behaviour, and allow generator
|
||
code to be refactored in a simple and straightforward way.
|
||
|
||
|
||
Proposal
|
||
========
|
||
|
||
The following new expression syntax will be allowed in the body of a
|
||
generator:
|
||
|
||
::
|
||
|
||
yield from <expr>
|
||
|
||
where <expr> is an expression evaluating to an iterable, from which an
|
||
iterator is extracted. The iterator is run to exhaustion, during which
|
||
time it yields and receives values directly to or from the caller of
|
||
the generator containing the ``yield from`` expression (the
|
||
"delegating generator").
|
||
|
||
Furthermore, when the iterator is another generator, the subgenerator
|
||
is allowed to execute a ``return`` statement with a value, and that
|
||
value becomes the value of the ``yield from`` expression.
|
||
|
||
The full semantics of the ``yield from`` expression can be described
|
||
in terms of the generator protocol as follows:
|
||
|
||
* Any values that the iterator yields are passed directly to the
|
||
caller.
|
||
|
||
* Any values sent to the delegating generator using ``send()`` are
|
||
passed directly to the iterator. If the sent value is None, the
|
||
iterator's ``__next__()`` method is called. If the sent value
|
||
is not None, the iterator's ``send()`` method is called. If the
|
||
call raises StopIteration, the delegating generator is resumed.
|
||
Any other exception is propagated to the delegating generator.
|
||
|
||
* Exceptions other than GeneratorExit thrown into the delegating
|
||
generator are passed to the ``throw()`` method of the iterator.
|
||
If the call raises StopIteration, the delegating generator is
|
||
resumed. Any other exception is propagated to the delegating
|
||
generator.
|
||
|
||
* If a GeneratorExit exception is thrown into the delegating
|
||
generator, or the ``close()`` method of the delegating generator
|
||
is called, then the ``close()`` method of the iterator is called
|
||
if it has one. If this call results in an exception, it is
|
||
propagated to the delegating generator. Otherwise,
|
||
GeneratorExit is raised in the delegating generator.
|
||
|
||
* The value of the ``yield from`` expression is the first argument
|
||
to the ``StopIteration`` exception raised by the iterator when
|
||
it terminates.
|
||
|
||
* ``return expr`` in a generator causes ``StopIteration(expr)`` to
|
||
be raised upon exit from the generator.
|
||
|
||
|
||
Enhancements to StopIteration
|
||
-----------------------------
|
||
|
||
For convenience, the ``StopIteration`` exception will be given a
|
||
``value`` attribute that holds its first argument, or None if there
|
||
are no arguments.
|
||
|
||
|
||
Formal Semantics
|
||
----------------
|
||
|
||
Python 3 syntax is used in this section.
|
||
|
||
1. The statement ::
|
||
|
||
RESULT = yield from EXPR
|
||
|
||
is semantically equivalent to ::
|
||
|
||
_i = iter(EXPR)
|
||
try:
|
||
_y = next(_i)
|
||
except StopIteration as _e:
|
||
_r = _e.value
|
||
else:
|
||
while 1:
|
||
try:
|
||
_s = yield _y
|
||
except GeneratorExit as _e:
|
||
try:
|
||
_m = _i.close
|
||
except AttributeError:
|
||
pass
|
||
else:
|
||
_m()
|
||
raise _e
|
||
except BaseException as _e:
|
||
_x = sys.exc_info()
|
||
try:
|
||
_m = _i.throw
|
||
except AttributeError:
|
||
raise _e
|
||
else:
|
||
try:
|
||
_y = _m(*_x)
|
||
except StopIteration as _e:
|
||
_r = _e.value
|
||
break
|
||
else:
|
||
try:
|
||
if _s is None:
|
||
_y = next(_i)
|
||
else:
|
||
_y = _i.send(_s)
|
||
except StopIteration as _e:
|
||
_r = _e.value
|
||
break
|
||
RESULT = _r
|
||
|
||
|
||
2. In a generator, the statement ::
|
||
|
||
return value
|
||
|
||
is semantically equivalent to ::
|
||
|
||
raise StopIteration(value)
|
||
|
||
except that, as currently, the exception cannot be caught by
|
||
``except`` clauses within the returning generator.
|
||
|
||
3. The StopIteration exception behaves as though defined thusly::
|
||
|
||
class StopIteration(Exception):
|
||
|
||
def __init__(self, *args):
|
||
if len(args) > 0:
|
||
self.value = args[0]
|
||
else:
|
||
self.value = None
|
||
Exception.__init__(self, *args)
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
The Refactoring Principle
|
||
-------------------------
|
||
|
||
The rationale behind most of the semantics presented above stems from
|
||
the desire to be able to refactor generator code. It should be
|
||
possible to take a section of code containing one or more ``yield``
|
||
expressions, move it into a separate function (using the usual
|
||
techniques to deal with references to variables in the surrounding
|
||
scope, etc.), and call the new function using a ``yield from``
|
||
expression.
|
||
|
||
The behaviour of the resulting compound generator should be, as far as
|
||
reasonably practicable, the same as the original unfactored generator
|
||
in all situations, including calls to ``__next__()``, ``send()``,
|
||
``throw()`` and ``close()``.
|
||
|
||
The semantics in cases of subiterators other than generators has been
|
||
chosen as a reasonable generalization of the generator case.
|
||
|
||
The proposed semantics have the following limitations with regard to
|
||
refactoring:
|
||
|
||
* A block of code that catches GeneratorExit without subsequently
|
||
re-raising it cannot be factored out while retaining exactly the
|
||
same behaviour.
|
||
|
||
* Factored code may not behave the same way as unfactored code if a
|
||
StopIteration exception is thrown into the delegating generator.
|
||
|
||
With use cases for these being rare to non-existent, it was not
|
||
considered worth the extra complexity required to support them.
|
||
|
||
|
||
Finalization
|
||
------------
|
||
|
||
There was some debate as to whether explicitly finalizing the
|
||
delegating generator by calling its ``close()`` method while it is
|
||
suspended at a ``yield from`` should also finalize the subiterator.
|
||
An argument against doing so is that it would result in premature
|
||
finalization of the subiterator if references to it exist elsewhere.
|
||
|
||
Consideration of non-refcounting Python implementations led to the
|
||
decision that this explicit finalization should be performed, so that
|
||
explicitly closing a factored generator has the same effect as doing
|
||
so to an unfactored one in all Python implementations.
|
||
|
||
The assumption made is that, in the majority of use cases, the
|
||
subiterator will not be shared. The rare case of a shared subiterator
|
||
can be accommodated by means of a wrapper that blocks ``throw()`` and
|
||
``close()`` calls, or by using a means other than ``yield from`` to
|
||
call the subiterator.
|
||
|
||
|
||
Generators as Threads
|
||
---------------------
|
||
|
||
A motivation for generators being able to return values concerns the
|
||
use of generators to implement lightweight threads. When using
|
||
generators in that way, it is reasonable to want to spread the
|
||
computation performed by the lightweight thread over many functions.
|
||
One would like to be able to call a subgenerator as though it were an
|
||
ordinary function, passing it parameters and receiving a returned
|
||
value.
|
||
|
||
Using the proposed syntax, a statement such as ::
|
||
|
||
y = f(x)
|
||
|
||
where f is an ordinary function, can be transformed into a delegation
|
||
call ::
|
||
|
||
y = yield from g(x)
|
||
|
||
where g is a generator. One can reason about the behaviour of the
|
||
resulting code by thinking of g as an ordinary function that can be
|
||
suspended using a ``yield`` statement.
|
||
|
||
When using generators as threads in this way, typically one is not
|
||
interested in the values being passed in or out of the yields.
|
||
However, there are use cases for this as well, where the thread is
|
||
seen as a producer or consumer of items. The ``yield from``
|
||
expression allows the logic of the thread to be spread over as many
|
||
functions as desired, with the production or consumption of items
|
||
occurring in any subfunction, and the items are automatically routed to
|
||
or from their ultimate source or destination.
|
||
|
||
Concerning ``throw()`` and ``close()``, it is reasonable to expect
|
||
that if an exception is thrown into the thread from outside, it should
|
||
first be raised in the innermost generator where the thread is
|
||
suspended, and propagate outwards from there; and that if the thread
|
||
is terminated from outside by calling ``close()``, the chain of active
|
||
generators should be finalised from the innermost outwards.
|
||
|
||
|
||
Syntax
|
||
------
|
||
|
||
The particular syntax proposed has been chosen as suggestive of its
|
||
meaning, while not introducing any new keywords and clearly standing
|
||
out as being different from a plain ``yield``.
|
||
|
||
|
||
Optimisations
|
||
-------------
|
||
|
||
Using a specialised syntax opens up possibilities for optimisation
|
||
when there is a long chain of generators. Such chains can arise, for
|
||
instance, when recursively traversing a tree structure. The overhead
|
||
of passing ``__next__()`` calls and yielded values down and up the
|
||
chain can cause what ought to be an O(n) operation to become, in the
|
||
worst case, O(n\*\*2).
|
||
|
||
A possible strategy is to add a slot to generator objects to hold a
|
||
generator being delegated to. When a ``__next__()`` or ``send()``
|
||
call is made on the generator, this slot is checked first, and if it
|
||
is nonempty, the generator that it references is resumed instead. If
|
||
it raises StopIteration, the slot is cleared and the main generator is
|
||
resumed.
|
||
|
||
This would reduce the delegation overhead to a chain of C function
|
||
calls involving no Python code execution. A possible enhancement
|
||
would be to traverse the whole chain of generators in a loop and
|
||
directly resume the one at the end, although the handling of
|
||
StopIteration is more complicated then.
|
||
|
||
|
||
Use of StopIteration to return values
|
||
-------------------------------------
|
||
|
||
There are a variety of ways that the return value from the generator
|
||
could be passed back. Some alternatives include storing it as an
|
||
attribute of the generator-iterator object, or returning it as the
|
||
value of the ``close()`` call to the subgenerator. However, the
|
||
proposed mechanism is attractive for a couple of reasons:
|
||
|
||
* Using a generalization of the StopIteration exception makes it easy
|
||
for other kinds of iterators to participate in the protocol without
|
||
having to grow an extra attribute or a close() method.
|
||
|
||
* It simplifies the implementation, because the point at which the
|
||
return value from the subgenerator becomes available is the same
|
||
point at which the exception is raised. Delaying until any later
|
||
time would require storing the return value somewhere.
|
||
|
||
|
||
Rejected Ideas
|
||
--------------
|
||
|
||
Some ideas were discussed but rejected.
|
||
|
||
Suggestion: There should be some way to prevent the initial call to
|
||
__next__(), or substitute it with a send() call with a specified
|
||
value, the intention being to support the use of generators wrapped so
|
||
that the initial __next__() is performed automatically.
|
||
|
||
Resolution: Outside the scope of the proposal. Such generators should
|
||
not be used with ``yield from``.
|
||
|
||
Suggestion: If closing a subiterator raises StopIteration with a
|
||
value, return that value from the ``close()`` call to the delegating
|
||
generator.
|
||
|
||
The motivation for this feature is so that the end of a stream of
|
||
values being sent to a generator can be signalled by closing the
|
||
generator. The generator would catch GeneratorExit, finish its
|
||
computation and return a result, which would then become the return
|
||
value of the close() call.
|
||
|
||
Resolution: This usage of close() and GeneratorExit would be
|
||
incompatible with their current role as a bail-out and clean-up
|
||
mechanism. It would require that when closing a delegating generator,
|
||
after the subgenerator is closed, the delegating generator be resumed
|
||
instead of re-raising GeneratorExit. But this is not acceptable,
|
||
because it would fail to ensure that the delegating generator is
|
||
finalised properly in the case where close() is being called for
|
||
cleanup purposes.
|
||
|
||
Signalling the end of values to a consumer is better addressed by
|
||
other means, such as sending in a sentinel value or throwing in an
|
||
exception agreed upon by the producer and consumer. The consumer can
|
||
then detect the sentinel or exception and respond by finishing its
|
||
computation and returning normally. Such a scheme behaves correctly
|
||
in the presence of delegation.
|
||
|
||
Suggestion: If ``close()`` is not to return a value, then raise an
|
||
exception if StopIteration with a non-None value occurs.
|
||
|
||
Resolution: No clear reason to do so. Ignoring a return value is not
|
||
considered an error anywhere else in Python.
|
||
|
||
|
||
Criticisms
|
||
==========
|
||
|
||
Under this proposal, the value of a ``yield from`` expression would be
|
||
derived in a very different way from that of an ordinary ``yield``
|
||
expression. This suggests that some other syntax not containing the
|
||
word ``yield`` might be more appropriate, but no acceptable
|
||
alternative has so far been proposed. Rejected alternatives include
|
||
``call``, ``delegate`` and ``gcall``.
|
||
|
||
It has been suggested that some mechanism other than ``return`` in the
|
||
subgenerator should be used to establish the value returned by the
|
||
``yield from`` expression. However, this would interfere with the
|
||
goal of being able to think of the subgenerator as a suspendable
|
||
function, since it would not be able to return values in the same way
|
||
as other functions.
|
||
|
||
The use of an exception to pass the return value has been criticised
|
||
as an "abuse of exceptions", without any concrete justification of
|
||
this claim. In any case, this is only one suggested implementation;
|
||
another mechanism could be used without losing any essential features
|
||
of the proposal.
|
||
|
||
It has been suggested that a different exception, such as
|
||
GeneratorReturn, should be used instead of StopIteration to return a
|
||
value. However, no convincing practical reason for this has been put
|
||
forward, and the addition of a ``value`` attribute to StopIteration
|
||
mitigates any difficulties in extracting a return value from a
|
||
StopIteration exception that may or may not have one. Also, using a
|
||
different exception would mean that, unlike ordinary functions,
|
||
'return' without a value in a generator would not be equivalent to
|
||
'return None'.
|
||
|
||
|
||
Alternative Proposals
|
||
=====================
|
||
|
||
Proposals along similar lines have been made before, some using the
|
||
syntax ``yield *`` instead of ``yield from``. While ``yield *`` is
|
||
more concise, it could be argued that it looks too similar to an
|
||
ordinary ``yield`` and the difference might be overlooked when reading
|
||
code.
|
||
|
||
To the author's knowledge, previous proposals have focused only on
|
||
yielding values, and thereby suffered from the criticism that the
|
||
two-line for-loop they replace is not sufficiently tiresome to write
|
||
to justify a new syntax. By dealing with the full generator protocol,
|
||
this proposal provides considerably more benefit.
|
||
|
||
|
||
Additional Material
|
||
===================
|
||
|
||
Some examples of the use of the proposed syntax are available, and
|
||
also a prototype implementation based on the first optimisation
|
||
outlined above.
|
||
|
||
`Examples and Implementation`_
|
||
|
||
.. _Examples and Implementation:
|
||
http://www.cosc.canterbury.ac.nz/greg.ewing/python/yield-from/
|
||
|
||
A version of the implementation updated for Python 3.3 is available from
|
||
tracker `issue #11682`_
|
||
|
||
.. _issue #11682:
|
||
http://bugs.python.org/issue11682
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|