348 lines
11 KiB
Plaintext
348 lines
11 KiB
Plaintext
|
PEP: 380
|
||
|
Title: Syntax for Delegating to a Subgenerator
|
||
|
Version: $Revision$
|
||
|
Last-Modified: $Date$
|
||
|
Author: Gregory Ewing <greg.ewing@canterbury.ac.nz>
|
||
|
Status: Draft
|
||
|
Type: Standards Track
|
||
|
Content-Type: text/x-rst
|
||
|
Created: 13-Feb-2009
|
||
|
Python-Version: 2.7
|
||
|
Post-History:
|
||
|
|
||
|
|
||
|
Abstract
|
||
|
========
|
||
|
|
||
|
A syntax is proposed for a generator to delegate part of its
|
||
|
operations to another generator. This allows a section of code
|
||
|
containing 'yield' to be factored out and placed in another
|
||
|
generator. Additionally, the subgenerator is allowed to return with a
|
||
|
value, and the value is made available to the delegating generator.
|
||
|
|
||
|
The new syntax also opens up some opportunities for optimisation when
|
||
|
one generator re-yields values produced by another.
|
||
|
|
||
|
|
||
|
Proposal
|
||
|
========
|
||
|
|
||
|
The following new expression syntax will be allowed in the body of a
|
||
|
generator:
|
||
|
|
||
|
::
|
||
|
|
||
|
yield from <expr>
|
||
|
|
||
|
where <expr> is an expression evaluating to an iterable, from which an
|
||
|
iterator is extracted. The iterator is run to exhaustion, during which
|
||
|
time it yields and receives values directly to or from the caller of
|
||
|
the generator containing the ``yield from`` expression (the
|
||
|
"delegating generator").
|
||
|
|
||
|
When the iterator is another generator, the effect is the same as if
|
||
|
the body of the subgenerator were inlined at the point of the ``yield
|
||
|
from`` expression. Furthermore, the subgenerator is allowed to execute
|
||
|
a ``return`` statement with a value, and that value becomes the value of
|
||
|
the ``yield from`` expression.
|
||
|
|
||
|
In general, the semantics can be understood in terms of the iterator
|
||
|
protocol as follows:
|
||
|
|
||
|
* Any values that the iterator yields are passed directly to the
|
||
|
caller.
|
||
|
|
||
|
* Any values sent to the delegating generator using ``send()``
|
||
|
are passed directly to the iterator. If the sent value is None,
|
||
|
the iterator's ``next()`` method is called. If the sent value is
|
||
|
not None, the iterator's ``send()`` method is called. Any exception
|
||
|
resulting from attempting to call ``next`` or ``send`` is raised
|
||
|
in the delegating generator.
|
||
|
|
||
|
* Calls to the ``throw()`` method of the delegating generator are
|
||
|
forwarded to the iterator. If the iterator does not have a
|
||
|
``throw()`` method, the thrown-in exception is raised in the
|
||
|
delegating generator.
|
||
|
|
||
|
* If the delegating generator's ``close()`` method is called, the
|
||
|
``close() method of the iterator is called first if it has one,
|
||
|
then the delegating generator is finalised.
|
||
|
|
||
|
* The value of the ``yield from`` expression is the first argument
|
||
|
to the ``StopIteration`` exception raised by the iterator when it
|
||
|
terminates.
|
||
|
|
||
|
* ``return expr`` in a generator causes ``StopIteration(expr)`` to
|
||
|
be raised.
|
||
|
|
||
|
For convenience, the ``StopIteration`` exception will be given a
|
||
|
``value`` attribute that holds its first argument, or None if there
|
||
|
are no arguments.
|
||
|
|
||
|
|
||
|
Formal Semantics
|
||
|
----------------
|
||
|
|
||
|
1. The statement
|
||
|
|
||
|
::
|
||
|
|
||
|
result = yield from expr
|
||
|
|
||
|
is semantically equivalent to
|
||
|
|
||
|
::
|
||
|
|
||
|
_i = iter(expr)
|
||
|
try:
|
||
|
_u = _i.next()
|
||
|
while 1:
|
||
|
try:
|
||
|
_v = yield _u
|
||
|
except Exception, _e:
|
||
|
_m = getattr(_i, 'throw', None)
|
||
|
if _m is not None:
|
||
|
_u = _m(_e)
|
||
|
else:
|
||
|
raise
|
||
|
else:
|
||
|
if _v is None:
|
||
|
_u = _i.next()
|
||
|
else:
|
||
|
_u = _i.send(_v)
|
||
|
except StopIteration, _e:
|
||
|
result = _e.value
|
||
|
finally:
|
||
|
_m = getattr(_i, 'close', None)
|
||
|
if _m is not None:
|
||
|
_m()
|
||
|
|
||
|
except that implementations are free to cache bound methods for the 'next',
|
||
|
'send', 'throw' and 'close' methods of the iterator.
|
||
|
|
||
|
2. In a generator, the statement
|
||
|
|
||
|
::
|
||
|
|
||
|
return value
|
||
|
|
||
|
is semantically equivalent to
|
||
|
|
||
|
::
|
||
|
|
||
|
raise StopIteration(value)
|
||
|
|
||
|
except that, as currently, the exception cannot be caught by ``except``
|
||
|
clauses within the returning generator.
|
||
|
|
||
|
3. The StopIteration exception behaves as though defined thusly:
|
||
|
|
||
|
::
|
||
|
|
||
|
class StopIteration(Exception):
|
||
|
|
||
|
def __init__(self, *args):
|
||
|
if len(args) > 0:
|
||
|
self.value = args[0]
|
||
|
else:
|
||
|
self.value = None
|
||
|
Exception.__init__(self, *args)
|
||
|
|
||
|
|
||
|
Rationale
|
||
|
=========
|
||
|
|
||
|
A Python generator is a form of coroutine, but has the limitation that
|
||
|
it can only yield to its immediate caller. This means that a piece of
|
||
|
code containing a ``yield`` cannot be factored out and put into a
|
||
|
separate function in the same way as other code. Performing such a
|
||
|
factoring causes the called function to itself become a generator, and
|
||
|
it is necessary to explicitly iterate over this second generator and
|
||
|
re-yield any values that it produces.
|
||
|
|
||
|
If yielding of values is the only concern, this is not very arduous
|
||
|
and can be performed with a loop such as
|
||
|
|
||
|
::
|
||
|
|
||
|
for v in g:
|
||
|
yield v
|
||
|
|
||
|
However, if the subgenerator is to interact properly with the caller
|
||
|
in the case of calls to ``send()``, ``throw()`` and ``close()``, things
|
||
|
become considerably more complicated. As the formal expansion presented
|
||
|
above illustrates, the necessary code is very longwinded, and it is tricky
|
||
|
to handle all the corner cases correctly. In this situation, the advantages
|
||
|
of a specialised syntax should be clear.
|
||
|
|
||
|
|
||
|
Generators as Threads
|
||
|
---------------------
|
||
|
|
||
|
A motivating use case for generators being able to return values
|
||
|
concerns the use of generators to implement lightweight threads. When
|
||
|
using generators in that way, it is reasonable to want to spread the
|
||
|
computation performed by the lightweight thread over many functions.
|
||
|
One would like to be able to call a subgenerator as though it were
|
||
|
an ordinary function, passing it parameters and receiving a returned
|
||
|
value.
|
||
|
|
||
|
Using the proposed syntax, a statement such as
|
||
|
|
||
|
::
|
||
|
|
||
|
y = f(x)
|
||
|
|
||
|
where f is an ordinary function, can be transformed into a delegation
|
||
|
call
|
||
|
|
||
|
::
|
||
|
|
||
|
y = yield from g(x)
|
||
|
|
||
|
where g is a generator. One can reason about the behaviour of the
|
||
|
resulting code by thinking of g as an ordinary function that can be
|
||
|
suspended using a ``yield`` statement.
|
||
|
|
||
|
When using generators as threads in this way, typically one is not
|
||
|
interested in the values being passed in or out of the yields.
|
||
|
However, there are use cases for this as well, where the thread is
|
||
|
seen as a producer or consumer of items. The ``yield from``
|
||
|
expression allows the logic of the thread to be spread over as
|
||
|
many functions as desired, with the production or consumption of
|
||
|
items occuring in any subfunction, and the items are automatically
|
||
|
routed to or from their ultimate source or destination.
|
||
|
|
||
|
Concerning ``throw()`` and ``close()``, it is reasonable to expect
|
||
|
that if an exception is thrown into the thread from outside, it should
|
||
|
first be raised in the innermost generator where the thread is suspended,
|
||
|
and propagate outwards from there; and that if the thread is terminated
|
||
|
from outside by calling ``close()``, the chain of active generators
|
||
|
should be finalised from the innermost outwards.
|
||
|
|
||
|
|
||
|
Syntax
|
||
|
------
|
||
|
|
||
|
The particular syntax proposed has been chosen as suggestive of its
|
||
|
meaning, while not introducing any new keywords and clearly standing
|
||
|
out as being different from a plain ``yield``.
|
||
|
|
||
|
|
||
|
Optimisations
|
||
|
-------------
|
||
|
|
||
|
Using a specialised syntax opens up possibilities for optimisation
|
||
|
when there is a long chain of generators. Such chains can arise, for
|
||
|
instance, when recursively traversing a tree structure. The overhead
|
||
|
of passing ``next()`` calls and yielded values down and up the chain
|
||
|
can cause what ought to be an O(n) operation to become O(n\*\*2).
|
||
|
|
||
|
A possible strategy is to add a slot to generator objects to hold a
|
||
|
generator being delegated to. When a ``next()`` or ``send()`` call is
|
||
|
made on the generator, this slot is checked first, and if it is
|
||
|
nonempty, the generator that it references is resumed instead. If it
|
||
|
raises StopIteration, the slot is cleared and the main generator is
|
||
|
resumed.
|
||
|
|
||
|
This would reduce the delegation overhead to a chain of C function
|
||
|
calls involving no Python code execution. A possible enhancement would
|
||
|
be to traverse the whole chain of generators in a loop and directly
|
||
|
resume the one at the end, although the handling of StopIteration is
|
||
|
more complicated then.
|
||
|
|
||
|
|
||
|
Use of StopIteration to return values
|
||
|
-------------------------------------
|
||
|
|
||
|
There are a variety of ways that the return value from the generator
|
||
|
could be passed back. Some alternatives include storing it as an
|
||
|
attribute of the generator-iterator object, or returning it as the
|
||
|
value of the ``close()`` call to the subgenerator. However, the proposed
|
||
|
mechanism is attractive for a couple of reasons:
|
||
|
|
||
|
* Using the StopIteration exception makes it easy for other kinds
|
||
|
of iterators to participate in the protocol without having to
|
||
|
grow extra attributes or a close() method.
|
||
|
|
||
|
* It simplifies the implementation, because the point at which the
|
||
|
return value from the subgenerator becomes available is the same
|
||
|
point at which StopIteration is raised. Delaying until any later
|
||
|
time would require storing the return value somewhere.
|
||
|
|
||
|
|
||
|
Criticisms
|
||
|
==========
|
||
|
|
||
|
Under this proposal, the value of a ``yield from`` expression would
|
||
|
be derived in a very different way from that of an ordinary ``yield``
|
||
|
expression. This suggests that some other syntax not containing the
|
||
|
word ``yield`` might be more appropriate, but no acceptable alternative
|
||
|
has so far been proposed.
|
||
|
|
||
|
It has been suggested that some mechanism other than ``return`` in
|
||
|
the subgenerator should be used to establish the value returned by
|
||
|
the ``yield from`` expression. However, this would interfere with
|
||
|
the goal of being able to think of the subgenerator as a suspendable
|
||
|
function, since it would not be able to return values in the same way
|
||
|
as other functions.
|
||
|
|
||
|
The use of an argument to StopIteration to pass the return value
|
||
|
has been criticised as an "abuse of exceptions", without any
|
||
|
concrete justification of this claim. In any case, this is only
|
||
|
one suggested implementation; another mechanism could be used
|
||
|
without losing any essential features of the proposal.
|
||
|
|
||
|
It has been suggested that a different exception, such as
|
||
|
GeneratorReturn, should be used instead of StopIteration to return a
|
||
|
value. However, no convincing practical reason for this has been put
|
||
|
forward, and the addition of a ``value`` attribute to StopIteration
|
||
|
mitigates any difficulties in extracting a return value from a
|
||
|
StopIteration exception that may or may not have one. Also, using a
|
||
|
different exception would mean that, unlike ordinary functions,
|
||
|
'return' without a value in a generator would not be equivalent to
|
||
|
'return None'.
|
||
|
|
||
|
|
||
|
Alternative Proposals
|
||
|
=====================
|
||
|
|
||
|
Proposals along similar lines have been made before, some using the
|
||
|
syntax ``yield *`` instead of ``yield from``. While ``yield *`` is
|
||
|
more concise, it could be argued that it looks too similar to an
|
||
|
ordinary ``yield`` and the difference might be overlooked when reading
|
||
|
code.
|
||
|
|
||
|
To the author's knowledge, previous proposals have focused only on
|
||
|
yielding values, and thereby suffered from the criticism that the
|
||
|
two-line for-loop they replace is not sufficiently tiresome to write
|
||
|
to justify a new syntax. By also dealing with calls to ``send()``,
|
||
|
``throw()`` and ``close()``, this proposal provides considerably more
|
||
|
benefit.
|
||
|
|
||
|
|
||
|
Implementation
|
||
|
==============
|
||
|
|
||
|
A `prototype implementation`_ is available, based on the first
|
||
|
optimisation outlined above.
|
||
|
|
||
|
.. _prototype implementation: http://www.cosc.canterbury.ac.nz/greg.ewing/python/yield-from/
|
||
|
|
||
|
|
||
|
Copyright
|
||
|
=========
|
||
|
|
||
|
This document has been placed in the public domain.
|
||
|
|
||
|
|
||
|
|
||
|
..
|
||
|
Local Variables:
|
||
|
mode: indented-text
|
||
|
indent-tabs-mode: nil
|
||
|
sentence-end-double-space: t
|
||
|
fill-column: 70
|
||
|
coding: utf-8
|
||
|
End:
|