245 lines
9.4 KiB
Plaintext
245 lines
9.4 KiB
Plaintext
PEP: 479
|
||
Title: Change StopIteration handling inside generators
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Chris Angelico <rosuav@gmail.com>
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 15-Nov-2014
|
||
Python-Version: 3.5
|
||
Post-History: 15-Nov-2014, 19-Nov-2014
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
This PEP proposes a semantic change to ``StopIteration`` when raised
|
||
inside a generator. This would unify the behaviour of list
|
||
comprehensions and generator expressions, reducing surprises such as
|
||
the one that started this discussion [1]_. This is also the main
|
||
backwards incompatibility of the proposal -- any generator that
|
||
depends on raising ``StopIteration`` to terminate it will
|
||
have to be rewritten to either catch that exception or use a for-loop.
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
The interaction of generators and ``StopIteration`` is currently
|
||
somewhat surprising, and can conceal obscure bugs. An unexpected
|
||
exception should not result in subtly altered behaviour, but should
|
||
cause a noisy and easily-debugged traceback. Currently,
|
||
``StopIteration`` can be absorbed by the generator construct.
|
||
|
||
|
||
Background information
|
||
======================
|
||
|
||
When a generator frame is (re)started as a result of a ``__next__()``
|
||
(or ``send()`` or ``throw()``) call, one of three outcomes can occur:
|
||
|
||
* A yield point is reached, and the yielded value is returned.
|
||
* The frame is returned from; ``StopIteration`` is raised.
|
||
* An exception is raised, which bubbles out.
|
||
|
||
In the latter two cases the frame is abandoned (and the generator
|
||
object's ``gi_frame`` attribute is set to None).
|
||
|
||
|
||
Proposal
|
||
========
|
||
|
||
If a ``StopIteration`` is about to bubble out of a generator frame, it
|
||
is replaced with ``RuntimeError``, which causes the ``next()`` call
|
||
(which invoked the generator) to fail, passing that exception out.
|
||
From then on it's just like any old exception. [3]_
|
||
|
||
This affects the third outcome listed above, without altering any
|
||
other effects. Furthermore, it only affects this outcome when the
|
||
exception raised is StopIteration (or a subclass thereof).
|
||
|
||
Note that the proposed replacement happens at the point where the
|
||
exception is about to bubble out of the frame, i.e. after any
|
||
``except`` or ``finally`` blocks that could affect it have been
|
||
exited. The ``StopIteration`` raised by returning from the frame is
|
||
not affected (the point being that ``StopIteration`` means that the
|
||
generator terminated "normally", i.e. it did not raise an exception).
|
||
|
||
|
||
Consequences for existing code
|
||
==============================
|
||
|
||
This change will affect existing code that depends on
|
||
``StopIteration`` bubbling up. The pure Python reference
|
||
implementation of ``groupby`` [2]_ currently has comments "Exit on
|
||
``StopIteration``" where it is expected that the exception will
|
||
propagate and then be handled. This will be unusual, but not unknown,
|
||
and such constructs will fail. Other examples abound, e.g. [5]_, [6]_.
|
||
|
||
(Nick Coghlan comments: """If you wanted to factor out a helper
|
||
function that terminated the generator you'd have to do "return
|
||
yield from helper()" rather than just "helper()".""")
|
||
|
||
There are also examples of generator expressions floating around that
|
||
rely on a StopIteration raised by the expression, the target or the
|
||
predicate (rather than by the __next__() call implied in the ``for``
|
||
loop proper).
|
||
|
||
As this can break code, it is proposed to utilize the ``__future__``
|
||
mechanism to introduce this in Python 3.5, finally making it standard
|
||
in Python 3.6 or 3.7. The proposed syntax is::
|
||
|
||
from __future__ import generator_stop
|
||
|
||
Any generator function constructed under the influence of this
|
||
directive will have the REPLACE_STOPITERATION flag set on its code
|
||
object, and generators with the flag set will behave according to this
|
||
proposal. Once the feature becomes standard, the flag may be dropped;
|
||
code should not inspect generators for it.
|
||
|
||
|
||
Alternate proposals
|
||
===================
|
||
|
||
Raising something other than RuntimeError
|
||
-----------------------------------------
|
||
|
||
Rather than the generic ``RuntimeError``, it might make sense to raise
|
||
a new exception type ``UnexpectedStopIteration``. This has the
|
||
downside of implicitly encouraging that it be caught; the correct
|
||
action is to catch the original ``StopIteration``, not the chained
|
||
exception.
|
||
|
||
|
||
Supplying a specific exception to raise on return
|
||
-------------------------------------------------
|
||
|
||
Nick Coghlan suggested a means of providing a specific
|
||
``StopIteration`` instance to the generator; if any other instance of
|
||
``StopIteration`` is raised, it is an error, but if that particular
|
||
one is raised, the generator has properly completed. This subproposal
|
||
has been withdrawn in favour of better options, but is retained for
|
||
reference.
|
||
|
||
|
||
Making return-triggered StopIterations obvious
|
||
----------------------------------------------
|
||
|
||
For certain situations, a simpler and fully backward-compatible
|
||
solution may be sufficient: when a generator returns, instead of
|
||
raising ``StopIteration``, it raises a specific subclass of
|
||
``StopIteration`` (``GeneratorReturn``) which can then be detected.
|
||
If it is not that subclass, it is an escaping exception rather than a
|
||
return statement.
|
||
|
||
The inspiration for this alternative proposal was Nick's observation
|
||
[7]_ that if an ``asyncio`` coroutine [8]_ accidentally raises
|
||
``StopIteration``, it currently terminates silently, which may present
|
||
a hard-to-debug mystery to the developer. The main proposal turns
|
||
such accidents in clearly distinguishable ``RuntimeError`` exceptions,
|
||
but if that is rejected, this alternate proposal would enable
|
||
``asyncio`` to distinguish between a ``return`` statement and an
|
||
accidentally-raised ``StopIteration`` exception.
|
||
|
||
Of the three outcomes listed above:
|
||
|
||
* A yielded value, obviously, would still be returned.
|
||
* If the frame is returned from, ``GeneratorReturn`` is raised.
|
||
* If an instance of ``GeneratorReturn`` would be raised, instead an
|
||
instance of ``StopIteration`` would be raised.
|
||
|
||
In the third case, the ``StopIteration`` would have the ``value`` of
|
||
the original ``GeneratorReturn``, and would reference the original
|
||
exception in its ``__cause__``. If uncaught, this would clearly show
|
||
the chaining of exceptions.
|
||
|
||
This alternative does *not* affect the discrepancy between generator
|
||
expressions and list comprehensions, but allows generator-aware code
|
||
(such as the ``contextlib`` and ``asyncio`` modules) to reliably
|
||
differentiate between the second and third outcomes listed above.
|
||
|
||
However, once code exists that depends on this distinction between
|
||
``GeneratorReturn`` and ``StopIteration``, a generator that invokes
|
||
another generator and relies on the latter's ``StopIteration`` to
|
||
bubble out would still be potentially wrong, depending on the use made
|
||
of the distinction between the two exception types.
|
||
|
||
|
||
Criticism
|
||
=========
|
||
|
||
Unofficial and apocryphal statistics suggest that this is seldom, if
|
||
ever, a problem. [4]_ Code does exist which relies on the current
|
||
behaviour (e.g. [2]_, [5]_, [6]_), and there is the concern that this
|
||
would be unnecessary code churn to achieve little or no gain.
|
||
|
||
Steven D'Aprano started an informal survey on comp.lang.python [9]_;
|
||
at the time of writing only two responses have been received: one was
|
||
in favor of changing list comprehensions to match generator
|
||
expressions (!), the other was in favor of this PEP's main proposal.
|
||
|
||
The existing model has been compared to the perfectly-acceptable
|
||
issues inherent to every other case where an exception has special
|
||
meaning. For instance, an unexpected ``KeyError`` inside a
|
||
``__getitem__`` method will be interpreted as failure, rather than
|
||
permitted to bubble up. However, there is a difference. Dunder
|
||
methods use ``return`` to indicate normality, and ``raise`` to signal
|
||
abnormality; generators ``yield`` to indicate data, and ``return`` to
|
||
signal the abnormal state. This makes explicitly raising
|
||
``StopIteration`` entirely redundant, and potentially surprising. If
|
||
other dunder methods had dedicated keywords to distinguish between
|
||
their return paths, they too could turn unexpected exceptions into
|
||
``RuntimeError``; the fact that they cannot should not preclude
|
||
generators from doing so.
|
||
|
||
|
||
References
|
||
==========
|
||
|
||
.. [1] Initial mailing list comment
|
||
(https://mail.python.org/pipermail/python-ideas/2014-November/029906.html)
|
||
|
||
.. [2] Pure Python implementation of groupby
|
||
(https://docs.python.org/3/library/itertools.html#itertools.groupby)
|
||
|
||
.. [3] Proposal by GvR
|
||
(https://mail.python.org/pipermail/python-ideas/2014-November/029953.html)
|
||
|
||
.. [4] Response by Steven D'Aprano
|
||
(https://mail.python.org/pipermail/python-ideas/2014-November/029994.html)
|
||
|
||
.. [5] Split a sequence or generator using a predicate
|
||
(http://code.activestate.com/recipes/578416-split-a-sequence-or-generator-using-a-predicate/)
|
||
|
||
.. [6] wrap unbounded generator to restrict its output
|
||
(http://code.activestate.com/recipes/66427-wrap-unbounded-generator-to-restrict-its-output/)
|
||
|
||
.. [7] Post from Nick Coghlan mentioning asyncio
|
||
(https://mail.python.org/pipermail/python-ideas/2014-November/029961.html)
|
||
|
||
.. [8] Coroutines in asyncio
|
||
(https://docs.python.org/3/library/asyncio-task.html#coroutines)
|
||
|
||
.. [9] Thread on comp.lang.python started by Steven D'Aprano
|
||
(https://mail.python.org/pipermail/python-list/2014-November/680757.html)
|
||
|
||
.. [10] Tracker issue with Proof-of-Concept patch
|
||
(http://bugs.python.org/issue22906)
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|