PEP: 479 Title: Change StopIteration handling inside generators Version: $Revision$ Last-Modified: $Date$ Author: Chris Angelico Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 15-Nov-2014 Python-Version: 3.5 Post-History: 15-Nov-2014, 19-Nov-2014 Abstract ======== This PEP proposes a semantic change to ``StopIteration`` when raised inside a generator. This would unify the behaviour of list comprehensions and generator expressions, reducing surprises such as the one that started this discussion [1]_. This is also the main backwards incompatibility of the proposal -- any generator that depends on raising ``StopIteration`` to terminate it will have to be rewritten to either catch that exception or use a for-loop. Rationale ========= The interaction of generators and ``StopIteration`` is currently somewhat surprising, and can conceal obscure bugs. An unexpected exception should not result in subtly altered behaviour, but should cause a noisy and easily-debugged traceback. Currently, ``StopIteration`` can be absorbed by the generator construct. Background information ====================== When a generator frame is (re)started as a result of a ``__next__()`` (or ``send()`` or ``throw()``) call, one of three outcomes can occur: * A yield point is reached, and the yielded value is returned. * The frame is returned from; ``StopIteration`` is raised. * An exception is raised, which bubbles out. In the latter two cases the frame is abandoned (and the generator object's ``gi_frame`` attribute is set to None). Proposal ======== If a ``StopIteration`` is about to bubble out of a generator frame, it is replaced with ``RuntimeError``, which causes the ``next()`` call (which invoked the generator) to fail, passing that exception out. From then on it's just like any old exception. [3]_ This affects the third outcome listed above, without altering any other effects. Furthermore, it only affects this outcome when the exception raised is StopIteration (or a subclass thereof). Note that the proposed replacement happens at the point where the exception is about to bubble out of the frame, i.e. after any ``except`` or ``finally`` blocks that could affect it have been exited. The ``StopIteration`` raised by returning from the frame is not affected (the point being that ``StopIteration`` means that the generator terminated "normally", i.e. it did not raise an exception). Consequences for existing code ============================== This change will affect existing code that depends on ``StopIteration`` bubbling up. The pure Python reference implementation of ``groupby`` [2]_ currently has comments "Exit on ``StopIteration``" where it is expected that the exception will propagate and then be handled. This will be unusual, but not unknown, and such constructs will fail. Other examples abound, e.g. [5]_, [6]_. (Nick Coghlan comments: """If you wanted to factor out a helper function that terminated the generator you'd have to do "return yield from helper()" rather than just "helper()".""") There are also examples of generator expressions floating around that rely on a StopIteration raised by the expression, the target or the predicate (rather than by the __next__() call implied in the ``for`` loop proper). As this can break code, it is proposed to utilize the ``__future__`` mechanism to introduce this in Python 3.5, finally making it standard in Python 3.6 or 3.7. The proposed syntax is:: from __future__ import generator_stop Any generator function constructed under the influence of this directive will have the REPLACE_STOPITERATION flag set on its code object, and generators with the flag set will behave according to this proposal. Once the feature becomes standard, the flag may be dropped; code should not inspect generators for it. Alternate proposals =================== Raising something other than RuntimeError ----------------------------------------- Rather than the generic ``RuntimeError``, it might make sense to raise a new exception type ``UnexpectedStopIteration``. This has the downside of implicitly encouraging that it be caught; the correct action is to catch the original ``StopIteration``, not the chained exception. Supplying a specific exception to raise on return ------------------------------------------------- Nick Coghlan suggested a means of providing a specific ``StopIteration`` instance to the generator; if any other instance of ``StopIteration`` is raised, it is an error, but if that particular one is raised, the generator has properly completed. This subproposal has been withdrawn in favour of better options, but is retained for reference. Making return-triggered StopIterations obvious ---------------------------------------------- For certain situations, a simpler and fully backward-compatible solution may be sufficient: when a generator returns, instead of raising ``StopIteration``, it raises a specific subclass of ``StopIteration`` (``GeneratorReturn``) which can then be detected. If it is not that subclass, it is an escaping exception rather than a return statement. The inspiration for this alternative proposal was Nick's observation [7]_ that if an ``asyncio`` coroutine [8]_ accidentally raises ``StopIteration``, it currently terminates silently, which may present a hard-to-debug mystery to the developer. The main proposal turns such accidents in clearly distinguishable ``RuntimeError`` exceptions, but if that is rejected, this alternate proposal would enable ``asyncio`` to distinguish between a ``return`` statement and an accidentally-raised ``StopIteration`` exception. Of the three outcomes listed above: * A yielded value, obviously, would still be returned. * If the frame is returned from, ``GeneratorReturn`` is raised. * If an instance of ``GeneratorReturn`` would be raised, instead an instance of ``StopIteration`` would be raised. In the third case, the ``StopIteration`` would have the ``value`` of the original ``GeneratorReturn``, and would reference the original exception in its ``__cause__``. If uncaught, this would clearly show the chaining of exceptions. This alternative does *not* affect the discrepancy between generator expressions and list comprehensions, but allows generator-aware code (such as the ``contextlib`` and ``asyncio`` modules) to reliably differentiate between the second and third outcomes listed above. However, once code exists that depends on this distinction between ``GeneratorReturn`` and ``StopIteration``, a generator that invokes another generator and relies on the latter's ``StopIteration`` to bubble out would still be potentially wrong, depending on the use made of the distinction between the two exception types. Criticism ========= Unofficial and apocryphal statistics suggest that this is seldom, if ever, a problem. [4]_ Code does exist which relies on the current behaviour (e.g. [2]_, [5]_, [6]_), and there is the concern that this would be unnecessary code churn to achieve little or no gain. Steven D'Aprano started an informal survey on comp.lang.python [9]_; at the time of writing only two responses have been received: one was in favor of changing list comprehensions to match generator expressions (!), the other was in favor of this PEP's main proposal. The existing model has been compared to the perfectly-acceptable issues inherent to every other case where an exception has special meaning. For instance, an unexpected ``KeyError`` inside a ``__getitem__`` method will be interpreted as failure, rather than permitted to bubble up. However, there is a difference. Dunder methods use ``return`` to indicate normality, and ``raise`` to signal abnormality; generators ``yield`` to indicate data, and ``return`` to signal the abnormal state. This makes explicitly raising ``StopIteration`` entirely redundant, and potentially surprising. If other dunder methods had dedicated keywords to distinguish between their return paths, they too could turn unexpected exceptions into ``RuntimeError``; the fact that they cannot should not preclude generators from doing so. References ========== .. [1] Initial mailing list comment (https://mail.python.org/pipermail/python-ideas/2014-November/029906.html) .. [2] Pure Python implementation of groupby (https://docs.python.org/3/library/itertools.html#itertools.groupby) .. [3] Proposal by GvR (https://mail.python.org/pipermail/python-ideas/2014-November/029953.html) .. [4] Response by Steven D'Aprano (https://mail.python.org/pipermail/python-ideas/2014-November/029994.html) .. [5] Split a sequence or generator using a predicate (http://code.activestate.com/recipes/578416-split-a-sequence-or-generator-using-a-predicate/) .. [6] wrap unbounded generator to restrict its output (http://code.activestate.com/recipes/66427-wrap-unbounded-generator-to-restrict-its-output/) .. [7] Post from Nick Coghlan mentioning asyncio (https://mail.python.org/pipermail/python-ideas/2014-November/029961.html) .. [8] Coroutines in asyncio (https://docs.python.org/3/library/asyncio-task.html#coroutines) .. [9] Thread on comp.lang.python started by Steven D'Aprano (https://mail.python.org/pipermail/python-list/2014-November/680757.html) .. [10] Tracker issue with Proof-of-Concept patch (http://bugs.python.org/issue22906) Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: