Oops - remove accidentally-committed files
This commit is contained in:
parent
ef8239fef0
commit
837ab51351
|
@ -1,603 +0,0 @@
|
||||||
PEP: 479
|
|
||||||
Title: Change StopIteration handling inside generators
|
|
||||||
Version: $Revision$
|
|
||||||
Last-Modified: $Date$
|
|
||||||
Author: Chris Angelico <rosuav@gmail.com>, Guido van Rossum <guido@python.org>
|
|
||||||
Status: Accepted
|
|
||||||
Type: Standards Track
|
|
||||||
Content-Type: text/x-rst
|
|
||||||
Created: 15-Nov-2014
|
|
||||||
Python-Version: 3.5
|
|
||||||
Post-History: 15-Nov-2014, 19-Nov-2014, 5-Dec-2014
|
|
||||||
|
|
||||||
|
|
||||||
Abstract
|
|
||||||
========
|
|
||||||
|
|
||||||
This PEP proposes a change to generators: when ``StopIteration`` is
|
|
||||||
raised inside a generator, it is replaced it with ``RuntimeError``.
|
|
||||||
(More precisely, this happens when the exception is about to bubble
|
|
||||||
out of the generator's stack frame.) Because the change is backwards
|
|
||||||
incompatible, the feature is initially introduced using a
|
|
||||||
``__future__`` statement.
|
|
||||||
|
|
||||||
|
|
||||||
Acceptance
|
|
||||||
==========
|
|
||||||
|
|
||||||
This PEP was accepted by the BDFL on November 22. Because of the
|
|
||||||
exceptionally short period from first draft to acceptance, the main
|
|
||||||
objections brought up after acceptance were carefully considered and
|
|
||||||
have been reflected in the "Alternate proposals" section below.
|
|
||||||
However, none of the discussion changed the BDFL's mind and the PEP's
|
|
||||||
acceptance is now final. (Suggestions for clarifying edits are still
|
|
||||||
welcome -- unlike IETF RFCs, the text of a PEP is not cast in stone
|
|
||||||
after its acceptance, although the core design/plan/specification
|
|
||||||
should not change after acceptance.)
|
|
||||||
|
|
||||||
|
|
||||||
Rationale
|
|
||||||
=========
|
|
||||||
|
|
||||||
The interaction of generators and ``StopIteration`` is currently
|
|
||||||
somewhat surprising, and can conceal obscure bugs. An unexpected
|
|
||||||
exception should not result in subtly altered behaviour, but should
|
|
||||||
cause a noisy and easily-debugged traceback. Currently,
|
|
||||||
``StopIteration`` can be absorbed by the generator construct.
|
|
||||||
|
|
||||||
The main goal of the proposal is to ease debugging in the situation
|
|
||||||
where an unguarded ``next()`` call (perhaps several stack frames deep)
|
|
||||||
raises ``StopIteration`` and causes the iteration controlled by the
|
|
||||||
generator to terminate silently. (When another exception is raised, a
|
|
||||||
traceback is printed pinpointing the cause of the problem.)
|
|
||||||
|
|
||||||
This is particularly pernicious in combination with the ``yield from``
|
|
||||||
construct of PEP 380 [1]_, as it breaks the abstraction that a
|
|
||||||
subgenerator may be factored out of a generator. That PEP notes this
|
|
||||||
limitation, but notes that "use cases for these [are] rare to non-
|
|
||||||
existent". Unfortunately while intentional use is rare, it is easy to
|
|
||||||
stumble on these cases by accident::
|
|
||||||
|
|
||||||
import contextlib
|
|
||||||
|
|
||||||
@contextlib.contextmanager
|
|
||||||
def transaction():
|
|
||||||
print('begin')
|
|
||||||
try:
|
|
||||||
yield from do_it()
|
|
||||||
except:
|
|
||||||
print('rollback')
|
|
||||||
raise
|
|
||||||
else:
|
|
||||||
print('commit')
|
|
||||||
|
|
||||||
def do_it():
|
|
||||||
print('Refactored initial setup')
|
|
||||||
yield # Body of with-statement is executed here
|
|
||||||
print('Refactored finalization of successful transaction')
|
|
||||||
|
|
||||||
|
|
||||||
import pathlib
|
|
||||||
|
|
||||||
with transaction():
|
|
||||||
print('commit file {}'.format(
|
|
||||||
# I can never remember what the README extension is
|
|
||||||
next(pathlib.Path('/some/dir').glob('README*'))))
|
|
||||||
|
|
||||||
Here factoring out ``do_it`` into a subgenerator has introduced a subtle
|
|
||||||
bug: if a bug in the wrapped block allows ``StopIteration`` to escape
|
|
||||||
(here because the README doesn't exist), under the current behavior
|
|
||||||
``do_it`` will properly abort, but the exception will be swallowed by
|
|
||||||
the ``yield from`` and the original context manager will commit the
|
|
||||||
unfinished transaction! Similarly problematic behavior occurs when an
|
|
||||||
``asyncio`` coroutine raises ``StopIteration``, causing it to terminate
|
|
||||||
silently. In both cases, the refactoring abstraction of ``yield from``
|
|
||||||
breaks in the presence of bugs in client code.
|
|
||||||
|
|
||||||
Additionally, the proposal reduces the difference between list
|
|
||||||
comprehensions and generator expressions, preventing surprises such as
|
|
||||||
the one that started this discussion [2]_. Henceforth, the following
|
|
||||||
statements will produce the same result if either produces a result at
|
|
||||||
all::
|
|
||||||
|
|
||||||
a = list(F(x) for x in xs if P(x))
|
|
||||||
a = [F(x) for x in xs if P(x)]
|
|
||||||
|
|
||||||
With the current state of affairs, it is possible to write a function
|
|
||||||
``F(x)`` or a predicate ``P(x)`` that causes the first form to produce
|
|
||||||
a (truncated) result, while the second form raises an exception
|
|
||||||
(namely, ``StopIteration``). With the proposed change, both forms
|
|
||||||
will raise an exception at this point (albeit ``RuntimeError`` in the
|
|
||||||
first case and ``StopIteration`` in the second).
|
|
||||||
|
|
||||||
Finally, the proposal also clears up the confusion about how to
|
|
||||||
terminate a generator: the proper way is ``return``, not
|
|
||||||
``raise StopIteration``.
|
|
||||||
|
|
||||||
As an added bonus, the above changes bring generator functions much
|
|
||||||
more in line with regular functions. If you wish to take a piece of
|
|
||||||
code presented as a generator and turn it into something else, you
|
|
||||||
can usually do this fairly simply, by replacing every ``yield`` with
|
|
||||||
a call to ``print()`` or ``list.append()``; however, if there are any
|
|
||||||
bare ``next()`` calls in the code, you have to be aware of them. If
|
|
||||||
the code was originally written without relying on ``StopIteration``
|
|
||||||
terminating the function, the transformation would be that much
|
|
||||||
easier.
|
|
||||||
|
|
||||||
|
|
||||||
Background information
|
|
||||||
======================
|
|
||||||
|
|
||||||
When a generator frame is (re)started as a result of a ``__next__()``
|
|
||||||
(or ``send()`` or ``throw()``) call, one of three outcomes can occur:
|
|
||||||
|
|
||||||
* A yield point is reached, and the yielded value is returned.
|
|
||||||
* The frame is returned from; ``StopIteration`` is raised.
|
|
||||||
* An exception is raised, which bubbles out.
|
|
||||||
|
|
||||||
In the latter two cases the frame is abandoned (and the generator
|
|
||||||
object's ``gi_frame`` attribute is set to None).
|
|
||||||
|
|
||||||
|
|
||||||
Proposal
|
|
||||||
========
|
|
||||||
|
|
||||||
If a ``StopIteration`` is about to bubble out of a generator frame, it
|
|
||||||
is replaced with ``RuntimeError``, which causes the ``next()`` call
|
|
||||||
(which invoked the generator) to fail, passing that exception out.
|
|
||||||
From then on it's just like any old exception. [4]_
|
|
||||||
|
|
||||||
This affects the third outcome listed above, without altering any
|
|
||||||
other effects. Furthermore, it only affects this outcome when the
|
|
||||||
exception raised is ``StopIteration`` (or a subclass thereof).
|
|
||||||
|
|
||||||
Note that the proposed replacement happens at the point where the
|
|
||||||
exception is about to bubble out of the frame, i.e. after any
|
|
||||||
``except`` or ``finally`` blocks that could affect it have been
|
|
||||||
exited. The ``StopIteration`` raised by returning from the frame is
|
|
||||||
not affected (the point being that ``StopIteration`` means that the
|
|
||||||
generator terminated "normally", i.e. it did not raise an exception).
|
|
||||||
|
|
||||||
A subtle issue is what will happen if the caller, having caught the
|
|
||||||
``RuntimeError``, calls the generator object's ``__next__()`` method
|
|
||||||
again. The answer is that from this point on it will raise
|
|
||||||
``StopIteration`` -- the behavior is the same as when any other
|
|
||||||
exception was raised by the generator.
|
|
||||||
|
|
||||||
Another logical consequence of the proposal: if someone uses
|
|
||||||
``g.throw(StopIteration)`` to throw a ``StopIteration`` exception into
|
|
||||||
a generator, if the generator doesn't catch it (which it could do
|
|
||||||
using a ``try/except`` around the ``yield``), it will be transformed
|
|
||||||
into ``RuntimeError``.
|
|
||||||
|
|
||||||
During the transition phase, the new feature must be enabled
|
|
||||||
per-module using::
|
|
||||||
|
|
||||||
from __future__ import generator_stop
|
|
||||||
|
|
||||||
Any generator function constructed under the influence of this
|
|
||||||
directive will have the ``REPLACE_STOPITERATION`` flag set on its code
|
|
||||||
object, and generators with the flag set will behave according to this
|
|
||||||
proposal. Once the feature becomes standard, the flag may be dropped;
|
|
||||||
code should not inspect generators for it.
|
|
||||||
|
|
||||||
|
|
||||||
Consequences for existing code
|
|
||||||
==============================
|
|
||||||
|
|
||||||
This change will affect existing code that depends on
|
|
||||||
``StopIteration`` bubbling up. The pure Python reference
|
|
||||||
implementation of ``groupby`` [3]_ currently has comments "Exit on
|
|
||||||
``StopIteration``" where it is expected that the exception will
|
|
||||||
propagate and then be handled. This will be unusual, but not unknown,
|
|
||||||
and such constructs will fail. Other examples abound, e.g. [6]_, [7]_.
|
|
||||||
|
|
||||||
(Nick Coghlan comments: """If you wanted to factor out a helper
|
|
||||||
function that terminated the generator you'd have to do "return
|
|
||||||
yield from helper()" rather than just "helper()".""")
|
|
||||||
|
|
||||||
There are also examples of generator expressions floating around that
|
|
||||||
rely on a ``StopIteration`` raised by the expression, the target or the
|
|
||||||
predicate (rather than by the ``__next__()`` call implied in the ``for``
|
|
||||||
loop proper).
|
|
||||||
|
|
||||||
Writing backwards and forwards compatible code
|
|
||||||
----------------------------------------------
|
|
||||||
|
|
||||||
With the exception of hacks that raise ``StopIteration`` to exit a
|
|
||||||
generator expression, it is easy to write code that works equally well
|
|
||||||
under older Python versions as under the new semantics.
|
|
||||||
|
|
||||||
This is done by enclosing those places in the generator body where a
|
|
||||||
``StopIteration`` is expected (e.g. bare ``next()`` calls or in some
|
|
||||||
cases helper functions that are expected to raise ``StopIteration``)
|
|
||||||
in a ``try/except`` construct that returns when ``StopIteration`` is
|
|
||||||
raised. The ``try/except`` construct should appear directly in the
|
|
||||||
generator function; doing this in a helper function that is not itself
|
|
||||||
a generator does not work. If ``raise StopIteration`` occurs directly
|
|
||||||
in a generator, simply replace it with ``return``.
|
|
||||||
|
|
||||||
|
|
||||||
Examples of breakage
|
|
||||||
--------------------
|
|
||||||
|
|
||||||
Generators which explicitly raise ``StopIteration`` can generally be
|
|
||||||
changed to simply return instead. This will be compatible with all
|
|
||||||
existing Python versions, and will not be affected by ``__future__``.
|
|
||||||
Here are some illustrations from the standard library.
|
|
||||||
|
|
||||||
Lib/ipaddress.py::
|
|
||||||
|
|
||||||
if other == self:
|
|
||||||
raise StopIteration
|
|
||||||
|
|
||||||
Becomes::
|
|
||||||
|
|
||||||
if other == self:
|
|
||||||
return
|
|
||||||
|
|
||||||
In some cases, this can be combined with ``yield from`` to simplify
|
|
||||||
the code, such as Lib/difflib.py::
|
|
||||||
|
|
||||||
if context is None:
|
|
||||||
while True:
|
|
||||||
yield next(line_pair_iterator)
|
|
||||||
|
|
||||||
Becomes::
|
|
||||||
|
|
||||||
if context is None:
|
|
||||||
yield from line_pair_iterator
|
|
||||||
return
|
|
||||||
|
|
||||||
(The ``return`` is necessary for a strictly-equivalent translation,
|
|
||||||
though in this particular file, there is no further code, and the
|
|
||||||
``return`` can be omitted.) For compatibility with pre-3.3 versions
|
|
||||||
of Python, this could be written with an explicit ``for`` loop::
|
|
||||||
|
|
||||||
if context is None:
|
|
||||||
for line in line_pair_iterator:
|
|
||||||
yield line
|
|
||||||
return
|
|
||||||
|
|
||||||
More complicated iteration patterns will need explicit ``try/except``
|
|
||||||
constructs. For example, a hypothetical parser like this::
|
|
||||||
|
|
||||||
def parser(f):
|
|
||||||
while True:
|
|
||||||
data = next(f)
|
|
||||||
while True:
|
|
||||||
line = next(f)
|
|
||||||
if line == "- end -": break
|
|
||||||
data += line
|
|
||||||
yield data
|
|
||||||
|
|
||||||
would need to be rewritten as::
|
|
||||||
|
|
||||||
def parser(f):
|
|
||||||
while True:
|
|
||||||
try:
|
|
||||||
data = next(f)
|
|
||||||
while True:
|
|
||||||
line = next(f)
|
|
||||||
if line == "- end -": break
|
|
||||||
data += line
|
|
||||||
yield data
|
|
||||||
except StopIteration:
|
|
||||||
return
|
|
||||||
|
|
||||||
or possibly::
|
|
||||||
|
|
||||||
def parser(f):
|
|
||||||
for data in f:
|
|
||||||
while True:
|
|
||||||
line = next(f)
|
|
||||||
if line == "- end -": break
|
|
||||||
data += line
|
|
||||||
yield data
|
|
||||||
|
|
||||||
The latter form obscures the iteration by purporting to iterate over
|
|
||||||
the file with a ``for`` loop, but then also fetches more data from
|
|
||||||
the same iterator during the loop body. It does, however, clearly
|
|
||||||
differentiate between a "normal" termination (``StopIteration``
|
|
||||||
instead of the initial line) and an "abnormal" termination (failing
|
|
||||||
to find the end marker in the inner loop, which will now raise
|
|
||||||
``RuntimeError``).
|
|
||||||
|
|
||||||
This effect of ``StopIteration`` has been used to cut a generator
|
|
||||||
expression short, creating a form of ``takewhile``::
|
|
||||||
|
|
||||||
def stop():
|
|
||||||
raise StopIteration
|
|
||||||
print(list(x for x in range(10) if x < 5 or stop()))
|
|
||||||
# prints [0, 1, 2, 3, 4]
|
|
||||||
|
|
||||||
Under the current proposal, this form of non-local flow control is
|
|
||||||
not supported, and would have to be rewritten in statement form::
|
|
||||||
|
|
||||||
def gen():
|
|
||||||
for x in range(10):
|
|
||||||
if x >= 5: return
|
|
||||||
yield x
|
|
||||||
print(list(gen()))
|
|
||||||
# prints [0, 1, 2, 3, 4]
|
|
||||||
|
|
||||||
While this is a small loss of functionality, it is functionality that
|
|
||||||
often comes at the cost of readability, and just as ``lambda`` has
|
|
||||||
restrictions compared to ``def``, so does a generator expression have
|
|
||||||
restrictions compared to a generator function. In many cases, the
|
|
||||||
transformation to full generator function will be trivially easy, and
|
|
||||||
may improve structural clarity.
|
|
||||||
|
|
||||||
|
|
||||||
Explanation of generators, iterators, and StopIteration
|
|
||||||
=======================================================
|
|
||||||
|
|
||||||
Under this proposal, generators and iterators would be distinct, but
|
|
||||||
related, concepts. Like the mixing of text and bytes in Python 2,
|
|
||||||
the mixing of generators and iterators has resulted in certain
|
|
||||||
perceived conveniences, but proper separation will make bugs more
|
|
||||||
visible.
|
|
||||||
|
|
||||||
An iterator is an object with a ``__next__`` method. Like many other
|
|
||||||
special methods, it may either return a value, or raise a specific
|
|
||||||
exception - in this case, ``StopIteration`` - to signal that it has
|
|
||||||
no value to return. In this, it is similar to ``__getattr__`` (can
|
|
||||||
raise ``AttributeError``), ``__getitem__`` (can raise ``KeyError``),
|
|
||||||
and so on. A helper function for an iterator can be written to
|
|
||||||
follow the same protocol; for example::
|
|
||||||
|
|
||||||
def helper(x, y):
|
|
||||||
if x > y: return 1 / (x - y)
|
|
||||||
raise StopIteration
|
|
||||||
|
|
||||||
def __next__(self):
|
|
||||||
if self.a: return helper(self.b, self.c)
|
|
||||||
return helper(self.d, self.e)
|
|
||||||
|
|
||||||
Both forms of signalling are carried through: a returned value is
|
|
||||||
returned, an exception bubbles up. The helper is written to match
|
|
||||||
the protocol of the calling function.
|
|
||||||
|
|
||||||
A generator function is one which contains a ``yield`` expression.
|
|
||||||
Each time it is (re)started, it may either yield a value, or return
|
|
||||||
(including "falling off the end"). A helper function for a generator
|
|
||||||
can also be written, but it must also follow generator protocol::
|
|
||||||
|
|
||||||
def helper(x, y):
|
|
||||||
if x > y: yield 1 / (x - y)
|
|
||||||
|
|
||||||
def gen(self):
|
|
||||||
if self.a: return (yield from helper(self.b, self.c))
|
|
||||||
return (yield from helper(self.d, self.e))
|
|
||||||
|
|
||||||
In both cases, any unexpected exception will bubble up. Due to the
|
|
||||||
nature of generators and iterators, an unexpected ``StopIteration``
|
|
||||||
inside a generator will be converted into ``RuntimeError``, but
|
|
||||||
beyond that, all exceptions will propagate normally.
|
|
||||||
|
|
||||||
|
|
||||||
Transition plan
|
|
||||||
===============
|
|
||||||
|
|
||||||
- Python 3.5: Enable new semantics under ``__future__`` import; silent
|
|
||||||
deprecation warning if ``StopIteration`` bubbles out of a generator
|
|
||||||
not under ``__future__`` import.
|
|
||||||
|
|
||||||
- Python 3.6: Non-silent deprecation warning.
|
|
||||||
|
|
||||||
- Python 3.7: Enable new semantics everywhere.
|
|
||||||
|
|
||||||
|
|
||||||
Alternate proposals
|
|
||||||
===================
|
|
||||||
|
|
||||||
Raising something other than RuntimeError
|
|
||||||
-----------------------------------------
|
|
||||||
|
|
||||||
Rather than the generic ``RuntimeError``, it might make sense to raise
|
|
||||||
a new exception type ``UnexpectedStopIteration``. This has the
|
|
||||||
downside of implicitly encouraging that it be caught; the correct
|
|
||||||
action is to catch the original ``StopIteration``, not the chained
|
|
||||||
exception.
|
|
||||||
|
|
||||||
|
|
||||||
Supplying a specific exception to raise on return
|
|
||||||
-------------------------------------------------
|
|
||||||
|
|
||||||
Nick Coghlan suggested a means of providing a specific
|
|
||||||
``StopIteration`` instance to the generator; if any other instance of
|
|
||||||
``StopIteration`` is raised, it is an error, but if that particular
|
|
||||||
one is raised, the generator has properly completed. This subproposal
|
|
||||||
has been withdrawn in favour of better options, but is retained for
|
|
||||||
reference.
|
|
||||||
|
|
||||||
|
|
||||||
Making return-triggered StopIterations obvious
|
|
||||||
----------------------------------------------
|
|
||||||
|
|
||||||
For certain situations, a simpler and fully backward-compatible
|
|
||||||
solution may be sufficient: when a generator returns, instead of
|
|
||||||
raising ``StopIteration``, it raises a specific subclass of
|
|
||||||
``StopIteration`` (``GeneratorReturn``) which can then be detected.
|
|
||||||
If it is not that subclass, it is an escaping exception rather than a
|
|
||||||
return statement.
|
|
||||||
|
|
||||||
The inspiration for this alternative proposal was Nick's observation
|
|
||||||
[8]_ that if an ``asyncio`` coroutine [9]_ accidentally raises
|
|
||||||
``StopIteration``, it currently terminates silently, which may present
|
|
||||||
a hard-to-debug mystery to the developer. The main proposal turns
|
|
||||||
such accidents into clearly distinguishable ``RuntimeError`` exceptions,
|
|
||||||
but if that is rejected, this alternate proposal would enable
|
|
||||||
``asyncio`` to distinguish between a ``return`` statement and an
|
|
||||||
accidentally-raised ``StopIteration`` exception.
|
|
||||||
|
|
||||||
Of the three outcomes listed above, two change:
|
|
||||||
|
|
||||||
* If a yield point is reached, the value, obviously, would still be
|
|
||||||
returned.
|
|
||||||
* If the frame is returned from, ``GeneratorReturn`` (rather than
|
|
||||||
``StopIteration``) is raised.
|
|
||||||
* If an instance of ``GeneratorReturn`` would be raised, instead an
|
|
||||||
instance of ``StopIteration`` would be raised. Any other exception
|
|
||||||
bubbles up normally.
|
|
||||||
|
|
||||||
In the third case, the ``StopIteration`` would have the ``value`` of
|
|
||||||
the original ``GeneratorReturn``, and would reference the original
|
|
||||||
exception in its ``__cause__``. If uncaught, this would clearly show
|
|
||||||
the chaining of exceptions.
|
|
||||||
|
|
||||||
This alternative does *not* affect the discrepancy between generator
|
|
||||||
expressions and list comprehensions, but allows generator-aware code
|
|
||||||
(such as the ``contextlib`` and ``asyncio`` modules) to reliably
|
|
||||||
differentiate between the second and third outcomes listed above.
|
|
||||||
|
|
||||||
However, once code exists that depends on this distinction between
|
|
||||||
``GeneratorReturn`` and ``StopIteration``, a generator that invokes
|
|
||||||
another generator and relies on the latter's ``StopIteration`` to
|
|
||||||
bubble out would still be potentially wrong, depending on the use made
|
|
||||||
of the distinction between the two exception types.
|
|
||||||
|
|
||||||
|
|
||||||
Converting the exception inside next()
|
|
||||||
--------------------------------------
|
|
||||||
|
|
||||||
Mark Shannon suggested [12]_ that the problem could be solved in
|
|
||||||
``next()`` rather than at the boundary of generator functions. By
|
|
||||||
having ``next()`` catch ``StopIteration`` and raise instead
|
|
||||||
``ValueError``, all unexpected ``StopIteration`` bubbling would be
|
|
||||||
prevented; however, the backward-incompatibility concerns are far
|
|
||||||
more serious than for the current proposal, as every ``next()`` call
|
|
||||||
now needs to be rewritten to guard against ``ValueError`` instead of
|
|
||||||
``StopIteration`` - not to mention that there is no way to write one
|
|
||||||
block of code which reliably works on multiple versions of Python.
|
|
||||||
(Using a dedicated exception type, perhaps subclassing ``ValueError``,
|
|
||||||
would help this; however, all code would still need to be rewritten.)
|
|
||||||
|
|
||||||
|
|
||||||
Sub-proposal: decorator to explicitly request current behaviour
|
|
||||||
---------------------------------------------------------------
|
|
||||||
|
|
||||||
Nick Coghlan suggested [13]_ that the situations where the current
|
|
||||||
behaviour is desired could be supported by means of a decorator::
|
|
||||||
|
|
||||||
from itertools import allow_implicit_stop
|
|
||||||
|
|
||||||
@allow_implicit_stop
|
|
||||||
def my_generator():
|
|
||||||
...
|
|
||||||
yield next(it)
|
|
||||||
...
|
|
||||||
|
|
||||||
Which would be semantically equivalent to::
|
|
||||||
|
|
||||||
def my_generator():
|
|
||||||
try:
|
|
||||||
...
|
|
||||||
yield next(it)
|
|
||||||
...
|
|
||||||
except StopIteration
|
|
||||||
return
|
|
||||||
|
|
||||||
but be faster, as it could be implemented by simply permitting the
|
|
||||||
``StopIteration`` to bubble up directly.
|
|
||||||
|
|
||||||
Single-source Python 2/3 code would also benefit in a 3.7+ world,
|
|
||||||
since libraries like six and python-future could just define their own
|
|
||||||
version of "allow_implicit_stop" that referred to the new builtin in
|
|
||||||
3.5+, and was implemented as an identity function in other versions.
|
|
||||||
|
|
||||||
However, due to the implementation complexities required, the ongoing
|
|
||||||
compatibility issues created, the subtlety of the decorator's effect,
|
|
||||||
and the fact that it would encourage the "quick-fix" solution of just
|
|
||||||
slapping the decorator onto all generators instead of properly fixing
|
|
||||||
the code in question, this sub-proposal has been rejected. [14]_
|
|
||||||
|
|
||||||
|
|
||||||
Criticism
|
|
||||||
=========
|
|
||||||
|
|
||||||
Unofficial and apocryphal statistics suggest that this is seldom, if
|
|
||||||
ever, a problem. [5]_ Code does exist which relies on the current
|
|
||||||
behaviour (e.g. [3]_, [6]_, [7]_), and there is the concern that this
|
|
||||||
would be unnecessary code churn to achieve little or no gain.
|
|
||||||
|
|
||||||
Steven D'Aprano started an informal survey on comp.lang.python [10]_;
|
|
||||||
at the time of writing only two responses have been received: one was
|
|
||||||
in favor of changing list comprehensions to match generator
|
|
||||||
expressions (!), the other was in favor of this PEP's main proposal.
|
|
||||||
|
|
||||||
The existing model has been compared to the perfectly-acceptable
|
|
||||||
issues inherent to every other case where an exception has special
|
|
||||||
meaning. For instance, an unexpected ``KeyError`` inside a
|
|
||||||
``__getitem__`` method will be interpreted as failure, rather than
|
|
||||||
permitted to bubble up. However, there is a difference. Special
|
|
||||||
methods use ``return`` to indicate normality, and ``raise`` to signal
|
|
||||||
abnormality; generators ``yield`` to indicate data, and ``return`` to
|
|
||||||
signal the abnormal state. This makes explicitly raising
|
|
||||||
``StopIteration`` entirely redundant, and potentially surprising. If
|
|
||||||
other special methods had dedicated keywords to distinguish between
|
|
||||||
their return paths, they too could turn unexpected exceptions into
|
|
||||||
``RuntimeError``; the fact that they cannot should not preclude
|
|
||||||
generators from doing so.
|
|
||||||
|
|
||||||
|
|
||||||
References
|
|
||||||
==========
|
|
||||||
|
|
||||||
.. [1] PEP 380 - Syntax for Delegating to a Subgenerator
|
|
||||||
(https://www.python.org/dev/peps/pep-0380)
|
|
||||||
|
|
||||||
.. [2] Initial mailing list comment
|
|
||||||
(https://mail.python.org/pipermail/python-ideas/2014-November/029906.html)
|
|
||||||
|
|
||||||
.. [3] Pure Python implementation of groupby
|
|
||||||
(https://docs.python.org/3/library/itertools.html#itertools.groupby)
|
|
||||||
|
|
||||||
.. [4] Proposal by GvR
|
|
||||||
(https://mail.python.org/pipermail/python-ideas/2014-November/029953.html)
|
|
||||||
|
|
||||||
.. [5] Response by Steven D'Aprano
|
|
||||||
(https://mail.python.org/pipermail/python-ideas/2014-November/029994.html)
|
|
||||||
|
|
||||||
.. [6] Split a sequence or generator using a predicate
|
|
||||||
(http://code.activestate.com/recipes/578416-split-a-sequence-or-generator-using-a-predicate/)
|
|
||||||
|
|
||||||
.. [7] wrap unbounded generator to restrict its output
|
|
||||||
(http://code.activestate.com/recipes/66427-wrap-unbounded-generator-to-restrict-its-output/)
|
|
||||||
|
|
||||||
.. [8] Post from Nick Coghlan mentioning asyncio
|
|
||||||
(https://mail.python.org/pipermail/python-ideas/2014-November/029961.html)
|
|
||||||
|
|
||||||
.. [9] Coroutines in asyncio
|
|
||||||
(https://docs.python.org/3/library/asyncio-task.html#coroutines)
|
|
||||||
|
|
||||||
.. [10] Thread on comp.lang.python started by Steven D'Aprano
|
|
||||||
(https://mail.python.org/pipermail/python-list/2014-November/680757.html)
|
|
||||||
|
|
||||||
.. [11] Tracker issue with Proof-of-Concept patch
|
|
||||||
(http://bugs.python.org/issue22906)
|
|
||||||
|
|
||||||
.. [12] Post from Mark Shannon with alternate proposal
|
|
||||||
(https://mail.python.org/pipermail/python-dev/2014-November/137129.html)
|
|
||||||
|
|
||||||
.. [13] Idea from Nick Coghlan
|
|
||||||
(https://mail.python.org/pipermail/python-dev/2014-November/137201.html)
|
|
||||||
|
|
||||||
.. [14] Rejection by GvR
|
|
||||||
(https://mail.python.org/pipermail/python-dev/2014-November/137243.html)
|
|
||||||
|
|
||||||
Copyright
|
|
||||||
=========
|
|
||||||
|
|
||||||
This document has been placed in the public domain.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
..
|
|
||||||
Local Variables:
|
|
||||||
mode: indented-text
|
|
||||||
indent-tabs-mode: nil
|
|
||||||
sentence-end-double-space: t
|
|
||||||
fill-column: 70
|
|
||||||
coding: utf-8
|
|
||||||
End:
|
|
|
@ -1,328 +0,0 @@
|
||||||
PEP: 483
|
|
||||||
Title: The Theory of Type Hinting
|
|
||||||
Version: $Revision$
|
|
||||||
Last-Modified: $Date$
|
|
||||||
Author: Guido van Rossum <guido@python.org>
|
|
||||||
Discussions-To: Python-Ideas <python-ideas@python.org>
|
|
||||||
Status: Draft
|
|
||||||
Type: Informational
|
|
||||||
Content-Type: text/x-rst
|
|
||||||
Created: 08-Jan-2015
|
|
||||||
Post-History:
|
|
||||||
Resolution:
|
|
||||||
|
|
||||||
The Theory of Type Hinting
|
|
||||||
==========================
|
|
||||||
|
|
||||||
Guido van Rossum, Dec 19, 2014 (with a few more recent updates)
|
|
||||||
|
|
||||||
This work is licensed under a `Creative Commons
|
|
||||||
Attribution-NonCommercial-ShareAlike 4.0 International
|
|
||||||
License <http://creativecommons.org/licenses/by-nc-sa/4.0/>`_.
|
|
||||||
|
|
||||||
|
|
||||||
Introduction
|
|
||||||
------------
|
|
||||||
|
|
||||||
This document lays out the theory of the new type hinting proposal for
|
|
||||||
Python 3.5. It's not quite a full proposal or specification because
|
|
||||||
there are many details that need to be worked out, but it lays out the
|
|
||||||
theory without which it is hard to discuss more detailed specifications.
|
|
||||||
We start by explaining gradual typing; then we state some conventions
|
|
||||||
and general rules; then we define the new special types (such as Union)
|
|
||||||
that can be used in annotations; and finally we define the approach to
|
|
||||||
generic types. (The latter section needs more fleshing out; sorry!)
|
|
||||||
|
|
||||||
|
|
||||||
Summary of gradual typing
|
|
||||||
-------------------------
|
|
||||||
|
|
||||||
We define a new relationship, is-consistent-with, which is similar to
|
|
||||||
is-subclass-of, except it is not transitive when the new type **Any** is
|
|
||||||
involved. (Neither relationship is symmetric.) Assigning x to y is OK if
|
|
||||||
the type of x is consistent with the type of y. (Compare this to “... if
|
|
||||||
the type of x is a subclass of the type of y,” which states one of the
|
|
||||||
fundamentals of OO programming.) The is-consistent-with relationship is
|
|
||||||
defined by three rules:
|
|
||||||
|
|
||||||
- A type t1 is consistent with a type t2 if t1 is a subclass of t2.
|
|
||||||
(But not the other way around.)
|
|
||||||
- **Any** is consistent with every type. (But **Any** is not a subclass
|
|
||||||
of every type.)
|
|
||||||
- Every type is a subclass of **Any**. (Which also makes every type
|
|
||||||
consistent with **Any**, via rule 1.)
|
|
||||||
|
|
||||||
That's all! See Jeremy Siek's blog post `What is Gradual
|
|
||||||
Typing <http://wphomes.soic.indiana.edu/jsiek/what-is-gradual-typing/>`_
|
|
||||||
for a longer explanation and motivation. Note that rule 3 places **Any**
|
|
||||||
at the root of the class graph. This makes it very similar to
|
|
||||||
**object**. The difference is that **object** is not consistent with
|
|
||||||
most types (e.g. you can't use an object() instance where an int is
|
|
||||||
expected). IOW both **Any** and **object** mean “any type is allowed”
|
|
||||||
when used to annotate an argument, but only **Any** can be passed no
|
|
||||||
matter what type is expected (in essence, **Any** shuts up complaints
|
|
||||||
from the static checker).
|
|
||||||
|
|
||||||
Here's an example showing how these rules work out in practice:
|
|
||||||
|
|
||||||
Say we have an Employee class, and a subclass Manager:
|
|
||||||
|
|
||||||
- class Employee: ...
|
|
||||||
- class Manager(Employee): ...
|
|
||||||
|
|
||||||
Let's say variable e is declared with type Employee:
|
|
||||||
|
|
||||||
- e = Employee() # type: Employee
|
|
||||||
|
|
||||||
Now it's okay to assign a Manager instance to e (rule 1):
|
|
||||||
|
|
||||||
- e = Manager()
|
|
||||||
|
|
||||||
It's not okay to assign an Employee instance to a variable declared with
|
|
||||||
type Manager:
|
|
||||||
|
|
||||||
- m = Manager() # type: Manager
|
|
||||||
- m = Employee() # Fails static check
|
|
||||||
|
|
||||||
However, suppose we have a variable whose type is **Any**:
|
|
||||||
|
|
||||||
- a = some\_func() # type: Any
|
|
||||||
|
|
||||||
Now it's okay to assign a to e (rule 2):
|
|
||||||
|
|
||||||
- e = a # OK
|
|
||||||
|
|
||||||
Of course it's also okay to assign e to a (rule 3), but we didn't need
|
|
||||||
the concept of consistency for that:
|
|
||||||
|
|
||||||
- a = e # OK
|
|
||||||
|
|
||||||
Notational conventions
|
|
||||||
~~~~~~~~~~~~~~~~~~~~~~
|
|
||||||
|
|
||||||
- t1, t2 etc. and u1, u2 etc. are types or classes. Sometimes we write
|
|
||||||
ti or tj to refer to “any of t1, t2, etc.”
|
|
||||||
- X, Y etc. are type variables (defined with Var(), see below).
|
|
||||||
- C, D etc. are classes defined with a class statement.
|
|
||||||
- x, y etc. are objects or instances.
|
|
||||||
- We use the terms type and class interchangeably, and we assume
|
|
||||||
type(x) is x.\_\_class\_\_.
|
|
||||||
|
|
||||||
General rules
|
|
||||||
~~~~~~~~~~~~~
|
|
||||||
|
|
||||||
- Instance-ness is derived from class-ness, e.g. x is an instance of
|
|
||||||
t1 if type(x) is a subclass of t1.
|
|
||||||
- No types defined below (i.e. Any, Union etc.) can be instantiated.
|
|
||||||
(But non-abstract subclasses of Generic can be.)
|
|
||||||
- No types defined below can be subclassed, except for Generic and
|
|
||||||
classes derived from it.
|
|
||||||
- Where a type is expected, None can be substituted for type(None);
|
|
||||||
e.g. Union[t1, None] == Union[t1, type(None)].
|
|
||||||
|
|
||||||
Types
|
|
||||||
~~~~~
|
|
||||||
|
|
||||||
- **Any**. Every class is a subclass of Any; however, to the static
|
|
||||||
type checker it is also consistent with every class (see above).
|
|
||||||
- **Union[t1, t2, ...]**. Classes that are subclass of at least one of
|
|
||||||
t1 etc. are subclasses of this. So are unions whose components are
|
|
||||||
all subclasses of t1 etc. (Example: Union[int, str] is a subclass of
|
|
||||||
Union[int, float, str].) The order of the arguments doesn't matter.
|
|
||||||
(Example: Union[int, str] == Union[str, int].) If ti is itself a
|
|
||||||
Union the result is flattened. (Example: Union[int, Union[float,
|
|
||||||
str]] == Union[int, float, str].) If ti and tj have a subclass
|
|
||||||
relationship, the less specific type survives. (Example:
|
|
||||||
Union[Employee, Manager] == Union[Employee].) Union[t1] returns just
|
|
||||||
t1. Union[] is illegal, so is Union[()]. Corollary: Union[..., Any,
|
|
||||||
...] returns Any; Union[..., object, ...] returns object; to cut a
|
|
||||||
tie, Union[Any, object] == Union[object, Any] == Any.
|
|
||||||
- **Optional[t1]**. Alias for Union[t1, None], i.e. Union[t1,
|
|
||||||
type(None)].
|
|
||||||
- **Tuple[t1, t2, ..., tn]**. A tuple whose items are instances of t1
|
|
||||||
etc.. Example: Tuple[int, float] means a tuple of two items, the
|
|
||||||
first is an int, the second a float; e.g., (42, 3.14). Tuple[u1, u2,
|
|
||||||
..., um] is a subclass of Tuple[t1, t2, ..., tn] if they have the
|
|
||||||
same length (n==m) and each ui is a subclass of ti. To spell the type
|
|
||||||
of the empty tuple, use Tuple[()]. There is no way to define a
|
|
||||||
variadic tuple type. (TODO: Maybe Tuple[t1, ...] with literal
|
|
||||||
ellipsis?)
|
|
||||||
- **Callable[[t1, t2, ..., tn], tr]**. A function with positional
|
|
||||||
argument types t1 etc., and return type tr. The argument list may be
|
|
||||||
empty (n==0). There is no way to indicate optional or keyword
|
|
||||||
arguments, nor varargs (we don't need to spell those often enough to
|
|
||||||
complicate the syntax — however, Reticulated Python has a useful idea
|
|
||||||
here). This is covariant in the return type, but contravariant in the
|
|
||||||
arguments. “Covariant” here means that for two callable types that
|
|
||||||
differ only in the return type, the subclass relationship for the
|
|
||||||
callable types follows that of the return types. (Example:
|
|
||||||
Callable[[], Manager] is a subclass of Callable[[], Employee].)
|
|
||||||
“Contravariant“ here means that for two callable types that differ
|
|
||||||
only in the type of one argument, the subclass relationship for the
|
|
||||||
callable types goes in the opposite direction as for the argument
|
|
||||||
types. (Example: Callable[[Employee], None] is a subclass of
|
|
||||||
Callable[[Mananger], None]. Yes, you read that right.)
|
|
||||||
|
|
||||||
We might add:
|
|
||||||
|
|
||||||
- **Intersection[t1, t2, ...]**. Classes that are subclass of *each* of
|
|
||||||
t1, etc are subclasses of this. (Compare to Union, which has *at
|
|
||||||
least one* instead of *each* in its definition.) The order of the
|
|
||||||
arguments doesn't matter. Nested intersections are flattened, e.g.
|
|
||||||
Intersection[int, Intersection[float, str]] == Intersection[int,
|
|
||||||
float, str]. An intersection of fewer types is a subclass of an
|
|
||||||
intersection of more types, e.g. Intersection[int, str] is a subclass
|
|
||||||
of Intersection[int, float, str]. An intersection of one argument is
|
|
||||||
just that argument, e.g. Intersection[int] is int. When argument have
|
|
||||||
a subclass relationship, the more specific class survives, e.g.
|
|
||||||
Intersection[str, Employee, Manager] is Intersection[str, Manager].
|
|
||||||
Intersection[] is illegal, so is Intersection[()]. Corollary: Any
|
|
||||||
disappears from the argument list, e.g. Intersection[int, str, Any]
|
|
||||||
== Intersection[int, str].Intersection[Any, object] is object. The
|
|
||||||
interaction between Intersection and Union is complex but should be
|
|
||||||
no surprise if you understand the interaction between intersections
|
|
||||||
and unions in set theory (note that sets of types can be infinite in
|
|
||||||
size, since there is no limit on the number of new subclasses).
|
|
||||||
|
|
||||||
Pragmatics
|
|
||||||
~~~~~~~~~~
|
|
||||||
|
|
||||||
Some things are irrelevant to the theory but make practical use more
|
|
||||||
convenient. (This is not a full list; I probably missed a few and some
|
|
||||||
are still controversial or not fully specified.)
|
|
||||||
|
|
||||||
Type aliases, e.g.
|
|
||||||
|
|
||||||
- point = Tuple[float, float]
|
|
||||||
- def distance(p: point) -> float: ...
|
|
||||||
|
|
||||||
Forward references via strings, e.g.
|
|
||||||
|
|
||||||
class C:
|
|
||||||
|
|
||||||
- def compare(self, other: “C”) -> int: ...
|
|
||||||
|
|
||||||
If a default of None is specified, the type is implicitly optional, e.g.
|
|
||||||
|
|
||||||
- def get(key: KT, default: VT = None) -> VT: ...
|
|
||||||
|
|
||||||
Don't use dynamic type expressions; use builtins and imported types
|
|
||||||
only. No 'if'.
|
|
||||||
|
|
||||||
- def display(message: str if WINDOWS else bytes): # NOT OK
|
|
||||||
|
|
||||||
Type declaration in comments, e.g.
|
|
||||||
|
|
||||||
- x = [] # type: Sequence[int]
|
|
||||||
|
|
||||||
Type declarations using Undefined, e.g.
|
|
||||||
|
|
||||||
- x = Undefined(str)
|
|
||||||
|
|
||||||
Other things, e.g. casts, overloading and stub modules; best left to an
|
|
||||||
actual PEP.
|
|
||||||
|
|
||||||
Generic types
|
|
||||||
~~~~~~~~~~~~~
|
|
||||||
|
|
||||||
(TODO: Explain more. See also the `mypy docs on
|
|
||||||
generics <http://mypy.readthedocs.org/en/latest/generics.html>`_.)
|
|
||||||
|
|
||||||
**X = Var('X')**. Declares a unique type variable. The name must match
|
|
||||||
the variable name.
|
|
||||||
|
|
||||||
**Y = Var('Y', t1, t2, ...).** Ditto, constrained to t1 etc. Behaves
|
|
||||||
like Union[t1, t2, ...] for most purposes, but when used as a type
|
|
||||||
variable, subclasses of t1 etc. are replaced by the most-derived base
|
|
||||||
class among t1 etc.
|
|
||||||
|
|
||||||
Example of constrained type variables:
|
|
||||||
|
|
||||||
AnyStr = Var('AnyStr', str, bytes)
|
|
||||||
|
|
||||||
def longest(a: AnyStr, b: AnyStr) -> AnyStr:
|
|
||||||
|
|
||||||
- return a if len(a) >= len(b) else b
|
|
||||||
|
|
||||||
x = longest('a', 'abc') # The inferred type for x is str
|
|
||||||
|
|
||||||
y = longest('a', b'abc') # Fails static type check
|
|
||||||
|
|
||||||
In this example, both arguments to longest() must have the same type
|
|
||||||
(str or bytes), and moreover, even if the arguments are instances of a
|
|
||||||
common str subclass, the return type is still str, not that subclass
|
|
||||||
(see next example).
|
|
||||||
|
|
||||||
For comparison, if the type variable was unconstrained, the common
|
|
||||||
subclass would be chosen as the return type, e.g.:
|
|
||||||
|
|
||||||
S = Var('S')
|
|
||||||
|
|
||||||
def longest(a: S, b: S) -> S:
|
|
||||||
|
|
||||||
- return a if len(a) >= b else b
|
|
||||||
|
|
||||||
class MyStr(str): ...
|
|
||||||
|
|
||||||
x = longest(MyStr('a'), MyStr('abc'))
|
|
||||||
|
|
||||||
The inferred type of x is MyStr (whereas in the AnyStr example it would
|
|
||||||
be str).
|
|
||||||
|
|
||||||
Also for comparison, if a Union is used, the return type also has to be
|
|
||||||
a Union:
|
|
||||||
|
|
||||||
U = Union[str, bytes]
|
|
||||||
|
|
||||||
def longest(a: U, b: U) -> U:
|
|
||||||
|
|
||||||
- return a if len(a) >- b else b
|
|
||||||
|
|
||||||
x = longest('a', 'abc')
|
|
||||||
|
|
||||||
The inferred type of x is still Union[str, bytes], even though both
|
|
||||||
arguments are str.
|
|
||||||
|
|
||||||
**class C(Generic[X, Y, ...]):** ... Define a generic class C over type
|
|
||||||
variables X etc. C itself becomes parameterizable, e.g. C[int, str, ...]
|
|
||||||
is a specific class with substitutions X→int etc.
|
|
||||||
|
|
||||||
TODO: Explain use of generic types in function signatures. E.g.
|
|
||||||
Sequence[X], Sequence[int], Sequence[Tuple[X, Y, Z]], and mixtures.
|
|
||||||
Think about co\*variance. No gimmicks like deriving from
|
|
||||||
Sequence[Union[int, str]] or Sequence[Union[int, X]].
|
|
||||||
|
|
||||||
**Protocol**. Similar to Generic but uses structural equivalence. (TODO:
|
|
||||||
Explain, and think about co\*variance.)
|
|
||||||
|
|
||||||
Predefined generic types and Protocols in typing.py
|
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
||||||
|
|
||||||
(See also the `mypy typing.py
|
|
||||||
module <https://github.com/JukkaL/typing/blob/master/typing.py>`_.)
|
|
||||||
|
|
||||||
- Everything from collections.abc (but Set renamed to AbstractSet).
|
|
||||||
- Dict, List, Set, a few more. (FrozenSet?)
|
|
||||||
- Pattern, Match. (Why?)
|
|
||||||
- IO, TextIO, BinaryIO. (Why?)
|
|
||||||
|
|
||||||
Another reference
|
|
||||||
~~~~~~~~~~~~~~~~~
|
|
||||||
|
|
||||||
Lest mypy gets all the attention, I should mention \ `Reticulated
|
|
||||||
Python <https://github.com/mvitousek/reticulated>`_ by Michael Vitousek
|
|
||||||
as an example of a slightly different approach to gradual typing for
|
|
||||||
Python. It is described in an actual `academic
|
|
||||||
paper <http://wphomes.soic.indiana.edu/jsiek/files/2014/03/retic-python.pdf>`_
|
|
||||||
written by Vitousek with Jeremy Siek and Jim Baker (the latter of Jython
|
|
||||||
fame).
|
|
||||||
|
|
||||||
|
|
||||||
..
|
|
||||||
Local Variables:
|
|
||||||
mode: indented-text
|
|
||||||
indent-tabs-mode: nil
|
|
||||||
sentence-end-double-space: t
|
|
||||||
fill-column: 70
|
|
||||||
coding: utf-8
|
|
||||||
End:
|
|
Loading…
Reference in New Issue