656 lines
20 KiB
Plaintext
656 lines
20 KiB
Plaintext
PEP: 525
|
|
Title: Asynchronous Generators
|
|
Version: $Revision$
|
|
Last-Modified: $Date$
|
|
Author: Yury Selivanov <yury@magic.io>
|
|
Discussions-To: <python-dev@python.org>
|
|
Status: Accepted
|
|
Type: Standards Track
|
|
Content-Type: text/x-rst
|
|
Created: 28-Jul-2016
|
|
Python-Version: 3.6
|
|
Post-History: 02-Aug-2016, 23-Aug-2016, 01-Sep-2016, 06-Sep-2016
|
|
|
|
|
|
Abstract
|
|
========
|
|
|
|
PEP 492 introduced support for native coroutines and ``async``/``await``
|
|
syntax to Python 3.5. It is proposed here to extend Python's
|
|
asynchronous capabilities by adding support for
|
|
*asynchronous generators*.
|
|
|
|
|
|
Rationale and Goals
|
|
===================
|
|
|
|
Regular generators (introduced in PEP 255) enabled an elegant way of
|
|
writing complex *data producers* and have them behave like an iterator.
|
|
|
|
However, currently there is no equivalent concept for the *asynchronous
|
|
iteration protocol* (``async for``). This makes writing asynchronous
|
|
data producers unnecessarily complex, as one must define a class that
|
|
implements ``__aiter__`` and ``__anext__`` to be able to use it in
|
|
an ``async for`` statement.
|
|
|
|
Essentially, the goals and rationale for PEP 255, applied to the
|
|
asynchronous execution case, hold true for this proposal as well.
|
|
|
|
Performance is an additional point for this proposal: in our testing of
|
|
the reference implementation, asynchronous generators are **2x** faster
|
|
than an equivalent implemented as an asynchronous iterator.
|
|
|
|
As an illustration of the code quality improvement, consider the
|
|
following class that prints numbers with a given delay once iterated::
|
|
|
|
class Ticker:
|
|
"""Yield numbers from 0 to `to` every `delay` seconds."""
|
|
|
|
def __init__(self, delay, to):
|
|
self.delay = delay
|
|
self.i = 0
|
|
self.to = to
|
|
|
|
def __aiter__(self):
|
|
return self
|
|
|
|
async def __anext__(self):
|
|
i = self.i
|
|
if i >= self.to:
|
|
raise StopAsyncIteration
|
|
self.i += 1
|
|
if i:
|
|
await asyncio.sleep(self.delay)
|
|
return i
|
|
|
|
|
|
The same can be implemented as a much simpler asynchronous generator::
|
|
|
|
async def ticker(delay, to):
|
|
"""Yield numbers from 0 to `to` every `delay` seconds."""
|
|
for i in range(to):
|
|
yield i
|
|
await asyncio.sleep(delay)
|
|
|
|
|
|
Specification
|
|
=============
|
|
|
|
This proposal introduces the concept of *asynchronous generators* to
|
|
Python.
|
|
|
|
This specification presumes knowledge of the implementation of
|
|
generators and coroutines in Python (PEP 342, PEP 380 and PEP 492).
|
|
|
|
|
|
Asynchronous Generators
|
|
-----------------------
|
|
|
|
A Python *generator* is any function containing one or more ``yield``
|
|
expressions::
|
|
|
|
def func(): # a function
|
|
return
|
|
|
|
def genfunc(): # a generator function
|
|
yield
|
|
|
|
We propose to use the same approach to define
|
|
*asynchronous generators*::
|
|
|
|
async def coro(): # a coroutine function
|
|
await smth()
|
|
|
|
async def asyncgen(): # an asynchronous generator function
|
|
await smth()
|
|
yield 42
|
|
|
|
The result of calling an *asynchronous generator function* is
|
|
an *asynchronous generator object*, which implements the asynchronous
|
|
iteration protocol defined in PEP 492.
|
|
|
|
It is a ``SyntaxError`` to have a non-empty ``return`` statement in an
|
|
asynchronous generator.
|
|
|
|
|
|
Support for Asynchronous Iteration Protocol
|
|
-------------------------------------------
|
|
|
|
The protocol requires two special methods to be implemented:
|
|
|
|
1. An ``__aiter__`` method returning an *asynchronous iterator*.
|
|
2. An ``__anext__`` method returning an *awaitable* object, which uses
|
|
``StopIteration`` exception to "yield" values, and
|
|
``StopAsyncIteration`` exception to signal the end of the iteration.
|
|
|
|
Asynchronous generators define both of these methods. Let's manually
|
|
iterate over a simple asynchronous generator::
|
|
|
|
async def genfunc():
|
|
yield 1
|
|
yield 2
|
|
|
|
gen = genfunc()
|
|
|
|
assert gen.__aiter__() is gen
|
|
|
|
assert await gen.__anext__() == 1
|
|
assert await gen.__anext__() == 2
|
|
|
|
await gen.__anext__() # This line will raise StopAsyncIteration.
|
|
|
|
|
|
Finalization
|
|
------------
|
|
|
|
PEP 492 requires an event loop or a scheduler to run coroutines.
|
|
Because asynchronous generators are meant to be used from coroutines,
|
|
they also require an event loop to run and finalize them.
|
|
|
|
Asynchronous generators can have ``try..finally`` blocks, as well as
|
|
``async with``. It is important to provide a guarantee that, even
|
|
when partially iterated, and then garbage collected, generators can
|
|
be safely finalized. For example::
|
|
|
|
async def square_series(con, to):
|
|
async with con.transaction():
|
|
cursor = con.cursor(
|
|
'SELECT generate_series(0, $1) AS i', to)
|
|
async for row in cursor:
|
|
yield row['i'] ** 2
|
|
|
|
async for i in square_series(con, 1000):
|
|
if i == 100:
|
|
break
|
|
|
|
The above code defines an asynchronous generator that uses
|
|
``async with`` to iterate over a database cursor in a transaction.
|
|
The generator is then iterated over with ``async for``, which interrupts
|
|
the iteration at some point.
|
|
|
|
The ``square_series()`` generator will then be garbage collected,
|
|
and without a mechanism to asynchronously close the generator, Python
|
|
interpreter would not be able to do anything.
|
|
|
|
To solve this problem we propose to do the following:
|
|
|
|
1. Implement an ``aclose`` method on asynchronous generators
|
|
returning a special *awaitable*. When awaited it
|
|
throws a ``GeneratorExit`` into the suspended generator and
|
|
iterates over it until either a ``GeneratorExit`` or
|
|
a ``StopAsyncIteration`` occur.
|
|
|
|
This is very similar to what the ``close()`` method does to regular
|
|
Python generators, except that an event loop is required to execute
|
|
``aclose()``.
|
|
|
|
2. Raise a ``RuntimeError``, when an asynchronous generator executes
|
|
a ``yield`` expression in its ``finally`` block (using ``await``
|
|
is fine, though)::
|
|
|
|
async def gen():
|
|
try:
|
|
yield
|
|
finally:
|
|
await asyncio.sleep(1) # Can use 'await'.
|
|
|
|
yield # Cannot use 'yield',
|
|
# this line will trigger a
|
|
# RuntimeError.
|
|
|
|
3. Add two new methods to the ``sys`` module:
|
|
``set_asyncgen_hooks()`` and ``get_asyncgen_hooks()``.
|
|
|
|
The idea behind ``sys.set_asyncgen_hooks()`` is to allow event
|
|
loops to intercept asynchronous generators iteration and finalization,
|
|
so that the end user does not need to care about the finalization
|
|
problem, and everything just works.
|
|
|
|
``sys.set_asyncgen_hooks()`` accepts two arguments:
|
|
|
|
* ``firstiter``: a callable which will be called when an asynchronous
|
|
generator is iterated for the first time.
|
|
|
|
* ``finalizer``: a callable which will be called when an asynchronous
|
|
generator is about to be GCed.
|
|
|
|
When an asynchronous generator is iterated for the first time,
|
|
it stores a reference to the current finalizer. If there is none,
|
|
a ``RuntimeError`` is raised. This provides a strong guarantee that
|
|
every asynchronous generator object will always have a finalizer
|
|
installed by the correct event loop.
|
|
|
|
When an asynchronous generator is about to be garbage collected,
|
|
it calls its cached finalizer. The assumption is that the finalizer
|
|
will schedule an ``aclose()`` call with the loop that was active
|
|
when the iteration started.
|
|
|
|
For instance, here is how asyncio is modified to allow safe
|
|
finalization of asynchronous generators::
|
|
|
|
# asyncio/base_events.py
|
|
|
|
class BaseEventLoop:
|
|
|
|
def run_forever(self):
|
|
...
|
|
old_hooks = sys.get_asyncgen_hooks()
|
|
sys.set_asyncgen_hooks(finalizer=self._finalize_asyncgen)
|
|
try:
|
|
...
|
|
finally:
|
|
sys.set_asyncgen_hooks(*old_hooks)
|
|
...
|
|
|
|
def _finalize_asyncgen(self, gen):
|
|
self.create_task(gen.aclose())
|
|
|
|
The second argument, ``firstiter``, allows event loops to maintain
|
|
a weak set of asynchronous generators instantiated under their control.
|
|
This makes it possible to implement "shutdown" mechanisms to safely
|
|
finalize all open generators and close the event loop.
|
|
|
|
``sys.set_asyncgen_hooks()`` is thread-specific, so several event
|
|
loops running in parallel threads can use it safely.
|
|
|
|
``sys.get_asyncgen_hooks()`` returns a namedtuple-like structure
|
|
with ``firstiter`` and ``finalizer`` fields.
|
|
|
|
|
|
asyncio
|
|
-------
|
|
|
|
The asyncio event loop will use ``sys.set_asyncgen_hooks()`` API to
|
|
maintain a weak set of all scheduled asynchronous generators, and to
|
|
schedule their ``aclose()`` coroutine methods when it is time for
|
|
generators to be GCed.
|
|
|
|
To make sure that asyncio programs can finalize all scheduled
|
|
asynchronous generators reliably, we propose to add a new event loop
|
|
coroutine method ``loop.shutdown_asyncgens()``. The method will
|
|
schedule all currently open asynchronous generators to close with an
|
|
``aclose()`` call.
|
|
|
|
After calling the ``loop.shutdown_asyncgens()`` method, the event loop
|
|
will issue a warning whenever a new asynchronous generator is iterated
|
|
for the first time. The idea is that after requesting all asynchronous
|
|
generators to be shutdown, the program should not execute code that
|
|
iterates over new asynchronous generators.
|
|
|
|
An example of how ``shutdown_asyncgens`` coroutine should be used::
|
|
|
|
try:
|
|
loop.run_forever()
|
|
loop.run_until_complete(loop.shutdown_asyncgens())
|
|
finally:
|
|
loop.close()
|
|
|
|
|
|
Asynchronous Generator Object
|
|
-----------------------------
|
|
|
|
The object is modeled after the standard Python generator object.
|
|
Essentially, the behaviour of asynchronous generators is designed
|
|
to replicate the behaviour of synchronous generators, with the only
|
|
difference in that the API is asynchronous.
|
|
|
|
The following methods and properties are defined:
|
|
|
|
1. ``agen.__aiter__()``: Returns ``agen``.
|
|
|
|
2. ``agen.__anext__()``: Returns an *awaitable*, that performs one
|
|
asynchronous generator iteration when awaited.
|
|
|
|
3. ``agen.asend(val)``: Returns an *awaitable*, that pushes the
|
|
``val`` object in the ``agen`` generator. When the ``agen`` has
|
|
not yet been iterated, ``val`` must be ``None``.
|
|
|
|
Example::
|
|
|
|
async def gen():
|
|
await asyncio.sleep(0.1)
|
|
v = yield 42
|
|
print(v)
|
|
await asyncio.sleep(0.2)
|
|
|
|
g = gen()
|
|
|
|
await g.asend(None) # Will return 42 after sleeping
|
|
# for 0.1 seconds.
|
|
|
|
await g.asend('hello') # Will print 'hello' and
|
|
# raise StopAsyncIteration
|
|
# (after sleeping for 0.2 seconds.)
|
|
|
|
4. ``agen.athrow(typ, [val, [tb]])``: Returns an *awaitable*, that
|
|
throws an exception into the ``agen`` generator.
|
|
|
|
Example::
|
|
|
|
async def gen():
|
|
try:
|
|
await asyncio.sleep(0.1)
|
|
yield 'hello'
|
|
except ZeroDivisionError:
|
|
await asyncio.sleep(0.2)
|
|
yield 'world'
|
|
|
|
g = gen()
|
|
v = await g.asend(None)
|
|
print(v) # Will print 'hello' after
|
|
# sleeping for 0.1 seconds.
|
|
|
|
v = await g.athrow(ZeroDivisionError)
|
|
print(v) # Will print 'world' after
|
|
$ sleeping 0.2 seconds.
|
|
|
|
5. ``agen.aclose()``: Returns an *awaitable*, that throws a
|
|
``GeneratorExit`` exception into the generator. The *awaitable* can
|
|
either return a yielded value, if ``agen`` handled the exception,
|
|
or ``agen`` will be closed and the exception will propagate back
|
|
to the caller.
|
|
|
|
6. ``agen.__name__`` and ``agen.__qualname__``: readable and writable
|
|
name and qualified name attributes.
|
|
|
|
7. ``agen.ag_await``: The object that ``agen`` is currently *awaiting*
|
|
on, or ``None``. This is similar to the currently available
|
|
``gi_yieldfrom`` for generators and ``cr_await`` for coroutines.
|
|
|
|
8. ``agen.ag_frame``, ``agen.ag_running``, and ``agen.ag_code``:
|
|
defined in the same way as similar attributes of standard generators.
|
|
|
|
``StopIteration`` and ``StopAsyncIteration`` are not propagated out of
|
|
asynchronous generators, and are replaced with a ``RuntimeError``.
|
|
|
|
|
|
Implementation Details
|
|
----------------------
|
|
|
|
Asynchronous generator object (``PyAsyncGenObject``) shares the
|
|
struct layout with ``PyGenObject``. In addition to that, the
|
|
reference implementation introduces three new objects:
|
|
|
|
1. ``PyAsyncGenASend``: the awaitable object that implements
|
|
``__anext__`` and ``asend()`` methods.
|
|
|
|
2. ``PyAsyncGenAThrow``: the awaitable object that implements
|
|
``athrow()`` and ``aclose()`` methods.
|
|
|
|
3. ``_PyAsyncGenWrappedValue``: every directly yielded object from an
|
|
asynchronous generator is implicitly boxed into this structure. This
|
|
is how the generator implementation can separate objects that are
|
|
yielded using regular iteration protocol from objects that are
|
|
yielded using asynchronous iteration protocol.
|
|
|
|
``PyAsyncGenASend`` and ``PyAsyncGenAThrow`` are awaitables (they have
|
|
``__await__`` methods returning ``self``) and are coroutine-like objects
|
|
(implementing ``__iter__``, ``__next__``, ``send()`` and ``throw()``
|
|
methods). Essentially, they control how asynchronous generators are
|
|
iterated:
|
|
|
|
.. image:: pep-0525-1.png
|
|
:align: center
|
|
:width: 80%
|
|
|
|
|
|
PyAsyncGenASend and PyAsyncGenAThrow
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
``PyAsyncGenASend`` is a coroutine-like object that drives ``__anext__``
|
|
and ``asend()`` methods and implements the asynchronous iteration
|
|
protocol.
|
|
|
|
``agen.asend(val)`` and ``agen.__anext__()`` return instances of
|
|
``PyAsyncGenASend`` (which hold references back to the parent
|
|
``agen`` object.)
|
|
|
|
The data flow is defined as follows:
|
|
|
|
1. When ``PyAsyncGenASend.send(val)`` is called for the first time,
|
|
``val`` is pushed to the parent ``agen`` object (using existing
|
|
facilities of ``PyGenObject``.)
|
|
|
|
Subsequent iterations over the ``PyAsyncGenASend`` objects, push
|
|
``None`` to ``agen``.
|
|
|
|
When a ``_PyAsyncGenWrappedValue`` object is yielded, it
|
|
is unboxed, and a ``StopIteration`` exception is raised with the
|
|
unwrapped value as an argument.
|
|
|
|
2. When ``PyAsyncGenASend.throw(*exc)`` is called for the first time,
|
|
``*exc`` is throwed into the parent ``agen`` object.
|
|
|
|
Subsequent iterations over the ``PyAsyncGenASend`` objects, push
|
|
``None`` to ``agen``.
|
|
|
|
When a ``_PyAsyncGenWrappedValue`` object is yielded, it
|
|
is unboxed, and a ``StopIteration`` exception is raised with the
|
|
unwrapped value as an argument.
|
|
|
|
3. ``return`` statements in asynchronous generators raise
|
|
``StopAsyncIteration`` exception, which is propagated through
|
|
``PyAsyncGenASend.send()`` and ``PyAsyncGenASend.throw()`` methods.
|
|
|
|
``PyAsyncGenAThrow`` is very similar to ``PyAsyncGenASend``. The only
|
|
difference is that ``PyAsyncGenAThrow.send()``, when called first time,
|
|
throws an exception into the parent ``agen`` object (instead of pushing
|
|
a value into it.)
|
|
|
|
|
|
New Standard Library Functions and Types
|
|
----------------------------------------
|
|
|
|
1. ``types.AsyncGeneratorType`` -- type of asynchronous generator
|
|
object.
|
|
|
|
2. ``sys.set_asyncgen_hooks()`` and ``sys.get_asyncgen_hooks()``
|
|
methods to set up asynchronous generators finalizers and iteration
|
|
interceptors in event loops.
|
|
|
|
3. ``inspect.isasyncgen()`` and ``inspect.isasyncgenfunction()``
|
|
introspection functions.
|
|
|
|
4. New method for asyncio event loop:
|
|
``loop.shutdown_asyncgens(*, timeout=30)``.
|
|
|
|
|
|
Backwards Compatibility
|
|
-----------------------
|
|
|
|
The proposal is fully backwards compatible.
|
|
|
|
In Python 3.5 it is a ``SyntaxError`` to define an ``async def``
|
|
function with a ``yield`` expression inside, therefore it's safe to
|
|
introduce asynchronous generators in 3.6.
|
|
|
|
|
|
Performance
|
|
===========
|
|
|
|
Regular Generators
|
|
------------------
|
|
|
|
There is no performance degradation for regular generators.
|
|
The following micro benchmark runs at the same speed on CPython with
|
|
and without asynchronous generators::
|
|
|
|
def gen():
|
|
i = 0
|
|
while i < 100000000:
|
|
yield i
|
|
i += 1
|
|
|
|
list(gen())
|
|
|
|
|
|
Improvements over asynchronous iterators
|
|
----------------------------------------
|
|
|
|
The following micro-benchmark shows that asynchronous generators
|
|
are about **2.3x faster** than asynchronous iterators implemented in
|
|
pure Python::
|
|
|
|
N = 10 ** 7
|
|
|
|
async def agen():
|
|
for i in range(N):
|
|
yield i
|
|
|
|
class AIter:
|
|
def __init__(self):
|
|
self.i = 0
|
|
|
|
def __aiter__(self):
|
|
return self
|
|
|
|
async def __anext__(self):
|
|
i = self.i
|
|
if i >= N:
|
|
raise StopAsyncIteration
|
|
self.i += 1
|
|
return i
|
|
|
|
|
|
Design Considerations
|
|
=====================
|
|
|
|
|
|
``aiter()`` and ``anext()`` builtins
|
|
------------------------------------
|
|
|
|
Originally, PEP 492 defined ``__aiter__`` as a method that should
|
|
return an *awaitable* object, resulting in an asynchronous iterator.
|
|
|
|
However, in CPython 3.5.2, ``__aiter__`` was redefined to return
|
|
asynchronous iterators directly. To avoid breaking backwards
|
|
compatibility, it was decided that Python 3.6 will support both
|
|
ways: ``__aiter__`` can still return an *awaitable* with
|
|
a ``DeprecationWarning`` being issued.
|
|
|
|
Because of this dual nature of ``__aiter__`` in Python 3.6, we cannot
|
|
add a synchronous implementation of ``aiter()`` built-in. Therefore,
|
|
it is proposed to wait until Python 3.7.
|
|
|
|
|
|
Asynchronous list/dict/set comprehensions
|
|
-----------------------------------------
|
|
|
|
Syntax for asynchronous comprehensions is unrelated to the asynchronous
|
|
generators machinery, and should be considered in a separate PEP.
|
|
|
|
|
|
Asynchronous ``yield from``
|
|
---------------------------
|
|
|
|
While it is theoretically possible to implement ``yield from`` support
|
|
for asynchronous generators, it would require a serious redesign of the
|
|
generators implementation.
|
|
|
|
``yield from`` is also less critical for asynchronous generators, since
|
|
there is no need provide a mechanism of implementing another coroutines
|
|
protocol on top of coroutines. And to compose asynchronous generators a
|
|
simple ``async for`` loop can be used::
|
|
|
|
async def g1():
|
|
yield 1
|
|
yield 2
|
|
|
|
async def g2():
|
|
async for v in g1():
|
|
yield v
|
|
|
|
|
|
Why the ``asend()`` and ``athrow()`` methods are necessary
|
|
----------------------------------------------------------
|
|
|
|
They make it possible to implement concepts similar to
|
|
``contextlib.contextmanager`` using asynchronous generators.
|
|
For instance, with the proposed design, it is possible to implement
|
|
the following pattern::
|
|
|
|
@async_context_manager
|
|
async def ctx():
|
|
await open()
|
|
try:
|
|
yield
|
|
finally:
|
|
await close()
|
|
|
|
async with ctx():
|
|
await ...
|
|
|
|
Another reason is that it is possible to push data and throw exceptions
|
|
into asynchronous generators using the object returned from
|
|
``__anext__`` object, but it is hard to do that correctly. Adding
|
|
explicit ``asend()`` and ``athrow()`` will pave a safe way to
|
|
accomplish that.
|
|
|
|
In terms of implementation, ``asend()`` is a slightly more generic
|
|
version of ``__anext__``, and ``athrow()`` is very similar to
|
|
``aclose()``. Therefore having these methods defined for asynchronous
|
|
generators does not add any extra complexity.
|
|
|
|
|
|
Example
|
|
=======
|
|
|
|
A working example with the current reference implementation (will
|
|
print numbers from 0 to 9 with one second delay)::
|
|
|
|
async def ticker(delay, to):
|
|
for i in range(to):
|
|
yield i
|
|
await asyncio.sleep(delay)
|
|
|
|
|
|
async def run():
|
|
async for i in ticker(1, 10):
|
|
print(i)
|
|
|
|
|
|
import asyncio
|
|
loop = asyncio.get_event_loop()
|
|
try:
|
|
loop.run_until_complete(run())
|
|
finally:
|
|
loop.close()
|
|
|
|
|
|
Acceptance
|
|
==========
|
|
|
|
PEP 525 was accepted by Guido, September 6, 2016 [2]_.
|
|
|
|
|
|
Implementation
|
|
==============
|
|
|
|
The implementation is tracked in issue 28003 [3]_. The reference
|
|
implementation git repository is available at [1]_.
|
|
|
|
|
|
References
|
|
==========
|
|
|
|
.. [1] https://github.com/1st1/cpython/tree/async_gen
|
|
|
|
.. [2] https://mail.python.org/pipermail/python-dev/2016-September/146267.html
|
|
|
|
.. [3] http://bugs.python.org/issue28003
|
|
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
..
|
|
Local Variables:
|
|
mode: indented-text
|
|
indent-tabs-mode: nil
|
|
sentence-end-double-space: t
|
|
fill-column: 70
|
|
coding: utf-8
|
|
End:
|