PEP: 492 Title: Coroutines with async and await syntax Version: $Revision$ Last-Modified: $Date$ Author: Yury Selivanov Discussions-To: Status: Final Type: Standards Track Content-Type: text/x-rst Created: 09-Apr-2015 Python-Version: 3.5 Post-History: 17-Apr-2015, 21-Apr-2015, 27-Apr-2015, 29-Apr-2015, 05-May-2015 Abstract ======== The growth of Internet and general connectivity has triggered the proportionate need for responsive and scalable code. This proposal aims to answer that need by making writing explicitly asynchronous, concurrent Python code easier and more Pythonic. It is proposed to make *coroutines* a proper standalone concept in Python, and introduce new supporting syntax. The ultimate goal is to help establish a common, easily approachable, mental model of asynchronous programming in Python and make it as close to synchronous programming as possible. This PEP assumes that the asynchronous tasks are scheduled and coordinated by an Event Loop similar to that of stdlib module ``asyncio.events.AbstractEventLoop``. While the PEP is not tied to any specific Event Loop implementation, it is relevant only to the kind of coroutine that uses "yield" as a signal to the scheduler, indicating that the coroutine will be waiting until an event (such as IO) is completed. We believe that the changes proposed here will help keep Python relevant and competitive in a quickly growing area of asynchronous programming, as many other languages have adopted, or are planning to adopt, similar features: [2]_, [5]_, [6]_, [7]_, [8]_, [10]_. Rationale and Goals =================== Current Python supports implementing coroutines via generators (PEP 342), further enhanced by the ``yield from`` syntax introduced in PEP 380. This approach has a number of shortcomings: * It is easy to confuse coroutines with regular generators, since they share the same syntax; this is especially true for new developers. * Whether or not a function is a coroutine is determined by a presence of ``yield`` or ``yield from`` statements in its *body*, which can lead to unobvious errors when such statements appear in or disappear from function body during refactoring. * Support for asynchronous calls is limited to expressions where ``yield`` is allowed syntactically, limiting the usefulness of syntactic features, such as ``with`` and ``for`` statements. This proposal makes coroutines a native Python language feature, and clearly separates them from generators. This removes generator/coroutine ambiguity, and makes it possible to reliably define coroutines without reliance on a specific library. This also enables linters and IDEs to improve static code analysis and refactoring. Native coroutines and the associated new syntax features make it possible to define context manager and iteration protocols in asynchronous terms. As shown later in this proposal, the new ``async with`` statement lets Python programs perform asynchronous calls when entering and exiting a runtime context, and the new ``async for`` statement makes it possible to perform asynchronous calls in iterators. Specification ============= This proposal introduces new syntax and semantics to enhance coroutine support in Python. This specification presumes knowledge of the implementation of coroutines in Python (PEP 342 and PEP 380). Motivation for the syntax changes proposed here comes from the asyncio framework (PEP 3156) and the "Cofunctions" proposal (PEP 3152, now rejected in favor of this specification). From this point in this document we use the word *native coroutine* to refer to functions declared using the new syntax. *generator-based coroutine* is used where necessary to refer to coroutines that are based on generator syntax. *coroutine* is used in contexts where both definitions are applicable. New Coroutine Declaration Syntax -------------------------------- The following new syntax is used to declare a *native coroutine*:: async def read_data(db): pass Key properties of *coroutines*: * ``async def`` functions are always coroutines, even if they do not contain ``await`` expressions. * It is a ``SyntaxError`` to have ``yield`` or ``yield from`` expressions in an ``async`` function. * Internally, two new code object flags were introduced: - ``CO_COROUTINE`` is used to mark *native coroutines* (defined with new syntax.) - ``CO_ITERABLE_COROUTINE`` is used to make *generator-based coroutines* compatible with *native coroutines* (set by `types.coroutine()`_ function). All coroutines have ``CO_GENERATOR`` flag set. * Regular generators, when called, return a *generator object*; similarly, coroutines return a *coroutine* object. * ``StopIteration`` exceptions are not propagated out of coroutines, and are replaced with a ``RuntimeError``. For regular generators such behavior requires a future import (see PEP 479). * When a *coroutine* is garbage collected, a ``RuntimeWarning`` is raised if it was never awaited on (see also `Debugging Features`_.) * See also `Coroutine objects`_ section. types.coroutine() ----------------- A new function ``coroutine(gen)`` is added to the ``types`` module. It allows interoperability between existing *generator-based coroutines* in asyncio and *native coroutines* introduced by this PEP. The function applies ``CO_ITERABLE_COROUTINE`` flag to generator- function's code object, making it return a *coroutine* object. The function can be used as a decorator, since it modifies generator- functions in-place and returns them. Note, that the ``CO_COROUTINE`` flag is not applied by ``types.coroutine()`` to make it possible to separate *native coroutines* defined with new syntax, from *generator-based coroutines*. Await Expression ---------------- The following new ``await`` expression is used to obtain a result of coroutine execution:: async def read_data(db): data = await db.fetch('SELECT ...') ... ``await``, similarly to ``yield from``, suspends execution of ``read_data`` coroutine until ``db.fetch`` *awaitable* completes and returns the result data. It uses the ``yield from`` implementation with an extra step of validating its argument. ``await`` only accepts an *awaitable*, which can be one of: * A *native coroutine* object returned from a *native coroutine function*. * A *generator-based coroutine* object returned from a *generator function* decorated with ``types.coroutine()``. * An object with an ``__await__`` method returning an iterator. Any ``yield from`` chain of calls ends with a ``yield``. This is a fundamental mechanism of how *Futures* are implemented. Since, internally, coroutines are a special kind of generators, every ``await`` is suspended by a ``yield`` somewhere down the chain of ``await`` calls (please refer to PEP 3156 for a detailed explanation.) To enable this behavior for coroutines, a new magic method called ``__await__`` is added. In asyncio, for instance, to enable *Future* objects in ``await`` statements, the only change is to add ``__await__ = __iter__`` line to ``asyncio.Future`` class. Objects with ``__await__`` method are called *Future-like* objects in the rest of this PEP. Also, please note that ``__aiter__`` method (see its definition below) cannot be used for this purpose. It is a different protocol, and would be like using ``__iter__`` instead of ``__call__`` for regular callables. It is a ``TypeError`` if ``__await__`` returns anything but an iterator. * Objects defined with CPython C API with a ``tp_as_async->am_await`` function, returning an *iterator* (similar to ``__await__`` method). It is a ``SyntaxError`` to use ``await`` outside of an ``async def`` function (like it is a ``SyntaxError`` to use ``yield`` outside of ``def`` function.) It is a ``TypeError`` to pass anything other than an *awaitable* object to an ``await`` expression. Updated operator precedence table ''''''''''''''''''''''''''''''''' ``await`` keyword is defined as follows:: power ::= await ["**" u_expr] await ::= ["await"] primary where "primary" represents the most tightly bound operations of the language. Its syntax is:: primary ::= atom | attributeref | subscription | slicing | call See Python Documentation [12]_ and `Grammar Updates`_ section of this proposal for details. The key ``await`` difference from ``yield`` and ``yield from`` operators is that *await expressions* do not require parentheses around them most of the times. Also, ``yield from`` allows any expression as its argument, including expressions like ``yield from a() + b()``, that would be parsed as ``yield from (a() + b())``, which is almost always a bug. In general, the result of any arithmetic operation is not an *awaitable* object. To avoid this kind of mistakes, it was decided to make ``await`` precedence lower than ``[]``, ``()``, and ``.``, but higher than ``**`` operators. +------------------------------+-----------------------------------+ | Operator | Description | +==============================+===================================+ | ``yield`` ``x``, | Yield expression | | ``yield from`` ``x`` | | +------------------------------+-----------------------------------+ | ``lambda`` | Lambda expression | +------------------------------+-----------------------------------+ | ``if`` -- ``else`` | Conditional expression | +------------------------------+-----------------------------------+ | ``or`` | Boolean OR | +------------------------------+-----------------------------------+ | ``and`` | Boolean AND | +------------------------------+-----------------------------------+ | ``not`` ``x`` | Boolean NOT | +------------------------------+-----------------------------------+ | ``in``, ``not in``, | Comparisons, including membership | | ``is``, ``is not``, ``<``, | tests and identity tests | | ``<=``, ``>``, ``>=``, | | | ``!=``, ``==`` | | +------------------------------+-----------------------------------+ | ``|`` | Bitwise OR | +------------------------------+-----------------------------------+ | ``^`` | Bitwise XOR | +------------------------------+-----------------------------------+ | ``&`` | Bitwise AND | +------------------------------+-----------------------------------+ | ``<<``, ``>>`` | Shifts | +------------------------------+-----------------------------------+ | ``+``, ``-`` | Addition and subtraction | +------------------------------+-----------------------------------+ | ``*``, ``@``, ``/``, ``//``, | Multiplication, matrix | | ``%`` | multiplication, division, | | | remainder | +------------------------------+-----------------------------------+ | ``+x``, ``-x``, ``~x`` | Positive, negative, bitwise NOT | +------------------------------+-----------------------------------+ | ``**`` | Exponentiation | +------------------------------+-----------------------------------+ | ``await`` ``x`` | Await expression | +------------------------------+-----------------------------------+ | ``x[index]``, | Subscription, slicing, | | ``x[index:index]``, | call, attribute reference | | ``x(arguments...)``, | | | ``x.attribute`` | | +------------------------------+-----------------------------------+ | ``(expressions...)``, | Binding or tuple display, | | ``[expressions...]``, | list display, | | ``{key: value...}``, | dictionary display, | | ``{expressions...}`` | set display | +------------------------------+-----------------------------------+ Examples of "await" expressions ''''''''''''''''''''''''''''''' Valid syntax examples: ================================== ================================== Expression Will be parsed as ================================== ================================== ``if await fut: pass`` ``if (await fut): pass`` ``if await fut + 1: pass`` ``if (await fut) + 1: pass`` ``pair = await fut, 'spam'`` ``pair = (await fut), 'spam'`` ``with await fut, open(): pass`` ``with (await fut), open(): pass`` ``await foo()['spam'].baz()()`` ``await ( foo()['spam'].baz()() )`` ``return await coro()`` ``return ( await coro() )`` ``res = await coro() ** 2`` ``res = (await coro()) ** 2`` ``func(a1=await coro(), a2=0)`` ``func(a1=(await coro()), a2=0)`` ``await foo() + await bar()`` ``(await foo()) + (await bar())`` ``-await foo()`` ``-(await foo())`` ================================== ================================== Invalid syntax examples: ================================== ================================== Expression Should be written as ================================== ================================== ``await await coro()`` ``await (await coro())`` ``await -coro()`` ``await (-coro())`` ================================== ================================== Asynchronous Context Managers and "async with" ---------------------------------------------- An *asynchronous context manager* is a context manager that is able to suspend execution in its *enter* and *exit* methods. To make this possible, a new protocol for asynchronous context managers is proposed. Two new magic methods are added: ``__aenter__`` and ``__aexit__``. Both must return an *awaitable*. An example of an asynchronous context manager:: class AsyncContextManager: async def __aenter__(self): await log('entering context') async def __aexit__(self, exc_type, exc, tb): await log('exiting context') New Syntax '''''''''' A new statement for asynchronous context managers is proposed:: async with EXPR as VAR: BLOCK which is semantically equivalent to:: mgr = (EXPR) aexit = type(mgr).__aexit__ aenter = type(mgr).__aenter__(mgr) exc = True VAR = await aenter try: BLOCK except: if not await aexit(mgr, *sys.exc_info()): raise else: await aexit(mgr, None, None, None) As with regular ``with`` statements, it is possible to specify multiple context managers in a single ``async with`` statement. It is an error to pass a regular context manager without ``__aenter__`` and ``__aexit__`` methods to ``async with``. It is a ``SyntaxError`` to use ``async with`` outside of an ``async def`` function. Example ''''''' With *asynchronous context managers* it is easy to implement proper database transaction managers for coroutines:: async def commit(session, data): ... async with session.transaction(): ... await session.update(data) ... Code that needs locking also looks lighter:: async with lock: ... instead of:: with (yield from lock): ... Asynchronous Iterators and "async for" -------------------------------------- An *asynchronous iterable* is able to call asynchronous code in its *iter* implementation, and *asynchronous iterator* can call asynchronous code in its *next* method. To support asynchronous iteration: 1. An object must implement an ``__aiter__`` method returning an *awaitable* resulting in an *asynchronous iterator object*. 2. An *asynchronous iterator object* must implement an ``__anext__`` method returning an *awaitable*. 3. To stop iteration ``__anext__`` must raise a ``StopAsyncIteration`` exception. An example of asynchronous iterable:: class AsyncIterable: async def __aiter__(self): return self async def __anext__(self): data = await self.fetch_data() if data: return data else: raise StopAsyncIteration async def fetch_data(self): ... New Syntax '''''''''' A new statement for iterating through asynchronous iterators is proposed:: async for TARGET in ITER: BLOCK else: BLOCK2 which is semantically equivalent to:: iter = (ITER) iter = await type(iter).__aiter__(iter) running = True while running: try: TARGET = await type(iter).__anext__(iter) except StopAsyncIteration: running = False else: BLOCK else: BLOCK2 It is a ``TypeError`` to pass a regular iterable without ``__aiter__`` method to ``async for``. It is a ``SyntaxError`` to use ``async for`` outside of an ``async def`` function. As for with regular ``for`` statement, ``async for`` has an optional ``else`` clause. Example 1 ''''''''' With asynchronous iteration protocol it is possible to asynchronously buffer data during iteration:: async for data in cursor: ... Where ``cursor`` is an asynchronous iterator that prefetches ``N`` rows of data from a database after every ``N`` iterations. The following code illustrates new asynchronous iteration protocol:: class Cursor: def __init__(self): self.buffer = collections.deque() def _prefetch(self): ... async def __aiter__(self): return self async def __anext__(self): if not self.buffer: self.buffer = await self._prefetch() if not self.buffer: raise StopAsyncIteration return self.buffer.popleft() then the ``Cursor`` class can be used as follows:: async for row in Cursor(): print(row) which would be equivalent to the following code:: i = await Cursor().__aiter__() while True: try: row = await i.__anext__() except StopAsyncIteration: break else: print(row) Example 2 ''''''''' The following is a utility class that transforms a regular iterable to an asynchronous one. While this is not a very useful thing to do, the code illustrates the relationship between regular and asynchronous iterators. :: class AsyncIteratorWrapper: def __init__(self, obj): self._it = iter(obj) async def __aiter__(self): return self async def __anext__(self): try: value = next(self._it) except StopIteration: raise StopAsyncIteration return value async for letter in AsyncIteratorWrapper("abc"): print(letter) Why StopAsyncIteration? ''''''''''''''''''''''' Coroutines are still based on generators internally. So, before PEP 479, there was no fundamental difference between :: def g1(): yield from fut return 'spam' and :: def g2(): yield from fut raise StopIteration('spam') And since PEP 479 is accepted and enabled by default for coroutines, the following example will have its ``StopIteration`` wrapped into a ``RuntimeError`` :: async def a1(): await fut raise StopIteration('spam') The only way to tell the outside code that the iteration has ended is to raise something other than ``StopIteration``. Therefore, a new built-in exception class ``StopAsyncIteration`` was added. Moreover, with semantics from PEP 479, all ``StopIteration`` exceptions raised in coroutines are wrapped in ``RuntimeError``. Coroutine objects ----------------- Differences from generators ''''''''''''''''''''''''''' This section applies only to *native coroutines* with ``CO_COROUTINE`` flag, i.e. defined with the new ``async def`` syntax. **The behavior of existing *generator-based coroutines* in asyncio remains unchanged.** Great effort has been made to make sure that coroutines and generators are treated as distinct concepts: 1. *Native coroutine* objects do not implement ``__iter__`` and ``__next__`` methods. Therefore, they cannot be iterated over or passed to ``iter()``, ``list()``, ``tuple()`` and other built-ins. They also cannot be used in a ``for..in`` loop. An attempt to use ``__iter__`` or ``__next__`` on a *native coroutine* object will result in a ``TypeError``. 2. *Plain generators* cannot ``yield from`` *native coroutines*: doing so will result in a ``TypeError``. 3. *generator-based coroutines* (for asyncio code must be decorated with ``@asyncio.coroutine``) can ``yield from`` *native coroutine objects*. 4. ``inspect.isgenerator()`` and ``inspect.isgeneratorfunction()`` return ``False`` for *native coroutine* objects and *native coroutine functions*. Coroutine object methods '''''''''''''''''''''''' Coroutines are based on generators internally, thus they share the implementation. Similarly to generator objects, *coroutines* have ``throw()``, ``send()`` and ``close()`` methods. ``StopIteration`` and ``GeneratorExit`` play the same role for coroutines (although PEP 479 is enabled by default for coroutines). See PEP 342, PEP 380, and Python Documentation [11]_ for details. ``throw()``, ``send()`` methods for *coroutines* are used to push values and raise errors into *Future-like* objects. Debugging Features ------------------ A common beginner mistake is forgetting to use ``yield from`` on coroutines:: @asyncio.coroutine def useful(): asyncio.sleep(1) # this will do noting without 'yield from' For debugging this kind of mistakes there is a special debug mode in asyncio, in which ``@coroutine`` decorator wraps all functions with a special object with a destructor logging a warning. Whenever a wrapped generator gets garbage collected, a detailed logging message is generated with information about where exactly the decorator function was defined, stack trace of where it was collected, etc. Wrapper object also provides a convenient ``__repr__`` function with detailed information about the generator. The only problem is how to enable these debug capabilities. Since debug facilities should be a no-op in production mode, ``@coroutine`` decorator makes the decision of whether to wrap or not to wrap based on an OS environment variable ``PYTHONASYNCIODEBUG``. This way it is possible to run asyncio programs with asyncio's own functions instrumented. ``EventLoop.set_debug``, a different debug facility, has no impact on ``@coroutine`` decorator's behavior. With this proposal, coroutines is a native, distinct from generators, concept. *In addition* to a ``RuntimeWarning`` being raised on coroutines that were never awaited, it is proposed to add two new functions to the ``sys`` module: ``set_coroutine_wrapper`` and ``get_coroutine_wrapper``. This is to enable advanced debugging facilities in asyncio and other frameworks (such as displaying where exactly coroutine was created, and a more detailed stack trace of where it was garbage collected). New Standard Library Functions ------------------------------ * ``types.coroutine(gen)``. See `types.coroutine()`_ section for details. * ``inspect.iscoroutine(obj)`` returns ``True`` if ``obj`` is a *coroutine* object. * ``inspect.iscoroutinefunction(obj)`` returns ``True`` if ``obj`` is a *coroutine function*. * ``inspect.isawaitable(obj)`` returns ``True`` if ``obj`` can be used in ``await`` expression. See `Await Expression`_ for details. * ``sys.set_coroutine_wrapper(wrapper)`` allows to intercept creation of *coroutine* objects. ``wrapper`` must be a callable that accepts one argument: a *coroutine* object or ``None``. ``None`` resets the wrapper. If called twice, the new wrapper replaces the previous one. The function is thread-specific. See `Debugging Features`_ for more details. * ``sys.get_coroutine_wrapper()`` returns the current wrapper object. Returns ``None`` if no wrapper was set. The function is thread-specific. See `Debugging Features`_ for more details. New Abstract Base Classes ------------------------- In order to allow better integration with existing frameworks (such as Tornado, see [13]_) and compilers (such as Cython, see [16]_), two new Abstract Base Classes (ABC) are added: * ``collections.abc.Awaitable`` ABC for *Future-like* classes, that implement ``__await__`` method. * ``collections.abc.Coroutine`` ABC for *coroutine* objects, that implement ``send(value)``, ``throw(type, exc, tb)``, and ``close()`` methods. To allow easy testing if objects support asynchronous iteration, two more ABCs are added: * ``collections.abc.AsyncIterable`` -- tests for ``__aiter__`` method. * ``collections.abc.AsyncIterator`` -- tests for ``__aiter__`` and ``__anext__`` methods. Glossary ======== :Native coroutine function: A coroutine function is declared with ``async def``. It uses ``await`` and ``return value``; see `New Coroutine Declaration Syntax`_ for details. :Native coroutine: Returned from a native coroutine function. See `Await Expression`_ for details. :Generator-based coroutine function: Coroutines based on generator syntax. Most common example are functions decorated with ``@asyncio.coroutine``. :Generator-based coroutine: Returned from a generator-based coroutine function. :Coroutine: Either *native coroutine* or *generator-based coroutine*. :Coroutine object: Either *native coroutine* object or *generator-based coroutine* object. :Future-like object: An object with an ``__await__`` method, or a C object with ``tp_as_async->am_await`` function, returning an *iterator*. Can be consumed by an ``await`` expression in a coroutine. A coroutine waiting for a Future-like object is suspended until the Future-like object's ``__await__`` completes, and returns the result. See `Await Expression`_ for details. :Awaitable: A *Future-like* object or a *coroutine* object. See `Await Expression`_ for details. :Asynchronous context manager: An asynchronous context manager has ``__aenter__`` and ``__aexit__`` methods and can be used with ``async with``. See `Asynchronous Context Managers and "async with"`_ for details. :Asynchronous iterable: An object with an ``__aiter__`` method, which must return an *asynchronous iterator* object. Can be used with ``async for``. See `Asynchronous Iterators and "async for"`_ for details. :Asynchronous iterator: An asynchronous iterator has an ``__anext__`` method. See `Asynchronous Iterators and "async for"`_ for details. List of functions and methods ============================= ================= =================================== ================= Method Can contain Can't contain ================= =================================== ================= async def func await, return value yield, yield from async def __a*__ await, return value yield, yield from def __a*__ return awaitable await def __await__ yield, yield from, return iterable await generator yield, yield from, return value await ================= =================================== ================= Where: * "async def func": native coroutine; * "async def __a*__": ``__aiter__``, ``__anext__``, ``__aenter__``, ``__aexit__`` defined with the ``async`` keyword; * "def __a*__": ``__aiter__``, ``__anext__``, ``__aenter__``, ``__aexit__`` defined without the ``async`` keyword, must return an *awaitable*; * "def __await__": ``__await__`` method to implement *Future-like* objects; * generator: a "regular" generator, function defined with ``def`` and which contains a least one ``yield`` or ``yield from`` expression. Transition Plan =============== To avoid backwards compatibility issues with ``async`` and ``await`` keywords, it was decided to modify ``tokenizer.c`` in such a way, that it: * recognizes ``async def`` ``NAME`` tokens combination; * keeps track of regular ``def`` and ``async def`` indented blocks; * while tokenizing ``async def`` block, it replaces ``'async'`` ``NAME`` token with ``ASYNC``, and ``'await'`` ``NAME`` token with ``AWAIT``; * while tokenizing ``def`` block, it yields ``'async'`` and ``'await'`` ``NAME`` tokens as is. This approach allows for seamless combination of new syntax features (all of them available only in ``async`` functions) with any existing code. An example of having "async def" and "async" attribute in one piece of code:: class Spam: async = 42 async def ham(): print(getattr(Spam, 'async')) # The coroutine can be executed and will print '42' Backwards Compatibility ----------------------- This proposal preserves 100% backwards compatibility. asyncio ''''''' ``asyncio`` module was adapted and tested to work with coroutines and new statements. Backwards compatibility is 100% preserved, i.e. all existing code will work as-is. The required changes are mainly: 1. Modify ``@asyncio.coroutine`` decorator to use new ``types.coroutine()`` function. 2. Add ``__await__ = __iter__`` line to ``asyncio.Future`` class. 3. Add ``ensure_future()`` as an alias for ``async()`` function. Deprecate ``async()`` function. asyncio migration strategy '''''''''''''''''''''''''' Because *plain generators* cannot ``yield from`` *native coroutine objects* (see `Differences from generators`_ section for more details), it is advised to make sure that all generator-based coroutines are decorated with ``@asyncio.coroutine`` *before* starting to use the new syntax. async/await in CPython code base '''''''''''''''''''''''''''''''' There is no use of ``await`` names in CPython. ``async`` is mostly used by asyncio. We are addressing this by renaming ``async()`` function to ``ensure_future()`` (see `asyncio`_ section for details.) Another use of ``async`` keyword is in ``Lib/xml/dom/xmlbuilder.py``, to define an ``async = False`` attribute for ``DocumentLS`` class. There is no documentation or tests for it, it is not used anywhere else in CPython. It is replaced with a getter, that raises a ``DeprecationWarning``, advising to use ``async_`` attribute instead. 'async' attribute is not documented and is not used in CPython code base. Grammar Updates --------------- Grammar changes are fairly minimal:: decorated: decorators (classdef | funcdef | async_funcdef) async_funcdef: ASYNC funcdef compound_stmt: (if_stmt | while_stmt | for_stmt | try_stmt | with_stmt | funcdef | classdef | decorated | async_stmt) async_stmt: ASYNC (funcdef | with_stmt | for_stmt) power: atom_expr ['**' factor] atom_expr: [AWAIT] atom trailer* Transition Period Shortcomings ------------------------------ There is just one. Until ``async`` and ``await`` are not proper keywords, it is not possible (or at least very hard) to fix ``tokenizer.c`` to recognize them on the **same line** with ``def`` keyword:: # async and await will always be parsed as variables async def outer(): # 1 def nested(a=(await fut)): pass async def foo(): return (await fut) # 2 Since ``await`` and ``async`` in such cases are parsed as ``NAME`` tokens, a ``SyntaxError`` will be raised. To workaround these issues, the above examples can be easily rewritten to a more readable form:: async def outer(): # 1 a_default = await fut def nested(a=a_default): pass async def foo(): # 2 return (await fut) This limitation will go away as soon as ``async`` and ``await`` are proper keywords. Deprecation Plans ----------------- ``async`` and ``await`` names will be softly deprecated in CPython 3.5 and 3.6. In 3.7 we will transform them to proper keywords. Making ``async`` and ``await`` proper keywords before 3.7 might make it harder for people to port their code to Python 3. Design Considerations ===================== PEP 3152 -------- PEP 3152 by Gregory Ewing proposes a different mechanism for coroutines (called "cofunctions"). Some key points: 1. A new keyword ``codef`` to declare a *cofunction*. *Cofunction* is always a generator, even if there is no ``cocall`` expressions inside it. Maps to ``async def`` in this proposal. 2. A new keyword ``cocall`` to call a *cofunction*. Can only be used inside a *cofunction*. Maps to ``await`` in this proposal (with some differences, see below.) 3. It is not possible to call a *cofunction* without a ``cocall`` keyword. 4. ``cocall`` grammatically requires parentheses after it:: atom: cocall | cocall: 'cocall' atom cotrailer* '(' [arglist] ')' cotrailer: '[' subscriptlist ']' | '.' NAME 5. ``cocall f(*args, **kwds)`` is semantically equivalent to ``yield from f.__cocall__(*args, **kwds)``. Differences from this proposal: 1. There is no equivalent of ``__cocall__`` in this PEP, which is called and its result is passed to ``yield from`` in the ``cocall`` expression. ``await`` keyword expects an *awaitable* object, validates the type, and executes ``yield from`` on it. Although, ``__await__`` method is similar to ``__cocall__``, but is only used to define *Future-like* objects. 2. ``await`` is defined in almost the same way as ``yield from`` in the grammar (it is later enforced that ``await`` can only be inside ``async def``). It is possible to simply write ``await future``, whereas ``cocall`` always requires parentheses. 3. To make asyncio work with PEP 3152 it would be required to modify ``@asyncio.coroutine`` decorator to wrap all functions in an object with a ``__cocall__`` method, or to implement ``__cocall__`` on generators. To call *cofunctions* from existing generator-based coroutines it would be required to use ``costart(cofunc, *args, **kwargs)`` built-in. 4. Since it is impossible to call a *cofunction* without a ``cocall`` keyword, it automatically prevents the common mistake of forgetting to use ``yield from`` on generator-based coroutines. This proposal addresses this problem with a different approach, see `Debugging Features`_. 5. A shortcoming of requiring a ``cocall`` keyword to call a coroutine is that if is decided to implement coroutine-generators -- coroutines with ``yield`` or ``async yield`` expressions -- we wouldn't need a ``cocall`` keyword to call them. So we'll end up having ``__cocall__`` and no ``__call__`` for regular coroutines, and having ``__call__`` and no ``__cocall__`` for coroutine- generators. 6. Requiring parentheses grammatically also introduces a whole lot of new problems. The following code:: await fut await function_returning_future() await asyncio.gather(coro1(arg1, arg2), coro2(arg1, arg2)) would look like:: cocall fut() # or cocall costart(fut) cocall (function_returning_future())() cocall asyncio.gather(costart(coro1, arg1, arg2), costart(coro2, arg1, arg2)) 7. There are no equivalents of ``async for`` and ``async with`` in PEP 3152. Coroutine-generators -------------------- With ``async for`` keyword it is desirable to have a concept of a *coroutine-generator* -- a coroutine with ``yield`` and ``yield from`` expressions. To avoid any ambiguity with regular generators, we would likely require to have an ``async`` keyword before ``yield``, and ``async yield from`` would raise a ``StopAsyncIteration`` exception. While it is possible to implement coroutine-generators, we believe that they are out of scope of this proposal. It is an advanced concept that should be carefully considered and balanced, with a non-trivial changes in the implementation of current generator objects. This is a matter for a separate PEP. Why "async" and "await" keywords -------------------------------- async/await is not a new concept in programming languages: * C# has it since long time ago [5]_; * proposal to add async/await in ECMAScript 7 [2]_; see also Traceur project [9]_; * Facebook's Hack/HHVM [6]_; * Google's Dart language [7]_; * Scala [8]_; * proposal to add async/await to C++ [10]_; * and many other less popular languages. This is a huge benefit, as some users already have experience with async/await, and because it makes working with many languages in one project easier (Python with ECMAScript 7 for instance). Why "__aiter__" returns awaitable --------------------------------- In principle, ``__aiter__`` could be a regular function. There are several good reasons to make it a coroutine: * as most of the ``__anext__``, ``__aenter__``, and ``__aexit__`` methods are coroutines, users would often make a mistake defining it as ``async`` anyways; * there might be a need to run some asynchronous operations in ``__aiter__``, for instance to prepare DB queries or do some file operation. Importance of "async" keyword ----------------------------- While it is possible to just implement ``await`` expression and treat all functions with at least one ``await`` as coroutines, this approach makes APIs design, code refactoring and its long time support harder. Let's pretend that Python only has ``await`` keyword:: def useful(): ... await log(...) ... def important(): await useful() If ``useful()`` function is refactored and someone removes all ``await`` expressions from it, it would become a regular python function, and all code that depends on it, including ``important()`` would be broken. To mitigate this issue a decorator similar to ``@asyncio.coroutine`` has to be introduced. Why "async def" --------------- For some people bare ``async name(): pass`` syntax might look more appealing than ``async def name(): pass``. It is certainly easier to type. But on the other hand, it breaks the symmetry between ``async def``, ``async with`` and ``async for``, where ``async`` is a modifier, stating that the statement is asynchronous. It is also more consistent with the existing grammar. Why not "await for" and "await with" ------------------------------------ ``async`` is an adjective, and hence it is a better choice for a *statement qualifier* keyword. ``await for/with`` would imply that something is awaiting for a completion of a ``for`` or ``with`` statement. Why "async def" and not "def async" ----------------------------------- ``async`` keyword is a *statement qualifier*. A good analogy to it are "static", "public", "unsafe" keywords from other languages. "async for" is an asynchronous "for" statement, "async with" is an asynchronous "with" statement, "async def" is an asynchronous function. Having "async" after the main statement keyword might introduce some confusion, like "for async item in iterator" can be read as "for each asynchronous item in iterator". Having ``async`` keyword before ``def``, ``with`` and ``for`` also makes the language grammar simpler. And "async def" better separates coroutines from regular functions visually. Why not a __future__ import --------------------------- `Transition Plan`_ section explains how tokenizer is modified to treat ``async`` and ``await`` as keywords *only* in ``async def`` blocks. Hence ``async def`` fills the role that a module level compiler declaration like ``from __future__ import async_await`` would otherwise fill. Why magic methods start with "a" -------------------------------- New asynchronous magic methods ``__aiter__``, ``__anext__``, ``__aenter__``, and ``__aexit__`` all start with the same prefix "a". An alternative proposal is to use "async" prefix, so that ``__aiter__`` becomes ``__async_iter__``. However, to align new magic methods with the existing ones, such as ``__radd__`` and ``__iadd__`` it was decided to use a shorter version. Why not reuse existing magic names ---------------------------------- An alternative idea about new asynchronous iterators and context managers was to reuse existing magic methods, by adding an ``async`` keyword to their declarations:: class CM: async def __enter__(self): # instead of __aenter__ ... This approach has the following downsides: * it would not be possible to create an object that works in both ``with`` and ``async with`` statements; * it would break backwards compatibility, as nothing prohibits from returning a Future-like objects from ``__enter__`` and/or ``__exit__`` in Python <= 3.4; * one of the main points of this proposal is to make native coroutines as simple and foolproof as possible, hence the clear separation of the protocols. Why not reuse existing "for" and "with" statements -------------------------------------------------- The vision behind existing generator-based coroutines and this proposal is to make it easy for users to see where the code might be suspended. Making existing "for" and "with" statements to recognize asynchronous iterators and context managers will inevitably create implicit suspend points, making it harder to reason about the code. Comprehensions -------------- Syntax for asynchronous comprehensions could be provided, but this construct is outside of the scope of this PEP. Async lambda functions ---------------------- Syntax for asynchronous lambda functions could be provided, but this construct is outside of the scope of this PEP. Performance =========== Overall Impact -------------- This proposal introduces no observable performance impact. Here is an output of python's official set of benchmarks [4]_: :: python perf.py -r -b default ../cpython/python.exe ../cpython-aw/python.exe [skipped] Report on Darwin ysmac 14.3.0 Darwin Kernel Version 14.3.0: Mon Mar 23 11:59:05 PDT 2015; root:xnu-2782.20.48~5/RELEASE_X86_64 x86_64 i386 Total CPU cores: 8 ### etree_iterparse ### Min: 0.365359 -> 0.349168: 1.05x faster Avg: 0.396924 -> 0.379735: 1.05x faster Significant (t=9.71) Stddev: 0.01225 -> 0.01277: 1.0423x larger The following not significant results are hidden, use -v to show them: django_v2, 2to3, etree_generate, etree_parse, etree_process, fastpickle, fastunpickle, json_dump_v2, json_load, nbody, regex_v8, tornado_http. Tokenizer modifications ----------------------- There is no observable slowdown of parsing python files with the modified tokenizer: parsing of one 12Mb file (``Lib/test/test_binop.py`` repeated 1000 times) takes the same amount of time. async/await ----------- The following micro-benchmark was used to determine performance difference between "async" functions and generators:: import sys import time def binary(n): if n <= 0: return 1 l = yield from binary(n - 1) r = yield from binary(n - 1) return l + 1 + r async def abinary(n): if n <= 0: return 1 l = await abinary(n - 1) r = await abinary(n - 1) return l + 1 + r def timeit(gen, depth, repeat): t0 = time.time() for _ in range(repeat): list(gen(depth)) t1 = time.time() print('{}({}) * {}: total {:.3f}s'.format( gen.__name__, depth, repeat, t1-t0)) The result is that there is no observable performance difference. Minimum timing of 3 runs :: abinary(19) * 30: total 12.985s binary(19) * 30: total 12.953s Note that depth of 19 means 1,048,575 calls. Reference Implementation ======================== The reference implementation can be found here: [3]_. List of high-level changes and new protocols -------------------------------------------- 1. New syntax for defining coroutines: ``async def`` and new ``await`` keyword. 2. New ``__await__`` method for Future-like objects, and new ``tp_as_async->am_await`` slot in ``PyTypeObject``. 3. New syntax for asynchronous context managers: ``async with``. And associated protocol with ``__aenter__`` and ``__aexit__`` methods. 4. New syntax for asynchronous iteration: ``async for``. And associated protocol with ``__aiter__``, ``__aexit__`` and new built- in exception ``StopAsyncIteration``. New ``tp_as_async->am_aiter`` and ``tp_as_async->am_anext`` slots in ``PyTypeObject``. 5. New AST nodes: ``AsyncFunctionDef``, ``AsyncFor``, ``AsyncWith``, ``Await``. 6. New functions: ``sys.set_coroutine_wrapper(callback)``, ``sys.get_coroutine_wrapper()``, ``types.coroutine(gen)``, ``inspect.iscoroutinefunction()``, ``inspect.iscoroutine()``, and ``inspect.isawaitable()``. 7. New ``CO_COROUTINE`` and ``CO_ITERABLE_COROUTINE`` bit flags for code objects. 8. New ABCs: ``collections.abc.Awaitable``, ``collections.abc.Coroutine``, ``collections.abc.AsyncIterable``, and ``collections.abc.AsyncIterator``. While the list of changes and new things is not short, it is important to understand, that most users will not use these features directly. It is intended to be used in frameworks and libraries to provide users with convenient to use and unambiguous APIs with ``async def``, ``await``, ``async for`` and ``async with`` syntax. Working example --------------- All concepts proposed in this PEP are implemented [3]_ and can be tested. :: import asyncio async def echo_server(): print('Serving on localhost:8000') await asyncio.start_server(handle_connection, 'localhost', 8000) async def handle_connection(reader, writer): print('New connection...') while True: data = await reader.read(8192) if not data: break print('Sending {:.10}... back'.format(repr(data))) writer.write(data) loop = asyncio.get_event_loop() loop.run_until_complete(echo_server()) try: loop.run_forever() finally: loop.close() Acceptance ========== PEP 492 was accepted by Guido, Tuesday, May 5, 2015 [14]_. Implementation ============== The implementation is tracked in issue 24017 [15]_. It was committed on May 11, 2015. References ========== .. [1] https://docs.python.org/3/library/asyncio-task.html#asyncio.coroutine .. [2] http://wiki.ecmascript.org/doku.php?id=strawman:async_functions .. [3] https://github.com/1st1/cpython/tree/await .. [4] https://hg.python.org/benchmarks .. [5] https://msdn.microsoft.com/en-us/library/hh191443.aspx .. [6] http://docs.hhvm.com/manual/en/hack.async.php .. [7] https://www.dartlang.org/articles/await-async/ .. [8] http://docs.scala-lang.org/sips/pending/async.html .. [9] https://github.com/google/traceur-compiler/wiki/LanguageFeatures#async-functions-experimental .. [10] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3722.pdf (PDF) .. [11] https://docs.python.org/3/reference/expressions.html#generator-iterator-methods .. [12] https://docs.python.org/3/reference/expressions.html#primaries .. [13] https://mail.python.org/pipermail/python-dev/2015-May/139851.html .. [14] https://mail.python.org/pipermail/python-dev/2015-May/139844.html .. [15] http://bugs.python.org/issue24017 .. [16] https://github.com/python/asyncio/issues/233 Acknowledgments =============== I thank Guido van Rossum, Victor Stinner, Elvis Pranskevichus, Andrew Svetlov, Ɓukasz Langa, Greg Ewing, Stephen J. Turnbull, Jim J. Jewett, Brett Cannon, Nick Coghlan, Steven D'Aprano, Paul Moore, Nathaniel Smith, Ethan Furman, Stefan Behnel, Paul Sokolovsky, Victor Petrovykh, and many others for their feedback, ideas, edits, criticism, code reviews, and discussions around this PEP. Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: