390 lines
15 KiB
Plaintext
390 lines
15 KiB
Plaintext
PEP: 3156
|
||
Title: Asynchronous IO Support Rebooted
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Guido van Rossum <guido@python.org>
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 12-Dec-2012
|
||
Post-History: TBD
|
||
|
||
Abstract
|
||
========
|
||
|
||
This is a proposal for asynchronous I/O in Python 3, starting with
|
||
Python 3.3. Consider this the concrete proposal that is missing from
|
||
PEP 3153. The proposal includes a pluggable event loop API, transport
|
||
and protocol abstractions similar to those in Twisted, and a
|
||
higher-level scheduler based on yield-from (PEP 380). A reference
|
||
implementation is in the works under the code name tulip.
|
||
|
||
|
||
Introduction
|
||
============
|
||
|
||
The event loop is the place where most interoperability occurs. It
|
||
should be easy for (Python 3.3 ports of) frameworks like Twisted,
|
||
Tornado, or ZeroMQ to either adapt the default event loop
|
||
implementation to their needs using a lightweight wrapper or proxy, or
|
||
to replace the default event loop implementation with an adaptation of
|
||
their own event loop implementation. (Some frameworks, like Twisted,
|
||
have multiple event loop implementations. This should not be a
|
||
problem since these all have the same interface.)
|
||
|
||
It should even be possible for two different third-party frameworks to
|
||
interoperate, either by sharing the default event loop implementation
|
||
(each using its own adapter), or by sharing the event loop
|
||
implementation of either framework. In the latter case two levels of
|
||
adaptation would occur (from framework A's event loop to the standard
|
||
event loop interface, and from there to framework B's event loop).
|
||
Which event loop implementation is used should be under control of the
|
||
main program (though a default policy for event loop selection is
|
||
provided).
|
||
|
||
Thus, two separate APIs are defined:
|
||
|
||
- getting and setting the current event loop object
|
||
- the interface of a conforming event loop and its minimum guarantees
|
||
|
||
An event loop implementation may provide additional methods and
|
||
guarantees.
|
||
|
||
The event loop interface does not depend on yield-from. Rather, it
|
||
uses a combination of callbacks, additional interfaces (transports and
|
||
protocols), and Futures. The latter are similar to those defined in
|
||
PEP 3148, but have a different implementation and are not tied to
|
||
threads. In particular, they have no wait() method; the user is
|
||
expected to use callbacks.
|
||
|
||
For users (like myself) who don't like using callbacks, a scheduler is
|
||
provided for writing asynchronous I/O code as coroutines using the PEP
|
||
380 yield-from expressions. The scheduler is not pluggable;
|
||
pluggability occurs at the event loop level, and the scheduler should
|
||
work with any conforming event loop implementation.
|
||
|
||
For interoperability between code written using coroutines and other
|
||
async frameworks, the scheduler has a Task class that behaves like a
|
||
Future. A framework that interoperates at the event loop level can
|
||
wait for a Future to complete by adding a callback to the Future.
|
||
Likewise, the scheduler offers an operation to suspend a coroutine
|
||
until a callback is called.
|
||
|
||
Limited interoperability with threads is provided by the event loop
|
||
interface; there is an API to submit a function to an executor (see
|
||
PEP 3148) which returns a Future that is compatible with the event
|
||
loop.
|
||
|
||
|
||
Non-goals
|
||
=========
|
||
|
||
Interoperability with systems like Stackless Python or
|
||
greenlets/gevent is not a goal of this PEP.
|
||
|
||
|
||
Specification
|
||
=============
|
||
|
||
Dependencies
|
||
------------
|
||
|
||
Python 3.3 is required. No new language or standard library features
|
||
beyond Python 3.3 are required. No third-party modules or packages
|
||
are required.
|
||
|
||
Module Namespace
|
||
----------------
|
||
|
||
The specification here will live in a new toplevel package. Different
|
||
components will live in separate submodules of that package. The
|
||
package will import common APIs from their respective submodules and
|
||
make them available as package attributes (similar to the way the
|
||
email package works).
|
||
|
||
The name of the toplevel package is currently unspecified. The
|
||
reference implementation uses the name 'tulip', but the name will
|
||
change to something more boring if and when the implementation is
|
||
moved into the standard library (hopefully for Python 3.4).
|
||
|
||
Until the boring name is chosen, this PEP will use 'tulip' as the
|
||
toplevel package name. Classes and functions given without a module
|
||
name are assumed to be accessed via the toplevel package.
|
||
|
||
Event Loop Policy: Getting and Setting the Event Loop
|
||
-----------------------------------------------------
|
||
|
||
To get the current event loop, use ``get_event_loop()``. This returns
|
||
an instance of the ``EventLoop`` class defined below or an equivalent
|
||
object. It is possible that ``get_event_loop()`` returns a different
|
||
object depending on the current thread, or depending on some other
|
||
notion of context.
|
||
|
||
To set the current event loop, use ``set_event_loop(eventloop)``,
|
||
where ``eventloop`` is an instance of the ``EventLoop`` class or
|
||
equivalent. This uses the same notion of context as
|
||
``get_event_loop()``.
|
||
|
||
To change the way ``get_event_loop()`` and ``set_event_loop()`` work
|
||
(including their notion of context), call
|
||
``set_event_loop_policy(policy)``, where ``policy`` is an event loop
|
||
policy object. The policy object can be any object that has methods
|
||
``get_event_loop()`` and ``set_event_loop(eventloop)`` behaving like
|
||
the functions described above. The default event loop policy is an
|
||
instance of the class ``DefaultEventLoopPolicy``. The current event loop
|
||
policy object can be retrieved by calling ``get_event_loop_policy()``.
|
||
|
||
An event loop policy may but does not have to enforce that there is
|
||
only one event loop in existence. The default event loop policy does
|
||
not enforce this, but it does enforce that there is only one event
|
||
loop per thread.
|
||
|
||
Event Loop Interface
|
||
--------------------
|
||
|
||
A conforming event loop object has the following methods:
|
||
|
||
..
|
||
Look for a better way to format method docs. PEP 12 doesn't
|
||
seem to have one. PEP 418 uses ^^^, which makes sub-headings.
|
||
|
||
- ``run()``. Runs the event loop until there is nothing left to do.
|
||
This means, in particular:
|
||
|
||
- No more calls scheduled with ``call_later()`` (except for canceled
|
||
calls).
|
||
|
||
- No more registered file descriptors. It is up to the registering
|
||
party to unregister a file descriptor when it is closed.
|
||
|
||
- TBD: Do we need an API for stopping the event loop, given that we
|
||
have the termination condition? Is the termination condition
|
||
compatible with other frameworks?
|
||
|
||
- TBD: Do we need an API to run the event loop for a little while
|
||
(e.g. a single iteration)? If so, exactly what should it do?
|
||
|
||
- ``call_later(when, callback, *args)``. Arrange for
|
||
``callback(*args)`` to be called approximately ``when`` seconds in
|
||
the future, once, unless canceled. As usual in Python, ``when`` may
|
||
be a floating point number to represent smaller intervals. Returns
|
||
a ``DelayedCall`` object representing the callback, whose
|
||
``cancel()`` method can be used to cancel the callback.
|
||
|
||
- ``call_soon(callback, *args)``. Equivalent to ``call_later(0,
|
||
callback, *args)``.
|
||
|
||
- ``call_soon_threadsafe(callback, *args)``. Like
|
||
``call_soon(callback, *args)``, but when called from another thread
|
||
while the event loop is blocked waiting for I/O, unblocks the event
|
||
loop. This is the _only_ method that is safe to call from another
|
||
thread or from a signal handler. (To schedule a callback for a
|
||
later time in a threadsafe manner, you can use
|
||
``ev.call_soon_threadsafe(ev.call_later, when, callback, *args)``.)
|
||
|
||
The following methods for registering callbacks for file descriptors
|
||
are optional. If they are not implemented, accessing the method
|
||
(without calling it) returns AttributeError. The default
|
||
implementation provides them but the user normally doesn't use these
|
||
directly -- they are used by the transport implementations
|
||
exclusively. Also, on Windows these may be present or not depending
|
||
on whether a select-based or IOCP-based event loop is used. These
|
||
take integer file descriptors only, not objects with a fileno()
|
||
method. The file descriptor should represent something pollable --
|
||
i.e. no disk files.
|
||
|
||
- ``add_reader(fd, callback, *args)``. Arrange for
|
||
``callback(*args)`` to be called whenever file descriptor ``fd`` is
|
||
ready for reading. Returns a ``DelayedCall`` object which can be
|
||
used to cancel the callback. Note that, unlike ``call_later()``,
|
||
the callback may be called many times. Calling ``add_reader()``
|
||
again for the same file descriptor implicitly cancels the previous
|
||
callback for that file descriptor.
|
||
|
||
- ``add_writer(fd, callback, *args)``. Like ``add_reader()``,
|
||
but registers the callback for writing instead of for reading.
|
||
|
||
- ``remove_reader(fd)``. Cancels the current read callback for file
|
||
descriptor ``fd``, if one is set. A no-op if no callback is
|
||
currently set for the file descriptor. (The reason for providing
|
||
this alternate interface is that it is often more convenient to
|
||
remember the file descriptor than to remember the ``DelayedCall``
|
||
object.)
|
||
|
||
- ``remove_writer(fd)``. This is to ``add_writer()`` as
|
||
``remove_reader()`` is to ``add_reader()``.
|
||
|
||
The following methods for doing async I/O on sockets are optional.
|
||
They are alternative to the previous set of optional methods, intended
|
||
for transport implementations on Windows using IOCP (if the event loop
|
||
supports it). The socket argument has to be a non-blocking socket.
|
||
|
||
- ``sock_recv(sock, n)``. Receive up to ``n`` bytes from socket
|
||
``sock``. Returns a ``Future`` whose result on success will be a
|
||
bytes object on success.
|
||
|
||
- ``sock_sendall(sock, data)``. Send bytes ``data`` to the socket
|
||
``sock``. Returns a ``Future`` whose result on success will be
|
||
``None``. (TBD: Is it better to emulate ``sendall()`` or ``send()``
|
||
semantics?)
|
||
|
||
- ``sock_connect(sock, address)``. Connect to the given address.
|
||
Returns a ``Future`` whose result on success will be ``None``.
|
||
|
||
- ``sock_accept(sock)``. Accept a connection from a socket. The
|
||
socket must be in listening mode and bound to an address. Returns a
|
||
``Future`` whose result on success will be a tuple ``(conn, peer)``
|
||
where ``conn`` is a connected non-blocking socket and ``peer`` is
|
||
the peer address.
|
||
|
||
Other TBD:
|
||
|
||
- TBD: A method to submit a call to a PEP 3148 executor. Or a method
|
||
to wait for a PEP 3148 Future. Or both.
|
||
|
||
- TBD: Methods that return ``Futures``, in particular to make a
|
||
connection and to set up a listener.
|
||
|
||
- TBD: Do we need introspection APIs? E.g. asking for the read
|
||
callback given a file descriptor. Or when the next scheduled call
|
||
is. Or the list of file descriptors registered with callbacks.
|
||
|
||
Callback Sequencing
|
||
-------------------
|
||
|
||
When two callbacks are scheduled for the same time, they are run
|
||
in the order in which they are registered. For example::
|
||
|
||
ev.call_soon(foo)
|
||
ev.call_soon(bar)
|
||
|
||
guarantees that ``foo()`` is called before ``bar()``.
|
||
|
||
If ``call_soon()`` is used, this guarantee is true even if the system
|
||
clock were to run backwards. This is also the case for
|
||
``call_later(0, callback, *args)``. However, if ``call_later()`` is
|
||
used with a nonzero ``when`` argument, all bets are off if the system
|
||
clock were to runs backwards. (A good event loop implementation
|
||
should use ``time.monotonic()`` to avoid problems when the clock runs
|
||
backward. See PEP 418.)
|
||
|
||
Context
|
||
-------
|
||
|
||
All event loops have a notion of context. For the default event loop
|
||
implementation, the context is a thread. An event loop implementation
|
||
should run all callbacks in the same context. An event loop
|
||
implementation should run only one callback at a time, so callbacks
|
||
can assume automatic mutual exclusion with other callbacks scheduled
|
||
in the same event loop.
|
||
|
||
The DelayedCall Class
|
||
---------------------
|
||
|
||
TBD. (Only one method, ``cancel()``, and a read-only property,
|
||
``canceled``. Perhaps also ``callback`` and ``args`` properties.)
|
||
|
||
TBD: Find a better name?
|
||
|
||
Futures
|
||
-------
|
||
|
||
TBD.
|
||
|
||
Transports
|
||
----------
|
||
|
||
TBD.
|
||
|
||
Protocols
|
||
---------
|
||
|
||
TBD.
|
||
|
||
Coroutines and the Scheduler
|
||
----------------------------
|
||
|
||
TBD.
|
||
|
||
Callback Style
|
||
--------------
|
||
|
||
Most interfaces taking a callback also take positional arguments. For
|
||
instance, to arrange for ``foo("abc", 42)`` to be called soon, you
|
||
call ``ev.call_soon(foo, "abc", 42)``. To schedule the call
|
||
``foo()``, use ``ev.call_soon(foo)``. This convention greatly reduces
|
||
the number of small lambdas required in typical callback programming.
|
||
|
||
This convention specifically does _not_ support keyword arguments.
|
||
Keyword arguments are used to pass optional extra information about
|
||
the callback. This allows graceful evolution of the API without
|
||
having to worry about whether a keyword might be significant to a
|
||
callee somewhere. If you have a callback that _must_ be called with a
|
||
keyword argument, you can use a lambda or ``functools.partial``. For
|
||
example::
|
||
|
||
ev.call_soon(functools.partial(foo, "abc", repeat=42))
|
||
|
||
Choosing an Event Loop Implementation
|
||
-------------------------------------
|
||
|
||
TBD. (This is about the choice to use e.g. select vs. poll vs. epoll,
|
||
and how to override the choice. Probably belongs in the event loop
|
||
policy.)
|
||
|
||
|
||
Open Issues
|
||
===========
|
||
|
||
- Should we have ``future.add_callback(callback, *args)``, using the
|
||
convention from the section "Callback Style" above, or should we
|
||
stick with the PEP 3148 specification of
|
||
``future.add_done_callback(callback)`` which calls
|
||
``callback(future)``? (Glyph suggested using a different method
|
||
name since add_done_callback() does not guarantee that the callback
|
||
will be called in the right context.)
|
||
|
||
- Returning a Future is relatively expensive, and it is quite possible
|
||
that some types of calls _usually_ complete immediately
|
||
(e.g. writing small amounts of data to a socket). A trick used by
|
||
Richard Oudkerk in the tulip project's proactor branch makes calls
|
||
like recv() either return a regular result or _raise_ a Future. The
|
||
caller (likely a transport) must then write code like this::
|
||
|
||
try:
|
||
res = ev.sock_recv(sock, 8192)
|
||
except Future as f:
|
||
yield from sch.block_future(f)
|
||
res = f.result()
|
||
|
||
|
||
Acknowledgments
|
||
===============
|
||
|
||
Apart from PEP 3153, influences include PEP 380 and Greg Ewing's
|
||
tutorial for yield-from, Twisted, Tornado, ZeroMQ, pyftpdlib, tulip
|
||
(the author's attempts at synthesis of all these), wattle (Steve
|
||
Dower's counter-proposal), numerous discussions on python-ideas from
|
||
September through December 2012, a Skype session with Steve Dower and
|
||
Dino Viehland, email exchanges with Ben Darnell, an audience with
|
||
Niels Provos (original author of libevent), and two in-person meetings
|
||
with several Twisted developers, including Glyph, Brian Warner, David
|
||
Reid, and Duncan McGreggor.
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|