Update a lot of the intro text in PEP 3156.

This commit is contained in:
Guido van Rossum 2013-10-23 08:17:05 -07:00
parent 7dcf9d7527
commit 2d1e124438
1 changed files with 133 additions and 74 deletions

View File

@ -13,25 +13,68 @@ Post-History: 21-Dec-2012
Abstract
========
This is a proposal for asynchronous I/O in Python 3, starting with
This is a proposal for asynchronous I/O in Python 3, starting at
Python 3.3. Consider this the concrete proposal that is missing from
PEP 3153. The proposal includes a pluggable event loop API, transport
and protocol abstractions similar to those in Twisted, and a
higher-level scheduler based on ``yield from`` (PEP 380). A reference
implementation is in the works under the code name Tulip. The Tulip
repo is linked from the References section at the end.
The proposed standard library module name is ``asyncio``, although the
rest of this PEP has not yet been updated to reflect this.
PEP 3153. The proposal includes a pluggable event loop, transport and
protocol abstractions similar to those in Twisted, and a higher-level
scheduler based on ``yield from`` (PEP 380). The proposed package
name is ``asyncio``.
Introduction
============
Status
------
A reference implementation exists under the code name Tulip. The
Tulip repo is linked from the References section at the end. Packages
based on this repo will be provided on PyPI (see References) to enable
using the ``asyncio`` package with Python 3.3 installations.
As of October 20th 2013, the ``asyncio`` package has been checked into
the Python 3.4 repository and released with Python 3.4-alpha-4, with
"provisional" API status. This is an expression of confidence and
intended to increase early feedback on the API, and not intended to
force acceptance of the PEP. The expectation is that the package will
keep provisional status in Python 3.4 and progress to final status in
Python 3.5. Development continues to occur primarily in the Tulip
repo.
Dependencies
------------
Python 3.3 is required for many of the proposed features. The
reference implementation (Tulip) requires no new language or standard
library features beyond Python 3.3, no third-party modules or
packages, and no C code, except for the proactor-based event loop on
Windows.
Module Namespace
----------------
The specification here lives in a new top-level package, ``asyncio``.
Different components live in separate submodules of the package. The
package will import common APIs from their respective submodules and
make them available as package attributes (similar to the way the
email package works). For such common APIs, the name of the submodule
that actually defines them is not part of the specification. Less
common APIs may have to explicitly be imported from their respective
submodule, and in this case the submodule name is part of the
specification.
Classes and functions defined without a submodule name are assumed to
live in the namespace of the top-level package. (But do not confuse
these with methods of various classes, which for brevity are also used
without a namespace prefix in certain contexts.)
Interoperability
----------------
The event loop is the place where most interoperability occurs. It
should be easy for (Python 3.3 ports of) frameworks like Twisted,
Tornado, or even gevents to either adapt the default event loop
implementation to their needs using a lightweight wrapper or proxy, or
implementation to their needs using a lightweight adapter or proxy, or
to replace the default event loop implementation with an adaptation of
their own event loop implementation. (Some frameworks, like Twisted,
have multiple event loop implementations. This should not be a
@ -62,66 +105,103 @@ are defined:
- the interface of a conforming event loop and its minimum guarantees
An event loop implementation may provide additional methods and
guarantees.
guarantees, as long as these are called out in the documentation as
non-standard. An event loop implementation may also leave certain
methods unimplemented if they cannot be implemented in the given
environment; however, such deviations from the standard API should be
considered only as a last resort, and only if the platform or
environment forces the issue. (An example could be a platform where
there is a system event loop that cannot be started or stopped.)
The event loop interface does not depend on ``yield from``. Rather, it
uses a combination of callbacks, additional interfaces (transports and
The event loop API does not depend on ``yield from``. Rather, it uses
a combination of callbacks, additional interfaces (transports and
protocols), and Futures. The latter are similar to those defined in
PEP 3148, but have a different implementation and are not tied to
threads. In particular, they have no wait() method; the user is
expected to use callbacks.
All event loop methods documented as returning a coroutine are allowed
to return either a Future or a coroutine, at the implementation's
choice (the standard implementation always returns coroutines). All
event loop methods documented as accepting coroutine arguments *must*
accept both Futures and coroutines for such arguments. (A convenience
function, ``async()``, exists to convert an argument that is either a
conroutine or a Future into a Future.)
For users (like myself) who don't like using callbacks, a scheduler is
provided for writing asynchronous I/O code as coroutines using the PEP
380 ``yield from`` expressions. The scheduler is not pluggable;
pluggability occurs at the event loop level, and the scheduler should
work with any conforming event loop implementation.
pluggability occurs at the event loop level, and the standard
scheduler implementation should work with any conforming event loop
implementation. (In fact this is an important litmus test for
conforming implementations.)
For interoperability between code written using coroutines and other
async frameworks, the scheduler has a Task class that behaves like a
async frameworks, the scheduler defines a Task class that behaves like a
Future. A framework that interoperates at the event loop level can
wait for a Future to complete by adding a callback to the Future.
Likewise, the scheduler offers an operation to suspend a coroutine
until a callback is called.
Limited interoperability with threads is provided by the event loop
interface; there is an API to submit a function to an executor (see
PEP 3148) which returns a Future that is compatible with the event
loop.
The event loop API provides limited interoperability with threads:
there is an API to submit a function to an executor (see PEP 3148)
which returns a Future that is compatible with the event loop, and
there is a method to schedule a callback with an event loop from
another thread in a thread-safe manner.
A Note About Transports and Protocols
-------------------------------------
Transports and Protocols
------------------------
For those not familiar with Twisted, a quick explanation of the
difference between transports and protocols is in order. At the
relationship between transports and protocols is in order. At the
highest level, the transport is concerned with *how* bytes are
transmitted, while the protocol determines *which* bytes to transmit
(and to some extent when).
A different way of saying the same thing: a transport is an
abstraction for a socket (or similar I/O endpoint) while a protocol is
an abstraction for an application, from the transport's point of view.
Yet another view is simply that the transport and protocol interfaces
*together* define an abstract interface for using network I/O and
interprocess I/O.
There is almost always a 1:1 relationship between transport and
protocol objects: the protocol calls transport methods to send data,
while the transport calls protocol methods to pass it data that has
been received. Neither transport not protocol methods "block" -- they
set events into motion and then return.
The most common type of transport is a bidirectional stream transport.
It represents a pair of streams (one in each direction) that each
transmit a sequence of bytes. The most common example of a
It represents a pair of buffered streams (one in each direction) that
each transmit a sequence of bytes. The most common example of a
bidirectional stream transport is probably a TCP connection. Another
common example is an SSL connection. But there are some other things
that can be viewed this way, for example an SSH session or a pair of
UNIX pipes. Typically there aren't many different transport
implementations, and most of them come with the event loop
implementation. Note that transports don't need to use sockets, not
even if they use TCP -- sockets are a platform-specific implementation
detail.
implementation. (But there is no requirement that all transports must
be created by calling an event loop method -- a third party module may
well implement a new transport and provide a constructor or factory
function for it that simply takes an event loop as an argument.)
A bidirectional stream transport has two "sides": one side talks to
Note that transports don't need to use sockets, not even if they use
TCP -- sockets are a platform-specific implementation detail.
A bidirectional stream transport has two "ends": one end talks to
the network (or another process, or whatever low-level interface it
wraps), and the other side talks to the protocol. The former uses
wraps), and the other end talks to the protocol. The former uses
whatever API is necessary to implement the transport; but the
interface between transport and protocol is standardized by this PEP.
A protocol represents some kind of "application-level" protocol such
as HTTP or SMTP. Its primary interface is with the transport. While
some popular protocols will probably have a standard implementation,
often applications implement custom protocols. It also makes sense to
have libraries of useful 3rd party protocol implementations that can
be downloaded and installed from pypi.python.org.
A protocol can represent some kind of "application-level" protocol
such as HTTP or SMTP; it can also implement an abstraction shared by
multiple protocols, or a whole application. A protocol's primary
interface is with the transport. While some popular protocols (and
other abstractions) may have standard implementations, often
applications implement custom protocols. It also makes sense to have
libraries of useful third party protocol implementations that can be
downloaded and installed from PyPI.
There general notion of transport and protocol includes other
interfaces, where the transport wraps some other communication
@ -134,37 +214,10 @@ is different in each case.
Details of the interfaces defined by the various standard types of
transports and protocols are given later.
Dependencies
------------
Python 3.3 is required for many of the proposed features. The
reference implementation (Tulip) requires no new language or standard
library features beyond Python 3.3, no third-party modules or
packages, and no C code, except for the proactor-based event loop on
Windows.
Event Loop Interface Specification
==================================
Module Namespace
----------------
The specification here will live in a new toplevel package. Different
components will live in separate submodules of that package. The
package will import common APIs from their respective submodules and
make them available as package attributes (similar to the way the
email package works).
The name of the toplevel package is currently unspecified. The
reference implementation uses the name 'tulip', but the name will
change to something more boring if and when the implementation is
moved into the standard library (hopefully for Python 3.4).
Until the boring name is chosen, this PEP will use 'tulip' as the
toplevel package name. Classes and functions given without a module
name are assumed to be accessed via the toplevel package.
Event Loop Policy: Getting and Setting the Current Event Loop
-------------------------------------------------------------
@ -189,7 +242,8 @@ one. It should never return ``None``.
To set the event loop for the current context, use
``set_event_loop(event_loop)``, where ``event_loop`` is an event loop
object. It is okay to set the current event loop to ``None``, in
object, i.e. an instance of ``AbstractEventLoop``, or ``None``.
It is okay to set the current event loop to ``None``, in
which case subsequent calls to ``get_event_loop()`` will raise an
exception. This is useful for testing code that should not depend on
the existence of a default event loop.
@ -213,17 +267,18 @@ event loop object according to the policy's default rules. To make
this the current event loop, you must call ``set_event_loop()`` with
it.
To change the event loop policy,
call ``set_event_loop_policy(policy)``, where ``policy`` is an event
loop policy object or ``None``. The policy object must be an object
that has methods ``get_event_loop()``, ``set_event_loop(loop)`` and
To change the event loop policy, call
``set_event_loop_policy(policy)``, where ``policy`` is an event loop
policy object or ``None``. If not ``None``, the policy object must be
an instance of ``AbstractEventLoopPolicy`` that defines methods
``get_event_loop()``, ``set_event_loop(loop)`` and
``new_event_loop()``, all behaving like the functions described above.
Passing a policy value of ``None`` restores the default event loop
policy (overriding the alternate default set by the platform or
framework). The default event loop policy is an instance of the class
``DefaultEventLoopPolicy``. The current event loop policy object can
be retrieved by calling ``get_event_loop_policy()``. (TBD: Require
inheriting from ``AbstractEventLoopPolicy``?)
be retrieved by calling ``get_event_loop_policy()``.
Event Loop Classes
------------------
@ -544,8 +599,8 @@ use a different transport and protocol interface.
specific address. This is how you would do that. The host and
port are looked up using ``getaddrinfo()``.
- ``create_server_serving(protocol_factory, host, port,
<options>)``. Enters a serving loop that accepts connections.
- ``create_server(protocol_factory, host, port, <options>)``.
Enters a serving loop that accepts connections.
This is a coroutine that completes once the serving loop is set up
to serve. The return value is a ``Server`` object which can be used
to stop the serving loop in a controlled fashion by calling its
@ -735,8 +790,10 @@ I/O. This section of the API is clearly not yet ready for review.
write half of the bidirectional stream interface.
- TBD: A way to run a subprocess with stdin, stdout and stderr
connected to pipe transports. (This is being designed but not yet
ready.)
connected to pipe transports. (This is implemented now but not yet
documented.) (TBD: Document that the subprocess's
connection_closed() won't be called until the process has exited
*and* all pipes are closed, and why -- it's a race condition.)
TBD: offer the same interface on Windows for e.g. named pipes. (This
should be possible given that the standard library ``subprocess``
@ -1488,6 +1545,8 @@ References
- Tulip repo: http://code.google.com/p/tulip/
- PyPI: the Python Package Index at http://pypi.python.org/
- Nick Coghlan wrote a nice blog post with some background, thoughts
about different approaches to async I/O, gevent, and how to use
futures with constructs like ``while``, ``for`` and ``with``: