python-peps/pep-3156.txt

PEP: 3156
Title: Asynchronous IO Support Rebooted
Version: $Revision$
Last-Modified: $Date$
Author: Guido van Rossum <guido@python.org>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 12-Dec-2012
Post-History: 21-Dec-2012

Abstract
========

This is a proposal for asynchronous I/O in Python 3, starting with
Python 3.3.  Consider this the concrete proposal that is missing from
PEP 3153.  The proposal includes a pluggable event loop API, transport
and protocol abstractions similar to those in Twisted, and a
higher-level scheduler based on ``yield from`` (PEP 380).  A reference
implementation is in the works under the code name Tulip (the Tulip
repo is linked from the References section at the end).


Introduction
============

The event loop is the place where most interoperability occurs.  It
should be easy for (Python 3.3 ports of) frameworks like Twisted,
Tornado, or ZeroMQ to either adapt the default event loop
implementation to their needs using a lightweight wrapper or proxy, or
to replace the default event loop implementation with an adaptation of
their own event loop implementation.  (Some frameworks, like Twisted,
have multiple event loop implementations.  This should not be a
problem since these all have the same interface.)

It should even be possible for two different third-party frameworks to
interoperate, either by sharing the default event loop implementation
(each using its own adapter), or by sharing the event loop
implementation of either framework.  In the latter case two levels of
adaptation would occur (from framework A's event loop to the standard
event loop interface, and from there to framework B's event loop).
Which event loop implementation is used should be under control of the
main program (though a default policy for event loop selection is
provided).

Thus, two separate APIs are defined:

- getting and setting the current event loop object
- the interface of a conforming event loop and its minimum guarantees

An event loop implementation may provide additional methods and
guarantees.

The event loop interface does not depend on ``yield from``.  Rather, it
uses a combination of callbacks, additional interfaces (transports and
protocols), and Futures.  The latter are similar to those defined in
PEP 3148, but have a different implementation and are not tied to
threads.  In particular, they have no wait() method; the user is
expected to use callbacks.

For users (like myself) who don't like using callbacks, a scheduler is
provided for writing asynchronous I/O code as coroutines using the PEP
380 ``yield from`` expressions.  The scheduler is not pluggable;
pluggability occurs at the event loop level, and the scheduler should
work with any conforming event loop implementation.

For interoperability between code written using coroutines and other
async frameworks, the scheduler has a Task class that behaves like a
Future.  A framework that interoperates at the event loop level can
wait for a Future to complete by adding a callback to the Future.
Likewise, the scheduler offers an operation to suspend a coroutine
until a callback is called.

Limited interoperability with threads is provided by the event loop
interface; there is an API to submit a function to an executor (see
PEP 3148) which returns a Future that is compatible with the event
loop.

A Note About Transports and Protocols
-------------------------------------

For those not familiar with Twisted, a quick explanation of the
difference between transports and protocols is in order.  At the
highest level, the transport is concerned with *how* bytes are
transmitted, while the protocol determines *which* bytes to transmit
(and when).

A transport represents a pair of streams (one in each direction) that
each transmit a sequence of bytes.  The most common transport is
probably the TCP connection.  Another common transport is SSL.  But
there are some other things that can be viewed as transports, for
example an SSH session or a pair of UNIX pipes.  Typically there
aren't many different transport implementations, and most of them
come with the event loop implementation.

A transport has two "sides": one side talks to the network (or the
subprocess, or whatever low-level interface it wraps), and the other
side talks to the protocol.  The former uses whatever API is necessary
to implement the transport; but the interface between transport and
protocol is standardized by this PEP.

A protocol represents some kind of "application-level" protocol such
as HTTP or SMTP.  Its primary interface is with the transport.  While
some popular protocols will probably have a standard implementation,
often applications implement custom protocols.  It also makes sense to
have libraries of useful 3rd party protocol implementations that can
be downloaded and installed from pypi.python.org.

There is also a somewhat more general notion of transport and
protocol, where the transport wraps some other communication
abstraction.  Example include an interface for sending and receiving
datagrams, or a subprocess manager.  The separation of concerns is the
same as for stream transports and protocols, but the specific
interface between transport and protocol can be different in each
case.

Details of the interfaces between transports and protocols are given
later.


Non-goals
=========

Interoperability with systems like Stackless Python or
greenlets/gevent is not a goal of this PEP.


Specification
=============

Dependencies
------------

Python 3.3 is required.  No new language or standard library features
beyond Python 3.3 are required.  No third-party modules or packages
are required.

Module Namespace
----------------

The specification here will live in a new toplevel package.  Different
components will live in separate submodules of that package.  The
package will import common APIs from their respective submodules and
make them available as package attributes (similar to the way the
email package works).

The name of the toplevel package is currently unspecified.  The
reference implementation uses the name 'tulip', but the name will
change to something more boring if and when the implementation is
moved into the standard library (hopefully for Python 3.4).

Until the boring name is chosen, this PEP will use 'tulip' as the
toplevel package name.  Classes and functions given without a module
name are assumed to be accessed via the toplevel package.

Event Loop Policy: Getting and Setting the Event Loop
-----------------------------------------------------

To get the current event loop, use ``get_event_loop()``.  This returns
an instance of the ``EventLoop`` class defined below or an equivalent
object.  It is possible that ``get_event_loop()`` returns a different
object depending on the current thread, or depending on some other
notion of context.

To set the current event loop, use ``set_event_loop(event_loop)``,
where ``event_loop`` is an instance of the ``EventLoop`` class or
equivalent.  This uses the same notion of context as
``get_event_loop()``.

For the benefit of unit tests and other special cases there's a third
policy function: ``new_event_loop()``, which creates and returns a new
EventLoop instance according to the policy's default rules.  To make
this the current event loop, you must call ``set_event_loop()``.

To change the way the above three functions work
(including their notion of context), call
``set_event_loop_policy(policy)``, where ``policy`` is an event loop
policy object.  The policy object can be any object that has methods
``get_event_loop()``, ``set_event_loop(event_loop)``
and ``new_event_loop()`` behaving like
the functions described above.  The default event loop policy is an
instance of the class ``DefaultEventLoopPolicy``.  The current event loop
policy object can be retrieved by calling ``get_event_loop_policy()``.

An event loop policy may but does not have to enforce that there is
only one event loop in existence.  The default event loop policy does
not enforce this, but it does enforce that there is only one event
loop per thread.

Event Loop Interface
--------------------

A note about times: as usual in Python, all timeouts, intervals and
delays are measured in seconds, and may be ints or floats.  The
accuracy and precision of the clock are up to the implementation; the
default implementation uses ``time.monotonic()``.

A note about callbacks and Handlers: any function that takes a
callback and a variable number of arguments for it can also be given a
Handler object instead of the callback.  Then no arguments should be
given, and the Handler should represent an immediate callback (as
returned from ``call_soon()``), not a delayed callback (as returned
from ``call_later()``).  If the Handler is already cancelled, the
call is a no-op.

A conforming event loop object has the following methods:

- ``run()``.  Runs the event loop until there is nothing left to do.
  This means, in particular:

  - No more calls scheduled with ``call_later()``,
    ``call_repeatedly()``, ``call_soon()``, or
    ``call_soon_threadsafe()``, except for cancelled calls.

  - No more registered file descriptors.  It is up to the registering
    party to unregister a file descriptor when it is closed.

  Note: ``run()`` blocks until the termination condition is met,
  or until ``stop()`` is called.

  Note: if you schedule a call with ``call_repeatedly()``, ``run()``
  will not exit until you cancel it.

  TBD: How many variants of this do we really need?

- ``run_forever()``.  Runs the event loop until ``stop()`` is called.

- ``run_until_complete(future, timeout=None)``.  Runs the event loop
  until the Future is done.  If a timeout is given, it waits at most
  that long.  If the Future is done, its result is returned, or its
  exception is raised; if the timeout expires before the Future is
  done, or if ``stop()`` is called, ``TimeoutError`` is raised (but
  the Future is not cancelled).  This cannot be called when the event
  loop is already running.
  
  Note: This API is most useful for tests and the like.  It should not
  be used as a substitute for ``yield from future`` or other ways to
  wait for a Future (e.g. registering a done callback).

- ``run_once(timeout=None)``.  Run the event loop for a little while.
  If a timeout is given, an I/O poll made will block at most that
  long; otherwise, an I/O poll is not constrained in time.

  Note: Exactlly how much work this does is up to the implementation.
  One constraint: if a callback immediately schedules itself using
  ``call_soon()``, causing an infinite loop, ``run_once()`` should
  still return.

- ``stop()``.  Stops the event loop as soon as it is convenient.  It
  is fine to restart the loop with ``run()`` (or one of its variants)
  subsequently.

  Note: How soon exactly is up to the implementation.  All immediate
  callbacks that were already scheduled to run before ``stop()`` is
  called must still be run, but callbacks scheduled after it is called
  (or scheduled to be run later) will not be run.

- ``close()``.  Closes the event loop, releasing any resources it may
  hold, such as the file descriptor used by ``epoll()`` or
  ``kqueue()``.  This should not be called while the event loop is
  running.  It may be called multiple times.

- ``call_later(delay, callback, *args)``.  Arrange for
  ``callback(*args)`` to be called approximately ``delay`` seconds in
  the future, once, unless cancelled.  Returns
  a ``Handler`` object representing the callback, whose
  ``cancel()`` method can be used to cancel the callback.

- ``call_repeatedly(interval, callback, **args)``.  Like ``call_later()``
  but calls the callback repeatedly, every ``interval`` seconds,
  until the ``Handler`` returned is cancelled.  The first call is in
  ``interval`` seconds.

- ``call_soon(callback, *args)``.  Equivalent to ``call_later(0,
  callback, *args)``.

- ``call_soon_threadsafe(callback, *args)``.  Like
  ``call_soon(callback, *args)``, but when called from another thread
  while the event loop is blocked waiting for I/O, unblocks the event
  loop.  This is the *only* method that is safe to call from another
  thread.  (To schedule a callback for a
  later time in a threadsafe manner, you can use
  ``ev.call_soon_threadsafe(ev.call_later, when, callback, *args)``.)
  This is not safe to call from a signal handler (since it may use
  locks).

- ``add_signal_handler(sig, callback, *args).  Whenever signal ``sig``
  is received, arrange for ``callback(*args)`` to be called.  Returns
  a ``Handler`` which can be used to cancel the signal callback.
  (Cancelling the handler causes ``remove_signal_handler()`` to be
  called the next time the signal arrives.  Explicitly calling
  ``remove_signal_handler()`` is preferred.)
  Specifying another callback for the same signal replaces the
  previous handler (only one handler can be active per signal).  The
  ``sig`` must be a valid sigal number defined in the ``signal``
  module.  If the signal cannot be handled this raises an exception:
  ``ValueError`` if it is not a valid signal or if it is an uncatchable
  signale (e.g. ``SIGKILL``), ``RuntimeError`` if this particular event
  loop instance cannot handle signals (since signals are global per
  process, only an event loop associated with the main thread can
  handle signals).

- ``remove_signal_handler(sig)``.  Removes the handler for signal
  ``sig``, if one is set.  Raises the same exceptions as
  ``add_signal_handler()`` (except that it may return ``False``
  instead raising ``RuntimeError`` for uncatchable signals).  Returns
  ``True``e\ if a handler was removed successfully, ``False`` if no
  handler was set.

Some methods in the standard conforming interface return Futures:

- ``wrap_future(future)``.  This takes a PEP 3148 Future (i.e., an
  instance of ``concurrent.futures.Future``) and returns a Future
  compatible with the event loop (i.e., a ``tulip.Future`` instance).

- ``run_in_executor(executor, callback, *args)``.  Arrange to call
  ``callback(*args)`` in an executor (see PEP 3148).  Returns a Future
  whose result on success is the return value that call.  This is
  equivalent to ``wrap_future(executor.submit(callback, *args))``.  If
  ``executor`` is ``None``, a default ``ThreadPoolExecutor`` with 5
  threads is used.

- ``set_default_executor(executor)``.  Set the default executor used
  by ``run_in_executor()``.

- ``getaddrinfo(host, port, family=0, type=0, proto=0, flags=0)``.
  Similar to the ``socket.getaddrinfo()`` function but returns a
  Future.  The Future's result on success will be a list of the same
  format as returned by ``socket.getaddrinfo()``.  The default
  implementation calls ``socket.getaddrinfo()`` using
  ``run_in_executor()``, but other implementations may choose to
  implement their own DNS lookup.  The optional arguments *must* be
  specified as keyword arguments.

- ``getnameinfo(sockaddr, flags=0)``.  Similar to
  ``socket.getnameinfo()`` but returns a Future.  The Future's result
  on success will be a tuple ``(host, port)``.  Same implementation
  remarks as for ``getaddrinfo()``.

- ``create_connection(protocol_factory, host, port, **kwargs)``.
  Creates a stream connection to a given host and port.  This creates
  an implementation-dependent Transport to represent the connection,
  then calls ``protocol_factory()`` to instantiate (or retrieve) the
  user's Protocol implementation, and finally ties the two together.
  (See below for the definitions of Transport and Protocol.)  The
  user's Protocol implementation is created or retrieved by calling
  ``protocol_factory()`` without arguments(*).  The return value is a
  Future whose result on success is the ``(transport, protocol)``
  pair; if a failure prevents the creation of a successful connection,
  the Future will have an appropriate exception set.  Note that when
  the Future completes, the protocol's ``connection_made()`` method
  has not yet been called; that will happen when the connection
  handshake is complete.

  (*) There is no requirement that ``protocol_factory`` is a class.
  If your protocol class needs to have specific arguments passed to
  its constructor, you can use ``lambda`` or ``functools.partial()``.
  You can also pass a trivial ``lambda`` that returns a previously
  constructed Protocol instance.

  Optional keyword arguments:

  - ``family``, ``proto``, ``flags``: Address familty,
    protcol, and miscellaneous flags to be passed through
    to ``getaddrinfo()``.  These all default to ``0``.
    (The socket type is always ``SOCK_STREAM``.)

  - ``ssl``: Pass ``True`` to create an SSL transport (by default a
    plain TCP is created).  Or pass an ``ssl.SSLContext`` object to
    override the default SSL context object to be used.

- ``start_serving(protocol_factory, host, port, **kwds)``.  Enters a
  loop that accepts connections.  Returns a Future that completes once
  the loop is set up to serve; its return value is None.  Each time a
  connection is accepted, ``protocol_factory`` is called without
  arguments(*) to create a Protocol, a Transport is created to represent
  the network side of the connection, and the two are tied together by
  calling ``protocol.connection_made(transport)``.

  (*) See footnote above for ``create_connection()``.  However, since
  ``protocol_factory()`` is called once for each new incoming
  connection, it is recommended that it return a new Protocol object
  each time it is called.

  Optional keyword arguments:

  - ``family``, ``proto``, ``flags``: Address familty,
    protcol, and miscellaneous flags to be passed through
    to ``getaddrinfo()``.  These all default to ``0``.
    (The socket type is always ``SOCK_STREAM``.)

  TBD: Support SSL?  I don't even know how to do that synchronously,
  and I suppose it needs a certificate.

  TBD: Maybe make the Future's result an object that can be used to
  control the serving loop, e.g. to stop serving, abort all active
  connections, and (if supported) adjust the backlog or other
  parameters?  It could also have an API to inquire about active
  connections.  Alternatively, return a Future (subclass?) that only
  completes if the loop stops serving due to an error, or if it cannot
  be started?  Cancelling it might stop the loop.

TBD: Some platforms may not be interested in implementing all of
these, e.g. start_serving() may be of no interest to mobile apps.
(Although, there's a Minecraft server on my iPad...)

The following methods for registering callbacks for file descriptors
are optional.  If they are not implemented, accessing the method
(without calling it) returns AttributeError.  The default
implementation provides them but the user normally doesn't use these
directly -- they are used by the transport implementations
exclusively.  Also, on Windows these may be present or not depending
on whether a select-based or IOCP-based event loop is used.  These
take integer file descriptors only, not objects with a fileno()
method.  The file descriptor should represent something pollable --
i.e. no disk files.

- ``add_reader(fd, callback, *args)``.  Arrange for
  ``callback(*args)`` to be called whenever file descriptor ``fd`` is
  ready for reading.  Returns a ``Handler`` object which can be
  used to cancel the callback.  Note that, unlike ``call_later()``,
  the callback may be called many times.  Calling ``add_reader()``
  again for the same file descriptor implicitly cancels the previous
  callback for that file descriptor.  Note: cancelling the handler
  may be delayed until the handler would be called.  If you plan to
  close ``fd``, you should use ``remove_reader(fd)`` instead.
  (TBD: Change this to raise an exception if a handler
  is already set.)

- ``add_writer(fd, callback, *args)``.  Like ``add_reader()``,
  but registers the callback for writing instead of for reading.

- ``remove_reader(fd)``.  Cancels the current read callback for file
  descriptor ``fd``, if one is set.  A no-op if no callback is
  currently set for the file descriptor.  (The reason for providing
  this alternate interface is that it is often more convenient to
  remember the file descriptor than to remember the ``Handler``
  object.)  Returns ``True`` if a handler was removed, ``False``
  if not.

- ``remove_writer(fd)``.  This is to ``add_writer()`` as
  ``remove_reader()`` is to ``add_reader()``.

TBD: What about multiple callbacks per fd?  The current semantics is
that ``add_reader()/add_writer()`` replace a previously registered
callback.  Change this to raise an exception if a callback is already
registered.

The following methods for doing async I/O on sockets are optional.
They are alternative to the previous set of optional methods, intended
for transport implementations on Windows using IOCP (if the event loop
supports it).  The socket argument has to be a non-blocking socket.

- ``sock_recv(sock, n)``.  Receive up to ``n`` bytes from socket
  ``sock``.  Returns a Future whose result on success will be a
  bytes object on success.

- ``sock_sendall(sock, data)``.  Send bytes ``data`` to the socket
  ``sock``.  Returns a Future whose result on success will be
  ``None``.  (TBD: Is it better to emulate ``sendall()`` or ``send()``
  semantics?  I think ``sendall()`` -- but perhaps it should still
  be *named* ``send()``?)

- ``sock_connect(sock, address)``.  Connect to the given address.
  Returns a Future whose result on success will be ``None``.

- ``sock_accept(sock)``.  Accept a connection from a socket.  The
  socket must be in listening mode and bound to an address.  Returns a
  Future whose result on success will be a tuple ``(conn, peer)``
  where ``conn`` is a connected non-blocking socket and ``peer`` is
  the peer address.  (TBD: People tell me that this style of API is
  too slow for high-volume servers.  So there's also
  ``start_serving()`` above.  Then do we still need this?)

TBD: Optional methods are not so good.  Perhaps these should be
required?  It may still depend on the platform which set is more
efficient.  Another possibility: document these as "for transports
only" and the rest as "for anyone".

Callback Sequencing
-------------------

When two callbacks are scheduled for the same time, they are run
in the order in which they are registered.  For example::

  ev.call_soon(foo)
  ev.call_soon(bar)

guarantees that ``foo()`` is called before ``bar()``.

If ``call_soon()`` is used, this guarantee is true even if the system
clock were to run backwards.  This is also the case for
``call_later(0, callback, *args)``.  However, if ``call_later()`` is
used with a nonzero delay, all bets are off if the system
clock were to runs backwards.  (A good event loop implementation
should use ``time.monotonic()`` to avoid problems when the clock runs
backward.  See PEP 418.)

Context
-------

All event loops have a notion of context.  For the default event loop
implementation, the context is a thread.  An event loop implementation
should run all callbacks in the same context.  An event loop
implementation should run only one callback at a time, so callbacks
can assume automatic mutual exclusion with other callbacks scheduled
in the same event loop.

Exceptions
----------

There are two categories of exceptions in Python: those that derive
from the ``Exception`` class and those that derive from
``BaseException``.  Exceptions deriving from ``Exception`` will
generally be caught and handled appropriately; for example, they will
be passed through by Futures, and they will be logged and ignored when
they occur in a callback.

However, exceptions deriving only from ``BaseException`` are never
caught, and will usually cause the program to terminate with a
traceback.  (Examples of this category include ``KeyboardInterrupt``
and ``SystemExit``; it is usually unwise to treat these the same as
most other exceptions.)

The Handler Class
-----------------

The various methods for registering callbacks (e.g. ``call_later()``)
all return an object representing the registration that can be used to
cancel the callback.  For want of a better name this object is called
a ``Handler``, although the user never needs to instantiate
instances of this class.  There is one public method:

- ``cancel()``.  Attempt to cancel the callback.
  TBD: Exact specification.

Read-only public attributes:

- ``callback``.  The callback function to be called.

- ``args``.  The argument tuple with which to call the callback function.

- ``cancelled``.  True if ``cancel()`` has been called.

Note that some callbacks (e.g. those registered with ``call_later()``)
are meant to be called only once.  Others (e.g. those registered with
``add_reader()``) are meant to be called multiple times.

TBD: An API to call the callback (encapsulating the exception handling
necessary)?  Should it record how many times it has been called?
Maybe this API should just be ``__call__()``?  (But it should suppress
exceptions.)

TBD: Public attribute recording the realtime value when the callback
is scheduled?  (Since this is needed anyway for storing it in a heap.)

Futures
-------

The ``tulip.Future`` class here is intentionally similar to the
``concurrent.futures.Future`` class specified by PEP 3148, but there
are slight differences.  Whenever this PEP talks about Futures or
futures this should be understood to refer to ``tulip.Future`` unless
``concurrent.futures.Future`` is explicitly mentioned.  The supported
public API is as follows, indicating the differences with PEP 3148:

- ``cancel()``.  If the Future is already done (or cancelled), return
  ``False``.  Otherwise, change the Future's state to cancelled (this
  implies done), schedule the callbacks, and return ``True``.

- ``cancelled()``.  Returns ``True`` if the Future was cancelled.

- ``running()``.  Always returns ``False``.  Difference with PEP 3148:
  there is no "running" state.

- ``done()``.  Returns ``True`` if the Future is done.  Note that a
  cancelled Future is considered done too (here and everywhere).

- ``result()``.  Returns the result set with ``set_result()``, or
  raises the exception set with ``set_exception()``.  Raises
  ``CancelledError`` if cancelled.  Difference with PEP 3148: This has
  no timeout argument and does *not* wait; if the future is not yet
  done, it raises an exception.

- ``exception()``.  Returns the exception if set with
  ``set_exception()``, or ``None`` if a result was set with
  ``set_result()``.  Raises ``CancelledError`` if cancelled.
  Difference with PEP 3148: This has no timeout argument and does
  *not* wait; if the future is not yet done, it raises an exception.

- ``add_done_callback(fn)``.  Add a callback to be run when the Future
  becomes done (or is cancelled).  If the Future is already done (or
  cancelled), schedules the callback to using ``call_soon()``.
  Difference with PEP 3148: The callback is never called immediately,
  and always in the context of the caller.  (Typically, a context is a
  thread.)  You can think of this as calling the callback through
  ``call_soon()``.  Note that the callback (unlike all other callbacks
  defined in this PEP, and ignoring the convention from the section
  "Callback Style" below) is always called with a single argument, the
  Future object, and should not be a Handler object.

- ``set_result(result)``.  The Future must not be done (nor cancelled)
  already.  This makes the Future done and schedules the callbacks.
  Difference with PEP 3148: This is a public API.

- ``set_exception(exception)``.  The Future must not be done (nor
  cancelled) already.  This makes the Future done and schedules the
  callbacks.  Difference with PEP 3148: This is a public API.

The internal method ``set_running_or_notify_cancel()`` is not
supported; there is no way to set the running state.

The following exceptions are defined:

- ``InvalidStateError``.  Raised whenever the Future is not in a state
  acceptable to the method being called (e.g. calling ``set_result()``
  on a Future that is already done, or calling ``result()`` on a Future
  that is not yet done).

- ``InvalidTimeoutError``.  Raised by ``result()`` and ``exception()``
  when a nonzero ``timeout`` argument is given.

- ``CancelledError``.  An alias for
  ``concurrent.futures.CancelledError``.  Raised when ``result()`` or
  ``exception()`` is called on a Future that is cancelled.

- ``TimeoutError``.  An alias for ``concurrent.futures.TimeoutError``.
  May be raised by ``EventLoop.run_until_complete()``.

A Future is associated with the default event loop when it is created.
(TBD: Optionally pass in an alternative event loop instance?)

A ``tulip.Future`` object is not acceptable to the ``wait()`` and
``as_completed()`` functions in the ``concurrent.futures`` package.
However, there are similar APIs ``tulip.wait()`` and
``tulip.as_completed()``, described below.

A ``tulip.Future`` object is acceptable to a ``yield from`` expression
when used in a coroutine.  This is implemented through the
``__iter__()`` interface on the Future.  See the section "Coroutines
and the Scheduler" below.

When a Future is garbage-collected, if it has an associated exception
but neither ``result()`` nor ``exception()`` nor ``__iter__()`` has
ever been called (or the latter hasn't raised the exception yet --
details TBD), the exception should be logged.  TBD: At what level?

In the future (pun intended) we may unify ``tulip.Future`` and
``concurrent.futures.Future``, e.g. by adding an ``__iter__()`` method
to the latter that works with ``yield from``.  To prevent accidentally
blocking the event loop by calling e.g. ``result()`` on a Future
that's not don yet, the blocking operation may detect that an event
loop is active in the current thread and raise an exception instead.
However the current PEP strives to have no dependencies beyond Python
3.3, so changes to ``concurrent.futures.Future`` are off the table for
now.

Transports
----------

A transport is an abstraction on top of a socket or something similar
(for example, a UNIX pipe or an SSL connection).  Transports are
strongly influenced by Twisted and PEP 3153.  Users rarely implement
or instantiate transports -- rather, event loops offer utility methods
to set up transports.

Transports work in conjunction with protocols.  Protocols are
typically written without knowing or caring about the exact type of
transport used, and transports can be used with a wide variety of
protocols.  For example, an HTTP client protocol implementation may be
used with either a plain socket transport or an SSL transport.  The
plain socket transport can be used with many different protocols
besides HTTP (e.g. SMTP, IMAP, POP, FTP, IRC, SPDY).

Most connections have an asymmetric nature: the client and server
usually have very different roles and behaviors.  Hence, the interface
between transport and protocol is also asymmetric.  From the
protocol's point of view, *writing* data is done by calling the
``write()`` method on the transport object; this buffers the data and
returns immediately.  However, the transport takes a more active role
in *reading* data: whenever some data is read from the socket (or
other data source), the transport calls the protocol's
``data_received()`` method.

Transports have the following public methods:

- ``write(data)``.  Write some bytes.  The argument must be a bytes
  object.  Returns ``None``.  The transport is free to buffer the
  bytes, but it must eventually cause the bytes to be transferred to
  the entity at the other end, and it must maintain stream behavior.
  That is, ``t.write(b'abc'); t.write(b'def')`` is equivalent to
  ``t.write(b'abcdef')``, as well as to::

    t.write(b'a')
    t.write(b'b')
    t.write(b'c')
    t.write(b'd')
    t.write(b'e')
    t.write(b'f')

- ``writelines(iterable)``.  Equivalent to::

    for data in iterable:
        self.write(data)

- ``write_eof()``.  Close the writing end of the connection.
  Subsequent calls to ``write()`` are not allowed.  Once all buffered
  data is transferred, the transport signals to the other end that no
  more data will be received.  Some protocols don't support this
  operation; in that case, calling ``write_eof()`` will raise an
  exception.  (Note: This used to be called ``half_close()``, but
  unless you already know what it is for, that name doesn't indicate
  *which* end is closed.)

- ``can_write_eof()``.  Return ``True`` if the protocol supports
  ``write_eof()``, ``False`` if it does not.  (This method is needed
  because some protocols need to change their behavior when
  ``write_eof()`` is unavailable.  For example, in HTTP, to send data
  whose size is not known ahead of time, the end of the data is
  typically indicated using ``write_eof()``; however, SSL does not
  support this, and an HTTP protocol implementation would have to use
  the "chunked" transfer encoding in this case.  But if the data size
  is known ahead of time, the best approach in both cases is to use
  the Content-Length header.)

- ``pause()``.  Suspend delivery of data to the protocol until a
  subsequent ``resume()`` call.  Between ``pause()`` and ``resume()``,
  the protocol's ``data_received()`` method will not be called.  This
  has no effect on ``write()``.

- ``resume()``.  Restart delivery of data to the protocol via
  ``data_received()``.

- ``close()``.  Sever the connection with the entity at the other end.
  Any data buffered by ``write()`` will (eventually) be transferred
  before the connection is actually closed.  The protocol's
  ``data_received()`` method will not be called again.  Once all
  buffered data has been flushed, the protocol's ``connection_lost()``
  method will be called with ``None`` as the argument.  Note that
  this method does not wait for all that to happen.

- ``abort()``.  Immediately sever the connection.  Any data still
  buffered by the transport is thrown away.  Soon, the protocol's
  ``connection_lost()`` method will be called with ``None`` as
  argument.  (TBD: Distinguish in the ``connection_lost()`` argument
  between ``close()``, ``abort()`` or a close initated by the other
  end?  Or add a transport method to inquire about this?  Glyph's
  proposal was to pass different exceptions for this purpose.)

TBD: Provide flow control the other way -- the transport may need to
suspend the protocol if the amount of data buffered becomes a burden.
Proposal: let the transport call ``protocol.pause()`` and
``protocol.resume()`` if they exist; if they don't exist, the
protocol doesn't support flow control.  (Perhaps different names
to avoid confusion between protocols and transports?)

Protocols
---------

Protocols are always used in conjunction with transports.  While a few
common protocols are provided (e.g. decent though not necessarily
excellent HTTP client and server implementations), most protocols will
be implemented by user code or third-party libraries.

A protocol must implement the following methods, which will be called
by the transport.  Consider these callbacks that are always called by
the event loop in the right context.  (See the "Context" section
above.)

- ``connection_made(transport)``.  Indicates that the transport is
  ready and connected to the entity at the other end.  The protocol
  should probably save the transport reference as an instance variable
  (so it can call its ``write()`` and other methods later), and may
  write an initial greeting or request at this point.

- ``data_received(data)``.  The transport has read some bytes from the
  connection.  The argument is always a non-empty bytes object.  There
  are no guarantees about the minimum or maximum size of the data
  passed along this way.  ``p.data_received(b'abcdef')`` should be
  treated exactly equivalent to::

    p.data_received(b'abc')
    p.data_received(b'def')

- ``eof_received()``.  This is called when the other end called
  ``write_eof()`` (or something equivalent).  The default
  implementation calls ``close()`` on the transport, which causes
  ``connection_lost()`` to be called (eventually) on the protocol.

- ``connection_lost(exc)``.  The transport has been closed or aborted,
  has detected that the other end has closed the connection cleanly,
  or has encountered an unexpected error.  In the first three cases
  the argument is ``None``; for an unexpected error, the argument is
  the exception that caused the transport to give up.  (TBD: Do we
  need to distinguish between the first three cases?)

Here is a chart indicating the order and multiplicity of calls:

  1. ``connection_made()`` -- exactly once
  2. ``data_received()`` -- zero or more times
  3. ``eof_received()`` -- at most once
  4. ``connection_lost()`` -- exactly once

TBD: Discuss whether user code needs to do anything to make sure that
protocol and transport aren't garbage-collected prematurely.

Callback Style
--------------

Most interfaces taking a callback also take positional arguments.  For
instance, to arrange for ``foo("abc", 42)`` to be called soon, you
call ``ev.call_soon(foo, "abc", 42)``.  To schedule the call
``foo()``, use ``ev.call_soon(foo)``.  This convention greatly reduces
the number of small lambdas required in typical callback programming.

This convention specifically does *not* support keyword arguments.
Keyword arguments are used to pass optional extra information about
the callback.  This allows graceful evolution of the API without
having to worry about whether a keyword might be significant to a
callee somewhere.  If you have a callback that *must* be called with a
keyword argument, you can use a lambda or ``functools.partial``.  For
example::

  ev.call_soon(functools.partial(foo, "abc", repeat=42))

Choosing an Event Loop Implementation
-------------------------------------

TBD.  (This is about the choice to use e.g. select vs. poll vs. epoll,
and how to override the choice.  Probably belongs in the event loop
policy.)


Coroutines and the Scheduler
============================

This is a separate toplevel section because its status is different
from the event loop interface.  Usage of coroutines is optional, and
it is perfectly fine to write code using callbacks only.  On the other
hand, there is only one implementation of the scheduler/coroutine API,
and if you're using coroutines, that's the one you're using.

Coroutines
----------

A coroutine is a generator that follows certain conventions.  For
documentation purposes, all coroutines should be decorated with
``@tulip.coroutine``, but this cannot be strictly enforced.

Coroutines use the ``yield from`` syntax introduced in PEP 380,
instead of the original ``yield`` syntax.

The word "coroutine", like the word "generator", is used for two
different (though related) concepts:

- The function that defines a coroutine (a function definition
  decorated with ``tulip.coroutine``).  If disambiguation is needed,
  we call this a *coroutine function*.

- The object obtained by calling a coroutine function.  This object
  represents a computation or an I/O operation (usually a combination)
  that will complete eventually.  For disambiguation we call it a
  *coroutine object*.

Things a coroutine can do:

- ``result = yield from future`` -- suspends the coroutine until the
  future is done, then returns the future's result, or raises its
  exception, which will be propagated.

- ``result = yield from coroutine`` -- wait for another coroutine to
  produce a result (or raise an exception, which will be propagated).
  The ``coroutine`` expression must be a *call* to another coroutine.

- ``return result`` -- produce a result to the coroutine that is
  waiting for this one using ``yield from``.

- ``raise exception`` -- raise an exception in the coroutine that is
  waiting for this one using ``yield from``.

Calling a coroutine does not start its code running -- it is just a
generator, and the coroutine object returned by the call is really a
generator object, which doesn't do anything until you iterate over it.
In the case of a coroutine object, there are two basic ways to start
it running: call ``yield from coroutine`` from another coroutine
(assuming the other coroutine is already running!), or convert it to a
Task (see below).

Coroutines can only run when the event loop is running.

Waiting for Multiple Coroutines
-------------------------------

To wait for multiple coroutines or Futures, two APIs similar to the
``wait()`` and ``as_completed()`` APIs in the ``concurrent.futures``
package are provided:

- ``tulip.wait(fs, timeout=None, return_when=ALL_COMPLETED)``.  This
  is a coroutine that waits for the Futures or coroutines given by
  ``fs`` to complete.  Coroutine arguments will be wrapped in Tasks
  (see below).  This returns a Future whose result on success is a
  tuple of two sets of Futures, ``(done, pending)``, where ``done`` is
  the set of original Futures (or wrapped coroutines) that are done
  (or cancelled), and ``pending`` is the rest, i.e. those that are
  still not done (nor cancelled).  Optional arguments ``timeout`` and
  ``return_when`` have the same meaning and defaults as for
  ``concurrent.futures.wait()``: ``timeout``, if not ``None``,
  specifies a timeout for the overall operation; ``return_when``,
  specifies when to stop.  The constants ``FIRST_COMPLETED``,
  ``FIRST_EXCEPTION``, ``ALL_COMPLETED`` are defined with the same
  values and the same meanings as in PEP 3148:

  - ``ALL_COMPLETED`` (default): Wait until all Futures are done or
    completed (or until the timeout occurs).

  - ``FIRST_COMPLETED``: Wait until at least one Future is done or
    cancelled (or until the timeout occurs).

  - ``FIRST_EXCEPTION``: Wait until at least one Future is done (not
    cancelled) with an exception set.  (The exclusion of cancelled
    Futures from the filter is surprising, but PEP 3148 does it this
    way.)

- ``tulip.as_completed(fs, timeout=None)``.  Returns an iterator whose
  values are Futures; waiting for successive values waits until the
  next Future or coroutine from the set ``fs`` completes, and returns
  its result (or raises its exception).  The optional argument
  ``timeout`` has the same meaning and default as it does for
  ``concurrent.futures.wait()``: when the timeout occurs, the next
  Future returned by the iterator will raise ``TimeoutError`` when
  waited for.  Example of use::

    for f in as_completed(fs):
        result = yield from f  # May raise an exception.
        # Use result.

Tasks
-----

A Task is an object that manages an independently running coroutine.
The Task interface is the same as the Future interface.  The task
becomes done when its coroutine returns or raises an exception; if it
returns a result, that becomes the task's result, if it raises an
exception, that becomes the task's exception.

Cancelling a task that's not done yet prevents its coroutine from
completing; in this case an exception is thrown into the coroutine
that it may catch to further handle cancellation, but it doesn't have
to (this is done using the standard ``close()`` method on generators,
described in PEP 342).

Tasks are also useful for interoperating between coroutines and
callback-based frameworks like Twisted.  After converting a coroutine
into a Task, callbacks can be added to the Task.

You may ask, why not convert all coroutines to Tasks?  The
``@tulip.coroutine`` decorator could do this.  This would slow things
down considerably in the case where one coroutine calls another (and
so on), as switching to a "bare" coroutine has much less overhead than
switching to a Task.

The Scheduler
-------------

The scheduler has no public interface.  You interact with it by using
``yield from future`` and ``yield from task``.  In fact, there is no
single object representing the scheduler -- its behavior is
implemented by the ``Task`` and ``Future`` classes using only the
public interface of the event loop, so it will work with third-party
event loop implementations, too.

Sleeping
--------

TBD: ``yield sleep(seconds)``.  Can use ``sleep(0)`` to suspend to
poll for I/O.

Coroutines and Protocols
------------------------

The best way to use coroutines to implement protocols is probably to
use a streaming buffer that gets filled by ``data_received()`` and can
be read asynchronously using methods like ``read(n)`` and
``readline()`` that return a Future.  When the connection is closed,
``read()`` should return a Future whose result is ``b''``, or raise an
exception if ``connection_closed()`` is called with an exception.

To write, the ``write()`` method (and friends) on the transport can be
used -- these do not return Futures.  A standard protocol
implementation should be provided that sets this up and kicks off the
coroutine when ``connection_made()`` is called.

TBD: Be more specific.

Cancellation
------------

TBD.  When a Task is cancelled its coroutine may see an exception at
any point where it is yielding to the scheduler (i.e., potentially at
any ``yield from`` operation).  We need to spell out which exception
is raised.

Also TBD: timeouts.


Open Issues
===========

- A debugging API?  E.g. something that logs a lot of stuff, or logs
  unusual conditions (like queues filling up faster than they drain)
  or even callbacks taking too much time...

- Do we need introspection APIs?  E.g. asking for the read callback
  given a file descriptor.  Or when the next scheduled call is.  Or
  the list of file descriptors registered with callbacks.

- Transports may need a method that tries to return the address of the
  socket (and another for the peer address).  Although these depend on
  the socket type and there may not always be a socket; then it should
  return None.  (Alternatively, there could be a method to return the
  socket itself -- but it is conceivable that a transport implements
  IP connections without using sockets, and what should it do then?)

- Need to handle os.fork().  (This may be up to the selector classes
  in Tulip's case.)

- Perhaps start_serving() needs a way to pass in an existing socket
  (e.g. gunicorn would need this).  Ditto for create_connection().

- We might introduce explicit locks, though these will be a bit of a
  pain to use, as we can't use the ``with lock: block`` syntax
  (because to wait for a lock we'd have to use ``yield from``, which
  the ``with`` statement can't do).

- Support for datagram protocols, "connected" or otherwise.  Probably
  need more socket I/O methods, e.g. ``sock_sendto()`` and
  ``sock_recvfrom()``.  Or users can write their own (it's not rocket
  science).  Is it reasonable to map ``write()``, ``writelines()``,
  ``data_received()`` to single datagrams?  Or should we have a
  different ``datagram_received()`` method on datagram protocols?
  (Glyph recommends the latter.)  And then what instead of ``write()``?
  Finally, do we need support for unconnected datagram protocols?
  (That would mean wrappers for ``sendto()`` and ``recvfrom()``.)

- We may need APIs to control various timeouts.  E.g. we may want to
  limit the time spent in DNS resolution, connecting, ssl handshake,
  idle connection, close/shutdown, even per session.  Possibly it's
  sufficient to add ``timeout`` keyword parameters to some methods,
  and other timeouts can probably be implemented by clever use of
  ``call_later()`` and ``Task.cancel()``.  But it's possible that some
  operations need default timeouts, and we may want to change the
  default for a specific operation globally (i.e., per event loop).

- An EventEmitter in the style of NodeJS?  Or make this a separate
  PEP?  It's easy enough to do in user space, though it may benefit
  from standardization.  (See
  https://github.com/mnot/thor/blob/master/thor/events.py and
  https://github.com/mnot/thor/blob/master/doc/events.md for examples.)


References
==========

- PEP 380 describes the semantics of ``yield from``.  TBD: Greg
  Ewing's tutorial.

- PEP 3148 describes ``concurrent.futures.Future``.

- PEP 3153, while rejected, has a good write-up explaining the need
  to separate transports and protocols.

- Tulip repo: http://code.google.com/p/tulip/

- Nick Coghlan wrote a nice blog post with some background, thoughts
  about different approaches to async I/O, gevent, and how to use
  futures with constructs like ``while``, ``for`` and ``with``:
  http://python-notes.boredomandlaziness.org/en/latest/pep_ideas/async_programming.html

- TBD: references to the relevant parts of Twisted, Tornado, ZeroMQ,
  pyftpdlib, libevent, libev, pyev, libuv, wattle, and so on.


Acknowledgments
===============

Apart from PEP 3153, influences include PEP 380 and Greg Ewing's
tutorial for ``yield from``, Twisted, Tornado, ZeroMQ, pyftpdlib, tulip
(the author's attempts at synthesis of all these), wattle (Steve
Dower's counter-proposal), numerous discussions on python-ideas from
September through December 2012, a Skype session with Steve Dower and
Dino Viehland, email exchanges with Ben Darnell, an audience with
Niels Provos (original author of libevent), and two in-person meetings
with several Twisted developers, including Glyph, Brian Warner, David
Reid, and Duncan McGreggor.  Also, the author's previous work on async
support in the NDB library for Google App Engine was an important
influence.


Copyright
=========

This document has been placed in the public domain.


..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End: