Some more clarifications and edits. Describe datagram protocol.

This commit is contained in:
Guido van Rossum 2013-04-29 21:24:46 -07:00
parent f07821f801
commit 40dc92be19
1 changed files with 124 additions and 86 deletions

View File

@ -130,10 +130,6 @@ is different in each case.
Details of the interfaces defined by the various standard types of
transports and protocols are given later.
Specification
=============
Dependencies
------------
@ -143,6 +139,10 @@ library features beyond Python 3.3, no third-party modules or
packages, and no C code, except for the proactor-based event loop on
Windows.
Event Loop Interface Specification
==================================
Module Namespace
----------------
@ -217,14 +217,6 @@ framework). The default event loop policy is an instance of the class
be retrieved by calling ``get_event_loop_policy()``. (TBD: Require
inheriting from ``AbstractEventLoopPolicy``?)
Notes for the Event Loop Interface
----------------------------------
A note about times: as usual in Python, all timeouts, intervals and
delays are measured in seconds, and may be ints or floats. The
accuracy and precision of the clock are up to the implementation; the
default implementation uses ``time.monotonic()``.
Event Loop Classes
------------------
@ -298,6 +290,16 @@ well, using the ``subprocess`` module in the standard library.)
- Signal callbacks: ``add_signal_handler()``,
``remove_signal_handler()``.
Specifying Times
----------------
As usual in Python, all timeouts, intervals and delays are measured in
seconds, and may be ints or floats. The accuracy and precision of the
clock are up to the implementation; the default implementation uses
``time.monotonic()``. Books could be written about the implications
of this choice. Better read the docs for the stdandard library
``time`` module.
Required Event Loop Methods
---------------------------
@ -1033,7 +1035,9 @@ Datagram transports have these methods:
The optional second argument is the destination address. If
omitted, ``remote_addr`` must have been specified in the
``create_datagram_endpoint()`` call that created this transport. If
present, and ``remote_addr`` was specified, they must match.
present, and ``remote_addr`` was specified, they must match. The
(data, addr) pair may be sent immediately or buffered. The return
value is None.
- ``abort()``. Immediately close the transport. Buffered data will
be discarded.
@ -1056,20 +1060,26 @@ implement this in the protocol if it needs to make the distinction.)
Protocols
---------
XXX This is about where I left off.
TBD Describe different kinds of protocols (bidrectional stream,
unidirectional stream, datagram).
Protocols are always used in conjunction with transports. While a few
common protocols are provided (e.g. decent though not necessarily
excellent HTTP client and server implementations), most protocols will
be implemented by user code or third-party libraries.
A protocol must implement the following methods, which will be called
by the transport. Consider these callbacks that are always called by
the event loop in the right context. (See the "Context" section
above.)
Like for transports, we distinguish between stream protocols, datagram
protocols, and perhaps other custom protocols. The most common type
of protocol is a bidirectional stream protocol. (There are no
unidirectional protocols.)
(TBD: should protocol callbacks be allowed to be coroutines?)
Stream Protocols
''''''''''''''''
A (bidirectional) stream protocol must implement the following
methods, which will be called by the transport. Think of these as
callbacks that are always called by the event loop in the right
context. (See the "Context" section way above.)
- ``connection_made(transport)``. Indicates that the transport is
ready and connected to the entity at the other end. The protocol
@ -1108,6 +1118,36 @@ Here is a chart indicating the order and multiplicity of calls:
TBD: Discuss whether user code needs to do anything to make sure that
protocol and transport aren't garbage-collected prematurely.
Datagram Protocols
''''''''''''''''''
Datagram protocols have ``connection_made()`` and
``connection_lost()`` methods with the same signatures as stream
protocols. (As explained in the section about datagram transports, we
prefer the slightly odd nomenclature over defining different method
names to indicating the opening and closing of the socket.)
In addition, they have the following methods:
- ``datagram_received(data, addr)``. Indicates that a datagram
``data`` (a bytes objects) was received from remote address ``addr``
(an IPv4 2-tuple or an IPv6 4-tuple).
- ``connection_refused(exc)``. Indicates that a send or receive
operation raised a ``ConnectionRefused`` exception. This typically
indicates that a negative acknowledgment was received for a
previously sent datagram (not for the datagram that was being sent,
if the exception was raised by a send operation). Immediately after
this the socket will be closed and ``connection_lost()`` will be
called with the same exception argument.
Here is a chart indicating the order and multiplicity of calls:
1. ``connection_made()`` -- exactly once
2. ``datagram_received()`` -- zero or more times
3. ``connection_refused()`` -- at most once
4. ``connection_lost()`` -- exactly once
Callback Style
--------------
@ -1128,13 +1168,6 @@ example::
loop.call_soon(functools.partial(foo, "abc", repeat=42))
Choosing an Event Loop Implementation
-------------------------------------
TBD. (This is about the choice to use e.g. select vs. poll vs. epoll,
and how to override the choice. Probably belongs in the event loop
policy.)
Coroutines and the Scheduler
============================
@ -1159,25 +1192,27 @@ The word "coroutine", like the word "generator", is used for two
different (though related) concepts:
- The function that defines a coroutine (a function definition
decorated with ``tulip.coroutine``). If disambiguation is needed,
we call this a *coroutine function*.
decorated with ``tulip.coroutine``). If disambiguation is needed
we will call this a *coroutine function*.
- The object obtained by calling a coroutine function. This object
represents a computation or an I/O operation (usually a combination)
that will complete eventually. For disambiguation we call it a
*coroutine object*.
that will complete eventually. If disambiguation is needed we will
call it a *coroutine object*.
Things a coroutine can do:
- ``result = yield from future`` -- suspends the coroutine until the
future is done, then returns the future's result, or raises its
exception, which will be propagated.
future is done, then returns the future's result, or raises an
exception, which will be propagated. (If the future is cancelled,
it will raise a ``CancelledError`` exception.) Note that tasks are
futures, and everything said about futures also applies to tasks.
- ``result = yield from coroutine`` -- wait for another coroutine to
produce a result (or raise an exception, which will be propagated).
The ``coroutine`` expression must be a *call* to another coroutine.
- ``return result`` -- produce a result to the coroutine that is
- ``return expression`` -- produce a result to the coroutine that is
waiting for this one using ``yield from``.
- ``raise exception`` -- raise an exception in the coroutine that is
@ -1191,7 +1226,7 @@ it running: call ``yield from coroutine`` from another coroutine
(assuming the other coroutine is already running!), or convert it to a
Task (see below).
Coroutines can only run when the event loop is running.
Coroutines (and tasks) can only run when the event loop is running.
Waiting for Multiple Coroutines
-------------------------------
@ -1207,13 +1242,14 @@ package are provided:
tuple of two sets of Futures, ``(done, pending)``, where ``done`` is
the set of original Futures (or wrapped coroutines) that are done
(or cancelled), and ``pending`` is the rest, i.e. those that are
still not done (nor cancelled). Optional arguments ``timeout`` and
``return_when`` have the same meaning and defaults as for
``concurrent.futures.wait()``: ``timeout``, if not ``None``,
specifies a timeout for the overall operation; ``return_when``,
specifies when to stop. The constants ``FIRST_COMPLETED``,
``FIRST_EXCEPTION``, ``ALL_COMPLETED`` are defined with the same
values and the same meanings as in PEP 3148:
still not done (nor cancelled). Note that with the defaults for
``timeout`` and ``return_when``, ``done`` will always be an empty
list. Optional arguments ``timeout`` and ``return_when`` have the
same meaning and defaults as for ``concurrent.futures.wait()``:
``timeout``, if not ``None``, specifies a timeout for the overall
operation; ``return_when``, specifies when to stop. The constants
``FIRST_COMPLETED``, ``FIRST_EXCEPTION``, ``ALL_COMPLETED`` are
defined with the same values and the same meanings as in PEP 3148:
- ``ALL_COMPLETED`` (default): Wait until all Futures are done or
completed (or until the timeout occurs).
@ -1239,30 +1275,49 @@ package are provided:
result = yield from f # May raise an exception.
# Use result.
Note: if you do not wait for the futures as they are produced by the
iterator, your ``for`` loop may not make progress (since you are not
allowing other tasks to run).
Sleeping
--------
The coroutine ``sleep(delay)`` returns after a given time delay.
(TBD: Should the optional second argument, ``result``, be part of the
spec?)
Tasks
-----
A Task is an object that manages an independently running coroutine.
The Task interface is the same as the Future interface. The task
becomes done when its coroutine returns or raises an exception; if it
returns a result, that becomes the task's result, if it raises an
exception, that becomes the task's exception.
The Task interface is the same as the Future interface, and in fact
``Task`` is a subclass of ``Future``. The task becomes done when its
coroutine returns or raises an exception; if it returns a result, that
becomes the task's result, if it raises an exception, that becomes the
task's exception.
Cancelling a task that's not done yet prevents its coroutine from
completing; in this case an exception is thrown into the coroutine
that it may catch to further handle cancellation, but it doesn't have
to (this is done using the standard ``close()`` method on generators,
described in PEP 342).
completing. In this case a ``CancelledError`` exception is thrown
into the coroutine that it may catch to further handle cancellation.
If the exception is not caught, the generator will be properly
finalized anyway, as described in PEP 342.
Tasks are also useful for interoperating between coroutines and
callback-based frameworks like Twisted. After converting a coroutine
into a Task, callbacks can be added to the Task.
You may ask, why not convert all coroutines to Tasks? The
``@tulip.coroutine`` decorator could do this. This would slow things
down considerably in the case where one coroutine calls another (and
so on), as switching to a "bare" coroutine has much less overhead than
switching to a Task.
There are two ways to convert a coroutine into a task: explicitly, by
calling the coroutine function and then passing the resulting
coroutine object to the ``tulip.Task()`` constructor; or implicitly,
by decorating the coroutine with ``@tulip.task`` (instead of
``@tulip.coroutine``).
You may ask, why not automatically convert all coroutines to Tasks?
The ``@tulip.coroutine`` decorator could do this. However, this would
slow things down considerably in the case where one coroutine calls
another (and so on), as switching to a "bare" coroutine has much less
overhead than switching to a Task.
The Scheduler
-------------
@ -1274,38 +1329,21 @@ implemented by the ``Task`` and ``Future`` classes using only the
public interface of the event loop, so it will work with third-party
event loop implementations, too.
Sleeping
--------
TBD: ``yield sleep(seconds)``. Can use ``sleep(0)`` to suspend to
poll for I/O.
Coroutines and Protocols
------------------------
The best way to use coroutines to implement protocols is probably to
use a streaming buffer that gets filled by ``data_received()`` and can
be read asynchronously using methods like ``read(n)`` and
``readline()`` that return a Future. When the connection is closed,
``read()`` should return a Future whose result is ``b''``, or raise an
exception if ``connection_closed()`` is called with an exception.
The best way to use coroutines to implement an Internet protocol such
as FTP is probably to use a streaming buffer that gets filled by
``data_received()`` and can be read asynchronously using methods like
``read(n)`` and ``readline()`` that are coroutines or return a Future.
When the connection is closed, ``read()`` should eventually produce
``b''``, or raise an exception if ``connection_closed()`` is called
with an exception.
To write, the ``write()`` method (and friends) on the transport can be
used -- these do not return Futures. A standard protocol
implementation should be provided that sets this up and kicks off the
coroutine when ``connection_made()`` is called.
TBD: Be more specific.
Cancellation
------------
TBD. When a Task is cancelled its coroutine may see an exception at
any point where it is yielding to the scheduler (i.e., potentially at
any ``yield from`` operation). We need to spell out which exception
is raised.
Also TBD: timeouts.
To write a response, the ``write()`` method (and friends) on the
transport can be used -- these do not return Futures. A standard
protocol implementation should be provided that sets this up and kicks
off the coroutine when ``connection_made()`` is called.
Open Issues
@ -1336,7 +1374,7 @@ Open Issues
these would all require using Tulip internals.
- Locks and queues? The Tulip implementation contains implementations
of most types of locks and queues modeled after the stdlib
of most types of locks and queues modeled after the standard library
``threading`` and ``queue`` modules. Should we incorporate these in
the PEP?