Rewrap according to PEP standards; indent documentation for API elements for better clarity.

This commit is contained in:
Georg Brandl 2010-05-01 09:58:11 +00:00
parent ebeb712eea
commit 45d710b6d5
1 changed files with 196 additions and 182 deletions

View File

@ -14,20 +14,20 @@ Post-History:
Abstract Abstract
======== ========
This PEP proposes a design for a package that facilitates the evaluation of This PEP proposes a design for a package that facilitates the
callables using threads and processes. evaluation of callables using threads and processes.
========== ==========
Motivation Motivation
========== ==========
Python currently has powerful primitives to construct multi-threaded and Python currently has powerful primitives to construct multi-threaded
multi-process applications but parallelizing simple operations requires a lot of and multi-process applications but parallelizing simple operations
work i.e. explicitly launching processes/threads, constructing a work/results requires a lot of work i.e. explicitly launching processes/threads,
queue, and waiting for completion or some other termination condition (e.g. constructing a work/results queue, and waiting for completion or some
failure, timeout). It is also difficult to design an application with a global other termination condition (e.g. failure, timeout). It is also
process/thread limit when each component invents its own parallel execution difficult to design an application with a global process/thread limit
strategy. when each component invents its own parallel execution strategy.
============= =============
Specification Specification
@ -95,10 +95,10 @@ Web Crawl Example
Interface Interface
--------- ---------
The proposed package provides two core classes: `Executor` and `Future`. The proposed package provides two core classes: `Executor` and
An `Executor` receives asynchronous work requests (in terms of a callable and `Future`. An `Executor` receives asynchronous work requests (in terms
its arguments) and returns a `Future` to represent the execution of that of a callable and its arguments) and returns a `Future` to represent
work request. the execution of that work request.
Executor Executor
'''''''' ''''''''
@ -106,68 +106,70 @@ Executor
`Executor` is an abstract class that provides methods to execute calls `Executor` is an abstract class that provides methods to execute calls
asynchronously. asynchronously.
`submit(fn, *args, **kwargs)` ``submit(fn, *args, **kwargs)``
Schedules the callable to be executed as fn(*\*args*, *\*\*kwargs*) and returns Schedules the callable to be executed as ``fn(*args, **kwargs)``
a `Future` instance representing the execution of the function. and returns a `Future` instance representing the execution of the
function.
This is an abstract method and must be implemented by Executor subclasses. This is an abstract method and must be implemented by Executor
subclasses.
`map(func, *iterables, timeout=None)` ``map(func, *iterables, timeout=None)``
Equivalent to map(*func*, *\*iterables*) but executed asynchronously and Equivalent to ``map(func, *iterables)`` but executed
possibly out-of-order. The returned iterator raises a `TimeoutError` if asynchronously and possibly out-of-order. The returned iterator
`__next__()` is called and the result isn't available after *timeout* seconds raises a `TimeoutError` if `__next__()` is called and the result
from the original call to `map()`. If *timeout* is not specified or isn't available after *timeout* seconds from the original call to
``None`` then there is no limit to the wait time. If a call raises an exception `map()`. If *timeout* is not specified or `None` then there is no
then that exception will be raised when its value is retrieved from the limit to the wait time. If a call raises an exception then that
exception will be raised when its value is retrieved from the
iterator. iterator.
`shutdown(wait=True)` ``shutdown(wait=True)``
Signal the executor that it should free any resources that it is using when Signal the executor that it should free any resources that it is
the currently pending futures are done executing. Calls to using when the currently pending futures are done executing.
`Executor.submit` and `Executor.map` and made after shutdown will raise Calls to `Executor.submit` and `Executor.map` and made after
`RuntimeError`. shutdown will raise `RuntimeError`.
If wait is `True` then the executor will not return until all the pending If wait is `True` then the executor will not return until all the
futures are done executing and the resources associated with the executor pending futures are done executing and the resources associated
have been freed. with the executor have been freed.
`__enter__()` | ``__enter__()``
`__exit__(exc_type, exc_val, exc_tb)` | ``__exit__(exc_type, exc_val, exc_tb)``
When using an executor as a context manager, `__exit__` will call When using an executor as a context manager, `__exit__` will call
`Executor.shutdown(wait=True)`. ``Executor.shutdown(wait=True)``.
ProcessPoolExecutor ProcessPoolExecutor
''''''''''''''''''' '''''''''''''''''''
The `ProcessPoolExecutor` class is an `Executor` subclass that uses a pool of The `ProcessPoolExecutor` class is an `Executor` subclass that uses a
processes to execute calls asynchronously. The callable objects and arguments pool of processes to execute calls asynchronously. The callable
passed to `ProcessPoolExecutor.submit` must be serializeable according to the objects and arguments passed to `ProcessPoolExecutor.submit` must be
same limitations as the multiprocessing module. serializeable according to the same limitations as the multiprocessing
module.
Calling `Executor` or `Future` methods from within a callable submitted to a Calling `Executor` or `Future` methods from within a callable
`ProcessPoolExecutor` will result in deadlock. submitted to a `ProcessPoolExecutor` will result in deadlock.
`__init__(max_workers)` ``__init__(max_workers)``
Executes calls asynchronously using a pool of a most *max_workers* Executes calls asynchronously using a pool of a most *max_workers*
processes. If *max_workers* is ``None`` or not given then as many worker processes. If *max_workers* is ``None`` or not given then as many
processes will be created as the machine has processors. worker processes will be created as the machine has processors.
ThreadPoolExecutor ThreadPoolExecutor
'''''''''''''''''' ''''''''''''''''''
The `ThreadPoolExecutor` class is an `Executor` subclass that uses a pool of The `ThreadPoolExecutor` class is an `Executor` subclass that uses a
threads to execute calls asynchronously. pool of threads to execute calls asynchronously.
Deadlock can occur when the callable associated with a `Future` waits on Deadlock can occur when the callable associated with a `Future` waits
the results of another `Future`. For example: on the results of another `Future`. For example::
::
import time import time
def wait_on_b(): def wait_on_b():
@ -185,9 +187,7 @@ the results of another `Future`. For example:
a = executor.submit(wait_on_b) a = executor.submit(wait_on_b)
b = executor.submit(wait_on_a) b = executor.submit(wait_on_a)
And: And::
::
def wait_on_future(): def wait_on_future():
f = executor.submit(pow, 5, 2) f = executor.submit(pow, 5, 2)
@ -198,125 +198,134 @@ And:
executor = ThreadPoolExecutor(max_workers=1) executor = ThreadPoolExecutor(max_workers=1)
executor.submit(wait_on_future) executor.submit(wait_on_future)
`__init__(max_workers)` ``__init__(max_workers)``
Executes calls asynchronously using a pool of at most *max_workers* threads. Executes calls asynchronously using a pool of at most
*max_workers* threads.
Future Objects Future Objects
'''''''''''''' ''''''''''''''
The `Future` class encapsulates the asynchronous execution of a function The `Future` class encapsulates the asynchronous execution of a
or method call. `Future` instances are returned by `Executor.submit`. function or method call. `Future` instances are returned by
`Executor.submit`.
`cancel()` ``cancel()``
Attempt to cancel the call. If the call is currently being executed then Attempt to cancel the call. If the call is currently being
it cannot be cancelled and the method will return `False`, otherwise the call executed then it cannot be cancelled and the method will return
will be cancelled and the method will return `True`. `False`, otherwise the call will be cancelled and the method will
return `True`.
`cancelled()` ``cancelled()``
Return `True` if the call was successfully cancelled. Return `True` if the call was successfully cancelled.
`running()` ``running()``
Return `True` if the call is currently being executed and cannot be cancelled. Return `True` if the call is currently being executed and cannot
be cancelled.
`done()` ``done()``
Return `True` if the call was successfully cancelled or finished running. Return `True` if the call was successfully cancelled or finished
running.
`result(timeout=None)` ``result(timeout=None)``
Return the value returned by the call. If the call hasn't yet completed then Return the value returned by the call. If the call hasn't yet
this method will wait up to *timeout* seconds. If the call hasn't completed completed then this method will wait up to *timeout* seconds. If
in *timeout* seconds then a `TimeoutError` will be raised. If *timeout* the call hasn't completed in *timeout* seconds then a
is not specified or ``None`` then there is no limit to the wait time. `TimeoutError` will be raised. If *timeout* is not specified or
`None` then there is no limit to the wait time.
If the future is cancelled before completing then `CancelledError` will If the future is cancelled before completing then `CancelledError`
be raised. will be raised.
If the call raised then this method will raise the same exception. If the call raised then this method will raise the same exception.
`exception(timeout=None)` ``exception(timeout=None)``
Return the exception raised by the call. If the call hasn't yet completed Return the exception raised by the call. If the call hasn't yet
then this method will wait up to *timeout* seconds. If the call hasn't completed then this method will wait up to *timeout* seconds. If
completed in *timeout* seconds then a `TimeoutError` will be raised. the call hasn't completed in *timeout* seconds then a
If *timeout* is not specified or ``None`` then there is no limit to the wait `TimeoutError` will be raised. If *timeout* is not specified or
time. ``None`` then there is no limit to the wait time.
If the future is cancelled before completing then `CancelledError` will If the future is cancelled before completing then `CancelledError`
be raised. will be raised.
If the call completed without raising then ``None`` is returned. If the call completed without raising then `None` is returned.
`add_done_callback(fn)` ``add_done_callback(fn)``
Attaches a function *fn* to the future that will be called when the future is Attaches a function *fn* to the future that will be called when
cancelled or finishes running. *fn* will be called with the future as its only the future is cancelled or finishes running. *fn* will be called
argument. with the future as its only argument.
If the future has already completed or been cancelled then *fn* will be called If the future has already completed or been cancelled then *fn*
immediately. If the same function is added several times then it will still only will be called immediately. If the same function is added several
be called once. times then it will still only be called once.
NOTE: This method can be used to create adapters from Futures to Twisted NOTE: This method can be used to create adapters from Futures to
Deferreds. Twisted Deferreds.
`remove_done_callback(fn)` ``remove_done_callback(fn)``
Removes the function *fn*, which was previously attached to the future using Removes the function *fn*, which was previously attached to the
`add_done_callback`. `KeyError` is raised if the function was not previously future using `add_done_callback`. `KeyError` is raised if the
attached. function was not previously attached.
Internal Future Methods Internal Future Methods
^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^
The following `Future` methods are meant for use in unit tests and `Executor` The following `Future` methods are meant for use in unit tests and
implementations. `Executor` implementations.
`set_running_or_notify_cancel()` ``set_running_or_notify_cancel()``
Should be called by `Executor` implementations before executing the work Should be called by `Executor` implementations before executing
associated with the `Future`. the work associated with the `Future`.
If the method returns `False` then the `Future` was cancelled i.e. If the method returns `False` then the `Future` was cancelled,
`Future.cancel` was called and returned `True`. Any threads waiting on the i.e. `Future.cancel` was called and returned `True`. Any threads
`Future` completing (i.e. through `as_completed()` or `wait()`) will be woken waiting on the `Future` completing (i.e. through `as_completed()`
up. or `wait()`) will be woken up.
If the method returns `True` then the `Future` was not cancelled and has been If the method returns `True` then the `Future` was not cancelled
put in the running state i.e. calls to `Future.running()` will return `True`. and has been put in the running state, i.e. calls to
`Future.running()` will return `True`.
This method can only be called once and cannot be called after This method can only be called once and cannot be called after
`Future.set_result()` or `Future.set_exception()` have been called. `Future.set_result()` or `Future.set_exception()` have been
called.
`set_result(result)` ``set_result(result)``
Sets the result of the work associated with the `Future`. Sets the result of the work associated with the `Future`.
`set_exception(exception)` ``set_exception(exception)``
Sets the result of the work associated with the `Future` to the given Sets the result of the work associated with the `Future` to the
`Exception`. given `Exception`.
Module Functions Module Functions
'''''''''''''''' ''''''''''''''''
`wait(fs, timeout=None, return_when=ALL_COMPLETED)` ``wait(fs, timeout=None, return_when=ALL_COMPLETED)``
Wait for the `Future` instances given by *fs* to complete. Returns a named Wait for the `Future` instances given by *fs* to complete.
2-tuple of sets. The first set, named "finished", contains the futures that Returns a named 2-tuple of sets. The first set, named "finished",
completed (finished or were cancelled) before the wait completed. The second contains the futures that completed (finished or were cancelled)
set, named "not_finished", contains uncompleted futures. before the wait completed. The second set, named "not_finished",
contains uncompleted futures.
*timeout* can be used to control the maximum number of seconds to wait before *timeout* can be used to control the maximum number of seconds to
returning. If timeout is not specified or None then there is no limit to the wait before returning. If timeout is not specified or None then
wait time. there is no limit to the wait time.
*return_when* indicates when the method should return. It must be one of the *return_when* indicates when the method should return. It must be
following constants: one of the following constants:
============================= ================================================== ============================= ==================================================
Constant Description Constant Description
@ -329,55 +338,60 @@ following constants:
`ALL_COMPLETED` The method will return when all calls finish. `ALL_COMPLETED` The method will return when all calls finish.
============================= ================================================== ============================= ==================================================
`as_completed(fs, timeout=None)` ``as_completed(fs, timeout=None)``
Returns an iterator over the `Future` instances given by *fs* that yields Returns an iterator over the `Future` instances given by *fs* that
futures as they complete (finished or were cancelled). Any futures that yields futures as they complete (finished or were cancelled). Any
completed before `as_completed()` was called will be yielded first. The returned futures that completed before `as_completed()` was called will be
iterator raises a `TimeoutError` if `__next__()` is called and the result isn't yielded first. The returned iterator raises a `TimeoutError` if
available after *timeout* seconds from the original call to `as_completed()`. If `__next__()` is called and the result isn't available after
*timeout* is not specified or `None` then there is no limit to the wait time. *timeout* seconds from the original call to `as_completed()`. If
*timeout* is not specified or `None` then there is no limit to the
wait time.
========= =========
Rationale Rationale
========= =========
The proposed design of this module was heavily influenced by the the Java The proposed design of this module was heavily influenced by the the
java.util.concurrent package [1]_. The conceptual basis of the module, as in Java java.util.concurrent package [1]_. The conceptual basis of the
Java, is the Future class, which represents the progress and result of an module, as in Java, is the Future class, which represents the progress
asynchronous computation. The Future class makes little commitment to the and result of an asynchronous computation. The Future class makes
evaluation mode being used e.g. it can be be used to represent lazy or eager little commitment to the evaluation mode being used e.g. it can be be
evaluation, for evaluation using threads, processes or remote procedure call. used to represent lazy or eager evaluation, for evaluation using
threads, processes or remote procedure call.
Futures are created by concrete implementations of the Executor class Futures are created by concrete implementations of the Executor class
(called ExecutorService in Java). The reference implementation provides (called ExecutorService in Java). The reference implementation
classes that use either a process a thread pool to eagerly evaluate provides classes that use either a process a thread pool to eagerly
computations. evaluate computations.
Futures have already been seen in Python as part of a popular Python Futures have already been seen in Python as part of a popular Python
cookbook recipe [2]_ and have discussed on the Python-3000 mailing list [3]_. cookbook recipe [2]_ and have discussed on the Python-3000 mailing
list [3]_.
The proposed design is explicit i.e. it requires that clients be aware that The proposed design is explicit, i.e. it requires that clients be
they are consuming Futures. It would be possible to design a module that aware that they are consuming Futures. It would be possible to design
would return proxy objects (in the style of `weakref`) that could be used a module that would return proxy objects (in the style of `weakref`)
transparently. It is possible to build a proxy implementation on top of that could be used transparently. It is possible to build a proxy
the proposed explicit mechanism. implementation on top of the proposed explicit mechanism.
The proposed design does not introduce any changes to Python language syntax The proposed design does not introduce any changes to Python language
or semantics. Special syntax could be introduced [4]_ to mark function and syntax or semantics. Special syntax could be introduced [4]_ to mark
method calls as asynchronous. A proxy result would be returned while the function and method calls as asynchronous. A proxy result would be
operation is eagerly evaluated asynchronously, and execution would only returned while the operation is eagerly evaluated asynchronously, and
block if the proxy object were used before the operation completed. execution would only block if the proxy object were used before the
operation completed.
Anh Hai Trinh proposed a simpler but more limited API concept [5]_ and the API Anh Hai Trinh proposed a simpler but more limited API concept [5]_ and
has been discussed in some detail on stdlib-sig [6]_. the API has been discussed in some detail on stdlib-sig [6]_.
======================== ========================
Reference Implementation Reference Implementation
======================== ========================
The reference implementation [7]_ contains a complete implementation of the The reference implementation [7]_ contains a complete implementation
proposed design. It has been tested on Linux and Mac OS X. of the proposed design. It has been tested on Linux and Mac OS X.
========== ==========
References References