changes from PEP authors; corrections
This commit is contained in:
parent
43996147f7
commit
1cabbf4c39
|
@ -96,7 +96,7 @@ Index by Category
|
||||||
S 364 Transitioning to the Py3K Standard Library Warsaw
|
S 364 Transitioning to the Py3K Standard Library Warsaw
|
||||||
S 368 Standard image protocol and class Mastrodomenico
|
S 368 Standard image protocol and class Mastrodomenico
|
||||||
S 369 Post import hooks Heimes
|
S 369 Post import hooks Heimes
|
||||||
S 371 Addition of the Processing module Noller, Oudkerk
|
S 371 Addition of the multiprocessing package Noller, Oudkerk
|
||||||
S 3134 Exception Chaining and Embedded Tracebacks Yee
|
S 3134 Exception Chaining and Embedded Tracebacks Yee
|
||||||
S 3135 New Super Spealman, Delaney
|
S 3135 New Super Spealman, Delaney
|
||||||
S 3138 String representation in Python 3000 Ishimoto
|
S 3138 String representation in Python 3000 Ishimoto
|
||||||
|
@ -475,7 +475,7 @@ Numerical Index
|
||||||
S 368 Standard image protocol and class Mastrodomenico
|
S 368 Standard image protocol and class Mastrodomenico
|
||||||
S 369 Post import hooks Heimes
|
S 369 Post import hooks Heimes
|
||||||
SA 370 Per user site-packages directory Heimes
|
SA 370 Per user site-packages directory Heimes
|
||||||
S 371 Addition of the Processing module Noller, Oudkerk
|
S 371 Addition of the multiprocessing package Noller, Oudkerk
|
||||||
SR 666 Reject Foolish Indentation Creighton
|
SR 666 Reject Foolish Indentation Creighton
|
||||||
SR 754 IEEE 754 Floating Point Special Values Warnes
|
SR 754 IEEE 754 Floating Point Special Values Warnes
|
||||||
P 3000 Python 3000 GvR
|
P 3000 Python 3000 GvR
|
||||||
|
|
132
pep-0371.txt
132
pep-0371.txt
|
@ -1,5 +1,5 @@
|
||||||
PEP: 371
|
PEP: 371
|
||||||
Title: Addition of the Processing module to standard library
|
Title: Addition of the multiprocessing package to the standard library
|
||||||
Version: $Revision: $
|
Version: $Revision: $
|
||||||
Last-Modified: $Date: $
|
Last-Modified: $Date: $
|
||||||
Author: Jesse Noller <jnoller@gmail.com>
|
Author: Jesse Noller <jnoller@gmail.com>
|
||||||
|
@ -14,22 +14,22 @@ Post-History:
|
||||||
|
|
||||||
Abstract
|
Abstract
|
||||||
|
|
||||||
This PEP proposes the inclusion of the pyProcessing [1] module into the
|
This PEP proposes the inclusion of the pyProcessing [1] package into the
|
||||||
python standard library.
|
Python standard library, renamed to "multiprocessing".
|
||||||
|
|
||||||
The processing module mimics the standard library threading module and API
|
The processing package mimics the standard library threading module and API
|
||||||
to provide a process-based approach to "threaded programming" allowing
|
to provide a process-based approach to "threaded programming" allowing
|
||||||
end-users to dispatch multiple tasks that effectively side-step the global
|
end-users to dispatch multiple tasks that effectively side-step the global
|
||||||
interpreter lock.
|
interpreter lock.
|
||||||
|
|
||||||
The module also provides server and client modules to provide remote-
|
The package also provides server and client functionality (processing.Manager)
|
||||||
sharing and management of objects and tasks so that applications may not
|
to provide remote sharing and management of objects and tasks so that
|
||||||
only leverage multiple cores on the local machine, but also distribute
|
applications may not only leverage multiple cores on the local machine,
|
||||||
objects and tasks across a cluster of networked machines.
|
but also distribute objects and tasks across a cluster of networked machines.
|
||||||
|
|
||||||
While the distributed capabilities of the module are beneficial, the primary
|
While the distributed capabilities of the package are beneficial, the primary
|
||||||
focus of this PEP is the core threading-like API and capabilities of the
|
focus of this PEP is the core threading-like API and capabilities of the
|
||||||
module.
|
package.
|
||||||
|
|
||||||
Rationale
|
Rationale
|
||||||
|
|
||||||
|
@ -41,20 +41,20 @@ Rationale
|
||||||
Python programmers who are leveraging multi-core machines.
|
Python programmers who are leveraging multi-core machines.
|
||||||
|
|
||||||
The GIL itself prevents more than a single thread from running within the
|
The GIL itself prevents more than a single thread from running within the
|
||||||
interpreter at any given point in time, effectively removing python's
|
interpreter at any given point in time, effectively removing Python's
|
||||||
ability to take advantage of multi-processor systems. While I/O bound
|
ability to take advantage of multi-processor systems. While I/O bound
|
||||||
applications do not suffer the same slow-down when using threading, they do
|
applications do not suffer the same slow-down when using threading, they do
|
||||||
suffer some performance cost due to the GIL.
|
suffer some performance cost due to the GIL.
|
||||||
|
|
||||||
The Processing module offers a method to side-step the GIL allowing
|
The pyProcessing package offers a method to side-step the GIL allowing
|
||||||
applications within CPython to take advantage of multi-core architectures
|
applications within CPython to take advantage of multi-core architectures
|
||||||
without asking users to completely change their programming paradigm (i.e.:
|
without asking users to completely change their programming paradigm (i.e.:
|
||||||
dropping threaded programming for another "concurrent" approach - Twisted,
|
dropping threaded programming for another "concurrent" approach - Twisted,
|
||||||
etc).
|
etc).
|
||||||
|
|
||||||
The Processing module offers CPython users a known API (that of the
|
The Processing package offers CPython users a known API (that of the
|
||||||
threading module), with known semantics and easy-scalability. In the
|
threading module), with known semantics and easy-scalability. In the
|
||||||
future, the module might not be as relevant should the CPython interpreter
|
future, the package might not be as relevant should the CPython interpreter
|
||||||
enable "true" threading, however for some applications, forking an OS
|
enable "true" threading, however for some applications, forking an OS
|
||||||
process may sometimes be more desirable than using lightweight threads,
|
process may sometimes be more desirable than using lightweight threads,
|
||||||
especially on those platforms where process creation is fast/optimized.
|
especially on those platforms where process creation is fast/optimized.
|
||||||
|
@ -70,7 +70,7 @@ Rationale
|
||||||
t.start()
|
t.start()
|
||||||
t.join()
|
t.join()
|
||||||
|
|
||||||
The pyprocessing module mirrors the API so well, that with a simple change
|
The pyprocessing package mirrors the API so well, that with a simple change
|
||||||
of the import to:
|
of the import to:
|
||||||
|
|
||||||
from processing import Process as worker
|
from processing import Process as worker
|
||||||
|
@ -78,17 +78,17 @@ Rationale
|
||||||
The code now executes through the processing.Process class. This type of
|
The code now executes through the processing.Process class. This type of
|
||||||
compatibility means that, with a minor (in most cases) change in code,
|
compatibility means that, with a minor (in most cases) change in code,
|
||||||
users' applications will be able to leverage all cores and processors on a
|
users' applications will be able to leverage all cores and processors on a
|
||||||
given machine for parallel execution. In many cases the pyprocessing module
|
given machine for parallel execution. In many cases the pyprocessing package
|
||||||
is even faster than the normal threading approach for I/O bound programs.
|
is even faster than the normal threading approach for I/O bound programs.
|
||||||
This of course, takes into account that the pyprocessing module is in
|
This of course, takes into account that the pyprocessing package is in
|
||||||
optimized C code, while the threading module is not.
|
optimized C code, while the threading module is not.
|
||||||
|
|
||||||
The "Distributed" Problem
|
The "Distributed" Problem
|
||||||
|
|
||||||
In the discussion on Python-Dev about the inclusion of this module [3] there
|
In the discussion on Python-Dev about the inclusion of this package [3] there
|
||||||
was confusion about the intentions this PEP with an attempt to solve the
|
was confusion about the intentions this PEP with an attempt to solve the
|
||||||
"Distributed" problem - frequently comparing the functionality of this
|
"Distributed" problem - frequently comparing the functionality of this
|
||||||
module with other solutions like MPI-based communication [4], CORBA, or
|
package with other solutions like MPI-based communication [4], CORBA, or
|
||||||
other distributed object approaches [5].
|
other distributed object approaches [5].
|
||||||
|
|
||||||
The "distributed" problem is large and varied. Each programmer working
|
The "distributed" problem is large and varied. Each programmer working
|
||||||
|
@ -96,24 +96,24 @@ The "Distributed" Problem
|
||||||
module/method or a highly customized problem for which no existing solution
|
module/method or a highly customized problem for which no existing solution
|
||||||
works.
|
works.
|
||||||
|
|
||||||
The acceptance of this module does not preclude or recommend that
|
The acceptance of this package does not preclude or recommend that
|
||||||
programmers working on the "distributed" problem not examine other solutions
|
programmers working on the "distributed" problem not examine other solutions
|
||||||
for their problem domain. The intent of including this module is to provide
|
for their problem domain. The intent of including this package is to provide
|
||||||
entry-level capabilities for local concurrency and the basic support to
|
entry-level capabilities for local concurrency and the basic support to
|
||||||
spread that concurrency across a network of machines - although the two are
|
spread that concurrency across a network of machines - although the two are
|
||||||
not tightly coupled, the pyprocessing module could in fact, be used in
|
not tightly coupled, the pyprocessing package could in fact, be used in
|
||||||
conjunction with any of the other solutions including MPI/etc.
|
conjunction with any of the other solutions including MPI/etc.
|
||||||
|
|
||||||
If necessary - it is possible to completely decouple the local concurrency
|
If necessary - it is possible to completely decouple the local concurrency
|
||||||
abilities of the module from the network-capable/shared aspects of the
|
abilities of the package from the network-capable/shared aspects of the
|
||||||
module. Without serious concerns or cause however, the author of this PEP
|
package. Without serious concerns or cause however, the author of this PEP
|
||||||
does not recommend that approach.
|
does not recommend that approach.
|
||||||
|
|
||||||
Performance Comparison
|
Performance Comparison
|
||||||
|
|
||||||
As we all know - there are "lies, damned lies, and benchmarks". These speed
|
As we all know - there are "lies, damned lies, and benchmarks". These speed
|
||||||
comparisons, while aimed at showcasing the performance of the pyprocessing
|
comparisons, while aimed at showcasing the performance of the pyprocessing
|
||||||
module, are by no means comprehensive or applicable to all possible use
|
package, are by no means comprehensive or applicable to all possible use
|
||||||
cases or environments. Especially for those platforms with sluggish process
|
cases or environments. Especially for those platforms with sluggish process
|
||||||
forking timing.
|
forking timing.
|
||||||
|
|
||||||
|
@ -157,10 +157,10 @@ Performance Comparison
|
||||||
threaded (8 threads) 0.007990 seconds
|
threaded (8 threads) 0.007990 seconds
|
||||||
processes (8 procs) 0.005512 seconds
|
processes (8 procs) 0.005512 seconds
|
||||||
|
|
||||||
As you can see, process forking via the pyprocessing module is faster than
|
As you can see, process forking via the pyprocessing package is faster than
|
||||||
the speed of building and then executing the threaded version of the code.
|
the speed of building and then executing the threaded version of the code.
|
||||||
|
|
||||||
The second test calculates 50000 fibonacci numbers inside of each thread
|
The second test calculates 50000 Fibonacci numbers inside of each thread
|
||||||
(isolated and shared nothing):
|
(isolated and shared nothing):
|
||||||
|
|
||||||
cmd: python run_benchmarks.py fibonacci.py
|
cmd: python run_benchmarks.py fibonacci.py
|
||||||
|
@ -209,7 +209,7 @@ Performance Comparison
|
||||||
showcase how the current threading implementation does hinder non-I/O
|
showcase how the current threading implementation does hinder non-I/O
|
||||||
applications. Obviously, these tests could be improved to use a queue for
|
applications. Obviously, these tests could be improved to use a queue for
|
||||||
coordination of results and chunks of work but that is not required to show
|
coordination of results and chunks of work but that is not required to show
|
||||||
the performance of the module.
|
the performance of the package and core Processing module.
|
||||||
|
|
||||||
The next test is an I/O bound test. This is normally where we see a steep
|
The next test is an I/O bound test. This is normally where we see a steep
|
||||||
improvement in the threading module approach versus a single-threaded
|
improvement in the threading module approach versus a single-threaded
|
||||||
|
@ -264,51 +264,93 @@ Performance Comparison
|
||||||
processes (8 procs) 0.298625 seconds
|
processes (8 procs) 0.298625 seconds
|
||||||
|
|
||||||
We finally see threaded performance surpass that of single-threaded
|
We finally see threaded performance surpass that of single-threaded
|
||||||
execution, but the pyprocessing module is still faster when increasing the
|
execution, but the pyprocessing package is still faster when increasing the
|
||||||
number of workers. If you stay with one or two threads/workers, then the
|
number of workers. If you stay with one or two threads/workers, then the
|
||||||
timing between threads and pyprocessing is fairly close.
|
timing between threads and pyprocessing is fairly close.
|
||||||
|
|
||||||
Additional benchmarks can be found in the pyprocessing module's source
|
One item of note however, is that there is an implicit overhead within the
|
||||||
distribution's examples/ directory.
|
pyprocessing package's Queue implementation due to the object serialization.
|
||||||
|
|
||||||
|
Alec Thomas provided a short example based on the run_benchmarks.py script
|
||||||
|
to demonstrate this overhead versus the default Queue implementation:
|
||||||
|
|
||||||
|
cmd: run_bench_queue.py
|
||||||
|
non_threaded (1 iters) 0.010546 seconds
|
||||||
|
threaded (1 threads) 0.015164 seconds
|
||||||
|
processes (1 procs) 0.066167 seconds
|
||||||
|
|
||||||
|
non_threaded (2 iters) 0.020768 seconds
|
||||||
|
threaded (2 threads) 0.041635 seconds
|
||||||
|
processes (2 procs) 0.084270 seconds
|
||||||
|
|
||||||
|
non_threaded (4 iters) 0.041718 seconds
|
||||||
|
threaded (4 threads) 0.086394 seconds
|
||||||
|
processes (4 procs) 0.144176 seconds
|
||||||
|
|
||||||
|
non_threaded (8 iters) 0.083488 seconds
|
||||||
|
threaded (8 threads) 0.184254 seconds
|
||||||
|
processes (8 procs) 0.302999 seconds
|
||||||
|
|
||||||
|
Additional benchmarks can be found in the pyprocessing package's source
|
||||||
|
distribution's examples/ directory. The examples will be included in the
|
||||||
|
package's documentation.
|
||||||
|
|
||||||
Maintenance
|
Maintenance
|
||||||
|
|
||||||
Richard M. Oudkerk - the author of the pyprocessing module has agreed to
|
Richard M. Oudkerk - the author of the pyprocessing package has agreed to
|
||||||
maintaing the module within Python SVN. Jesse Noller has volunteered to
|
maintain the package within Python SVN. Jesse Noller has volunteered to
|
||||||
also help maintain/document and test the module.
|
also help maintain/document and test the package.
|
||||||
|
|
||||||
|
API Naming
|
||||||
|
|
||||||
|
The API of the pyprocessing package is designed to closely mimic that of
|
||||||
|
the threading and Queue modules. It has been proposed that instead of
|
||||||
|
adding the package as-is, we rename it to be PEP 8 compliant instead.
|
||||||
|
|
||||||
|
Since the aim of the package is to be a drop-in for the threading
|
||||||
|
module, the authors feel that the current API should be used.
|
||||||
|
When the threading and Queue modules are updated to fully reflect
|
||||||
|
PEP 8, the pyprocessing/multiprocessing naming can be revised.
|
||||||
|
|
||||||
Timing/Schedule
|
Timing/Schedule
|
||||||
|
|
||||||
Some concerns have been raised about the timing/lateness of this PEP
|
Some concerns have been raised about the timing/lateness of this PEP
|
||||||
for the 2.6 and 3.0 releases this year, however it is felt by both
|
for the 2.6 and 3.0 releases this year, however it is felt by both
|
||||||
the authors and others that the functionality this module offers
|
the authors and others that the functionality this package offers
|
||||||
surpasses the risk of inclusion.
|
surpasses the risk of inclusion.
|
||||||
|
|
||||||
However, taking into account the desire not to destabilize python-core, some
|
However, taking into account the desire not to destabilize Python-core, some
|
||||||
refactoring of pyprocessing's code "into" python-core can be withheld until
|
refactoring of pyprocessing's code "into" Python-core can be withheld until
|
||||||
the next 2.x/3.x releases. This means that the actual risk to python-core
|
the next 2.x/3.x releases. This means that the actual risk to Python-core
|
||||||
is minimal, and largely constrained to the actual module itself.
|
is minimal, and largely constrained to the actual package itself.
|
||||||
|
|
||||||
Open Issues
|
Open Issues
|
||||||
|
|
||||||
* All existing tests for the module should be converted to UnitTest format.
|
* All existing tests for the package should be converted to UnitTest format.
|
||||||
* Existing documentation has to be moved to ReST formatting.
|
* Existing documentation has to be moved to ReST formatting.
|
||||||
* Verify code coverage percentage of existing test suite.
|
* Verify code coverage percentage of existing test suite.
|
||||||
* Identify any requirements to achieve a 1.0 milestone if required.
|
* Identify any requirements to achieve a 1.0 milestone if required.
|
||||||
* Verify current source tree conforms to standard library practices.
|
* Verify current source tree conforms to standard library practices.
|
||||||
* Rename top-level module from "pyprocessing" to "multiprocessing".
|
* Rename top-level package from "pyprocessing" to "multiprocessing".
|
||||||
* Confirm no "default" remote connection capabilities, if needed enable the
|
* Confirm no "default" remote connection capabilities, if needed enable the
|
||||||
remote security mechanisms by default for those classes which offer remote
|
remote security mechanisms by default for those classes which offer remote
|
||||||
capabilities.
|
capabilities.
|
||||||
* Some of the API (Queue methods qsize(), task_done() and join()) either
|
* Some of the API (Queue methods qsize(), task_done() and join()) either
|
||||||
need to be added, or the reason for their exclusion needs to be identified
|
need to be added, or the reason for their exclusion needs to be identified
|
||||||
and documented clearly.
|
and documented clearly.
|
||||||
|
* Add in "multiprocessing.setExecutable()" method to override the default
|
||||||
|
behavior of the package to spawn processes using the current executable
|
||||||
|
name rather than the Python interpreter. Note that Mark Hammond has
|
||||||
|
suggested a factory-style interface for this[7].
|
||||||
|
* Also note that the default behavior of process spawning does not make
|
||||||
|
it compatible with use within IDLE as-is, this will be examined as
|
||||||
|
a bug-fix or "setExecutable" enhancement.
|
||||||
|
|
||||||
Closed Issues
|
Closed Issues
|
||||||
|
|
||||||
* Reliance on ctypes: The pyprocessing module's reliance on ctypes prevents
|
* Reliance on ctypes: The pyprocessing package's reliance on ctypes prevents
|
||||||
the module from functioning on platforms where ctypes is not supported.
|
the package from functioning on platforms where ctypes is not supported.
|
||||||
This is not a restriction of this module, but rather ctypes.
|
This is not a restriction of this package, but rather of ctypes.
|
||||||
|
|
||||||
References
|
References
|
||||||
|
|
||||||
|
@ -330,6 +372,8 @@ References
|
||||||
Magazine in December 2008: "Python Threads and the Global Interpreter
|
Magazine in December 2008: "Python Threads and the Global Interpreter
|
||||||
Lock" by Jesse Noller. It has been modified for this PEP.
|
Lock" by Jesse Noller. It has been modified for this PEP.
|
||||||
|
|
||||||
|
[7] http://groups.google.com/group/python-dev2/msg/54cf06d15cbcbc34
|
||||||
|
|
||||||
Copyright
|
Copyright
|
||||||
|
|
||||||
This document has been placed in the public domain.
|
This document has been placed in the public domain.
|
||||||
|
|
Loading…
Reference in New Issue