changes from PEP authors; corrections

2008-06-03 14:14:50 +00:00 · 2008-06-03 14:14:50 +00:00 · 1cabbf4c39
parent 43996147f7
commit 1cabbf4c39
2 changed files with 90 additions and 46 deletions
--- a/pep-0000.txt
+++ b/pep-0000.txt
@ -96,7 +96,7 @@ Index by Category
 S   364  Transitioning to the Py3K Standard Library   Warsaw
 S   368  Standard image protocol and class            Mastrodomenico
 S   369  Post import hooks                            Heimes
- S   371  Addition of the Processing module            Noller, Oudkerk
+ S   371  Addition of the multiprocessing package      Noller, Oudkerk
 S  3134  Exception Chaining and Embedded Tracebacks   Yee
 S  3135  New Super                                    Spealman, Delaney
 S  3138  String representation in Python 3000         Ishimoto
@ -475,7 +475,7 @@ Numerical Index
 S   368  Standard image protocol and class            Mastrodomenico
 S   369  Post import hooks                            Heimes
 SA  370  Per user site-packages directory             Heimes
- S   371  Addition of the Processing module            Noller, Oudkerk
+ S   371  Addition of the multiprocessing package      Noller, Oudkerk
 SR  666  Reject Foolish Indentation                   Creighton
 SR  754  IEEE 754 Floating Point Special Values       Warnes
 P  3000  Python 3000                                  GvR
--- a/pep-0371.txt
+++ b/pep-0371.txt
@ -1,5 +1,5 @@
 PEP: 371
-Title: Addition of the Processing module to standard library
+Title: Addition of the multiprocessing package to the standard library
 Version: $Revision: $
 Last-Modified: $Date: $
 Author: Jesse Noller <jnoller@gmail.com>
@ -14,22 +14,22 @@ Post-History:

 Abstract

-    This PEP proposes the inclusion of the pyProcessing [1] module into the
-    python standard library.
+    This PEP proposes the inclusion of the pyProcessing [1] package into the
+    Python standard library, renamed to "multiprocessing".

-    The processing module mimics the standard library threading module and API
+    The processing package mimics the standard library threading module and API
    to provide a process-based approach to "threaded programming" allowing
    end-users to dispatch multiple tasks that effectively side-step the global
    interpreter lock.

-    The module also provides server and client modules to provide remote-
-    sharing and management of objects and tasks so that applications may not
-    only leverage multiple cores on the local machine, but also distribute
-    objects and tasks across a cluster of networked machines.
+    The package also provides server and client functionality (processing.Manager)
+    to provide remote sharing and management of objects and tasks so that 
+    applications may not only leverage multiple cores on the local machine, 
+    but also distribute objects and tasks across a cluster of networked machines.

-    While the distributed capabilities of the module are beneficial, the primary
+    While the distributed capabilities of the package are beneficial, the primary
    focus of this PEP is the core threading-like API and capabilities of the
-    module.
+    package.

 Rationale

@ -41,20 +41,20 @@ Rationale
    Python programmers who are leveraging multi-core machines.

    The GIL itself prevents more than a single thread from running within the
-    interpreter at any given point in time, effectively removing python's
+    interpreter at any given point in time, effectively removing Python's
    ability to take advantage of multi-processor systems.  While I/O bound
    applications do not suffer the same slow-down when using threading, they do
    suffer some performance cost due to the GIL.

-    The Processing module offers a method to side-step the GIL allowing
+    The pyProcessing package offers a method to side-step the GIL allowing
    applications within CPython to take advantage of multi-core architectures
    without asking users to completely change their programming paradigm (i.e.:
    dropping threaded programming for another "concurrent" approach - Twisted,
    etc).

-    The Processing module offers CPython users a known API (that of the
+    The Processing package offers CPython users a known API (that of the
    threading module), with known semantics and easy-scalability.  In the
-    future, the module might not be as relevant should the CPython interpreter
+    future, the package might not be as relevant should the CPython interpreter
    enable "true" threading, however for some applications, forking an OS
    process may sometimes be more desirable than using lightweight threads,
    especially on those platforms where process creation is fast/optimized.
@ -70,7 +70,7 @@ Rationale
        t.start()
        t.join()

-    The pyprocessing module mirrors the API so well, that with a simple change
+    The pyprocessing package mirrors the API so well, that with a simple change
    of the import to:

        from processing import Process as worker
@ -78,17 +78,17 @@ Rationale
    The code now executes through the processing.Process class.  This type of
    compatibility means that, with a minor (in most cases) change in code,
    users' applications will be able to leverage all cores and processors on a
-    given machine for parallel execution.  In many cases the pyprocessing module
+    given machine for parallel execution.  In many cases the pyprocessing package
    is even faster than the normal threading approach for I/O bound programs.
-    This of course, takes into account that the pyprocessing module is in
+    This of course, takes into account that the pyprocessing package is in
    optimized C code, while the threading module is not.

 The "Distributed" Problem

-    In the discussion on Python-Dev about the inclusion of this module [3] there
+    In the discussion on Python-Dev about the inclusion of this package [3] there
    was confusion about the intentions this PEP with an attempt to solve the
    "Distributed" problem - frequently comparing the functionality of this
-    module with other solutions like MPI-based communication [4], CORBA, or
+    package with other solutions like MPI-based communication [4], CORBA, or
    other distributed object approaches [5].

    The "distributed" problem is large and varied.  Each programmer working
@ -96,24 +96,24 @@ The "Distributed" Problem
    module/method or a highly customized problem for which no existing solution
    works.

-    The acceptance of this module does not preclude or recommend that
+    The acceptance of this package does not preclude or recommend that
    programmers working on the "distributed" problem not examine other solutions
-    for their problem domain.  The intent of including this module is to provide
+    for their problem domain.  The intent of including this package is to provide
    entry-level capabilities for local concurrency and the basic support to
    spread that concurrency across a network of machines - although the two are
-    not tightly coupled, the pyprocessing module could in fact, be used in
+    not tightly coupled, the pyprocessing package could in fact, be used in
    conjunction with any of the other solutions including MPI/etc.

    If necessary - it is possible to completely decouple the local concurrency
-    abilities of the module from the network-capable/shared aspects of the
-    module. Without serious concerns or cause however, the author of this PEP
+    abilities of the package from the network-capable/shared aspects of the
+    package. Without serious concerns or cause however, the author of this PEP
    does not recommend that approach.

 Performance Comparison

    As we all know - there are "lies, damned lies, and benchmarks".  These speed
    comparisons, while aimed at showcasing the performance of the pyprocessing
-    module, are by no means comprehensive or applicable to all possible use
+    package, are by no means comprehensive or applicable to all possible use
    cases or environments.  Especially for those platforms with sluggish process
    forking timing.

@ -157,10 +157,10 @@ Performance Comparison
        threaded (8 threads)    0.007990 seconds
        processes (8 procs)     0.005512 seconds

-    As you can see, process forking via the pyprocessing module is faster than
+    As you can see, process forking via the pyprocessing package is faster than
    the speed of building and then executing the threaded version of the code.

-    The second test calculates 50000 fibonacci numbers inside of each thread
+    The second test calculates 50000 Fibonacci numbers inside of each thread
    (isolated and shared nothing):

        cmd: python run_benchmarks.py fibonacci.py
@ -209,7 +209,7 @@ Performance Comparison
    showcase how the current threading implementation does hinder non-I/O
    applications.  Obviously, these tests could be improved to use a queue for
    coordination of results and chunks of work but that is not required to show
-    the performance of the module.
+    the performance of the package and core Processing module.

    The next test is an I/O bound test.  This is normally where we see a steep
    improvement in the threading module approach versus a single-threaded
@ -264,51 +264,93 @@ Performance Comparison
        processes (8 procs)     0.298625 seconds

    We finally see threaded performance surpass that of single-threaded
-    execution, but the pyprocessing module is still faster when increasing the
+    execution, but the pyprocessing package is still faster when increasing the
    number of workers.  If you stay with one or two threads/workers, then the
    timing between threads and pyprocessing is fairly close.

-    Additional benchmarks can be found in the pyprocessing module's source
-    distribution's examples/ directory.
+    One item of note however, is that there is an implicit overhead within the
+    pyprocessing package's Queue implementation due to the object serialization.
+    
+    Alec Thomas provided a short example based on the run_benchmarks.py script
+    to demonstrate this overhead versus the default Queue implementation:
+
+        cmd: run_bench_queue.py 
+        non_threaded (1 iters)  0.010546 seconds
+        threaded (1 threads)    0.015164 seconds
+        processes (1 procs)     0.066167 seconds
+
+        non_threaded (2 iters)  0.020768 seconds
+        threaded (2 threads)    0.041635 seconds
+        processes (2 procs)     0.084270 seconds
+
+        non_threaded (4 iters)  0.041718 seconds
+        threaded (4 threads)    0.086394 seconds
+        processes (4 procs)     0.144176 seconds
+
+        non_threaded (8 iters)  0.083488 seconds
+        threaded (8 threads)    0.184254 seconds
+        processes (8 procs)     0.302999 seconds
+
+    Additional benchmarks can be found in the pyprocessing package's source
+    distribution's examples/ directory. The examples will be included in the
+    package's documentation.

 Maintenance

-    Richard M. Oudkerk - the author of the pyprocessing module has agreed to
-    maintaing the module within Python SVN.  Jesse Noller has volunteered to
-    also help maintain/document and test the module.
+    Richard M. Oudkerk - the author of the pyprocessing package has agreed to
+    maintain the package within Python SVN.  Jesse Noller has volunteered to
+    also help maintain/document and test the package.
+
+API Naming
+
+    The API of the pyprocessing package is designed to closely mimic that of
+    the threading and Queue modules. It has been proposed that instead of 
+    adding the package as-is, we rename it to be PEP 8 compliant instead.
+
+    Since the aim of the package is to be a drop-in for the threading
+    module, the authors feel that the current API should be used.
+    When the threading and Queue modules are updated to fully reflect
+    PEP 8, the pyprocessing/multiprocessing naming can be revised.

 Timing/Schedule

    Some concerns have been raised about the timing/lateness of this PEP
    for the 2.6 and 3.0 releases this year, however it is felt by both
-    the authors and others that the functionality this module offers
+    the authors and others that the functionality this package offers
    surpasses the risk of inclusion.

-    However, taking into account the desire not to destabilize python-core, some
-    refactoring of pyprocessing's code "into" python-core can be withheld until
-    the next 2.x/3.x releases.  This means that the actual risk to python-core
-    is minimal, and largely constrained to the actual module itself.
+    However, taking into account the desire not to destabilize Python-core, some
+    refactoring of pyprocessing's code "into" Python-core can be withheld until
+    the next 2.x/3.x releases.  This means that the actual risk to Python-core
+    is minimal, and largely constrained to the actual package itself.

 Open Issues

-    * All existing tests for the module should be converted to UnitTest format.
+    * All existing tests for the package should be converted to UnitTest format.
    * Existing documentation has to be moved to ReST formatting.
    * Verify code coverage percentage of existing test suite.
    * Identify any requirements to achieve a 1.0 milestone if required.
    * Verify current source tree conforms to standard library practices.
-    * Rename top-level module from "pyprocessing" to "multiprocessing".
+    * Rename top-level package from "pyprocessing" to "multiprocessing".
    * Confirm no "default" remote connection capabilities, if needed enable the
      remote security mechanisms by default for those classes which offer remote
      capabilities.
    * Some of the API (Queue methods qsize(), task_done() and join()) either
      need to be added, or the reason for their exclusion needs to be identified
      and documented clearly.
+    * Add in "multiprocessing.setExecutable()" method to override the default
+      behavior of the package to spawn processes using the current executable
+      name rather than the Python interpreter. Note that Mark Hammond has 
+      suggested a factory-style interface for this[7].
+        * Also note that the default behavior of process spawning does not make
+          it compatible with use within IDLE as-is, this will be examined as
+          a bug-fix or "setExecutable" enhancement.

 Closed Issues

-    * Reliance on ctypes: The pyprocessing module's reliance on ctypes prevents
-      the module from functioning on platforms where ctypes is not supported.
-      This is not a restriction of this module, but rather ctypes.
+    * Reliance on ctypes: The pyprocessing package's reliance on ctypes prevents
+      the package from functioning on platforms where ctypes is not supported.
+      This is not a restriction of this package, but rather of ctypes.

 References

@ -330,6 +372,8 @@ References
        Magazine in December 2008: "Python Threads and the Global Interpreter
        Lock" by Jesse Noller.  It has been modified for this PEP.

+    [7] http://groups.google.com/group/python-dev2/msg/54cf06d15cbcbc34
+
 Copyright

    This document has been placed in the public domain.