re-wrapped text to 70 columns

2008-06-03 14:19:58 +00:00 · 2008-06-03 14:19:58 +00:00 · 1f996f3ebc
parent 1cabbf4c39
commit 1f996f3ebc
1 changed files with 169 additions and 145 deletions
--- a/pep-0371.txt
+++ b/pep-0371.txt
@ -14,50 +14,54 @@ Post-History:
 Abstract
-    This PEP proposes the inclusion of the pyProcessing [1] package into the
+    This PEP proposes the inclusion of the pyProcessing [1] package
-    Python standard library, renamed to "multiprocessing".
+    into the Python standard library, renamed to "multiprocessing".
-    The processing package mimics the standard library threading module and API
+    The processing package mimics the standard library threading
-    to provide a process-based approach to "threaded programming" allowing
+    module and API to provide a process-based approach to "threaded
-    end-users to dispatch multiple tasks that effectively side-step the global
+    programming" allowing end-users to dispatch multiple tasks that
-    interpreter lock.
+    effectively side-step the global interpreter lock.
-    The package also provides server and client functionality (processing.Manager)
+    The package also provides server and client functionality
-    to provide remote sharing and management of objects and tasks so that 
+    (processing.Manager) to provide remote sharing and management of
-    applications may not only leverage multiple cores on the local machine, 
+    objects and tasks so that applications may not only leverage
-    but also distribute objects and tasks across a cluster of networked machines.
+    multiple cores on the local machine, but also distribute objects
    and tasks across a cluster of networked machines.
-    While the distributed capabilities of the package are beneficial, the primary
+    While the distributed capabilities of the package are beneficial,
-    focus of this PEP is the core threading-like API and capabilities of the
+    the primary focus of this PEP is the core threading-like API and
-    package.
+    capabilities of the package.
 Rationale
-    The current CPython interpreter implements the Global Interpreter Lock (GIL)
+    The current CPython interpreter implements the Global Interpreter
-    and barring work in Python 3000 or other versions currently planned [2], the
+    Lock (GIL) and barring work in Python 3000 or other versions
-    GIL will remain as-is within the CPython interpreter for the foreseeable
+    currently planned [2], the GIL will remain as-is within the
-    future.  While the GIL itself enables clean and easy to maintain C code for
+    CPython interpreter for the foreseeable future.  While the GIL
-    the interpreter and extensions base, it is frequently an issue for those
+    itself enables clean and easy to maintain C code for the
-    Python programmers who are leveraging multi-core machines.
+    interpreter and extensions base, it is frequently an issue for
    those Python programmers who are leveraging multi-core machines.
-    The GIL itself prevents more than a single thread from running within the
+    The GIL itself prevents more than a single thread from running
-    interpreter at any given point in time, effectively removing Python's
+    within the interpreter at any given point in time, effectively
-    ability to take advantage of multi-processor systems.  While I/O bound
+    removing Python's ability to take advantage of multi-processor
-    applications do not suffer the same slow-down when using threading, they do
+    systems.  While I/O bound applications do not suffer the same
-    suffer some performance cost due to the GIL.
+    slow-down when using threading, they do suffer some performance
    cost due to the GIL.
-    The pyProcessing package offers a method to side-step the GIL allowing
+    The pyProcessing package offers a method to side-step the GIL
-    applications within CPython to take advantage of multi-core architectures
+    allowing applications within CPython to take advantage of
-    without asking users to completely change their programming paradigm (i.e.:
+    multi-core architectures without asking users to completely change
-    dropping threaded programming for another "concurrent" approach - Twisted,
+    their programming paradigm (i.e.: dropping threaded programming
-    etc).
+    for another "concurrent" approach - Twisted, etc).
-    The Processing package offers CPython users a known API (that of the
+    The Processing package offers CPython users a known API (that of
-    threading module), with known semantics and easy-scalability.  In the
+    the threading module), with known semantics and easy-scalability.
-    future, the package might not be as relevant should the CPython interpreter
+    In the future, the package might not be as relevant should the
-    enable "true" threading, however for some applications, forking an OS
+    CPython interpreter enable "true" threading, however for some
-    process may sometimes be more desirable than using lightweight threads,
+    applications, forking an OS process may sometimes be more
-    especially on those platforms where process creation is fast/optimized.
+    desirable than using lightweight threads, especially on those
    platforms where process creation is fast/optimized.
    For example, a simple threaded application:
@ -70,52 +74,56 @@ Rationale
        t.start()
        t.join()
-    The pyprocessing package mirrors the API so well, that with a simple change
+    The pyprocessing package mirrors the API so well, that with a
-    of the import to:
+    simple change of the import to:
        from processing import Process as worker
-    The code now executes through the processing.Process class.  This type of
+    The code now executes through the processing.Process class.  This
-    compatibility means that, with a minor (in most cases) change in code,
+    type of compatibility means that, with a minor (in most cases)
-    users' applications will be able to leverage all cores and processors on a
+    change in code, users' applications will be able to leverage all
-    given machine for parallel execution.  In many cases the pyprocessing package
+    cores and processors on a given machine for parallel execution.
-    is even faster than the normal threading approach for I/O bound programs.
+    In many cases the pyprocessing package is even faster than the
-    This of course, takes into account that the pyprocessing package is in
+    normal threading approach for I/O bound programs.  This of course,
-    optimized C code, while the threading module is not.
+    takes into account that the pyprocessing package is in optimized C
    code, while the threading module is not.
 The "Distributed" Problem
-    In the discussion on Python-Dev about the inclusion of this package [3] there
+    In the discussion on Python-Dev about the inclusion of this
-    was confusion about the intentions this PEP with an attempt to solve the
+    package [3] there was confusion about the intentions this PEP with
-    "Distributed" problem - frequently comparing the functionality of this
+    an attempt to solve the "Distributed" problem - frequently
-    package with other solutions like MPI-based communication [4], CORBA, or
+    comparing the functionality of this package with other solutions
-    other distributed object approaches [5].
+    like MPI-based communication [4], CORBA, or other distributed
    object approaches [5].
-    The "distributed" problem is large and varied.  Each programmer working
+    The "distributed" problem is large and varied.  Each programmer
-    within this domain has either very strong opinions about their favorite
+    working within this domain has either very strong opinions about
-    module/method or a highly customized problem for which no existing solution
+    their favorite module/method or a highly customized problem for
-    works.
+    which no existing solution works.
    The acceptance of this package does not preclude or recommend that
-    programmers working on the "distributed" problem not examine other solutions
+    programmers working on the "distributed" problem not examine other
-    for their problem domain.  The intent of including this package is to provide
+    solutions for their problem domain.  The intent of including this
-    entry-level capabilities for local concurrency and the basic support to
+    package is to provide entry-level capabilities for local
-    spread that concurrency across a network of machines - although the two are
+    concurrency and the basic support to spread that concurrency
-    not tightly coupled, the pyprocessing package could in fact, be used in
+    across a network of machines - although the two are not tightly
    coupled, the pyprocessing package could in fact, be used in
    conjunction with any of the other solutions including MPI/etc.
-    If necessary - it is possible to completely decouple the local concurrency
+    If necessary - it is possible to completely decouple the local
-    abilities of the package from the network-capable/shared aspects of the
+    concurrency abilities of the package from the
-    package. Without serious concerns or cause however, the author of this PEP
+    network-capable/shared aspects of the package.  Without serious
-    does not recommend that approach.
+    concerns or cause however, the author of this PEP does not
    recommend that approach.
 Performance Comparison
-    As we all know - there are "lies, damned lies, and benchmarks".  These speed
+    As we all know - there are "lies, damned lies, and benchmarks".
-    comparisons, while aimed at showcasing the performance of the pyprocessing
+    These speed comparisons, while aimed at showcasing the performance
-    package, are by no means comprehensive or applicable to all possible use
+    of the pyprocessing package, are by no means comprehensive or
-    cases or environments.  Especially for those platforms with sluggish process
+    applicable to all possible use cases or environments.  Especially
-    forking timing.
+    for those platforms with sluggish process forking timing.
    All benchmarks were run using the following:
        * 4 Core Intel Xeon CPU @ 3.00GHz
@ -127,16 +135,17 @@ Performance Comparison
        http://jessenoller.com/code/bench-src.tgz
    The basic method of execution for these benchmarks is in the
-    run_benchmarks.py script, which is simply a wrapper to execute a target
+    run_benchmarks.py script, which is simply a wrapper to execute a
-    function through a single threaded (linear), multi-threaded (via threading),
+    target function through a single threaded (linear), multi-threaded
-    and multi-process (via pyprocessing) function for a static number of
+    (via threading), and multi-process (via pyprocessing) function for
-    iterations with increasing numbers of execution loops and/or threads.
+    a static number of iterations with increasing numbers of execution
    loops and/or threads.
-    The run_benchmarks.py script executes each function 100 times, picking the
+    The run_benchmarks.py script executes each function 100 times,
-    best run of that 100 iterations via the timeit module.
+    picking the best run of that 100 iterations via the timeit module.
-    First, to identify the overhead of the spawning of the workers, we execute
+    First, to identify the overhead of the spawning of the workers, we
-    an function which is simply a pass statement (empty):
+    execute an function which is simply a pass statement (empty):
        cmd: python run_benchmarks.py empty_func.py
        Importing empty_func
@ -157,11 +166,12 @@ Performance Comparison
        threaded (8 threads)    0.007990 seconds
        processes (8 procs)     0.005512 seconds
-    As you can see, process forking via the pyprocessing package is faster than
+    As you can see, process forking via the pyprocessing package is
-    the speed of building and then executing the threaded version of the code.
+    faster than the speed of building and then executing the threaded
    version of the code.
-    The second test calculates 50000 Fibonacci numbers inside of each thread
+    The second test calculates 50000 Fibonacci numbers inside of each
-    (isolated and shared nothing):
+    thread (isolated and shared nothing):
        cmd: python run_benchmarks.py fibonacci.py
        Importing fibonacci
@ -182,8 +192,8 @@ Performance Comparison
        threaded (8 threads)    1.596824 seconds
        processes (8 procs)     0.417899 seconds
-    The third test calculates the sum of all primes below 100000, again sharing
+    The third test calculates the sum of all primes below 100000,
-    nothing.
+    again sharing nothing.
        cmd: run_benchmarks.py crunch_primes.py
        Importing crunch_primes
@ -204,17 +214,18 @@ Performance Comparison
        threaded (8 threads)    5.109192 seconds
        processes (8 procs)     1.077939 seconds
    The reason why tests two and three focused on pure numeric
    crunching is to showcase how the current threading implementation
    does hinder non-I/O applications.  Obviously, these tests could be
    improved to use a queue for coordination of results and chunks of
    work but that is not required to show the performance of the
    package and core Processing module.
-    The reason why tests two and three focused on pure numeric crunching is to
+    The next test is an I/O bound test.  This is normally where we see
-    showcase how the current threading implementation does hinder non-I/O
+    a steep improvement in the threading module approach versus a
-    applications.  Obviously, these tests could be improved to use a queue for
+    single-threaded approach.  In this case, each worker is opening a
-    coordination of results and chunks of work but that is not required to show
+    descriptor to lorem.txt, randomly seeking within it and writing
-    the performance of the package and core Processing module.
+    lines to /dev/null:
    The next test is an I/O bound test.  This is normally where we see a steep
    improvement in the threading module approach versus a single-threaded
    approach.  In this case, each worker is opening a descriptor to lorem.txt,
    randomly seeking within it and writing lines to /dev/null:
        cmd: python run_benchmarks.py file_io.py
        Importing file_io
@ -235,14 +246,14 @@ Performance Comparison
        threaded (8 threads)    2.437204 seconds
        processes (8 procs)     0.203438 seconds
-    As you can see, pyprocessing is still faster on this I/O operation than
+    As you can see, pyprocessing is still faster on this I/O operation
-    using multiple threads.  And using multiple threads is slower than the
+    than using multiple threads.  And using multiple threads is slower
-    single threaded execution itself.
+    than the single threaded execution itself.
-    Finally, we will run a socket-based test to show network I/O performance.
+    Finally, we will run a socket-based test to show network I/O
-    This function grabs a URL from a server on the LAN that is a simple error
+    performance.  This function grabs a URL from a server on the LAN
-    page from tomcat.  It gets the page 100 times. The network is silent, and a
+    that is a simple error page from tomcat.  It gets the page 100
-    10G connection:
+    times.  The network is silent, and a 10G connection:
        cmd: python run_benchmarks.py url_get.py
        Importing url_get
@ -263,16 +274,19 @@ Performance Comparison
        threaded (8 threads)    0.659298 seconds
        processes (8 procs)     0.298625 seconds
-    We finally see threaded performance surpass that of single-threaded
+    We finally see threaded performance surpass that of
-    execution, but the pyprocessing package is still faster when increasing the
+    single-threaded execution, but the pyprocessing package is still
-    number of workers.  If you stay with one or two threads/workers, then the
+    faster when increasing the number of workers.  If you stay with
-    timing between threads and pyprocessing is fairly close.
+    one or two threads/workers, then the timing between threads and
    pyprocessing is fairly close.
-    One item of note however, is that there is an implicit overhead within the
+    One item of note however, is that there is an implicit overhead
-    pyprocessing package's Queue implementation due to the object serialization.
+    within the pyprocessing package's Queue implementation due to the
    object serialization.
-    Alec Thomas provided a short example based on the run_benchmarks.py script
+    Alec Thomas provided a short example based on the
-    to demonstrate this overhead versus the default Queue implementation:
+    run_benchmarks.py script to demonstrate this overhead versus the
    default Queue implementation:
        cmd: run_bench_queue.py 
        non_threaded (1 iters)  0.010546 seconds
@ -291,21 +305,23 @@ Performance Comparison
        threaded (8 threads)    0.184254 seconds
        processes (8 procs)     0.302999 seconds
-    Additional benchmarks can be found in the pyprocessing package's source
+    Additional benchmarks can be found in the pyprocessing package's
-    distribution's examples/ directory. The examples will be included in the
+    source distribution's examples/ directory.  The examples will be
-    package's documentation.
+    included in the package's documentation.
 Maintenance
-    Richard M. Oudkerk - the author of the pyprocessing package has agreed to
+    Richard M. Oudkerk - the author of the pyprocessing package has
-    maintain the package within Python SVN.  Jesse Noller has volunteered to
+    agreed to maintain the package within Python SVN.  Jesse Noller
-    also help maintain/document and test the package.
+    has volunteered to also help maintain/document and test the
    package.
 API Naming
-    The API of the pyprocessing package is designed to closely mimic that of
+    The API of the pyprocessing package is designed to closely mimic
-    the threading and Queue modules. It has been proposed that instead of 
+    that of the threading and Queue modules.  It has been proposed that
-    adding the package as-is, we rename it to be PEP 8 compliant instead.
+    instead of adding the package as-is, we rename it to be PEP 8
    compliant instead.
    Since the aim of the package is to be a drop-in for the threading
    module, the authors feel that the current API should be used.
@ -314,43 +330,50 @@ API Naming
 Timing/Schedule
-    Some concerns have been raised about the timing/lateness of this PEP
+    Some concerns have been raised about the timing/lateness of this
-    for the 2.6 and 3.0 releases this year, however it is felt by both
+    PEP for the 2.6 and 3.0 releases this year, however it is felt by
-    the authors and others that the functionality this package offers
+    both the authors and others that the functionality this package
-    surpasses the risk of inclusion.
+    offers surpasses the risk of inclusion.
-    However, taking into account the desire not to destabilize Python-core, some
+    However, taking into account the desire not to destabilize
-    refactoring of pyprocessing's code "into" Python-core can be withheld until
+    Python-core, some refactoring of pyprocessing's code "into"
-    the next 2.x/3.x releases.  This means that the actual risk to Python-core
+    Python-core can be withheld until the next 2.x/3.x releases.  This
-    is minimal, and largely constrained to the actual package itself.
+    means that the actual risk to Python-core is minimal, and largely
    constrained to the actual package itself.
 Open Issues
-    * All existing tests for the package should be converted to UnitTest format.
+    * All existing tests for the package should be converted to
      UnitTest format.
    * Existing documentation has to be moved to ReST formatting.
    * Verify code coverage percentage of existing test suite.
-    * Identify any requirements to achieve a 1.0 milestone if required.
+    * Identify any requirements to achieve a 1.0 milestone if
-    * Verify current source tree conforms to standard library practices.
+      required.
-    * Rename top-level package from "pyprocessing" to "multiprocessing".
+    * Verify current source tree conforms to standard library
-    * Confirm no "default" remote connection capabilities, if needed enable the
+      practices.
-      remote security mechanisms by default for those classes which offer remote
+    * Rename top-level package from "pyprocessing" to
-      capabilities.
+      "multiprocessing".
-    * Some of the API (Queue methods qsize(), task_done() and join()) either
+    * Confirm no "default" remote connection capabilities, if needed
-      need to be added, or the reason for their exclusion needs to be identified
+      enable the remote security mechanisms by default for those
-      and documented clearly.
+      classes which offer remote capabilities.
-    * Add in "multiprocessing.setExecutable()" method to override the default
+    * Some of the API (Queue methods qsize(), task_done() and join())
-      behavior of the package to spawn processes using the current executable
+      either need to be added, or the reason for their exclusion needs
-      name rather than the Python interpreter. Note that Mark Hammond has 
+      to be identified and documented clearly.
-      suggested a factory-style interface for this[7].
+    * Add in "multiprocessing.setExecutable()" method to override the
-        * Also note that the default behavior of process spawning does not make
+      default behavior of the package to spawn processes using the
-          it compatible with use within IDLE as-is, this will be examined as
+      current executable name rather than the Python interpreter.  Note
-          a bug-fix or "setExecutable" enhancement.
+      that Mark Hammond has suggested a factory-style interface for
      this[7].
        * Also note that the default behavior of process spawning does
          not make it compatible with use within IDLE as-is, this will
          be examined as a bug-fix or "setExecutable" enhancement.
 Closed Issues
-    * Reliance on ctypes: The pyprocessing package's reliance on ctypes prevents
+    * Reliance on ctypes: The pyprocessing package's reliance on
-      the package from functioning on platforms where ctypes is not supported.
+      ctypes prevents the package from functioning on platforms where
-      This is not a restriction of this package, but rather of ctypes.
+      ctypes is not supported.  This is not a restriction of this
      package, but rather of ctypes.
 References
@ -369,8 +392,9 @@ References
        http://wiki.python.org/moin/ParallelProcessing
    [6] The original run_benchmark.py code was published in Python
-        Magazine in December 2008: "Python Threads and the Global Interpreter
+        Magazine in December 2008: "Python Threads and the Global
-        Lock" by Jesse Noller.  It has been modified for this PEP.
+        Interpreter Lock" by Jesse Noller.  It has been modified for
        this PEP.
    [7] http://groups.google.com/group/python-dev2/msg/54cf06d15cbcbc34