PEP 734: Updates After Discussion (#3664)
This commit is contained in:
parent
c55835e170
commit
13022a6d12
|
@ -25,9 +25,9 @@ This PEP proposes to add a new module, ``interpreters``, to support
|
||||||
inspecting, creating, and running code in multiple interpreters in the
|
inspecting, creating, and running code in multiple interpreters in the
|
||||||
current process. This includes ``Interpreter`` objects that represent
|
current process. This includes ``Interpreter`` objects that represent
|
||||||
the underlying interpreters. The module will also provide a basic
|
the underlying interpreters. The module will also provide a basic
|
||||||
``Queue`` class for communication between interpreters. Finally, we
|
``Queue`` class for communication between interpreters.
|
||||||
will add a new ``concurrent.futures.InterpreterPoolExecutor`` based
|
Finally, we will add a new ``concurrent.futures.InterpreterPoolExecutor``
|
||||||
on the ``interpreters`` module.
|
based on the ``interpreters`` module.
|
||||||
|
|
||||||
|
|
||||||
Introduction
|
Introduction
|
||||||
|
@ -92,7 +92,7 @@ Interpreters and Threads
|
||||||
------------------------
|
------------------------
|
||||||
|
|
||||||
Thread states are related to interpreter states in much the same way
|
Thread states are related to interpreter states in much the same way
|
||||||
that OS threads and processes are related (at a hight level). To
|
that OS threads and processes are related (at a high level). To
|
||||||
begin with, the relationship is one-to-many.
|
begin with, the relationship is one-to-many.
|
||||||
A thread state belongs to a single interpreter (and stores
|
A thread state belongs to a single interpreter (and stores
|
||||||
a pointer to it). That thread state is never used for a different
|
a pointer to it). That thread state is never used for a different
|
||||||
|
@ -276,106 +276,6 @@ interpreters. Without one, multiple interpreters are a much less
|
||||||
useful feature.
|
useful feature.
|
||||||
|
|
||||||
|
|
||||||
Rationale
|
|
||||||
=========
|
|
||||||
|
|
||||||
A Minimal API
|
|
||||||
-------------
|
|
||||||
|
|
||||||
Since the core dev team has no real experience with
|
|
||||||
how users will make use of multiple interpreters in Python code, this
|
|
||||||
proposal purposefully keeps the initial API as lean and minimal as
|
|
||||||
possible. The objective is to provide a well-considered foundation
|
|
||||||
on which further (more advanced) functionality may be added later,
|
|
||||||
as appropriate.
|
|
||||||
|
|
||||||
That said, the proposed design incorporates lessons learned from
|
|
||||||
existing use of subinterpreters by the community, from existing stdlib
|
|
||||||
modules, and from other programming languages. It also factors in
|
|
||||||
experience from using subinterpreters in the CPython test suite and
|
|
||||||
using them in `concurrency benchmarks`_.
|
|
||||||
|
|
||||||
.. _concurrency benchmarks:
|
|
||||||
https://github.com/ericsnowcurrently/concurrency-benchmarks
|
|
||||||
|
|
||||||
Interpreter.prepare_main() Sets Multiple Variables
|
|
||||||
--------------------------------------------------
|
|
||||||
|
|
||||||
``prepare_main()`` may be seen as a setter function of sorts.
|
|
||||||
It supports setting multiple names at once,
|
|
||||||
e.g. ``interp.prepare_main(spam=1, eggs=2)``, whereas most setters
|
|
||||||
set one item at a time. The main reason is for efficiency.
|
|
||||||
|
|
||||||
To set a value in the interpreter's ``__main__.__dict__``, the
|
|
||||||
implementation must first switch the OS thread to the identified
|
|
||||||
interpreter, which involves some non-negligible overhead. After
|
|
||||||
setting the value it must switch back.
|
|
||||||
Furthermore, there is some additional overhead to the mechanism
|
|
||||||
by which it passes objects between interpreters, which can be
|
|
||||||
reduced in aggregate if multiple values are set at once.
|
|
||||||
|
|
||||||
Therefore, ``prepare_main()`` supports setting multiple
|
|
||||||
values at once.
|
|
||||||
|
|
||||||
Propagating Exceptions
|
|
||||||
----------------------
|
|
||||||
|
|
||||||
An uncaught exception from a subinterpreter,
|
|
||||||
via ``Interpreter.exec_sync()``,
|
|
||||||
could either be (effectively) ignored, like ``threading.Thread()`` does,
|
|
||||||
or propagated, like the builtin ``exec()`` does. Since ``exec_sync()``
|
|
||||||
is a synchronous operation, like the builtin ``exec()``,
|
|
||||||
uncaught exceptions are propagated.
|
|
||||||
|
|
||||||
However, such exceptions are not raised directly. That's because
|
|
||||||
interpreters are isolated from each other and must not share objects,
|
|
||||||
including exceptions. That could be addressed by raising a surrogate
|
|
||||||
of the exception, whether a summary, a copy, or a proxy that wraps it.
|
|
||||||
Any of those could preserve the traceback, which is useful for
|
|
||||||
debugging. The ``ExecFailure`` that gets raised
|
|
||||||
is such a surrogate.
|
|
||||||
|
|
||||||
There's another concern to consider. If a propagated exception isn't
|
|
||||||
immediately caught, it will bubble up through the call stack until
|
|
||||||
caught (or not). In the case that code somewhere else may catch it,
|
|
||||||
it is helpful to identify that the exception came from a subinterpreter
|
|
||||||
(i.e. a "remote" source), rather than from the current interpreter.
|
|
||||||
That's why ``Interpreter.exec_sync()`` raises ``ExecFailure`` and why
|
|
||||||
it is a plain ``Exception``, rather than a copy or proxy with a class
|
|
||||||
that matches the original exception. For example, an uncaught
|
|
||||||
``ValueError`` from a subinterpreter would never get caught in a later
|
|
||||||
``try: ... except ValueError: ...``. Instead, ``ExecFailure``
|
|
||||||
must be handled directly.
|
|
||||||
|
|
||||||
Limited Object Sharing
|
|
||||||
----------------------
|
|
||||||
|
|
||||||
As noted in `Interpreter Isolation`_, only a small number of builtin
|
|
||||||
objects may be truly shared between interpreters. In all other cases
|
|
||||||
objects can only be shared indirectly, through copies or proxies.
|
|
||||||
|
|
||||||
The set of objects that are shareable as copies through queues
|
|
||||||
(and ``Interpreter.prepare_main()``) is limited for the sake of
|
|
||||||
efficiency.
|
|
||||||
|
|
||||||
Supporting sharing of *all* objects is possible (via pickle)
|
|
||||||
but not part of this proposal. For one thing, it's helpful to know
|
|
||||||
that only an efficient implementation is being used. Furthermore,
|
|
||||||
for mutable objects pickling would violate the guarantee that "shared"
|
|
||||||
objects be equivalent (and stay that way).
|
|
||||||
|
|
||||||
Objects vs. ID Proxies
|
|
||||||
----------------------
|
|
||||||
|
|
||||||
For both interpreters and queues, the low-level module makes use of
|
|
||||||
proxy objects that expose the underlying state by their corresponding
|
|
||||||
process-global IDs. In both cases the state is likewise process-global
|
|
||||||
and will be used by multiple interpreters. Thus they aren't suitable
|
|
||||||
to be implemented as ``PyObject``, which is only really an option for
|
|
||||||
interpreter-specific data. That's why the ``interpreters`` module
|
|
||||||
instead provides objects that are weakly associated through the ID.
|
|
||||||
|
|
||||||
|
|
||||||
Specification
|
Specification
|
||||||
=============
|
=============
|
||||||
|
|
||||||
|
@ -407,7 +307,7 @@ The module defines the following functions:
|
||||||
for it. The interpreter doesn't do anything on its own and is
|
for it. The interpreter doesn't do anything on its own and is
|
||||||
not inherently tied to any OS thread. That only happens when
|
not inherently tied to any OS thread. That only happens when
|
||||||
something is actually run in the interpreter
|
something is actually run in the interpreter
|
||||||
(e.g. ``Interpreter.exec_sync()``), and only while running.
|
(e.g. ``Interpreter.exec()``), and only while running.
|
||||||
The interpreter may or may not have thread states ready to use,
|
The interpreter may or may not have thread states ready to use,
|
||||||
but that is strictly an internal implementation detail.
|
but that is strictly an internal implementation detail.
|
||||||
|
|
||||||
|
@ -439,7 +339,7 @@ Attributes and methods:
|
||||||
|
|
||||||
It refers only to if there is an OS thread
|
It refers only to if there is an OS thread
|
||||||
running a script (code) in the interpreter's ``__main__`` module.
|
running a script (code) in the interpreter's ``__main__`` module.
|
||||||
That basically means whether or not ``Interpreter.exec_sync()``
|
That basically means whether or not ``Interpreter.exec()``
|
||||||
is running in some OS thread. Code running in sub-threads
|
is running in some OS thread. Code running in sub-threads
|
||||||
is ignored.
|
is ignored.
|
||||||
|
|
||||||
|
@ -454,7 +354,7 @@ Attributes and methods:
|
||||||
``prepare_main()`` is helpful for initializing the
|
``prepare_main()`` is helpful for initializing the
|
||||||
globals for an interpreter before running code in it.
|
globals for an interpreter before running code in it.
|
||||||
|
|
||||||
* ``exec_sync(code, /)``
|
* ``exec(code, /)``
|
||||||
Execute the given source code in the interpreter
|
Execute the given source code in the interpreter
|
||||||
(in the current OS thread), using its ``__main__`` module.
|
(in the current OS thread), using its ``__main__`` module.
|
||||||
It doesn't return anything.
|
It doesn't return anything.
|
||||||
|
@ -465,39 +365,59 @@ Attributes and methods:
|
||||||
the globals and locals.
|
the globals and locals.
|
||||||
|
|
||||||
The code running in the current OS thread (a different
|
The code running in the current OS thread (a different
|
||||||
interpreter) is effectively paused until ``exec_sync()``
|
interpreter) is effectively paused until ``Interpreter.exec()``
|
||||||
finishes. To avoid pausing it, create a new ``threading.Thread``
|
finishes. To avoid pausing it, create a new ``threading.Thread``
|
||||||
and call ``exec_sync()`` in it.
|
and call ``Interpreter.exec()`` in it
|
||||||
|
(like ``Interpreter.call_in_thread()`` does).
|
||||||
|
|
||||||
``exec_sync()`` does not reset the interpreter's state nor
|
``Interpreter.exec()`` does not reset the interpreter's state nor
|
||||||
the ``__main__`` module, neither before nor after, so each
|
the ``__main__`` module, neither before nor after, so each
|
||||||
successive call picks up where the last one left off. This can
|
successive call picks up where the last one left off. This can
|
||||||
be useful for running some code to initialize an interpreter
|
be useful for running some code to initialize an interpreter
|
||||||
(e.g. with imports) before later performing some repeated task.
|
(e.g. with imports) before later performing some repeated task.
|
||||||
|
|
||||||
If there is an uncaught exception, it will be propagated into
|
If there is an uncaught exception, it will be propagated into
|
||||||
the calling interpreter as a ``ExecFailure``, which
|
the calling interpreter as an ``ExecutionFailed``. The full error
|
||||||
preserves enough information for a helpful error display. That
|
display of the original exception, generated relative to the
|
||||||
means if the ``ExecFailure`` isn't caught then the full
|
called interpreter, is preserved on the propagated ``ExecutionFailed``.
|
||||||
traceback of the propagated exception, including details about
|
That includes the full traceback, with all the extra info like
|
||||||
syntax errors, etc., will be displayed. Having the full
|
syntax error details and chained exceptions.
|
||||||
traceback is particularly useful when debugging.
|
If the ``ExecutionFailed`` is not caught then that full error display
|
||||||
|
will be shown, much like it would be if the propagated exception
|
||||||
|
had been raised in the main interpreter and uncaught. Having
|
||||||
|
the full traceback is particularly useful when debugging.
|
||||||
|
|
||||||
If exception propagation is not desired then an explicit
|
If exception propagation is not desired then an explicit
|
||||||
try-except should be used around the *code* passed to
|
try-except should be used around the *code* passed to
|
||||||
``exec_sync()``. Likewise any error handling that depends
|
``Interpreter.exec()``. Likewise any error handling that depends
|
||||||
on specific information from the exception must use an explicit
|
on specific information from the exception must use an explicit
|
||||||
try-except around the given *code*, since ``ExecFailure``
|
try-except around the given *code*, since ``ExecutionFailed``
|
||||||
will not preserve that information.
|
will not preserve that information.
|
||||||
|
|
||||||
* ``run(code, /) -> threading.Thread``
|
* ``call(callable, /)``
|
||||||
Create a new thread and call ``exec_sync()`` in it.
|
Call the callable object in the interpreter.
|
||||||
Exceptions are not propagated.
|
The return value is discarded. If the callable raises an exception
|
||||||
|
then it gets propagated as an ``ExecutionFailed`` exception,
|
||||||
|
in the same way as ``Interpreter.exec()``.
|
||||||
|
|
||||||
This is roughly equivalent to::
|
For now only plain functions are supported and only ones that
|
||||||
|
take no arguments and have no cell vars. Free globals are resolved
|
||||||
|
against the target interpreter's ``__main__`` module.
|
||||||
|
|
||||||
|
In the future, we can add support for arguments, closures,
|
||||||
|
and a broader variety of callables, at least partly via pickle.
|
||||||
|
We can also consider not discarding the return value.
|
||||||
|
The initial restrictions are in place to allow us to get the basic
|
||||||
|
functionality of the module out to users sooner.
|
||||||
|
|
||||||
|
* ``call_in_thread(callable, /) -> threading.Thread``
|
||||||
|
Essentially, apply ``Interpreter.call()`` in a new thread.
|
||||||
|
Return values are discarded and exceptions are not propagated.
|
||||||
|
|
||||||
|
``call_in_thread()`` is roughly equivalent to::
|
||||||
|
|
||||||
def task():
|
def task():
|
||||||
interp.exec_sync(code)
|
interp.run(func)
|
||||||
t = threading.Thread(target=task)
|
t = threading.Thread(target=task)
|
||||||
t.start()
|
t.start()
|
||||||
|
|
||||||
|
@ -518,7 +438,7 @@ the back and each "get" pops the next one off the front. Every added
|
||||||
object will be popped off in the order it was pushed on.
|
object will be popped off in the order it was pushed on.
|
||||||
|
|
||||||
Only objects that are specifically supported for passing
|
Only objects that are specifically supported for passing
|
||||||
between interpreters may be sent through a ``Queue``.
|
between interpreters may be sent through an ``interpreters.Queue``.
|
||||||
Note that the actual objects aren't sent, but rather their
|
Note that the actual objects aren't sent, but rather their
|
||||||
underlying data. However, the popped object will still be
|
underlying data. However, the popped object will still be
|
||||||
strictly equivalent to the original.
|
strictly equivalent to the original.
|
||||||
|
@ -526,10 +446,12 @@ See `Shareable Objects`_.
|
||||||
|
|
||||||
The module defines the following functions:
|
The module defines the following functions:
|
||||||
|
|
||||||
* ``create_queue(maxsize=0) -> Queue``
|
* ``create_queue(maxsize=0, *, syncobj=False) -> Queue``
|
||||||
Create a new queue. If the maxsize is zero or negative then the
|
Create a new queue. If the maxsize is zero or negative then the
|
||||||
queue is unbounded.
|
queue is unbounded.
|
||||||
|
|
||||||
|
"syncobj" is used as the default for ``put()`` and ``put_nowait()``.
|
||||||
|
|
||||||
Queue Objects
|
Queue Objects
|
||||||
-------------
|
-------------
|
||||||
|
|
||||||
|
@ -552,7 +474,8 @@ Attributes and methods:
|
||||||
used for a pipe.
|
used for a pipe.
|
||||||
|
|
||||||
* ``maxsize``
|
* ``maxsize``
|
||||||
Number of items allowed in the queue. Zero means "unbounded".
|
(read-only) Number of items allowed in the queue.
|
||||||
|
Zero means "unbounded".
|
||||||
|
|
||||||
* ``__hash__()``
|
* ``__hash__()``
|
||||||
Return the hash of the queue's ``id``. This is the same
|
Return the hash of the queue's ``id``. This is the same
|
||||||
|
@ -579,18 +502,25 @@ Attributes and methods:
|
||||||
This is only a snapshot of the state at the time of the call.
|
This is only a snapshot of the state at the time of the call.
|
||||||
Other threads or interpreters may cause this to change.
|
Other threads or interpreters may cause this to change.
|
||||||
|
|
||||||
* ``put(obj, timeout=None)``
|
* ``put(obj, timeout=None, *, syncobj=None)``
|
||||||
Add the object to the queue.
|
Add the object to the queue.
|
||||||
|
|
||||||
The object must be `shareable <Shareable Objects_>`_, which means
|
|
||||||
the object's data is passed through rather than the object itself.
|
|
||||||
|
|
||||||
If ``maxsize > 0`` and the queue is full then this blocks until
|
If ``maxsize > 0`` and the queue is full then this blocks until
|
||||||
a free slot is available. If *timeout* is a positive number
|
a free slot is available. If *timeout* is a positive number
|
||||||
then it only blocks at least that many seconds and then raises
|
then it only blocks at least that many seconds and then raises
|
||||||
``interpreters.QueueFull``. Otherwise is blocks forever.
|
``interpreters.QueueFull``. Otherwise is blocks forever.
|
||||||
|
|
||||||
* ``put_nowait(obj)``
|
If "syncobj" is true then the object must be
|
||||||
|
`shareable <Shareable Objects_>`_, which means the object's data
|
||||||
|
is passed through rather than the object itself.
|
||||||
|
If "syncobj" is false then all objects are supported. However,
|
||||||
|
there are some performance penalties and all objects are copies
|
||||||
|
(e.g. via pickle). Thus mutable objects will never be
|
||||||
|
automatically synchronized between interpreters.
|
||||||
|
If "syncobj" is None (the default) then the queue's default
|
||||||
|
value is used.
|
||||||
|
|
||||||
|
* ``put_nowait(obj, *, syncobj=None)``
|
||||||
Like ``put()`` but effectively with an immediate timeout.
|
Like ``put()`` but effectively with an immediate timeout.
|
||||||
Thus if the queue is full, it immediately raises
|
Thus if the queue is full, it immediately raises
|
||||||
``interpreters.QueueFull``.
|
``interpreters.QueueFull``.
|
||||||
|
@ -609,8 +539,8 @@ Attributes and methods:
|
||||||
Shareable Objects
|
Shareable Objects
|
||||||
-----------------
|
-----------------
|
||||||
|
|
||||||
Both ``Interpreter.prepare_main()`` and ``Queue`` work only with
|
``Interpreter.prepare_main()`` only works with "shareable" objects.
|
||||||
"shareable" objects.
|
The same goes for ``interpreters.Queue`` (optionally).
|
||||||
|
|
||||||
A "shareable" object is one which may be passed from one interpreter
|
A "shareable" object is one which may be passed from one interpreter
|
||||||
to another. The object is not necessarily actually directly shared
|
to another. The object is not necessarily actually directly shared
|
||||||
|
@ -640,7 +570,7 @@ Here's the initial list of supported objects:
|
||||||
* ``bool`` (``True``/``False``)
|
* ``bool`` (``True``/``False``)
|
||||||
* ``None``
|
* ``None``
|
||||||
* ``tuple`` (only with shareable items)
|
* ``tuple`` (only with shareable items)
|
||||||
* ``Queue``
|
* ``interpreters.Queue``
|
||||||
* ``memoryview`` (underlying buffer actually shared)
|
* ``memoryview`` (underlying buffer actually shared)
|
||||||
|
|
||||||
Note that the last two on the list, queues and ``memoryview``, are
|
Note that the last two on the list, queues and ``memoryview``, are
|
||||||
|
@ -655,12 +585,13 @@ a token back and forth through a queue to indicate safety
|
||||||
(see `Synchronization`_), or by assigning sub-range exclusivity
|
(see `Synchronization`_), or by assigning sub-range exclusivity
|
||||||
to individual interpreters.
|
to individual interpreters.
|
||||||
|
|
||||||
Most objects will be shared through queues (``Queue``), as interpreters
|
Most objects will be shared through queues (``interpreters.Queue``),
|
||||||
communicate information between each other. Less frequently, objects
|
as interpreters communicate information between each other.
|
||||||
will be shared through ``prepare_main()`` to set up an interpreter
|
Less frequently, objects will be shared through ``prepare_main()``
|
||||||
prior to running code in it. However, ``prepare_main()`` is the
|
to set up an interpreter prior to running code in it. However,
|
||||||
primary way that queues are shared, to provide another interpreter
|
``prepare_main()`` is the primary way that queues are shared,
|
||||||
with a means of further communication.
|
to provide another interpreter with a means
|
||||||
|
of further communication.
|
||||||
|
|
||||||
Finally, a reminder: for a few types the actual object is shared,
|
Finally, a reminder: for a few types the actual object is shared,
|
||||||
whereas for the rest only the underlying data is shared, whether
|
whereas for the rest only the underlying data is shared, whether
|
||||||
|
@ -675,9 +606,9 @@ had been shared directly, whether or not it actually was.
|
||||||
That's a slightly different and stronger promise than just equality.
|
That's a slightly different and stronger promise than just equality.
|
||||||
|
|
||||||
The guarantee is especially important for mutable objects, like
|
The guarantee is especially important for mutable objects, like
|
||||||
``Queue`` and ``memoryview``. Mutating the object in one interpreter
|
``Interpreters.Queue`` and ``memoryview``. Mutating the object
|
||||||
will always be reflected immediately in every other interpreter
|
in one interpreter will always be reflected immediately in every
|
||||||
sharing the object.
|
other interpreter sharing the object.
|
||||||
|
|
||||||
Synchronization
|
Synchronization
|
||||||
---------------
|
---------------
|
||||||
|
@ -692,8 +623,8 @@ However, interpreters cannot share objects which means they cannot
|
||||||
share ``threading.Lock`` objects.
|
share ``threading.Lock`` objects.
|
||||||
|
|
||||||
The ``interpreters`` module does not provide any such dedicated
|
The ``interpreters`` module does not provide any such dedicated
|
||||||
synchronization primitives. Instead, ``Queue`` objects provide
|
synchronization primitives. Instead, ``interpreters.Queue``
|
||||||
everything one might need.
|
objects provide everything one might need.
|
||||||
|
|
||||||
For example, if there's a shared resource that needs managed
|
For example, if there's a shared resource that needs managed
|
||||||
access then a queue may be used to manage it, where the interpreters
|
access then a queue may be used to manage it, where the interpreters
|
||||||
|
@ -709,7 +640,7 @@ pass an object around to indicate who can use the resource::
|
||||||
def worker():
|
def worker():
|
||||||
interp = interpreters.create()
|
interp = interpreters.create()
|
||||||
interp.prepare_main(control=control, data=data)
|
interp.prepare_main(control=control, data=data)
|
||||||
interp.exec_sync("""if True:
|
interp.exec("""if True:
|
||||||
from mymodule import edit_data
|
from mymodule import edit_data
|
||||||
while True:
|
while True:
|
||||||
token = control.get()
|
token = control.get()
|
||||||
|
@ -731,12 +662,12 @@ pass an object around to indicate who can use the resource::
|
||||||
Exceptions
|
Exceptions
|
||||||
----------
|
----------
|
||||||
|
|
||||||
* ``ExecFailure``
|
* ``ExecutionFailed``
|
||||||
Raised from ``Interpreter.exec_sync()`` when there's an
|
Raised from ``Interpreter.exec()`` and ``Interpreter.call()``
|
||||||
uncaught exception. The error display for this exception
|
when there's an uncaught exception.
|
||||||
includes the traceback of the uncaught exception, which gets
|
The error display for this exception includes the traceback
|
||||||
shown after the normal error display, much like happens for
|
of the uncaught exception, which gets shown after the normal
|
||||||
``ExceptionGroup``.
|
error display, much like happens for ``ExceptionGroup``.
|
||||||
|
|
||||||
Attributes:
|
Attributes:
|
||||||
|
|
||||||
|
@ -766,7 +697,18 @@ InterpreterPoolExecutor
|
||||||
Along with the new ``interpreters`` module, there will be a new
|
Along with the new ``interpreters`` module, there will be a new
|
||||||
``concurrent.futures.InterpreterPoolExecutor``. Each worker executes
|
``concurrent.futures.InterpreterPoolExecutor``. Each worker executes
|
||||||
in its own thread with its own subinterpreter. Communication may
|
in its own thread with its own subinterpreter. Communication may
|
||||||
still be done through ``Queue`` objects, set with the initializer.
|
still be done through ``interpreters.Queue`` objects,
|
||||||
|
set with the initializer.
|
||||||
|
|
||||||
|
sys.implementation.supports_isolated_interpreters
|
||||||
|
-------------------------------------------------
|
||||||
|
|
||||||
|
Python implementations are not required to support subinterpreters,
|
||||||
|
though most major ones do. If an implementation does support them
|
||||||
|
then ``sys.implementation.supports_isolated_interpreters`` will be
|
||||||
|
set to ``True``. Otherwise it will be ``False``. If the feature
|
||||||
|
is not supported then importing the ``interpreters`` module will
|
||||||
|
raise an ``ImportError``.
|
||||||
|
|
||||||
Examples
|
Examples
|
||||||
--------
|
--------
|
||||||
|
@ -818,7 +760,7 @@ via workers in sub-threads.
|
||||||
def worker():
|
def worker():
|
||||||
interp = interpreters.create()
|
interp = interpreters.create()
|
||||||
interp.prepare_main(tasks=tasks, results=results)
|
interp.prepare_main(tasks=tasks, results=results)
|
||||||
interp.exec_sync("""if True:
|
interp.exec("""if True:
|
||||||
from mymodule import handle_request, capture_exception
|
from mymodule import handle_request, capture_exception
|
||||||
|
|
||||||
while True:
|
while True:
|
||||||
|
@ -880,7 +822,7 @@ so the code takes advantage of directly sharing ``memoryview`` buffers.
|
||||||
def worker(id):
|
def worker(id):
|
||||||
interp = interpreters.create()
|
interp = interpreters.create()
|
||||||
interp.prepare_main(data=buf, results=results, tasks=tasks)
|
interp.prepare_main(data=buf, results=results, tasks=tasks)
|
||||||
interp.exec_sync("""if True:
|
interp.exec("""if True:
|
||||||
from mymodule import reduce_chunk
|
from mymodule import reduce_chunk
|
||||||
|
|
||||||
while True:
|
while True:
|
||||||
|
@ -914,6 +856,132 @@ so the code takes advantage of directly sharing ``memoryview`` buffers.
|
||||||
use_results(results)
|
use_results(results)
|
||||||
|
|
||||||
|
|
||||||
|
Rationale
|
||||||
|
=========
|
||||||
|
|
||||||
|
A Minimal API
|
||||||
|
-------------
|
||||||
|
|
||||||
|
Since the core dev team has no real experience with
|
||||||
|
how users will make use of multiple interpreters in Python code, this
|
||||||
|
proposal purposefully keeps the initial API as lean and minimal as
|
||||||
|
possible. The objective is to provide a well-considered foundation
|
||||||
|
on which further (more advanced) functionality may be added later,
|
||||||
|
as appropriate.
|
||||||
|
|
||||||
|
That said, the proposed design incorporates lessons learned from
|
||||||
|
existing use of subinterpreters by the community, from existing stdlib
|
||||||
|
modules, and from other programming languages. It also factors in
|
||||||
|
experience from using subinterpreters in the CPython test suite and
|
||||||
|
using them in `concurrency benchmarks`_.
|
||||||
|
|
||||||
|
.. _concurrency benchmarks:
|
||||||
|
https://github.com/ericsnowcurrently/concurrency-benchmarks
|
||||||
|
|
||||||
|
create(), create_queue()
|
||||||
|
------------------------
|
||||||
|
|
||||||
|
Typically, users call a type to create instances of the type, at which
|
||||||
|
point the object's resources get provisioned. The ``interpreters``
|
||||||
|
module takes a different approach, where users must call ``create()``
|
||||||
|
to get a new interpreter or ``create_queue()`` for a new queue.
|
||||||
|
Calling ``interpreters.Interpreter()`` directly only returns a wrapper
|
||||||
|
around an existing interpreters (likewise for
|
||||||
|
``interpreters.Queue()``).
|
||||||
|
|
||||||
|
This is because interpreters (and queues) are special resources.
|
||||||
|
They exist globally in the process and are not managed/owned by the
|
||||||
|
current interpreter. Thus the ``interpreters`` module makes creating
|
||||||
|
an interpreter (or queue) a visibly distinct operation from creating
|
||||||
|
an instance of ``interpreters.Interpreter``
|
||||||
|
(or ``interpreters.Queue``).
|
||||||
|
|
||||||
|
Interpreter.prepare_main() Sets Multiple Variables
|
||||||
|
--------------------------------------------------
|
||||||
|
|
||||||
|
``prepare_main()`` may be seen as a setter function of sorts.
|
||||||
|
It supports setting multiple names at once,
|
||||||
|
e.g. ``interp.prepare_main(spam=1, eggs=2)``, whereas most setters
|
||||||
|
set one item at a time. The main reason is for efficiency.
|
||||||
|
|
||||||
|
To set a value in the interpreter's ``__main__.__dict__``, the
|
||||||
|
implementation must first switch the OS thread to the identified
|
||||||
|
interpreter, which involves some non-negligible overhead. After
|
||||||
|
setting the value it must switch back.
|
||||||
|
Furthermore, there is some additional overhead to the mechanism
|
||||||
|
by which it passes objects between interpreters, which can be
|
||||||
|
reduced in aggregate if multiple values are set at once.
|
||||||
|
|
||||||
|
Therefore, ``prepare_main()`` supports setting multiple
|
||||||
|
values at once.
|
||||||
|
|
||||||
|
Propagating Exceptions
|
||||||
|
----------------------
|
||||||
|
|
||||||
|
An uncaught exception from a subinterpreter,
|
||||||
|
via ``Interpreter.exec()``,
|
||||||
|
could either be (effectively) ignored,
|
||||||
|
like ``threading.Thread()`` does,
|
||||||
|
or propagated, like the builtin ``exec()`` does.
|
||||||
|
Since ``Interpreter.exec()`` is a synchronous operation,
|
||||||
|
like the builtin ``exec()``, uncaught exceptions are propagated.
|
||||||
|
|
||||||
|
However, such exceptions are not raised directly. That's because
|
||||||
|
interpreters are isolated from each other and must not share objects,
|
||||||
|
including exceptions. That could be addressed by raising a surrogate
|
||||||
|
of the exception, whether a summary, a copy, or a proxy that wraps it.
|
||||||
|
Any of those could preserve the traceback, which is useful for
|
||||||
|
debugging. The ``ExecutionFailed`` that gets raised
|
||||||
|
is such a surrogate.
|
||||||
|
|
||||||
|
There's another concern to consider. If a propagated exception isn't
|
||||||
|
immediately caught, it will bubble up through the call stack until
|
||||||
|
caught (or not). In the case that code somewhere else may catch it,
|
||||||
|
it is helpful to identify that the exception came from a subinterpreter
|
||||||
|
(i.e. a "remote" source), rather than from the current interpreter.
|
||||||
|
That's why ``Interpreter.exec()`` raises ``ExecutionFailed`` and why
|
||||||
|
it is a plain ``Exception``, rather than a copy or proxy with a class
|
||||||
|
that matches the original exception. For example, an uncaught
|
||||||
|
``ValueError`` from a subinterpreter would never get caught in a later
|
||||||
|
``try: ... except ValueError: ...``. Instead, ``ExecutionFailed``
|
||||||
|
must be handled directly.
|
||||||
|
|
||||||
|
In contrast, exceptions propagated from ``Interpreter.call()`` do not
|
||||||
|
involve ``ExecutionFailed`` but are raised directly, as though originating
|
||||||
|
in the calling interpreter. This is because ``Interpreter.call()`` is
|
||||||
|
a higher level method that uses pickle to support objects that can't
|
||||||
|
normally be passed between interpreters.
|
||||||
|
|
||||||
|
Limited Object Sharing
|
||||||
|
----------------------
|
||||||
|
|
||||||
|
As noted in `Interpreter Isolation`_, only a small number of builtin
|
||||||
|
objects may be truly shared between interpreters. In all other cases
|
||||||
|
objects can only be shared indirectly, through copies or proxies.
|
||||||
|
|
||||||
|
The set of objects that are shareable as copies through queues
|
||||||
|
(and ``Interpreter.prepare_main()``) is limited for the sake of
|
||||||
|
efficiency.
|
||||||
|
|
||||||
|
Supporting sharing of *all* objects is possible (via pickle)
|
||||||
|
but not part of this proposal. For one thing, it's helpful to know
|
||||||
|
in those cases that only an efficient implementation is being used.
|
||||||
|
Furthermore, in those cases supporting mutable objects via pickling
|
||||||
|
would violate the guarantee that "shared" objects be equivalent
|
||||||
|
(and stay that way).
|
||||||
|
|
||||||
|
Objects vs. ID Proxies
|
||||||
|
----------------------
|
||||||
|
|
||||||
|
For both interpreters and queues, the low-level module makes use of
|
||||||
|
proxy objects that expose the underlying state by their corresponding
|
||||||
|
process-global IDs. In both cases the state is likewise process-global
|
||||||
|
and will be used by multiple interpreters. Thus they aren't suitable
|
||||||
|
to be implemented as ``PyObject``, which is only really an option for
|
||||||
|
interpreter-specific data. That's why the ``interpreters`` module
|
||||||
|
instead provides objects that are weakly associated through the ID.
|
||||||
|
|
||||||
|
|
||||||
Rejected Ideas
|
Rejected Ideas
|
||||||
==============
|
==============
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue