1391 lines
53 KiB
ReStructuredText
1391 lines
53 KiB
ReStructuredText
PEP: 554
|
|
Title: Multiple Interpreters in the Stdlib
|
|
Author: Eric Snow <ericsnowcurrently@gmail.com>
|
|
Discussions-To: https://discuss.python.org/t/pep-554-multiple-interpreters-in-the-stdlib/24855
|
|
Status: Draft
|
|
Type: Standards Track
|
|
Content-Type: text/x-rst
|
|
Created: 05-Sep-2017
|
|
Python-Version: 3.13
|
|
Post-History: `07-Sep-2017 <https://mail.python.org/archives/list/python-ideas@python.org/thread/HQQWEE527HG3ILJVKQTXVSJIQO6NUSIA/>`__,
|
|
`08-Sep-2017 <https://mail.python.org/archives/list/python-dev@python.org/thread/NBWMA6LVD22XOUYC5ZMPBFWDQOECRP77/>`__,
|
|
`13-Sep-2017 <https://mail.python.org/archives/list/python-dev@python.org/thread/EG4FSFG5E3O22FTIUQOXMQ6X6B5X3DP7/>`__,
|
|
`05-Dec-2017 <https://mail.python.org/archives/list/python-dev@python.org/thread/BCSRGAMCYB3NGXNU42U66J56XNZVMQP2/>`__,
|
|
`04-May-2020 <https://mail.python.org/archives/list/python-dev@python.org/thread/X2KPCSRVBD2QD5GP5IMXXZTGZ46OXD3D/>`__,
|
|
`14-Mar-2023 <https://discuss.python.org/t/pep-554-multiple-interpreters-in-the-stdlib/24855/2/>`__,
|
|
|
|
|
|
Abstract
|
|
========
|
|
|
|
CPython has supported multiple interpreters in the same process (AKA
|
|
"subinterpreters") since version 1.5 (1997). The feature has been
|
|
available via the C-API. [c-api]_ Multiple interpreters operate in
|
|
`relative isolation from one another <Interpreter Isolation_>`_, which
|
|
facilitates novel alternative approaches to
|
|
`concurrency <Concurrency_>`_.
|
|
|
|
This proposal introduces the stdlib ``interpreters`` module. It exposes
|
|
the basic functionality of multiple interpreters already provided by the
|
|
C-API, along with describing a *very* basic way to communicate
|
|
(i.e. pass data between interpreters).
|
|
|
|
|
|
A Disclaimer about the GIL
|
|
==========================
|
|
|
|
To avoid any confusion up front: This PEP is meant to be independent
|
|
of any efforts to stop sharing the GIL between interpreters (:pep:`684`).
|
|
At most this proposal will allow users to take advantage of any
|
|
GIL-related work.
|
|
|
|
The author's position here is that exposing multiple interpreters
|
|
to Python code is worth doing, even if they still share the GIL.
|
|
Conversations with past steering councils indicates they do not
|
|
necessarily agree.
|
|
|
|
|
|
Proposal
|
|
========
|
|
|
|
Summary:
|
|
|
|
* add a new stdlib module: "interpreters"
|
|
* help for extension module maintainers
|
|
|
|
|
|
The "interpreters" Module
|
|
-------------------------
|
|
|
|
The ``interpreters`` module will provide a high-level interface
|
|
to the multiple interpreter functionality, and wrap a new low-level
|
|
``_interpreters`` (in the same way as the ``threading`` module).
|
|
See the `Examples`_ section for concrete usage and use cases.
|
|
|
|
Along with exposing the existing (in CPython) multiple interpreter
|
|
support, the module will also support a very basic mechanism for
|
|
passing data between interpreters. That involves setting simple objects
|
|
in the ``__main__`` module of a target subinterpreter. If one end of
|
|
an ``os.pipe()`` is passed this way then that pipe can be used to send
|
|
bytes between the two interpreters.
|
|
|
|
Note that *objects* are not shared between interpreters since they are
|
|
tied to the interpreter in which they were created. Instead, the
|
|
objects' *data* is passed between interpreters. See the `Shared Data`_
|
|
and `API For Sharing Data`_ sections for more details about
|
|
sharing/communicating between interpreters.
|
|
|
|
API summary for interpreters module
|
|
-----------------------------------
|
|
|
|
Here is a summary of the API for the ``interpreters`` module. For a
|
|
more in-depth explanation of the proposed classes and functions, see
|
|
the `"interpreters" Module API`_ section below.
|
|
|
|
For creating and using interpreters:
|
|
|
|
+----------------------------------+----------------------------------------------+
|
|
| signature | description |
|
|
+==================================+==============================================+
|
|
| ``list_all() -> [Interpreter]`` | Get all existing interpreters. |
|
|
+----------------------------------+----------------------------------------------+
|
|
| ``get_current() -> Interpreter`` | Get the currently running interpreter. |
|
|
+----------------------------------+----------------------------------------------+
|
|
| ``get_main() -> Interpreter`` | Get the main interpreter. |
|
|
+----------------------------------+----------------------------------------------+
|
|
| ``create() -> Interpreter`` | Initialize a new (idle) Python interpreter. |
|
|
+----------------------------------+----------------------------------------------+
|
|
|
|
|
|
|
|
|
+----------------------------------+---------------------------------------------------+
|
|
| signature | description |
|
|
+==================================+===================================================+
|
|
| ``class Interpreter`` | A single interpreter. |
|
|
+----------------------------------+---------------------------------------------------+
|
|
| ``.id`` | The interpreter's ID (read-only). |
|
|
+----------------------------------+---------------------------------------------------+
|
|
| ``.is_running() -> bool`` | Is the interpreter currently executing code? |
|
|
+----------------------------------+---------------------------------------------------+
|
|
| ``.close()`` | Finalize and destroy the interpreter. |
|
|
+----------------------------------+---------------------------------------------------+
|
|
| ``.run(src_str, /)`` | | Run the given source code in the interpreter |
|
|
| | | (in the current thread). |
|
|
+----------------------------------+---------------------------------------------------+
|
|
|
|
.. XXX Support blocking interp.run() until the interpreter
|
|
finishes its current work.
|
|
|
|
|
|
|
|
|
+--------------------+------------------+------------------------------------------------------+
|
|
| exception | base | description |
|
|
+====================+==================+======================================================+
|
|
| ``RunFailedError`` | ``RuntimeError`` | Interpreter.run() resulted in an uncaught exception. |
|
|
+--------------------+------------------+------------------------------------------------------+
|
|
|
|
.. XXX Add "InterpreterAlreadyRunningError"?
|
|
|
|
Help for Extension Module Maintainers
|
|
-------------------------------------
|
|
|
|
In practice, an extension that implements multi-phase init (:pep:`489`)
|
|
is considered isolated and thus compatible with multiple interpreters.
|
|
Otherwise it is "incompatible".
|
|
|
|
Many extension modules are still incompatible. The maintainers and
|
|
users of such extension modules will both benefit when they are updated
|
|
to support multiple interpreters. In the meantime, users may become
|
|
confused by failures when using multiple interpreters, which could
|
|
negatively impact extension maintainers. See `Concerns`_ below.
|
|
|
|
To mitigate that impact and accelerate compatibility, we will do the
|
|
following:
|
|
|
|
* be clear that extension modules are *not* required to support use in
|
|
multiple interpreters
|
|
* raise ``ImportError`` when an incompatible module is imported
|
|
in a subinterpreter
|
|
* provide resources (e.g. docs) to help maintainers reach compatibility
|
|
* reach out to the maintainers of Cython and of the most used extension
|
|
modules (on PyPI) to get feedback and possibly provide assistance
|
|
|
|
|
|
Examples
|
|
========
|
|
|
|
Run isolated code
|
|
-----------------
|
|
|
|
::
|
|
|
|
interp = interpreters.create()
|
|
print('before')
|
|
interp.run('print("during")')
|
|
print('after')
|
|
|
|
Run in a thread
|
|
---------------
|
|
|
|
::
|
|
|
|
interp = interpreters.create()
|
|
def run():
|
|
interp.run('print("during")')
|
|
t = threading.Thread(target=run)
|
|
print('before')
|
|
t.start()
|
|
t.join()
|
|
print('after')
|
|
|
|
Pre-populate an interpreter
|
|
---------------------------
|
|
|
|
::
|
|
|
|
interp = interpreters.create()
|
|
interp.run(tw.dedent("""
|
|
import some_lib
|
|
import an_expensive_module
|
|
some_lib.set_up()
|
|
"""))
|
|
wait_for_request()
|
|
interp.run(tw.dedent("""
|
|
some_lib.handle_request()
|
|
"""))
|
|
|
|
Handling an exception
|
|
---------------------
|
|
|
|
::
|
|
|
|
interp = interpreters.create()
|
|
try:
|
|
interp.run(tw.dedent("""
|
|
raise KeyError
|
|
"""))
|
|
except interpreters.RunFailedError as exc:
|
|
print(f"got the error from the subinterpreter: {exc}")
|
|
|
|
Re-raising an exception
|
|
-----------------------
|
|
|
|
::
|
|
|
|
interp = interpreters.create()
|
|
try:
|
|
try:
|
|
interp.run(tw.dedent("""
|
|
raise KeyError
|
|
"""))
|
|
except interpreters.RunFailedError as exc:
|
|
raise exc.__cause__
|
|
except KeyError:
|
|
print("got a KeyError from the subinterpreter")
|
|
|
|
Note that this pattern is a candidate for later improvement.
|
|
|
|
Synchronize using an OS pipe
|
|
----------------------------
|
|
|
|
::
|
|
|
|
interp = interpreters.create()
|
|
r, s = os.pipe()
|
|
print('before')
|
|
interp.run(tw.dedent(f"""
|
|
import os
|
|
os.read({r}, 1)
|
|
print("during")
|
|
"""))
|
|
print('after')
|
|
os.write(s, '')
|
|
|
|
Sharing a file descriptor
|
|
-------------------------
|
|
|
|
::
|
|
|
|
interp = interpreters.create()
|
|
r1, s1 = os.pipe()
|
|
r2, s2 = os.pipe()
|
|
interp.run(tw.dedent(f"""
|
|
import os
|
|
fd = int.from_bytes(
|
|
os.read({r1}, 10), 'big')
|
|
for line in os.fdopen(fd):
|
|
print(line)
|
|
os.write({s2}, b'')
|
|
"""))
|
|
with open('spamspamspam') as infile:
|
|
fd = infile.fileno().to_bytes(1, 'big')
|
|
os.write(s1, fd)
|
|
os.read(r2, 1)
|
|
|
|
Passing objects via pickle
|
|
--------------------------
|
|
|
|
::
|
|
|
|
interp = interpreters.create()
|
|
r, s = os.pipe()
|
|
interp.run(tw.dedent(f"""
|
|
import os
|
|
import pickle
|
|
reader = {r}
|
|
"""))
|
|
interp.run(tw.dedent("""
|
|
data = b''
|
|
c = os.read(reader, 1)
|
|
while c != b'\x00':
|
|
while c != b'\x00':
|
|
data += c
|
|
c = os.read(reader, 1)
|
|
obj = pickle.loads(data)
|
|
do_something(obj)
|
|
c = os.read(reader, 1)
|
|
"""))
|
|
for obj in input:
|
|
data = pickle.dumps(obj)
|
|
os.write(s, data)
|
|
os.write(s, b'\x00')
|
|
os.write(s, b'\x00')
|
|
|
|
Capturing an interpreter's stdout
|
|
---------------------------------
|
|
|
|
::
|
|
|
|
interp = interpreters.create()
|
|
stdout = io.StringIO()
|
|
with contextlib.redirect_stdout(stdout):
|
|
interp.run(tw.dedent("""
|
|
print('spam!')
|
|
"""))
|
|
assert(stdout.getvalue() == 'spam!')
|
|
|
|
A pipe (``os.pipe()``) could be used similarly.
|
|
|
|
Running a module
|
|
----------------
|
|
|
|
::
|
|
|
|
interp = interpreters.create()
|
|
main_module = mod_name
|
|
interp.run(f'import runpy; runpy.run_module({main_module!r})')
|
|
|
|
Running as script (including zip archives & directories)
|
|
--------------------------------------------------------
|
|
|
|
::
|
|
|
|
interp = interpreters.create()
|
|
main_script = path_name
|
|
interp.run(f"import runpy; runpy.run_path({main_script!r})")
|
|
|
|
|
|
Rationale
|
|
=========
|
|
|
|
Running code in multiple interpreters provides a useful level of
|
|
isolation within the same process. This can be leveraged in a number
|
|
of ways. Furthermore, subinterpreters provide a well-defined framework
|
|
in which such isolation may extended. (See :pep:`684`.)
|
|
|
|
Nick Coghlan explained some of the benefits through a comparison with
|
|
multi-processing [benefits]_::
|
|
|
|
[I] expect that communicating between subinterpreters is going
|
|
to end up looking an awful lot like communicating between
|
|
subprocesses via shared memory.
|
|
|
|
The trade-off between the two models will then be that one still
|
|
just looks like a single process from the point of view of the
|
|
outside world, and hence doesn't place any extra demands on the
|
|
underlying OS beyond those required to run CPython with a single
|
|
interpreter, while the other gives much stricter isolation
|
|
(including isolating C globals in extension modules), but also
|
|
demands much more from the OS when it comes to its IPC
|
|
capabilities.
|
|
|
|
The security risk profiles of the two approaches will also be quite
|
|
different, since using subinterpreters won't require deliberately
|
|
poking holes in the process isolation that operating systems give
|
|
you by default.
|
|
|
|
CPython has supported multiple interpreters, with increasing levels
|
|
of support, since version 1.5. While the feature has the potential
|
|
to be a powerful tool, it has suffered from neglect
|
|
because the multiple interpreter capabilities are not readily available
|
|
directly from Python. Exposing the existing functionality
|
|
in the stdlib will help reverse the situation.
|
|
|
|
This proposal is focused on enabling the fundamental capability of
|
|
multiple interpreters, isolated from each other,
|
|
in the same Python process. This is a
|
|
new area for Python so there is relative uncertainly about the best
|
|
tools to provide as companions to interpreters. Thus we minimize
|
|
the functionality we add in the proposal as much as possible.
|
|
|
|
Concerns
|
|
--------
|
|
|
|
* "subinterpreters are not worth the trouble"
|
|
|
|
Some have argued that subinterpreters do not add sufficient benefit
|
|
to justify making them an official part of Python. Adding features
|
|
to the language (or stdlib) has a cost in increasing the size of
|
|
the language. So an addition must pay for itself.
|
|
|
|
In this case, multiple interpreter support provide a novel concurrency
|
|
model focused on isolated threads of execution. Furthermore, they
|
|
provide an opportunity for changes in CPython that will allow
|
|
simultaneous use of multiple CPU cores (currently prevented
|
|
by the GIL--see :pep:`684`).
|
|
|
|
Alternatives to subinterpreters include threading, async, and
|
|
multiprocessing. Threading is limited by the GIL and async isn't
|
|
the right solution for every problem (nor for every person).
|
|
Multiprocessing is likewise valuable in some but not all situations.
|
|
Direct IPC (rather than via the multiprocessing module) provides
|
|
similar benefits but with the same caveat.
|
|
|
|
Notably, subinterpreters are not intended as a replacement for any of
|
|
the above. Certainly they overlap in some areas, but the benefits of
|
|
subinterpreters include isolation and (potentially) performance. In
|
|
particular, subinterpreters provide a direct route to an alternate
|
|
concurrency model (e.g. CSP) which has found success elsewhere and
|
|
will appeal to some Python users. That is the core value that the
|
|
``interpreters`` module will provide.
|
|
|
|
* "stdlib support for multiple interpreters adds extra burden
|
|
on C extension authors"
|
|
|
|
In the `Interpreter Isolation`_ section below we identify ways in
|
|
which isolation in CPython's subinterpreters is incomplete. Most
|
|
notable is extension modules that use C globals to store internal
|
|
state. (:pep:`3121` and :pep:`489` provide a solution to that problem,
|
|
followed by some extra APIs that improve efficiency, e.g. :pep:`573`).
|
|
|
|
Consequently, projects that publish extension modules may face an
|
|
increased maintenance burden as their users start using subinterpreters,
|
|
where their modules may break. This situation is limited to modules
|
|
that use C globals (or use libraries that use C globals) to store
|
|
internal state. For numpy, the reported-bug rate is one every 6
|
|
months. [bug-rate]_
|
|
|
|
Ultimately this comes down to a question of how often it will be a
|
|
problem in practice: how many projects would be affected, how often
|
|
their users will be affected, what the additional maintenance burden
|
|
will be for projects, and what the overall benefit of subinterpreters
|
|
is to offset those costs. The position of this PEP is that the actual
|
|
extra maintenance burden will be small and well below the threshold at
|
|
which subinterpreters are worth it.
|
|
|
|
* "creating a new concurrency API deserves much more thought and
|
|
experimentation, so the new module shouldn't go into the stdlib
|
|
right away, if ever"
|
|
|
|
Introducing an API for a new concurrency model, like happened with
|
|
asyncio, is an extremely large project that requires a lot of careful
|
|
consideration. It is not something that can be done as simply as this
|
|
PEP proposes and likely deserves significant time on PyPI to mature.
|
|
(See `Nathaniel's post <nathaniel-asyncio_>`_ on python-dev.)
|
|
|
|
However, this PEP does not propose any new concurrency API.
|
|
At most it exposes minimal tools (e.g. subinterpreters)
|
|
which may be used to write code that follows patterns associated with
|
|
(relatively) new-to-Python `concurrency models <Concurrency_>`_.
|
|
Those tools could also be used as the basis for APIs for such
|
|
concurrency models. Again, this PEP does not propose any such API.
|
|
|
|
* "there is no point to exposing subinterpreters if they still share
|
|
the GIL"
|
|
* "the effort to make the GIL per-interpreter is disruptive and risky"
|
|
|
|
A common misconception is that this PEP also includes a promise that
|
|
interpreters will no longer share the GIL. When that is clarified,
|
|
the next question is "what is the point?". This is already answered
|
|
at length in this PEP. Just to be clear, the value lies in::
|
|
|
|
* increase exposure of the existing feature, which helps improve
|
|
the code health of the entire CPython runtime
|
|
* expose the (mostly) isolated execution of interpreters
|
|
* preparation for per-interpreter GIL
|
|
* encourage experimentation
|
|
|
|
* "data sharing can have a negative impact on cache performance
|
|
in multi-core scenarios"
|
|
|
|
(See [cache-line-ping-pong]_.)
|
|
|
|
This shouldn't be a problem for now as we have no immediate plans
|
|
to actually share data between interpreters, instead focusing
|
|
on copying.
|
|
|
|
|
|
About Subinterpreters
|
|
=====================
|
|
|
|
Concurrency
|
|
-----------
|
|
|
|
Concurrency is a challenging area of software development. Decades of
|
|
research and practice have led to a wide variety of concurrency models,
|
|
each with different goals. Most center on correctness and usability.
|
|
|
|
One class of concurrency models focuses on isolated threads of
|
|
execution that interoperate through some message passing scheme. A
|
|
notable example is Communicating Sequential Processes [CSP]_ (upon
|
|
which Go's concurrency is roughly based). The intended isolation
|
|
inherent to CPython's interpreters makes them well-suited
|
|
to this approach.
|
|
|
|
Shared Data
|
|
-----------
|
|
|
|
CPython's interpreters are inherently isolated (with caveats
|
|
explained below), in contrast to threads. So the same
|
|
communicate-via-shared-memory approach doesn't work. Without an
|
|
alternative, effective use of concurrency via multiple interpreters
|
|
is significantly limited.
|
|
|
|
The key challenge here is that sharing objects between interpreters
|
|
faces complexity due to various constraints on object ownership,
|
|
visibility, and mutability. At a conceptual level it's easier to
|
|
reason about concurrency when objects only exist in one interpreter
|
|
at a time. At a technical level, CPython's current memory model
|
|
limits how Python *objects* may be shared safely between interpreters;
|
|
effectively, objects are bound to the interpreter in which they were
|
|
created. Furthermore, the complexity of *object* sharing increases as
|
|
interpreters become more isolated, e.g. after GIL removal (though this
|
|
is mitigated somewhat for some "immortal" objects (see :pep:`683`).
|
|
|
|
Consequently, the mechanism for sharing needs to be carefully considered.
|
|
There are a number of valid solutions, several of which may be
|
|
appropriate to support in Python. Earlier versions of this proposal
|
|
included a basic capability ("channels"), though most of the options
|
|
were quite similar.
|
|
|
|
Note that the implementation of ``Interpreter.run()`` will be done
|
|
in a way that allows for may of these solutions to be implemented
|
|
independently and to coexist, but doing so is not technically
|
|
a part of the proposal here.
|
|
|
|
The fundamental enabling feature for communication is that most objects
|
|
can be converted to some encoding of underlying raw data, which is safe
|
|
to be passed between interpreters. For example, an ``int`` object can
|
|
be turned into a C ``long`` value, send to another interpreter, and
|
|
turned back into an ``int`` object there.
|
|
|
|
Regardless, the effort to determine the best way forward here is outside
|
|
the scope of this PEP. In the meantime, this proposal provides a basic
|
|
interim solution, described in `API For Sharing Data`_ below.
|
|
|
|
Interpreter Isolation
|
|
---------------------
|
|
|
|
CPython's interpreters are intended to be strictly isolated from each
|
|
other. Each interpreter has its own copy of all modules, classes,
|
|
functions, and variables. The same applies to state in C, including in
|
|
extension modules. The CPython C-API docs explain more. [caveats]_
|
|
|
|
However, there are ways in which interpreters share some state. First
|
|
of all, some process-global state remains shared:
|
|
|
|
* file descriptors
|
|
* builtin types (e.g. dict, bytes)
|
|
* singletons (e.g. None)
|
|
* underlying static module data (e.g. functions) for
|
|
builtin/extension/frozen modules
|
|
|
|
There are no plans to change this.
|
|
|
|
Second, some isolation is faulty due to bugs or implementations that did
|
|
not take subinterpreters into account. This includes things like
|
|
extension modules that rely on C globals. [cryptography]_ In these
|
|
cases bugs should be opened (some are already):
|
|
|
|
* readline module hook functions (http://bugs.python.org/issue4202)
|
|
* memory leaks on re-init (http://bugs.python.org/issue21387)
|
|
|
|
Finally, some potential isolation is missing due to the current design
|
|
of CPython. Improvements are currently going on to address gaps in this
|
|
area:
|
|
|
|
* GC is not run per-interpreter [global-gc]_
|
|
* at-exit handlers are not run per-interpreter [global-atexit]_
|
|
* extensions using the ``PyGILState_*`` API are incompatible [gilstate]_
|
|
* interpreters share memory management (e.g. allocators, gc)
|
|
* interpreters share the GIL
|
|
|
|
Existing Usage
|
|
--------------
|
|
|
|
Multiple interpreter support is not a widely used feature. In fact,
|
|
the only documented cases of widespread usage are
|
|
`mod_wsgi <https://github.com/GrahamDumpleton/mod_wsgi>`_,
|
|
`OpenStack Ceph <https://github.com/ceph/ceph/pull/14971>`_, and
|
|
`JEP <https://github.com/ninia/jep>`_. On the one hand, these cases
|
|
provide confidence that existing multiple interpreter support is
|
|
relatively stable. On the other hand, there isn't much of a sample
|
|
size from which to judge the utility of the feature.
|
|
|
|
|
|
Alternate Python Implementations
|
|
================================
|
|
|
|
I've solicited feedback from various Python implementors about support
|
|
for subinterpreters. Each has indicated that they would be able to
|
|
support multiple interpreters in the same process (if they choose to)
|
|
without a lot of trouble. Here are the projects I contacted:
|
|
|
|
* jython ([jython]_)
|
|
* ironpython (personal correspondence)
|
|
* pypy (personal correspondence)
|
|
* micropython (personal correspondence)
|
|
|
|
|
|
.. _interpreters-list-all:
|
|
.. _interpreters-get-current:
|
|
.. _interpreters-create:
|
|
.. _interpreters-Interpreter:
|
|
|
|
"interpreters" Module API
|
|
=========================
|
|
|
|
The module provides the following functions::
|
|
|
|
list_all() -> [Interpreter]
|
|
|
|
Return a list of all existing interpreters.
|
|
|
|
get_current() => Interpreter
|
|
|
|
Return the currently running interpreter.
|
|
|
|
get_main() => Interpreter
|
|
|
|
Return the main interpreter. If the Python implementation
|
|
has no concept of a main interpreter then return None.
|
|
|
|
create() -> Interpreter
|
|
|
|
Initialize a new Python interpreter and return it.
|
|
It will remain idle until something is run in it and always
|
|
run in its own thread.
|
|
|
|
|
|
The module also provides the following classes::
|
|
|
|
class Interpreter(id):
|
|
|
|
id -> int:
|
|
|
|
The interpreter's ID. (read-only)
|
|
|
|
is_running() -> bool:
|
|
|
|
Return whether or not the interpreter's "run()" is currently
|
|
executing code. Code running in subthreads is ignored.
|
|
Calling this on the current interpreter will always return True.
|
|
|
|
close():
|
|
|
|
Finalize and destroy the interpreter.
|
|
|
|
This may not be called on an already running interpreter.
|
|
Doing so results in a RuntimeError.
|
|
|
|
run(source_str, /):
|
|
|
|
Run the provided Python source code in the interpreter,
|
|
in its __main__ module.
|
|
|
|
This may not be called on an already running interpreter.
|
|
Doing so results in a RuntimeError.
|
|
|
|
A "run()" call is similar to an exec() call (or calling
|
|
a function that returns None). Once "run()" completes,
|
|
the code that called "run()" continues executing (in the
|
|
original interpreter). Likewise, if there is any uncaught
|
|
exception then it effectively (see below) propagates into
|
|
the code where ``run()`` was called. Like exec() (and threads),
|
|
but unlike function calls, there is no return value. If any
|
|
"return" value from the code is needed, send the data out
|
|
via a pipe (os.pipe()).
|
|
|
|
The big difference from exec() or functions is that "run()"
|
|
executes the code in an entirely different interpreter,
|
|
with entirely separate state. The interpreters are completely
|
|
isolated from each other, so the state of the original interpreter
|
|
(including the code it was executing in the current OS thread)
|
|
does not affect the state of the target interpreter
|
|
(the one that will execute the code). Likewise, the target
|
|
does not affect the original, nor any of its other threads.
|
|
|
|
Instead, the state of the original interpreter (for this thread)
|
|
is frozen, and the code it's executing code completely blocks.
|
|
At that point, the target interpreter is given control of the
|
|
OS thread. Then, when it finishes executing, the original
|
|
interpreter gets control back and continues executing.
|
|
|
|
So calling "run()" will effectively cause the current Python
|
|
thread to completely pause. Sometimes you won't want that pause,
|
|
in which case you should make the "run()" call in another thread.
|
|
To do so, add a function that calls "run()" and then run that
|
|
function in a normal "threading.Thread".
|
|
|
|
Note that the interpreter's state is never reset, neither
|
|
before "run()" executes the code nor after. Thus the
|
|
interpreter state is preserved between calls to "run()".
|
|
This includes "sys.modules", the "builtins" module, and the
|
|
internal state of C extension modules.
|
|
|
|
Also note that "run()" executes in the namespace of the
|
|
"__main__" module, just like scripts, the REPL, "-m", and
|
|
"-c". Just as the interpreter's state is not ever reset, the
|
|
"__main__" module is never reset. You can imagine
|
|
concatenating the code from each "run()" call into one long
|
|
script. This is the same as how the REPL operates.
|
|
|
|
Supported code: source text.
|
|
|
|
Uncaught Exceptions
|
|
-------------------
|
|
|
|
Regarding uncaught exceptions in ``Interpreter.run()``, we noted that
|
|
they are "effectively" propagated into the code where ``run()`` was
|
|
called. To prevent leaking exceptions (and tracebacks) between
|
|
interpreters, we create a surrogate of the exception and its traceback
|
|
(see :class:`traceback.TracebackException`), set it to ``__cause__``
|
|
on a new ``RunFailedError``, and raise that.
|
|
|
|
Directly raising (a proxy of) the exception is problematic since it's
|
|
harder to distinguish between an error in the ``run()`` call and an
|
|
uncaught exception from the subinterpreter.
|
|
|
|
API For Sharing Data
|
|
--------------------
|
|
|
|
As discussed in `Shared Data`_ above, multiple interpreter support
|
|
is less useful without a mechanism for sharing data (communicating)
|
|
between them. Sharing actual Python objects between interpreters,
|
|
however, has enough potential problems that we are avoiding support
|
|
for that in this proposal. Nor, as mentioned earlier, are we adding
|
|
anything more than the most minimal mechanism for communication.
|
|
|
|
That very basic mechanism, using pipes (see ``os.pipe()``), will allow
|
|
users to send data (bytes) from one interpreter to another. We'll
|
|
take a closer look in a moment. Fundamentally, it's a simple
|
|
application of the underlying sharing capability proposed here.
|
|
|
|
The various aspects of the approach, including keeping the API minimal,
|
|
helps us avoid further exposing any underlying complexity
|
|
to Python users.
|
|
|
|
Communicating Through OS Pipes
|
|
''''''''''''''''''''''''''''''
|
|
|
|
As noted, this proposal enables a very basic mechanism for
|
|
communicating between interpreters, which makes use of
|
|
``Interpreter.run()``:
|
|
|
|
1. interpreter A calls ``os.pipe()`` to get a read/write pair
|
|
of file descriptors (both ``int`` objects)
|
|
2. interpreter A calls ``run()`` on interpreter B, including
|
|
the read FD via string formatting
|
|
3. interpreter A writes some bytes to the write FD
|
|
4. interpreter B reads those bytes
|
|
|
|
Several of the earlier examples demonstrate this, such as
|
|
`Synchronize using an OS pipe`_.
|
|
|
|
|
|
Interpreter Restrictions
|
|
========================
|
|
|
|
Every new interpreter created by ``interpreters.create()``
|
|
now has specific restrictions on any code it runs. This includes the
|
|
following:
|
|
|
|
* importing an extension module fails if it does not implement
|
|
multi-phase init
|
|
* daemon threads may not be created
|
|
* ``os.fork()`` is not allowed (so no ``multiprocessing``)
|
|
* ``os.exec*()`` is not allowed
|
|
(but "fork+exec", a la ``subprocess`` is okay)
|
|
|
|
Note that interpreters created with the existing C-API do not have these
|
|
restrictions. The same is true for the "main" interpreter, so
|
|
existing use of Python will not change.
|
|
|
|
.. Mention the similar restrictions in PEP 684?
|
|
|
|
We may choose to later loosen some of the above restrictions or provide
|
|
a way to enable/disable granular restrictions individually. Regardless,
|
|
requiring multi-phase init from extension modules will always be a
|
|
default restriction.
|
|
|
|
|
|
Documentation
|
|
=============
|
|
|
|
The new stdlib docs page for the ``interpreters`` module will include
|
|
the following:
|
|
|
|
* (at the top) a clear note that support for multiple interpreters
|
|
is not required from extension modules
|
|
* some explanation about what subinterpreters are
|
|
* brief examples of how to use multiple interpreters
|
|
(and communicating between them)
|
|
* a summary of the limitations of using multiple interpreters
|
|
* (for extension maintainers) a link to the resources for ensuring
|
|
multiple interpreters compatibility
|
|
* much of the API information in this PEP
|
|
|
|
Docs about resources for extension maintainers already exist on the
|
|
`Isolating Extension Modules <isolation-howto_>`_ howto page. Any
|
|
extra help will be added there. For example, it may prove helpful
|
|
to discuss strategies for dealing with linked libraries that keep
|
|
their own subinterpreter-incompatible global state.
|
|
|
|
.. _isolation-howto:
|
|
https://docs.python.org/3/howto/isolating-extensions.html
|
|
|
|
Note that the documentation will play a large part in mitigating any
|
|
negative impact that the new ``interpreters`` module might have on
|
|
extension module maintainers.
|
|
|
|
Also, the ``ImportError`` for incompatible extension modules will have
|
|
a message that clearly says it is due to missing multiple interpreters
|
|
compatibility and that extensions are not required to provide it. This
|
|
will help set user expectations properly.
|
|
|
|
Alternative Solutions
|
|
=====================
|
|
|
|
One possible alternative to a new module is to add support for interpreters
|
|
to ``concurrent.futures``. There are several reasons why that wouldn't work:
|
|
|
|
* the obvious place to look for multiple interpreters support
|
|
is an "interpreters" module, much as with "threading", etc.
|
|
* ``concurrent.futures`` is all about executing functions
|
|
but currently we don't have a good way to run a function
|
|
from one interpreter in another
|
|
|
|
Similar reasoning applies for support in the ``multiprocessing`` module.
|
|
|
|
|
|
Deferred Functionality
|
|
======================
|
|
|
|
In the interest of keeping this proposal minimal, the following
|
|
functionality has been left out for future consideration. Note that
|
|
this is not a judgement against any of said capability, but rather a
|
|
deferment. That said, each is arguably valid.
|
|
|
|
Shareable Objects
|
|
-----------------
|
|
|
|
Earlier versions of this proposal included a mechanism by which the
|
|
data underlying a given object could be passed to another interpreter
|
|
or even shared, even if the object can't be. Without channels there
|
|
isn't enough benefit to keep the concept of shareable objects around.
|
|
|
|
Interpreter.call()
|
|
------------------
|
|
|
|
It would be convenient to run existing functions in subinterpreters
|
|
directly. ``Interpreter.run()`` could be adjusted to support this or
|
|
a ``call()`` method could be added::
|
|
|
|
Interpreter.call(f, *args, **kwargs)
|
|
|
|
This suffers from the same problem as sharing objects between
|
|
interpreters via queues. The minimal solution (running a source string)
|
|
is sufficient for us to get the feature out where it can be explored.
|
|
|
|
Interpreter.run_in_thread()
|
|
---------------------------
|
|
|
|
This method would make a ``run()`` call for you in a thread. Doing this
|
|
using only ``threading.Thread`` and ``run()`` is relatively trivial so
|
|
we've left it out.
|
|
|
|
Synchronization Primitives
|
|
--------------------------
|
|
|
|
The ``threading`` module provides a number of synchronization primitives
|
|
for coordinating concurrent operations. This is especially necessary
|
|
due to the shared-state nature of threading. In contrast,
|
|
interpreters do not share state. Data sharing is restricted to the
|
|
runtime's shareable objects capability, which does away with the need
|
|
for explicit synchronization. If any sort of opt-in shared state
|
|
support is added to CPython's interpreters in the future, that same
|
|
effort can introduce synchronization primitives to meet that need.
|
|
|
|
CSP Library
|
|
-----------
|
|
|
|
A ``csp`` module would not be a large step away from the functionality
|
|
provided by this PEP. However, adding such a module is outside the
|
|
minimalist goals of this proposal.
|
|
|
|
Syntactic Support
|
|
-----------------
|
|
|
|
The ``Go`` language provides a concurrency model based on CSP,
|
|
so it's similar to the concurrency model that multiple interpreters
|
|
support. However, ``Go`` also provides syntactic support, as well as
|
|
several builtin concurrency primitives, to make concurrency a
|
|
first-class feature. Conceivably, similar syntactic (and builtin)
|
|
support could be added to Python using interpreters. However,
|
|
that is *way* outside the scope of this PEP!
|
|
|
|
Multiprocessing
|
|
---------------
|
|
|
|
The ``multiprocessing`` module could support interpreters in the same
|
|
way it supports threads and processes. In fact, the module's
|
|
maintainer, Davin Potts, has indicated this is a reasonable feature
|
|
request. However, it is outside the narrow scope of this PEP.
|
|
|
|
C-extension opt-in/opt-out
|
|
--------------------------
|
|
|
|
By using the ``PyModuleDef_Slot`` introduced by :pep:`489`, we could
|
|
easily add a mechanism by which C-extension modules could opt out of
|
|
multiple interpreter support. Then the import machinery, when operating
|
|
in a subinterpreter, would need to check the module for support.
|
|
It would raise an ImportError if unsupported.
|
|
|
|
Alternately we could support opting in to multiple interpreters support.
|
|
However, that would probably exclude many more modules (unnecessarily)
|
|
than the opt-out approach. Also, note that :pep:`489` defined that an
|
|
extension's use of the PEP's machinery implies multiple interpreters
|
|
support.
|
|
|
|
The scope of adding the ModuleDef slot and fixing up the import
|
|
machinery is non-trivial, but could be worth it. It all depends on
|
|
how many extension modules break under subinterpreters. Given that
|
|
there are relatively few cases we know of through mod_wsgi, we can
|
|
leave this for later.
|
|
|
|
Resetting __main__
|
|
------------------
|
|
|
|
As proposed, every call to ``Interpreter.run()`` will execute in the
|
|
namespace of the interpreter's existing ``__main__`` module. This means
|
|
that data persists there between ``run()`` calls. Sometimes this isn't
|
|
desirable and you want to execute in a fresh ``__main__``. Also,
|
|
you don't necessarily want to leak objects there that you aren't using
|
|
any more.
|
|
|
|
Note that the following won't work right because it will clear too much
|
|
(e.g. ``__name__`` and the other "__dunder__" attributes::
|
|
|
|
interp.run('globals().clear()')
|
|
|
|
Possible solutions include:
|
|
|
|
* a ``create()`` arg to indicate resetting ``__main__`` after each
|
|
``run`` call
|
|
* an ``Interpreter.reset_main`` flag to support opting in or out
|
|
after the fact
|
|
* an ``Interpreter.reset_main()`` method to opt in when desired
|
|
* ``importlib.util.reset_globals()`` [reset_globals]_
|
|
|
|
Also note that resetting ``__main__`` does nothing about state stored
|
|
in other modules. So any solution would have to be clear about the
|
|
scope of what is being reset. Conceivably we could invent a mechanism
|
|
by which any (or every) module could be reset, unlike ``reload()``
|
|
which does not clear the module before loading into it.
|
|
|
|
Regardless, since ``__main__`` is the execution namespace of the
|
|
interpreter, resetting it has a much more direct correlation to
|
|
interpreters and their dynamic state than does resetting other modules.
|
|
So a more generic module reset mechanism may prove unnecessary.
|
|
|
|
This isn't a critical feature initially. It can wait until later
|
|
if desirable.
|
|
|
|
Resetting an interpreter's state
|
|
--------------------------------
|
|
|
|
It may be nice to re-use an existing subinterpreter instead of
|
|
spinning up a new one. Since an interpreter has substantially more
|
|
state than just the ``__main__`` module, it isn't so easy to put an
|
|
interpreter back into a pristine/fresh state. In fact, there *may*
|
|
be parts of the state that cannot be reset from Python code.
|
|
|
|
A possible solution is to add an ``Interpreter.reset()`` method. This
|
|
would put the interpreter back into the state it was in when newly
|
|
created. If called on a running interpreter it would fail (hence the
|
|
main interpreter could never be reset). This would likely be more
|
|
efficient than creating a new interpreter, though that depends on
|
|
what optimizations will be made later to interpreter creation.
|
|
|
|
While this would potentially provide functionality that is not
|
|
otherwise available from Python code, it isn't a fundamental
|
|
functionality. So in the spirit of minimalism here, this can wait.
|
|
Regardless, I doubt it would be controversial to add it post-PEP.
|
|
|
|
Copy an existing interpreter's state
|
|
------------------------------------
|
|
|
|
Relatedly, it may be useful to support creating a new interpreter
|
|
based on an existing one, e.g. ``Interpreter.copy()``. This ties
|
|
into the idea that a snapshot could be made of an interpreter's memory,
|
|
which would make starting up CPython, or creating new interpreters,
|
|
faster in general. The same mechanism could be used for a
|
|
hypothetical ``Interpreter.reset()``, as described previously.
|
|
|
|
Shareable file descriptors and sockets
|
|
--------------------------------------
|
|
|
|
Given that file descriptors and sockets are process-global resources,
|
|
making them shareable is a reasonable idea. They would be a good
|
|
candidate for the first effort at expanding the supported shareable
|
|
types. They aren't strictly necessary for the initial API.
|
|
|
|
Integration with async
|
|
----------------------
|
|
|
|
Per Antoine Pitrou [async]_::
|
|
|
|
Has any thought been given to how FIFOs could integrate with async
|
|
code driven by an event loop (e.g. asyncio)? I think the model of
|
|
executing several asyncio (or Tornado) applications each in their
|
|
own subinterpreter may prove quite interesting to reconcile multi-
|
|
core concurrency with ease of programming. That would require the
|
|
FIFOs to be able to synchronize on something an event loop can wait
|
|
on (probably a file descriptor?).
|
|
|
|
The basic functionality of multiple interpreters support does not depend
|
|
on async and can be added later.
|
|
|
|
channels
|
|
--------
|
|
|
|
We could introduce some relatively efficient, native data types for
|
|
passing data between interpreters, to use instead of OS pipes. Earlier
|
|
versions of this PEP introduced one such mechanism, called "channels".
|
|
This can be pursued later.
|
|
|
|
Pipes and Queues
|
|
----------------
|
|
|
|
With the proposed object passing mechanism of "os.pipe()", other similar
|
|
basic types aren't strictly required to achieve the minimal useful
|
|
functionality of multiple interpreters. Such types include pipes
|
|
(like unbuffered channels, but one-to-one) and queues (like channels,
|
|
but more generic). See below in `Rejected Ideas`_ for more information.
|
|
|
|
Even though these types aren't part of this proposal, they may still
|
|
be useful in the context of concurrency. Adding them later is entirely
|
|
reasonable. The could be trivially implemented as wrappers around
|
|
channels. Alternatively they could be implemented for efficiency at the
|
|
same low level as channels.
|
|
|
|
Support inheriting settings (and more?)
|
|
---------------------------------------
|
|
|
|
Folks might find it useful, when creating a new interpreter, to be
|
|
able to indicate that they would like some things "inherited" by the
|
|
new interpreter. The mechanism could be a strict copy or it could be
|
|
copy-on-write. The motivating example is with the warnings module
|
|
(e.g. copy the filters).
|
|
|
|
The feature isn't critical, nor would it be widely useful, so it
|
|
can wait until there's interest. Notably, both suggested solutions
|
|
will require significant work, especially when it comes to complex
|
|
objects and most especially for mutable containers of mutable
|
|
complex objects.
|
|
|
|
Make exceptions shareable
|
|
-------------------------
|
|
|
|
Exceptions are propagated out of ``run()`` calls, so it isn't a big
|
|
leap to make them shareable. However, as noted elsewhere,
|
|
it isn't essential or (particularly common) so we can wait on doing
|
|
that.
|
|
|
|
Make RunFailedError.__cause__ lazy
|
|
----------------------------------
|
|
|
|
An uncaught exception in a subinterpreter (from ``run()``) is copied
|
|
to the calling interpreter and set as ``__cause__`` on a
|
|
``RunFailedError`` which is then raised. That copying part involves
|
|
some sort of deserialization in the calling interpreter, which can be
|
|
expensive (e.g. due to imports) yet is not always necessary.
|
|
|
|
So it may be useful to use an ``ExceptionProxy`` type to wrap the
|
|
serialized exception and only deserialize it when needed. That could
|
|
be via ``ExceptionProxy__getattribute__()`` or perhaps through
|
|
``RunFailedError.resolve()`` (which would raise the deserialized
|
|
exception and set ``RunFailedError.__cause__`` to the exception.
|
|
|
|
It may also make sense to have ``RunFailedError.__cause__`` be a
|
|
descriptor that does the lazy deserialization (and set ``__cause__``)
|
|
on the ``RunFailedError`` instance.
|
|
|
|
Make everything shareable through serialization
|
|
-----------------------------------------------
|
|
|
|
We could use pickle (or marshal) to serialize everything and thus
|
|
make them shareable. Doing this is potentially inefficient,
|
|
but it may be a matter of convenience in the end.
|
|
We can add it later, but trying to remove it later
|
|
would be significantly more painful.
|
|
|
|
Return a value from ``run()``
|
|
-----------------------------
|
|
|
|
Currently ``run()`` always returns None. One idea is to return the
|
|
return value from whatever the subinterpreter ran. However, for now
|
|
it doesn't make sense. The only thing folks can run is a string of
|
|
code (i.e. a script). This is equivalent to ``PyRun_StringFlags()``,
|
|
``exec()``, or a module body. None of those "return" anything. We can
|
|
revisit this once ``run()`` supports functions, etc.
|
|
|
|
Add a shareable synchronization primitive
|
|
-----------------------------------------
|
|
|
|
This would be ``_threading.Lock`` (or something like it) where
|
|
interpreters would actually share the underlying mutex. The main
|
|
concern is that locks and isolated interpreters may not mix well
|
|
(as learned in Go).
|
|
|
|
We can add this later if it proves desirable without much trouble.
|
|
|
|
Propagate SystemExit and KeyboardInterrupt Differently
|
|
------------------------------------------------------
|
|
|
|
The exception types that inherit from ``BaseException`` (aside from
|
|
``Exception``) are usually treated specially. These types are:
|
|
``KeyboardInterrupt``, ``SystemExit``, and ``GeneratorExit``. It may
|
|
make sense to treat them specially when it comes to propagation from
|
|
``run()``. Here are some options::
|
|
|
|
* propagate like normal via RunFailedError
|
|
* do not propagate (handle them somehow in the subinterpreter)
|
|
* propagate them directly (avoid RunFailedError)
|
|
* propagate them directly (set RunFailedError as __cause__)
|
|
|
|
We aren't going to worry about handling them differently. Threads
|
|
already ignore ``SystemExit``, so for now we will follow that pattern.
|
|
|
|
|
|
Rejected Ideas
|
|
==============
|
|
|
|
Add an API based on pipes
|
|
-------------------------
|
|
|
|
(Earlier versions of this PEP proposed "channels" for communicating
|
|
between interpreters. This idea is written relative to that.)
|
|
|
|
A pipe would be a simplex FIFO between exactly two interpreters. For
|
|
most use cases this would be sufficient. It could potentially simplify
|
|
the implementation as well. However, it isn't a big step to supporting
|
|
a many-to-many simplex FIFO via channels. Also, with pipes the API
|
|
ends up being slightly more complicated, requiring naming the pipes.
|
|
|
|
Add an API based on queues
|
|
--------------------------
|
|
|
|
(Earlier versions of this PEP proposed "channels" for communicating
|
|
between interpreters. This idea is written relative to that.)
|
|
|
|
Queues and buffered channels are almost the same thing. The main
|
|
difference is that channels have a stronger relationship with context
|
|
(i.e. the associated interpreter).
|
|
|
|
The name "Channel" was used instead of "Queue" to avoid confusion with
|
|
the stdlib ``queue.Queue``.
|
|
|
|
"enumerate"
|
|
-----------
|
|
|
|
The ``list_all()`` function provides the list of all interpreters.
|
|
In the threading module, which partly inspired the proposed API, the
|
|
function is called ``enumerate()``. The name is different here to
|
|
avoid confusing Python users that are not already familiar with the
|
|
threading API. For them "enumerate" is rather unclear, whereas
|
|
"list_all" is clear.
|
|
|
|
Alternate solutions to prevent leaking exceptions across interpreters
|
|
---------------------------------------------------------------------
|
|
|
|
In function calls, uncaught exceptions propagate to the calling frame.
|
|
The same approach could be taken with ``run()``. However, this would
|
|
mean that exception objects would leak across the inter-interpreter
|
|
boundary. Likewise, the frames in the traceback would potentially leak.
|
|
|
|
While that might not be a problem currently, it would be a problem once
|
|
interpreters get better isolation relative to memory management (which
|
|
is necessary to stop sharing the GIL between interpreters). We've
|
|
resolved the semantics of how the exceptions propagate by raising a
|
|
``RunFailedError`` instead, for which ``__cause__`` wraps a safe proxy
|
|
for the original exception and traceback.
|
|
|
|
Rejected possible solutions:
|
|
|
|
* reproduce the exception and traceback in the original interpreter
|
|
and raise that.
|
|
* raise a subclass of RunFailedError that proxies the original
|
|
exception and traceback.
|
|
* raise RuntimeError instead of RunFailedError
|
|
* convert at the boundary (a la ``subprocess.CalledProcessError``)
|
|
(requires a cross-interpreter representation)
|
|
* support customization via ``Interpreter.excepthook``
|
|
(requires a cross-interpreter representation)
|
|
* wrap in a proxy at the boundary (including with support for
|
|
something like ``err.raise()`` to propagate the traceback).
|
|
* return the exception (or its proxy) from ``run()`` instead of
|
|
raising it
|
|
* return a result object (like ``subprocess`` does) [result-object]_
|
|
(unnecessary complexity?)
|
|
* throw the exception away and expect users to deal with unhandled
|
|
exceptions explicitly in the script they pass to ``run()``
|
|
(they can pass error info out via ``os.pipe()``);
|
|
with threads you have to do something similar
|
|
|
|
Always associate each new interpreter with its own thread
|
|
---------------------------------------------------------
|
|
|
|
As implemented in the C-API, an interpreter is not inherently tied to
|
|
any thread. Furthermore, it will run in any existing thread, whether
|
|
created by Python or not. You only have to activate one of its thread
|
|
states (``PyThreadState``) in the thread first. This means that the
|
|
same thread may run more than one interpreter (though obviously
|
|
not at the same time).
|
|
|
|
The proposed module maintains this behavior. Interpreters are not
|
|
tied to threads. Only calls to ``Interpreter.run()`` are. However,
|
|
one of the key objectives of this PEP is to provide a more
|
|
human-centric concurrency model. With that in mind, from a conceptual
|
|
standpoint the module *might* be easier to understand if each
|
|
interpreter were associated with its own thread.
|
|
|
|
That would mean ``interpreters.create()`` would create a new thread
|
|
and ``Interpreter.run()`` would only execute in that thread (and
|
|
nothing else would). The benefit is that users would not have to
|
|
wrap ``Interpreter.run()`` calls in a new ``threading.Thread``. Nor
|
|
would they be in a position to accidentally pause the current
|
|
interpreter (in the current thread) while their interpreter
|
|
executes.
|
|
|
|
The idea is rejected because the benefit is small and the cost is high.
|
|
The difference from the capability in the C-API would be potentially
|
|
confusing. The implicit creation of threads is magical. The early
|
|
creation of threads is potentially wasteful. The inability to run
|
|
arbitrary interpreters in an existing thread would prevent some valid
|
|
use cases, frustrating users. Tying interpreters to threads would
|
|
require extra runtime modifications. It would also make the module's
|
|
implementation overly complicated. Finally, it might not even make
|
|
the module easier to understand.
|
|
|
|
Allow multiple simultaneous calls to Interpreter.run()
|
|
------------------------------------------------------
|
|
|
|
This would make sense especially if ``Interpreter.run()`` were to
|
|
manage new threads for you (which we've rejected). Essentially,
|
|
each call would run independently, which would be mostly fine
|
|
from a narrow technical standpoint, since each interpreter
|
|
can have multiple threads.
|
|
|
|
The problem is that the interpreter has only one ``__main__`` module
|
|
and simultaneous ``Interpreter.run()`` calls would have to sort out
|
|
sharing ``__main__`` or we'd have to invent a new mechanism. Neither
|
|
would be simple enough to be worth doing.
|
|
|
|
Add a "reraise" method to RunFailedError
|
|
----------------------------------------
|
|
|
|
While having ``__cause__`` set on ``RunFailedError`` helps produce a
|
|
more useful traceback, it's less helpful when handling the original
|
|
error. To help facilitate this, we could add
|
|
``RunFailedError.reraise()``. This method would enable the following
|
|
pattern::
|
|
|
|
try:
|
|
try:
|
|
interp.run(script)
|
|
except RunFailedError as exc:
|
|
exc.reraise()
|
|
except MyException:
|
|
...
|
|
|
|
This would be made even simpler if there existed a ``__reraise__``
|
|
protocol.
|
|
|
|
All that said, this is completely unnecessary. Using ``__cause__``
|
|
is good enough::
|
|
|
|
try:
|
|
try:
|
|
interp.run(script)
|
|
except RunFailedError as exc:
|
|
raise exc.__cause__
|
|
except MyException:
|
|
...
|
|
|
|
Note that in extreme cases it may require a little extra boilerplate::
|
|
|
|
try:
|
|
try:
|
|
interp.run(script)
|
|
except RunFailedError as exc:
|
|
if exc.__cause__ is not None:
|
|
raise exc.__cause__
|
|
raise # re-raise
|
|
except MyException:
|
|
...
|
|
|
|
|
|
Implementation
|
|
==============
|
|
|
|
The implementation of the PEP has 4 parts:
|
|
|
|
* the high-level module described in this PEP (mostly a light wrapper
|
|
around a low-level C extension
|
|
* the low-level C extension module
|
|
* additions to the ("private") C=API needed by the low-level module
|
|
* secondary fixes/changes in the CPython runtime that facilitate
|
|
the low-level module (among other benefits)
|
|
|
|
These are at various levels of completion, with more done the lower
|
|
you go:
|
|
|
|
* the high-level module has been, at best, roughly implemented.
|
|
However, fully implementing it will be almost trivial.
|
|
* the low-level module is mostly complete. The bulk of the
|
|
implementation was merged into master in December 2018 as the
|
|
"_xxsubinterpreters" module (for the sake of testing multiple
|
|
interpreters functionality). Only 3 parts of the implementation
|
|
remain: "send_wait()", "send_buffer()", and exception propagation.
|
|
All three have been mostly finished, but were blocked by work
|
|
related to ceval. That blocker is basically resolved now and
|
|
finishing the low-level will not require extensive work.
|
|
* all necessary C-API work has been finished
|
|
* all anticipated work in the runtime has been finished
|
|
|
|
The implementation effort for :pep:`554` is being tracked as part of
|
|
a larger project aimed at improving multi-core support in CPython.
|
|
[multi-core-project]_
|
|
|
|
|
|
References
|
|
==========
|
|
|
|
.. [c-api]
|
|
https://docs.python.org/3/c-api/init.html#sub-interpreter-support
|
|
|
|
.. [CSP]
|
|
https://en.wikipedia.org/wiki/Communicating_sequential_processes
|
|
https://github.com/futurecore/python-csp
|
|
|
|
.. [caveats]
|
|
https://docs.python.org/3/c-api/init.html#bugs-and-caveats
|
|
|
|
.. [cryptography]
|
|
https://github.com/pyca/cryptography/issues/2299
|
|
|
|
.. [global-gc]
|
|
http://bugs.python.org/issue24554
|
|
|
|
.. [gilstate]
|
|
https://bugs.python.org/issue10915
|
|
http://bugs.python.org/issue15751
|
|
|
|
.. [global-atexit]
|
|
https://bugs.python.org/issue6531
|
|
|
|
.. [bug-rate]
|
|
https://mail.python.org/pipermail/python-ideas/2017-September/047094.html
|
|
|
|
.. [benefits]
|
|
https://mail.python.org/pipermail/python-ideas/2017-September/047122.html
|
|
|
|
.. [reset_globals]
|
|
https://mail.python.org/pipermail/python-dev/2017-September/149545.html
|
|
|
|
.. [async]
|
|
https://mail.python.org/pipermail/python-dev/2017-September/149420.html
|
|
https://mail.python.org/pipermail/python-dev/2017-September/149585.html
|
|
|
|
.. [result-object]
|
|
https://mail.python.org/pipermail/python-dev/2017-September/149562.html
|
|
|
|
.. [jython]
|
|
https://mail.python.org/pipermail/python-ideas/2017-May/045771.html
|
|
|
|
.. [multi-core-project]
|
|
https://github.com/ericsnowcurrently/multi-core-python
|
|
|
|
.. [cache-line-ping-pong]
|
|
https://mail.python.org/archives/list/python-dev@python.org/message/3HVRFWHDMWPNR367GXBILZ4JJAUQ2STZ/
|
|
|
|
.. _nathaniel-asyncio:
|
|
https://mail.python.org/archives/list/python-dev@python.org/message/TUEAZNZHVJGGLL4OFD32OW6JJDKM6FAS/
|
|
|
|
* mp-conn
|
|
https://docs.python.org/3/library/multiprocessing.html#connection-objects
|
|
|
|
* main-thread
|
|
https://mail.python.org/pipermail/python-ideas/2017-September/047144.html
|
|
https://mail.python.org/pipermail/python-dev/2017-September/149566.html
|
|
|
|
* petr-c-ext
|
|
https://mail.python.org/pipermail/import-sig/2016-June/001062.html
|
|
https://mail.python.org/pipermail/python-ideas/2016-April/039748.html
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document has been placed in the public domain.
|