2017-09-07 12:27:39 -04:00
|
|
|
PEP: 554
|
|
|
|
Title: Multiple Interpreters in the Stdlib
|
|
|
|
Author: Eric Snow <ericsnowcurrently@gmail.com>
|
2023-03-14 19:37:31 -04:00
|
|
|
Discussions-To: https://discuss.python.org/t/pep-554-multiple-interpreters-in-the-stdlib/24855
|
2023-11-27 21:32:35 -05:00
|
|
|
Status: Superseded
|
2017-09-07 12:27:39 -04:00
|
|
|
Type: Standards Track
|
|
|
|
Content-Type: text/x-rst
|
2021-02-09 11:54:26 -05:00
|
|
|
Created: 05-Sep-2017
|
2023-04-10 12:32:03 -04:00
|
|
|
Python-Version: 3.13
|
2023-03-14 19:13:30 -04:00
|
|
|
Post-History: `07-Sep-2017 <https://mail.python.org/archives/list/python-ideas@python.org/thread/HQQWEE527HG3ILJVKQTXVSJIQO6NUSIA/>`__,
|
|
|
|
`08-Sep-2017 <https://mail.python.org/archives/list/python-dev@python.org/thread/NBWMA6LVD22XOUYC5ZMPBFWDQOECRP77/>`__,
|
|
|
|
`13-Sep-2017 <https://mail.python.org/archives/list/python-dev@python.org/thread/EG4FSFG5E3O22FTIUQOXMQ6X6B5X3DP7/>`__,
|
|
|
|
`05-Dec-2017 <https://mail.python.org/archives/list/python-dev@python.org/thread/BCSRGAMCYB3NGXNU42U66J56XNZVMQP2/>`__,
|
|
|
|
`04-May-2020 <https://mail.python.org/archives/list/python-dev@python.org/thread/X2KPCSRVBD2QD5GP5IMXXZTGZ46OXD3D/>`__,
|
2023-03-14 19:37:31 -04:00
|
|
|
`14-Mar-2023 <https://discuss.python.org/t/pep-554-multiple-interpreters-in-the-stdlib/24855/2/>`__,
|
2023-11-27 21:32:35 -05:00
|
|
|
`01-Nov-2023 <https://discuss.python.org/t/pep-554-multiple-interpreters-in-the-stdlib/24855/26/>`__,
|
|
|
|
Superseded-By: 734
|
|
|
|
|
|
|
|
|
|
|
|
.. note::
|
|
|
|
This PEP effectively continues in a cleaner form in :pep:`734`.
|
|
|
|
This PEP is kept as-is for the sake of the various sections of
|
|
|
|
background information and deferred/rejected ideas that have
|
|
|
|
been stripped from :pep:`734`.
|
2017-09-07 12:27:39 -04:00
|
|
|
|
|
|
|
|
|
|
|
Abstract
|
|
|
|
========
|
|
|
|
|
2017-12-05 21:16:00 -05:00
|
|
|
CPython has supported multiple interpreters in the same process (AKA
|
2018-05-14 13:39:07 -04:00
|
|
|
"subinterpreters") since version 1.5 (1997). The feature has been
|
2023-01-20 13:12:46 -05:00
|
|
|
available via the C-API. [c-api]_ Multiple interpreters operate in
|
2017-09-12 15:31:24 -04:00
|
|
|
`relative isolation from one another <Interpreter Isolation_>`_, which
|
2020-04-21 12:47:03 -04:00
|
|
|
facilitates novel alternative approaches to
|
|
|
|
`concurrency <Concurrency_>`_.
|
2017-09-08 17:04:39 -04:00
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
This proposal introduces the stdlib ``interpreters`` module. It exposes
|
|
|
|
the basic functionality of multiple interpreters already provided by the
|
2023-11-01 19:06:47 -04:00
|
|
|
C-API, along with basic support for communicating between interpreters.
|
|
|
|
This module is especially relevant since :pep:`684` introduced a
|
|
|
|
per-interpreter GIL in Python 3.12.
|
2020-04-21 12:47:03 -04:00
|
|
|
|
|
|
|
|
2017-09-07 12:27:39 -04:00
|
|
|
Proposal
|
|
|
|
========
|
|
|
|
|
2023-03-14 19:13:30 -04:00
|
|
|
Summary:
|
|
|
|
|
|
|
|
* add a new stdlib module: "interpreters"
|
2023-11-01 19:06:47 -04:00
|
|
|
* add concurrent.futures.InterpreterPoolExecutor
|
2023-03-14 19:13:30 -04:00
|
|
|
* help for extension module maintainers
|
|
|
|
|
|
|
|
|
2020-04-21 12:47:03 -04:00
|
|
|
The "interpreters" Module
|
|
|
|
-------------------------
|
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
The ``interpreters`` module will provide a high-level interface
|
|
|
|
to the multiple interpreter functionality, and wrap a new low-level
|
|
|
|
``_interpreters`` (in the same way as the ``threading`` module).
|
|
|
|
See the `Examples`_ section for concrete usage and use cases.
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
Along with exposing the existing (in CPython) multiple interpreter
|
2023-11-01 19:06:47 -04:00
|
|
|
support, the module will also support a basic mechanism for
|
|
|
|
passing data between interpreters. That involves setting "shareable"
|
|
|
|
objects in the ``__main__`` module of a target subinterpreter. Some
|
|
|
|
such objects, like ``os.pipe()``, may be used to communicate further.
|
|
|
|
The module will also provide a minimal implementation of "channels"
|
|
|
|
as a demonstration of cross-interpreter communication.
|
2017-12-05 21:16:00 -05:00
|
|
|
|
|
|
|
Note that *objects* are not shared between interpreters since they are
|
|
|
|
tied to the interpreter in which they were created. Instead, the
|
2023-01-20 13:12:46 -05:00
|
|
|
objects' *data* is passed between interpreters. See the `Shared Data`_
|
2023-11-01 19:06:47 -04:00
|
|
|
and `API For Communication`_ sections for more details about
|
2023-01-20 13:12:46 -05:00
|
|
|
sharing/communicating between interpreters.
|
2017-12-05 21:16:00 -05:00
|
|
|
|
|
|
|
API summary for interpreters module
|
|
|
|
-----------------------------------
|
|
|
|
|
|
|
|
Here is a summary of the API for the ``interpreters`` module. For a
|
|
|
|
more in-depth explanation of the proposed classes and functions, see
|
|
|
|
the `"interpreters" Module API`_ section below.
|
|
|
|
|
|
|
|
For creating and using interpreters:
|
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
+----------------------------------+----------------------------------------------+
|
|
|
|
| signature | description |
|
|
|
|
+==================================+==============================================+
|
|
|
|
| ``list_all() -> [Interpreter]`` | Get all existing interpreters. |
|
|
|
|
+----------------------------------+----------------------------------------------+
|
|
|
|
| ``get_current() -> Interpreter`` | Get the currently running interpreter. |
|
|
|
|
+----------------------------------+----------------------------------------------+
|
|
|
|
| ``get_main() -> Interpreter`` | Get the main interpreter. |
|
|
|
|
+----------------------------------+----------------------------------------------+
|
|
|
|
| ``create() -> Interpreter`` | Initialize a new (idle) Python interpreter. |
|
|
|
|
+----------------------------------+----------------------------------------------+
|
2017-12-05 21:16:00 -05:00
|
|
|
|
|
|
|
|
|
|
|
|
|
2023-03-14 19:13:30 -04:00
|
|
|
+----------------------------------+---------------------------------------------------+
|
|
|
|
| signature | description |
|
|
|
|
+==================================+===================================================+
|
|
|
|
| ``class Interpreter`` | A single interpreter. |
|
|
|
|
+----------------------------------+---------------------------------------------------+
|
|
|
|
| ``.id`` | The interpreter's ID (read-only). |
|
|
|
|
+----------------------------------+---------------------------------------------------+
|
|
|
|
| ``.is_running() -> bool`` | Is the interpreter currently executing code? |
|
|
|
|
+----------------------------------+---------------------------------------------------+
|
|
|
|
| ``.close()`` | Finalize and destroy the interpreter. |
|
|
|
|
+----------------------------------+---------------------------------------------------+
|
2023-11-01 19:06:47 -04:00
|
|
|
| ``.set_main_attrs(**kwargs)`` | Bind "shareable" objects in ``__main__``. |
|
|
|
|
+----------------------------------+---------------------------------------------------+
|
|
|
|
| ``.get_main_attr(name)`` | Get a "shareable" object from ``__main__``. |
|
|
|
|
+----------------------------------+---------------------------------------------------+
|
|
|
|
| ``.exec(src_str, /)`` | | Run the given source code in the interpreter |
|
2023-03-17 15:19:50 -04:00
|
|
|
| | | (in the current thread). |
|
2023-03-14 19:13:30 -04:00
|
|
|
+----------------------------------+---------------------------------------------------+
|
2023-01-20 13:12:46 -05:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
.. XXX Support blocking interp.exec() until the interpreter
|
2023-01-20 13:12:46 -05:00
|
|
|
finishes its current work.
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
For communicating between interpreters:
|
|
|
|
|
|
|
|
+---------------------------------------------------------+--------------------------------------------+
|
|
|
|
| signature | description |
|
|
|
|
+=========================================================+============================================+
|
|
|
|
| ``is_shareable(obj) -> Bool`` | | Can the object's data be passed |
|
|
|
|
| | | between interpreters? |
|
|
|
|
+---------------------------------------------------------+--------------------------------------------+
|
|
|
|
| ``create_channel() -> (RecvChannel, SendChannel)`` | | Create a new channel for passing |
|
|
|
|
| | | data between interpreters. |
|
|
|
|
+---------------------------------------------------------+--------------------------------------------+
|
|
|
|
|
|
|
|
concurrent.futures.InterpreterPoolExecutor
|
|
|
|
------------------------------------------
|
|
|
|
|
|
|
|
An executor will be added that extends ``ThreadPoolExecutor`` to run
|
|
|
|
per-thread tasks in subinterpreters. Initially, the only supported
|
|
|
|
tasks will be whatever ``Interpreter.exec()`` takes (e.g. a ``str``
|
|
|
|
script). However, we may also support some functions, as well as
|
|
|
|
eventually a separate method for pickling the task and arguments,
|
|
|
|
to reduce friction (at the expense of performance
|
|
|
|
for short-running tasks).
|
2023-01-20 13:12:46 -05:00
|
|
|
|
2020-05-01 18:05:01 -04:00
|
|
|
Help for Extension Module Maintainers
|
|
|
|
-------------------------------------
|
2020-04-21 12:47:03 -04:00
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
In practice, an extension that implements multi-phase init (:pep:`489`)
|
|
|
|
is considered isolated and thus compatible with multiple interpreters.
|
|
|
|
Otherwise it is "incompatible".
|
|
|
|
|
|
|
|
Many extension modules are still incompatible. The maintainers and
|
|
|
|
users of such extension modules will both benefit when they are updated
|
|
|
|
to support multiple interpreters. In the meantime, users may become
|
|
|
|
confused by failures when using multiple interpreters, which could
|
2020-05-01 18:05:01 -04:00
|
|
|
negatively impact extension maintainers. See `Concerns`_ below.
|
2020-04-21 12:47:03 -04:00
|
|
|
|
2020-05-01 18:05:01 -04:00
|
|
|
To mitigate that impact and accelerate compatibility, we will do the
|
|
|
|
following:
|
|
|
|
|
|
|
|
* be clear that extension modules are *not* required to support use in
|
2023-01-20 13:12:46 -05:00
|
|
|
multiple interpreters
|
|
|
|
* raise ``ImportError`` when an incompatible module is imported
|
|
|
|
in a subinterpreter
|
2020-05-01 18:05:01 -04:00
|
|
|
* provide resources (e.g. docs) to help maintainers reach compatibility
|
|
|
|
* reach out to the maintainers of Cython and of the most used extension
|
|
|
|
modules (on PyPI) to get feedback and possibly provide assistance
|
2020-04-21 12:47:03 -04:00
|
|
|
|
2017-09-08 17:04:39 -04:00
|
|
|
|
2017-09-12 15:31:24 -04:00
|
|
|
Examples
|
|
|
|
========
|
2017-09-08 17:04:39 -04:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
Run isolated code in current OS thread
|
|
|
|
--------------------------------------
|
2017-09-12 15:31:24 -04:00
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
interp = interpreters.create()
|
|
|
|
print('before')
|
2023-11-01 19:06:47 -04:00
|
|
|
interp.exec('print("during")')
|
2023-03-17 15:19:50 -04:00
|
|
|
print('after')
|
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
Run in a different thread
|
|
|
|
-------------------------
|
2023-03-17 15:19:50 -04:00
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
interp = interpreters.create()
|
|
|
|
def run():
|
2023-11-01 19:06:47 -04:00
|
|
|
interp.exec('print("during")')
|
2023-03-17 15:19:50 -04:00
|
|
|
t = threading.Thread(target=run)
|
|
|
|
print('before')
|
|
|
|
t.start()
|
|
|
|
t.join()
|
2017-09-12 15:31:24 -04:00
|
|
|
print('after')
|
|
|
|
|
|
|
|
Pre-populate an interpreter
|
|
|
|
---------------------------
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
interp = interpreters.create()
|
2023-11-01 19:06:47 -04:00
|
|
|
interp.exec(tw.dedent("""
|
2017-09-12 15:31:24 -04:00
|
|
|
import some_lib
|
|
|
|
import an_expensive_module
|
|
|
|
some_lib.set_up()
|
2017-09-22 19:51:38 -04:00
|
|
|
"""))
|
2017-09-12 15:31:24 -04:00
|
|
|
wait_for_request()
|
2023-11-01 19:06:47 -04:00
|
|
|
interp.exec(tw.dedent("""
|
2017-09-12 15:31:24 -04:00
|
|
|
some_lib.handle_request()
|
2017-09-22 19:51:38 -04:00
|
|
|
"""))
|
2017-09-12 15:31:24 -04:00
|
|
|
|
|
|
|
Handling an exception
|
|
|
|
---------------------
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
interp = interpreters.create()
|
|
|
|
try:
|
2023-11-01 19:06:47 -04:00
|
|
|
interp.exec(tw.dedent("""
|
2017-09-12 15:31:24 -04:00
|
|
|
raise KeyError
|
2023-03-17 15:19:50 -04:00
|
|
|
"""))
|
2018-09-10 15:07:16 -04:00
|
|
|
except interpreters.RunFailedError as exc:
|
|
|
|
print(f"got the error from the subinterpreter: {exc}")
|
2017-09-12 15:31:24 -04:00
|
|
|
|
2019-03-23 02:12:14 -04:00
|
|
|
Re-raising an exception
|
|
|
|
-----------------------
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
interp = interpreters.create()
|
|
|
|
try:
|
|
|
|
try:
|
2023-11-01 19:06:47 -04:00
|
|
|
interp.exec(tw.dedent("""
|
2019-03-23 02:12:14 -04:00
|
|
|
raise KeyError
|
2023-03-17 15:19:50 -04:00
|
|
|
"""))
|
2019-03-23 02:12:14 -04:00
|
|
|
except interpreters.RunFailedError as exc:
|
|
|
|
raise exc.__cause__
|
|
|
|
except KeyError:
|
|
|
|
print("got a KeyError from the subinterpreter")
|
|
|
|
|
|
|
|
Note that this pattern is a candidate for later improvement.
|
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
Interact with the __main__ namespace
|
|
|
|
------------------------------------
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
interp = interpreters.create()
|
|
|
|
interp.set_main_attrs(a=1, b=2)
|
|
|
|
interp.exec(tw.dedent("""
|
|
|
|
res = do_something(a, b)
|
|
|
|
"""))
|
|
|
|
res = interp.get_main_attr('res')
|
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
Synchronize using an OS pipe
|
|
|
|
----------------------------
|
2017-09-12 15:31:24 -04:00
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
interp = interpreters.create()
|
2023-11-01 19:06:47 -04:00
|
|
|
r1, s1 = os.pipe()
|
|
|
|
r2, s2 = os.pipe()
|
|
|
|
|
|
|
|
def task():
|
|
|
|
interp.exec(tw.dedent(f"""
|
2023-01-20 13:12:46 -05:00
|
|
|
import os
|
2023-11-01 19:06:47 -04:00
|
|
|
os.read({r1}, 1)
|
|
|
|
print('during B')
|
|
|
|
os.write({s2}, '')
|
2023-03-14 19:13:30 -04:00
|
|
|
"""))
|
2023-11-01 19:06:47 -04:00
|
|
|
|
|
|
|
t = threading.thread(target=task)
|
|
|
|
t.start()
|
|
|
|
print('before')
|
|
|
|
os.write(s1, '')
|
|
|
|
print('during A')
|
|
|
|
os.read(r2, 1)
|
2017-09-12 15:31:24 -04:00
|
|
|
print('after')
|
2023-11-01 19:06:47 -04:00
|
|
|
t.join()
|
2017-09-12 15:31:24 -04:00
|
|
|
|
|
|
|
Sharing a file descriptor
|
|
|
|
-------------------------
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
interp = interpreters.create()
|
2023-11-01 19:06:47 -04:00
|
|
|
with open('spamspamspam') as infile:
|
|
|
|
interp.set_main_attrs(fd=infile.fileno())
|
|
|
|
interp.exec(tw.dedent(f"""
|
2023-01-20 13:12:46 -05:00
|
|
|
import os
|
2017-09-12 15:31:24 -04:00
|
|
|
for line in os.fdopen(fd):
|
|
|
|
print(line)
|
2023-03-14 19:13:30 -04:00
|
|
|
"""))
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2017-09-12 15:31:24 -04:00
|
|
|
Passing objects via pickle
|
|
|
|
--------------------------
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
interp = interpreters.create()
|
2023-01-20 13:12:46 -05:00
|
|
|
r, s = os.pipe()
|
2023-11-01 19:06:47 -04:00
|
|
|
interp.exec(tw.dedent(f"""
|
2023-01-20 13:12:46 -05:00
|
|
|
import os
|
2017-09-12 15:31:24 -04:00
|
|
|
import pickle
|
2023-03-14 19:13:30 -04:00
|
|
|
reader = {r}
|
2023-03-17 15:19:50 -04:00
|
|
|
"""))
|
2023-11-01 19:06:47 -04:00
|
|
|
interp.exec(tw.dedent("""
|
2023-01-20 13:12:46 -05:00
|
|
|
data = b''
|
|
|
|
c = os.read(reader, 1)
|
|
|
|
while c != b'\x00':
|
|
|
|
while c != b'\x00':
|
|
|
|
data += c
|
|
|
|
c = os.read(reader, 1)
|
2017-09-12 15:31:24 -04:00
|
|
|
obj = pickle.loads(data)
|
|
|
|
do_something(obj)
|
2023-01-20 13:12:46 -05:00
|
|
|
c = os.read(reader, 1)
|
2017-12-06 12:06:56 -05:00
|
|
|
"""))
|
2017-09-12 15:31:24 -04:00
|
|
|
for obj in input:
|
|
|
|
data = pickle.dumps(obj)
|
2023-01-20 13:12:46 -05:00
|
|
|
os.write(s, data)
|
|
|
|
os.write(s, b'\x00')
|
|
|
|
os.write(s, b'\x00')
|
2017-09-08 02:30:21 -04:00
|
|
|
|
2023-03-21 11:56:19 -04:00
|
|
|
Capturing an interpreter's stdout
|
|
|
|
---------------------------------
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
interp = interpreters.create()
|
|
|
|
stdout = io.StringIO()
|
|
|
|
with contextlib.redirect_stdout(stdout):
|
2023-11-01 19:06:47 -04:00
|
|
|
interp.exec(tw.dedent("""
|
|
|
|
print('spam!')
|
|
|
|
"""))
|
2023-03-21 11:56:19 -04:00
|
|
|
assert(stdout.getvalue() == 'spam!')
|
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
# alternately:
|
|
|
|
interp.exec(tw.dedent("""
|
|
|
|
import contextlib, io
|
|
|
|
stdout = io.StringIO()
|
|
|
|
with contextlib.redirect_stdout(stdout):
|
|
|
|
print('spam!')
|
|
|
|
captured = stdout.getvalue()
|
|
|
|
"""))
|
|
|
|
captured = interp.get_main_attr('captured')
|
|
|
|
assert(captured == 'spam!')
|
|
|
|
|
2023-03-21 11:56:19 -04:00
|
|
|
A pipe (``os.pipe()``) could be used similarly.
|
|
|
|
|
2017-12-05 21:16:00 -05:00
|
|
|
Running a module
|
|
|
|
----------------
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
interp = interpreters.create()
|
|
|
|
main_module = mod_name
|
2023-11-01 19:06:47 -04:00
|
|
|
interp.exec(f'import runpy; runpy.run_module({main_module!r})')
|
2017-12-05 21:16:00 -05:00
|
|
|
|
|
|
|
Running as script (including zip archives & directories)
|
|
|
|
--------------------------------------------------------
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
interp = interpreters.create()
|
|
|
|
main_script = path_name
|
2023-11-01 19:06:47 -04:00
|
|
|
interp.exec(f"import runpy; runpy.run_path({main_script!r})")
|
|
|
|
|
|
|
|
Using a channel to communicate
|
|
|
|
------------------------------
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
tasks_recv, tasks = interpreters.create_channel()
|
|
|
|
results, results_send = interpreters.create_channel()
|
|
|
|
|
|
|
|
def worker():
|
|
|
|
interp = interpreters.create()
|
|
|
|
interp.set_main_attrs(tasks=tasks_recv, results=results_send)
|
|
|
|
interp.exec(tw.dedent("""
|
|
|
|
def handle_request(req):
|
|
|
|
...
|
|
|
|
|
|
|
|
def capture_exception(exc):
|
|
|
|
...
|
|
|
|
|
|
|
|
while True:
|
|
|
|
try:
|
|
|
|
req = tasks.recv()
|
|
|
|
except Exception:
|
|
|
|
# channel closed
|
|
|
|
break
|
|
|
|
try:
|
|
|
|
res = handle_request(req)
|
|
|
|
except Exception as exc:
|
|
|
|
res = capture_exception(exc)
|
|
|
|
results.send_nowait(res)
|
|
|
|
"""))
|
|
|
|
threads = [threading.Thread(target=worker) for _ in range(20)]
|
|
|
|
for t in threads:
|
|
|
|
t.start()
|
|
|
|
|
|
|
|
requests = ...
|
|
|
|
for req in requests:
|
|
|
|
tasks.send(req)
|
|
|
|
tasks.close()
|
|
|
|
|
|
|
|
for t in threads:
|
|
|
|
t.join()
|
|
|
|
|
|
|
|
Sharing a memoryview (imagine map-reduce)
|
|
|
|
-----------------------------------------
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
data, chunksize = read_large_data_set()
|
|
|
|
buf = memoryview(data)
|
|
|
|
numchunks = (len(buf) + 1) / chunksize
|
|
|
|
results = memoryview(b'\0' * numchunks)
|
|
|
|
|
|
|
|
tasks_recv, tasks = interpreters.create_channel()
|
|
|
|
|
|
|
|
def worker():
|
|
|
|
interp = interpreters.create()
|
|
|
|
interp.set_main_attrs(data=buf, results=results, tasks=tasks_recv)
|
|
|
|
interp.exec(tw.dedent("""
|
|
|
|
while True:
|
|
|
|
try:
|
|
|
|
req = tasks.recv()
|
|
|
|
except Exception:
|
|
|
|
# channel closed
|
|
|
|
break
|
|
|
|
resindex, start, end = req
|
|
|
|
chunk = data[start: end]
|
|
|
|
res = reduce_chunk(chunk)
|
|
|
|
results[resindex] = res
|
|
|
|
"""))
|
|
|
|
t = threading.Thread(target=worker)
|
|
|
|
t.start()
|
|
|
|
|
|
|
|
for i in range(numchunks):
|
|
|
|
if not workers_running():
|
|
|
|
raise ...
|
|
|
|
start = i * chunksize
|
|
|
|
end = start + chunksize
|
|
|
|
if end > len(buf):
|
|
|
|
end = len(buf)
|
|
|
|
tasks.send((start, end, i))
|
|
|
|
tasks.close()
|
|
|
|
t.join()
|
|
|
|
|
|
|
|
use_results(results)
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2017-09-08 02:30:21 -04:00
|
|
|
|
2017-09-12 15:31:24 -04:00
|
|
|
Rationale
|
|
|
|
=========
|
2017-09-08 02:30:21 -04:00
|
|
|
|
2017-09-12 15:31:24 -04:00
|
|
|
Running code in multiple interpreters provides a useful level of
|
2017-12-05 21:16:00 -05:00
|
|
|
isolation within the same process. This can be leveraged in a number
|
2017-09-12 15:31:24 -04:00
|
|
|
of ways. Furthermore, subinterpreters provide a well-defined framework
|
2023-01-20 13:12:46 -05:00
|
|
|
in which such isolation may extended. (See :pep:`684`.)
|
2017-09-08 02:30:21 -04:00
|
|
|
|
2023-10-11 08:05:51 -04:00
|
|
|
Alyssa (Nick) Coghlan explained some of the benefits through a comparison with
|
2017-09-22 19:51:38 -04:00
|
|
|
multi-processing [benefits]_::
|
|
|
|
|
|
|
|
[I] expect that communicating between subinterpreters is going
|
|
|
|
to end up looking an awful lot like communicating between
|
|
|
|
subprocesses via shared memory.
|
|
|
|
|
|
|
|
The trade-off between the two models will then be that one still
|
|
|
|
just looks like a single process from the point of view of the
|
|
|
|
outside world, and hence doesn't place any extra demands on the
|
|
|
|
underlying OS beyond those required to run CPython with a single
|
|
|
|
interpreter, while the other gives much stricter isolation
|
|
|
|
(including isolating C globals in extension modules), but also
|
|
|
|
demands much more from the OS when it comes to its IPC
|
|
|
|
capabilities.
|
|
|
|
|
|
|
|
The security risk profiles of the two approaches will also be quite
|
|
|
|
different, since using subinterpreters won't require deliberately
|
|
|
|
poking holes in the process isolation that operating systems give
|
|
|
|
you by default.
|
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
CPython has supported multiple interpreters, with increasing levels
|
|
|
|
of support, since version 1.5. While the feature has the potential
|
|
|
|
to be a powerful tool, it has suffered from neglect
|
|
|
|
because the multiple interpreter capabilities are not readily available
|
|
|
|
directly from Python. Exposing the existing functionality
|
|
|
|
in the stdlib will help reverse the situation.
|
2017-09-08 02:30:21 -04:00
|
|
|
|
2017-09-12 15:31:24 -04:00
|
|
|
This proposal is focused on enabling the fundamental capability of
|
2023-01-20 13:12:46 -05:00
|
|
|
multiple interpreters, isolated from each other,
|
|
|
|
in the same Python process. This is a
|
2017-09-12 15:31:24 -04:00
|
|
|
new area for Python so there is relative uncertainly about the best
|
2023-01-20 13:12:46 -05:00
|
|
|
tools to provide as companions to interpreters. Thus we minimize
|
2017-09-12 15:31:24 -04:00
|
|
|
the functionality we add in the proposal as much as possible.
|
2017-09-08 19:01:04 -04:00
|
|
|
|
2017-09-12 15:31:24 -04:00
|
|
|
Concerns
|
|
|
|
--------
|
2017-09-08 19:01:04 -04:00
|
|
|
|
2017-09-12 15:31:24 -04:00
|
|
|
* "subinterpreters are not worth the trouble"
|
2017-09-08 02:30:21 -04:00
|
|
|
|
2017-09-12 15:31:24 -04:00
|
|
|
Some have argued that subinterpreters do not add sufficient benefit
|
|
|
|
to justify making them an official part of Python. Adding features
|
|
|
|
to the language (or stdlib) has a cost in increasing the size of
|
2023-01-20 13:12:46 -05:00
|
|
|
the language. So an addition must pay for itself.
|
|
|
|
|
|
|
|
In this case, multiple interpreter support provide a novel concurrency
|
|
|
|
model focused on isolated threads of execution. Furthermore, they
|
|
|
|
provide an opportunity for changes in CPython that will allow
|
|
|
|
simultaneous use of multiple CPU cores (currently prevented
|
|
|
|
by the GIL--see :pep:`684`).
|
2017-09-12 15:31:24 -04:00
|
|
|
|
|
|
|
Alternatives to subinterpreters include threading, async, and
|
|
|
|
multiprocessing. Threading is limited by the GIL and async isn't
|
|
|
|
the right solution for every problem (nor for every person).
|
|
|
|
Multiprocessing is likewise valuable in some but not all situations.
|
|
|
|
Direct IPC (rather than via the multiprocessing module) provides
|
|
|
|
similar benefits but with the same caveat.
|
|
|
|
|
|
|
|
Notably, subinterpreters are not intended as a replacement for any of
|
|
|
|
the above. Certainly they overlap in some areas, but the benefits of
|
|
|
|
subinterpreters include isolation and (potentially) performance. In
|
|
|
|
particular, subinterpreters provide a direct route to an alternate
|
|
|
|
concurrency model (e.g. CSP) which has found success elsewhere and
|
|
|
|
will appeal to some Python users. That is the core value that the
|
|
|
|
``interpreters`` module will provide.
|
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
* "stdlib support for multiple interpreters adds extra burden
|
2017-09-12 15:31:24 -04:00
|
|
|
on C extension authors"
|
|
|
|
|
|
|
|
In the `Interpreter Isolation`_ section below we identify ways in
|
|
|
|
which isolation in CPython's subinterpreters is incomplete. Most
|
|
|
|
notable is extension modules that use C globals to store internal
|
2023-03-21 11:56:19 -04:00
|
|
|
state. (:pep:`3121` and :pep:`489` provide a solution to that problem,
|
|
|
|
followed by some extra APIs that improve efficiency, e.g. :pep:`573`).
|
2017-09-12 15:31:24 -04:00
|
|
|
|
|
|
|
Consequently, projects that publish extension modules may face an
|
|
|
|
increased maintenance burden as their users start using subinterpreters,
|
|
|
|
where their modules may break. This situation is limited to modules
|
|
|
|
that use C globals (or use libraries that use C globals) to store
|
2017-09-22 19:51:38 -04:00
|
|
|
internal state. For numpy, the reported-bug rate is one every 6
|
|
|
|
months. [bug-rate]_
|
2017-09-12 15:31:24 -04:00
|
|
|
|
|
|
|
Ultimately this comes down to a question of how often it will be a
|
|
|
|
problem in practice: how many projects would be affected, how often
|
|
|
|
their users will be affected, what the additional maintenance burden
|
|
|
|
will be for projects, and what the overall benefit of subinterpreters
|
|
|
|
is to offset those costs. The position of this PEP is that the actual
|
|
|
|
extra maintenance burden will be small and well below the threshold at
|
|
|
|
which subinterpreters are worth it.
|
|
|
|
|
2020-04-21 12:47:03 -04:00
|
|
|
* "creating a new concurrency API deserves much more thought and
|
|
|
|
experimentation, so the new module shouldn't go into the stdlib
|
|
|
|
right away, if ever"
|
|
|
|
|
2020-12-04 12:51:44 -05:00
|
|
|
Introducing an API for a new concurrency model, like happened with
|
2020-04-21 12:47:03 -04:00
|
|
|
asyncio, is an extremely large project that requires a lot of careful
|
2023-03-21 11:56:19 -04:00
|
|
|
consideration. It is not something that can be done as simply as this
|
2020-04-21 12:47:03 -04:00
|
|
|
PEP proposes and likely deserves significant time on PyPI to mature.
|
2022-06-14 13:27:47 -04:00
|
|
|
(See `Nathaniel's post <nathaniel-asyncio_>`_ on python-dev.)
|
2020-04-21 12:47:03 -04:00
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
However, this PEP does not propose any new concurrency API.
|
2023-11-01 19:06:47 -04:00
|
|
|
At most it exposes minimal tools (e.g. subinterpreters, channels)
|
2023-01-20 13:12:46 -05:00
|
|
|
which may be used to write code that follows patterns associated with
|
|
|
|
(relatively) new-to-Python `concurrency models <Concurrency_>`_.
|
|
|
|
Those tools could also be used as the basis for APIs for such
|
|
|
|
concurrency models. Again, this PEP does not propose any such API.
|
2020-04-21 12:47:03 -04:00
|
|
|
|
|
|
|
* "there is no point to exposing subinterpreters if they still share
|
|
|
|
the GIL"
|
|
|
|
* "the effort to make the GIL per-interpreter is disruptive and risky"
|
|
|
|
|
|
|
|
A common misconception is that this PEP also includes a promise that
|
2023-01-20 13:12:46 -05:00
|
|
|
interpreters will no longer share the GIL. When that is clarified,
|
2020-04-21 12:47:03 -04:00
|
|
|
the next question is "what is the point?". This is already answered
|
|
|
|
at length in this PEP. Just to be clear, the value lies in::
|
|
|
|
|
|
|
|
* increase exposure of the existing feature, which helps improve
|
|
|
|
the code health of the entire CPython runtime
|
2023-01-20 13:12:46 -05:00
|
|
|
* expose the (mostly) isolated execution of interpreters
|
2020-04-21 12:47:03 -04:00
|
|
|
* preparation for per-interpreter GIL
|
|
|
|
* encourage experimentation
|
|
|
|
|
2020-05-01 18:05:01 -04:00
|
|
|
* "data sharing can have a negative impact on cache performance
|
|
|
|
in multi-core scenarios"
|
|
|
|
|
|
|
|
(See [cache-line-ping-pong]_.)
|
|
|
|
|
|
|
|
This shouldn't be a problem for now as we have no immediate plans
|
|
|
|
to actually share data between interpreters, instead focusing
|
|
|
|
on copying.
|
|
|
|
|
2017-09-12 15:31:24 -04:00
|
|
|
|
|
|
|
About Subinterpreters
|
2017-09-08 14:59:32 -04:00
|
|
|
=====================
|
|
|
|
|
2017-12-05 21:16:00 -05:00
|
|
|
Concurrency
|
|
|
|
-----------
|
|
|
|
|
|
|
|
Concurrency is a challenging area of software development. Decades of
|
|
|
|
research and practice have led to a wide variety of concurrency models,
|
|
|
|
each with different goals. Most center on correctness and usability.
|
|
|
|
|
|
|
|
One class of concurrency models focuses on isolated threads of
|
|
|
|
execution that interoperate through some message passing scheme. A
|
2022-06-14 13:27:47 -04:00
|
|
|
notable example is Communicating Sequential Processes [CSP]_ (upon
|
2023-05-01 15:00:36 -04:00
|
|
|
which Go's concurrency is roughly based). The intended isolation
|
2023-01-20 13:12:46 -05:00
|
|
|
inherent to CPython's interpreters makes them well-suited
|
|
|
|
to this approach.
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
Shared Data
|
2017-09-13 21:35:40 -04:00
|
|
|
-----------
|
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
CPython's interpreters are inherently isolated (with caveats
|
|
|
|
explained below), in contrast to threads. So the same
|
|
|
|
communicate-via-shared-memory approach doesn't work. Without an
|
|
|
|
alternative, effective use of concurrency via multiple interpreters
|
|
|
|
is significantly limited.
|
2017-09-13 21:35:40 -04:00
|
|
|
|
|
|
|
The key challenge here is that sharing objects between interpreters
|
2017-12-05 21:16:00 -05:00
|
|
|
faces complexity due to various constraints on object ownership,
|
|
|
|
visibility, and mutability. At a conceptual level it's easier to
|
|
|
|
reason about concurrency when objects only exist in one interpreter
|
|
|
|
at a time. At a technical level, CPython's current memory model
|
|
|
|
limits how Python *objects* may be shared safely between interpreters;
|
2023-01-20 13:12:46 -05:00
|
|
|
effectively, objects are bound to the interpreter in which they were
|
2021-02-03 09:06:23 -05:00
|
|
|
created. Furthermore, the complexity of *object* sharing increases as
|
2023-01-20 13:12:46 -05:00
|
|
|
interpreters become more isolated, e.g. after GIL removal (though this
|
|
|
|
is mitigated somewhat for some "immortal" objects (see :pep:`683`).
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2023-03-14 19:13:30 -04:00
|
|
|
Consequently, the mechanism for sharing needs to be carefully considered.
|
2017-12-05 21:16:00 -05:00
|
|
|
There are a number of valid solutions, several of which may be
|
2023-11-01 19:06:47 -04:00
|
|
|
appropriate to support in Python's stdlib and C-API. Any such solution
|
|
|
|
is likely to share many characteristics with the others.
|
|
|
|
|
|
|
|
In the meantime, we propose here a minimal solution
|
|
|
|
(``Interpreter.set_main_attrs()``), which sets some precedent for how
|
|
|
|
objects are shared. More importantly, it facilitates the introduction
|
|
|
|
of more advanced approaches later and allows them to coexist and cooperate.
|
|
|
|
In part to demonstrate that, we will provide a basic implementation of
|
|
|
|
"channels", as a somewhat more advanced sharing solution.
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
Separate proposals may cover:
|
|
|
|
|
|
|
|
* the addition of a public C-API based on the implementation
|
|
|
|
``Interpreter.set_main_attrs()``
|
|
|
|
* the addition of other sharing approaches to the "interpreters" module
|
2023-01-20 13:12:46 -05:00
|
|
|
|
|
|
|
The fundamental enabling feature for communication is that most objects
|
|
|
|
can be converted to some encoding of underlying raw data, which is safe
|
|
|
|
to be passed between interpreters. For example, an ``int`` object can
|
2023-11-01 19:06:47 -04:00
|
|
|
be turned into a C ``long`` value, sent to another interpreter, and
|
|
|
|
turned back into an ``int`` object there. As another example,
|
|
|
|
``None`` may be passed as-is.
|
2023-01-20 13:12:46 -05:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
Regardless, the effort to determine the best way forward here is mostly
|
|
|
|
outside the scope of this PEP. In the meantime, this proposal describes
|
|
|
|
a basic interim solution using pipes (``os.pipe()``), as well as
|
|
|
|
providing a dedicated capability ("channels").
|
|
|
|
See `API For Communication`_ below.
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2017-09-12 15:31:24 -04:00
|
|
|
Interpreter Isolation
|
|
|
|
---------------------
|
|
|
|
|
2017-09-08 14:59:32 -04:00
|
|
|
CPython's interpreters are intended to be strictly isolated from each
|
|
|
|
other. Each interpreter has its own copy of all modules, classes,
|
|
|
|
functions, and variables. The same applies to state in C, including in
|
2017-09-12 15:31:24 -04:00
|
|
|
extension modules. The CPython C-API docs explain more. [caveats]_
|
2017-09-08 14:59:32 -04:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
However, there are ways in which interpreters do share some state.
|
|
|
|
First of all, some process-global state remains shared:
|
2017-09-12 15:31:24 -04:00
|
|
|
|
|
|
|
* file descriptors
|
2023-11-01 19:06:47 -04:00
|
|
|
* low-level env vars
|
|
|
|
* process memory (though allocators *are* isolated)
|
2017-09-12 15:31:24 -04:00
|
|
|
* builtin types (e.g. dict, bytes)
|
|
|
|
* singletons (e.g. None)
|
|
|
|
* underlying static module data (e.g. functions) for
|
|
|
|
builtin/extension/frozen modules
|
|
|
|
|
2017-09-08 14:59:32 -04:00
|
|
|
There are no plans to change this.
|
|
|
|
|
|
|
|
Second, some isolation is faulty due to bugs or implementations that did
|
|
|
|
not take subinterpreters into account. This includes things like
|
2017-09-12 15:31:24 -04:00
|
|
|
extension modules that rely on C globals. [cryptography]_ In these
|
|
|
|
cases bugs should be opened (some are already):
|
|
|
|
|
|
|
|
* readline module hook functions (http://bugs.python.org/issue4202)
|
|
|
|
* memory leaks on re-init (http://bugs.python.org/issue21387)
|
2017-09-08 14:59:32 -04:00
|
|
|
|
|
|
|
Finally, some potential isolation is missing due to the current design
|
2017-09-12 15:31:24 -04:00
|
|
|
of CPython. Improvements are currently going on to address gaps in this
|
|
|
|
area:
|
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
* extensions using the ``PyGILState_*`` API are somewhat incompatible [gilstate]_
|
2017-09-12 15:31:24 -04:00
|
|
|
|
|
|
|
Existing Usage
|
|
|
|
--------------
|
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
Multiple interpreter support has not been a widely used feature.
|
|
|
|
In fact, there have been only a handful of documented cases of
|
|
|
|
widespread usage, including
|
2018-09-10 15:07:16 -04:00
|
|
|
`mod_wsgi <https://github.com/GrahamDumpleton/mod_wsgi>`_,
|
|
|
|
`OpenStack Ceph <https://github.com/ceph/ceph/pull/14971>`_, and
|
|
|
|
`JEP <https://github.com/ninia/jep>`_. On the one hand, these cases
|
2023-01-20 13:12:46 -05:00
|
|
|
provide confidence that existing multiple interpreter support is
|
|
|
|
relatively stable. On the other hand, there isn't much of a sample
|
|
|
|
size from which to judge the utility of the feature.
|
2017-09-08 02:30:21 -04:00
|
|
|
|
|
|
|
|
2017-09-12 15:31:24 -04:00
|
|
|
Alternate Python Implementations
|
|
|
|
================================
|
|
|
|
|
2019-03-26 14:39:43 -04:00
|
|
|
I've solicited feedback from various Python implementors about support
|
|
|
|
for subinterpreters. Each has indicated that they would be able to
|
2023-01-20 13:12:46 -05:00
|
|
|
support multiple interpreters in the same process (if they choose to)
|
|
|
|
without a lot of trouble. Here are the projects I contacted:
|
2017-09-22 19:51:38 -04:00
|
|
|
|
2019-03-26 14:39:43 -04:00
|
|
|
* jython ([jython]_)
|
|
|
|
* ironpython (personal correspondence)
|
|
|
|
* pypy (personal correspondence)
|
|
|
|
* micropython (personal correspondence)
|
2017-09-22 19:51:38 -04:00
|
|
|
|
2017-09-12 15:31:24 -04:00
|
|
|
|
2019-03-25 21:10:58 -04:00
|
|
|
.. _interpreters-list-all:
|
|
|
|
.. _interpreters-get-current:
|
|
|
|
.. _interpreters-create:
|
|
|
|
.. _interpreters-Interpreter:
|
2023-11-01 19:06:47 -04:00
|
|
|
.. _interpreters-is-shareable:
|
2019-03-25 21:10:58 -04:00
|
|
|
|
2017-12-05 21:16:00 -05:00
|
|
|
"interpreters" Module API
|
|
|
|
=========================
|
2017-09-13 21:35:40 -04:00
|
|
|
|
2019-03-25 21:10:58 -04:00
|
|
|
The module provides the following functions::
|
2017-09-13 21:35:40 -04:00
|
|
|
|
2019-03-25 21:10:58 -04:00
|
|
|
list_all() -> [Interpreter]
|
2017-09-13 21:35:40 -04:00
|
|
|
|
2019-03-25 21:10:58 -04:00
|
|
|
Return a list of all existing interpreters.
|
2017-09-13 21:35:40 -04:00
|
|
|
|
2019-03-25 21:10:58 -04:00
|
|
|
get_current() => Interpreter
|
2017-09-22 19:51:38 -04:00
|
|
|
|
2019-03-25 21:10:58 -04:00
|
|
|
Return the currently running interpreter.
|
2017-09-22 19:51:38 -04:00
|
|
|
|
2020-04-21 12:47:03 -04:00
|
|
|
get_main() => Interpreter
|
|
|
|
|
2020-04-29 19:48:23 -04:00
|
|
|
Return the main interpreter. If the Python implementation
|
|
|
|
has no concept of a main interpreter then return None.
|
2020-04-21 12:47:03 -04:00
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
create() -> Interpreter
|
2017-09-13 21:35:40 -04:00
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
Initialize a new Python interpreter and return it.
|
|
|
|
It will remain idle until something is run in it and always
|
|
|
|
run in its own thread.
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
is_shareable(obj) -> bool:
|
|
|
|
|
|
|
|
Return True if the object may be "shared" between interpreters.
|
|
|
|
This does not necessarily mean that the actual objects will be
|
2023-11-27 21:32:35 -05:00
|
|
|
shared. Instead, it means that the objects' underlying data will
|
2023-11-01 19:06:47 -04:00
|
|
|
be shared in a cross-interpreter way, whether via a proxy, a
|
|
|
|
copy, or some other means.
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
|
|
|
|
The module also provides the following class::
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2019-03-25 21:10:58 -04:00
|
|
|
class Interpreter(id):
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2019-03-25 21:10:58 -04:00
|
|
|
id -> int:
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2020-05-01 18:05:01 -04:00
|
|
|
The interpreter's ID. (read-only)
|
|
|
|
|
2019-03-25 21:10:58 -04:00
|
|
|
is_running() -> bool:
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
Return whether or not the interpreter's "exec()" is currently
|
2023-03-17 15:19:50 -04:00
|
|
|
executing code. Code running in subthreads is ignored.
|
|
|
|
Calling this on the current interpreter will always return True.
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2020-04-29 19:48:23 -04:00
|
|
|
close():
|
2017-09-13 21:35:40 -04:00
|
|
|
|
2019-03-25 21:10:58 -04:00
|
|
|
Finalize and destroy the interpreter.
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2019-03-25 21:10:58 -04:00
|
|
|
This may not be called on an already running interpreter.
|
|
|
|
Doing so results in a RuntimeError.
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
set_main_attrs(iterable_or_mapping, /):
|
|
|
|
set_main_attrs(**kwargs):
|
|
|
|
|
|
|
|
Set attributes in the interpreter's __main__ module
|
|
|
|
corresponding to the given name-value pairs. Each value
|
|
|
|
must be a "shareable" object and will be converted to a new
|
|
|
|
object (e.g. copy, proxy) in whatever way that object's type
|
|
|
|
defines. If an attribute with the same name is already set,
|
|
|
|
it will be overwritten.
|
|
|
|
|
|
|
|
This method is helpful for setting up an interpreter before
|
|
|
|
calling exec().
|
|
|
|
|
|
|
|
get_main_attr(name, default=None, /):
|
|
|
|
|
|
|
|
Return the value of the corresponding attribute of the
|
|
|
|
interpreter's __main__ module. If the attribute isn't set
|
|
|
|
then the default is returned. If it is set, but the value
|
|
|
|
isn't "shareable" then a ValueError is raised.
|
|
|
|
|
|
|
|
This may be used to introspect the __main__ module, as well
|
|
|
|
as a very basic mechanism for "returning" one or more results
|
|
|
|
from Interpreter.exec().
|
|
|
|
|
|
|
|
exec(source_str, /):
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2023-03-17 15:19:50 -04:00
|
|
|
Run the provided Python source code in the interpreter,
|
|
|
|
in its __main__ module.
|
2023-01-20 13:12:46 -05:00
|
|
|
|
2019-03-25 21:10:58 -04:00
|
|
|
This may not be called on an already running interpreter.
|
|
|
|
Doing so results in a RuntimeError.
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
An "interp.exec()" call is similar to a builtin exec() call
|
|
|
|
(or to calling a function that returns None). Once
|
|
|
|
"interp.exec()" completes, the code that called "exec()"
|
|
|
|
continues executing (in the original interpreter). Likewise,
|
|
|
|
if there is any uncaught exception then it effectively
|
|
|
|
(see below) propagates into the code where ``interp.exec()``
|
|
|
|
was called. Like exec() (and threads), but unlike function
|
|
|
|
calls, there is no return value. If any "return" value from
|
|
|
|
the code is needed, send the data out via a pipe (os.pipe())
|
|
|
|
or channel or other cross-interpreter communication mechanism.
|
|
|
|
|
|
|
|
The big difference from exec() or functions is that
|
|
|
|
"interp.exec()" executes the code in an entirely different
|
|
|
|
interpreter, with entirely separate state. The interpreters
|
|
|
|
are completely isolated from each other, so the state of the
|
|
|
|
original interpreter (including the code it was executing in
|
|
|
|
the current OS thread) does not affect the state of the target
|
|
|
|
interpreter (the one that will execute the code). Likewise,
|
|
|
|
the target does not affect the original, nor any of its other
|
|
|
|
threads.
|
2023-03-17 15:19:50 -04:00
|
|
|
|
|
|
|
Instead, the state of the original interpreter (for this thread)
|
|
|
|
is frozen, and the code it's executing code completely blocks.
|
|
|
|
At that point, the target interpreter is given control of the
|
|
|
|
OS thread. Then, when it finishes executing, the original
|
|
|
|
interpreter gets control back and continues executing.
|
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
So calling "interp.exec()" will effectively cause the current
|
|
|
|
Python thread to completely pause. Sometimes you won't want
|
|
|
|
that pause, in which case you should make the "exec()" call in
|
|
|
|
another thread. To do so, add a function that calls
|
|
|
|
"interp.exec()" and then run that function in a normal
|
|
|
|
"threading.Thread".
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2019-03-25 21:10:58 -04:00
|
|
|
Note that the interpreter's state is never reset, neither
|
2023-11-01 19:06:47 -04:00
|
|
|
before "interp.exec()" executes the code nor after. Thus the
|
|
|
|
interpreter state is preserved between calls to
|
|
|
|
"interp.exec()". This includes "sys.modules", the "builtins"
|
|
|
|
module, and the internal state of C extension modules.
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
Also note that "interp.exec()" executes in the namespace of the
|
2019-03-25 21:10:58 -04:00
|
|
|
"__main__" module, just like scripts, the REPL, "-m", and
|
|
|
|
"-c". Just as the interpreter's state is not ever reset, the
|
|
|
|
"__main__" module is never reset. You can imagine
|
2023-11-01 19:06:47 -04:00
|
|
|
concatenating the code from each "interp.exec()" call into one
|
|
|
|
long script. This is the same as how the REPL operates.
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2019-03-25 21:10:58 -04:00
|
|
|
Supported code: source text.
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
.. XXX Add "InterpreterAlreadyRunningError"?
|
|
|
|
|
|
|
|
In addition to the functionality of ``Interpreter.set_main_attrs()``,
|
|
|
|
the module provides a related way to pass data between interpreters:
|
|
|
|
channels. See `Channels`_ below.
|
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
Uncaught Exceptions
|
|
|
|
-------------------
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
Regarding uncaught exceptions in ``Interpreter.exec()``, we noted that
|
|
|
|
they are "effectively" propagated into the code where ``interp.exec()``
|
|
|
|
was called. To prevent leaking exceptions (and tracebacks) between
|
2023-03-17 15:19:50 -04:00
|
|
|
interpreters, we create a surrogate of the exception and its traceback
|
2023-03-21 11:56:19 -04:00
|
|
|
(see :class:`traceback.TracebackException`), set it to ``__cause__``
|
2023-11-01 19:06:47 -04:00
|
|
|
on a new ``interpreters.RunFailedError``, and raise that.
|
2023-03-17 15:19:50 -04:00
|
|
|
|
|
|
|
Directly raising (a proxy of) the exception is problematic since it's
|
2023-11-01 19:06:47 -04:00
|
|
|
harder to distinguish between an error in the ``interp.exec()`` call
|
|
|
|
and an uncaught exception from the subinterpreter.
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
|
|
|
|
Interpreter Restrictions
|
|
|
|
========================
|
|
|
|
|
|
|
|
Every new interpreter created by ``interpreters.create()``
|
|
|
|
now has specific restrictions on any code it runs. This includes the
|
|
|
|
following:
|
|
|
|
|
|
|
|
* importing an extension module fails if it does not implement
|
|
|
|
multi-phase init
|
|
|
|
* daemon threads may not be created
|
|
|
|
* ``os.fork()`` is not allowed (so no ``multiprocessing``)
|
|
|
|
* ``os.exec*()`` is not allowed
|
|
|
|
(but "fork+exec", a la ``subprocess`` is okay)
|
|
|
|
|
|
|
|
Note that interpreters created with the existing C-API do not have these
|
|
|
|
restrictions. The same is true for the "main" interpreter, so
|
|
|
|
existing use of Python will not change.
|
|
|
|
|
|
|
|
.. XXX Mention the similar restrictions in PEP 684?
|
|
|
|
|
|
|
|
We may choose to later loosen some of the above restrictions or provide
|
|
|
|
a way to enable/disable granular restrictions individually. Regardless,
|
|
|
|
requiring multi-phase init from extension modules will always be a
|
|
|
|
default restriction.
|
|
|
|
|
|
|
|
|
|
|
|
API For Communication
|
|
|
|
=====================
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
As discussed in `Shared Data`_ above, multiple interpreter support
|
|
|
|
is less useful without a mechanism for sharing data (communicating)
|
|
|
|
between them. Sharing actual Python objects between interpreters,
|
|
|
|
however, has enough potential problems that we are avoiding support
|
|
|
|
for that in this proposal. Nor, as mentioned earlier, are we adding
|
2023-11-01 19:06:47 -04:00
|
|
|
anything more than a basic mechanism for communication.
|
|
|
|
|
|
|
|
That mechanism is the ``Interpreter.set_main_attrs()`` method.
|
|
|
|
It may be used to set up global variables before ``Interpreter.exec()``
|
|
|
|
is called. The name-value pairs passed to ``set_main_attrs()`` are
|
|
|
|
bound as attributes of the interpreter's ``__main__`` module.
|
|
|
|
The values must be "shareable". See `Shareable Types`_ below.
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
Additional approaches to communicating and sharing objects are enabled
|
|
|
|
through ``Interpreter.set_main_attrs()``. A shareable object could be
|
|
|
|
implemented which works like a queue, but with cross-interpreter safety.
|
|
|
|
In fact, this PEP does include an example of such an approach: channels.
|
|
|
|
|
|
|
|
Shareable Types
|
|
|
|
---------------
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
An object is "shareable" if its type supports shareable instances.
|
|
|
|
The type must implement a new internal protocol, which is used to
|
2023-11-27 21:32:35 -05:00
|
|
|
convert an object to interpreter-independent data and then converted
|
2023-11-01 19:06:47 -04:00
|
|
|
back to an object on the other side. Also see
|
|
|
|
`is_shareable() <interpreters-is-shareable_>`_ above.
|
|
|
|
|
|
|
|
A minimal set of simple, immutable builtin types will be supported
|
|
|
|
initially, including:
|
|
|
|
|
|
|
|
* ``None``
|
|
|
|
* ``bool``
|
|
|
|
* ``bytes``
|
|
|
|
* ``str``
|
|
|
|
* ``int``
|
|
|
|
* ``float``
|
|
|
|
|
|
|
|
We will also support a small number of complex types initially:
|
|
|
|
|
|
|
|
* ``memoryview``, to allow sharing :pep:`3118` buffers
|
|
|
|
* `channels <Channels_>`_
|
|
|
|
|
|
|
|
Further builtin types may be supported later, complex or not.
|
|
|
|
Limiting the initial shareable types is a practical matter, reducing
|
|
|
|
the potential complexity of the initial implementation. There are a
|
|
|
|
number of strategies we may pursue in the future to expand supported
|
|
|
|
objects, once we have more experience with interpreter isolation.
|
|
|
|
|
|
|
|
In the meantime, a separate proposal will discuss making the internal
|
|
|
|
protocol (and C-API) used by ``Interpreter.set_main_attrs()`` public.
|
|
|
|
With that protocol, support for other types could be added
|
|
|
|
by extension modules.
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
Communicating Through OS Pipes
|
|
|
|
''''''''''''''''''''''''''''''
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
Even without a dedicated object for communication, users may already
|
|
|
|
use existing tools. For example, one basic approach for sending data
|
|
|
|
between interpreters is to use a pipe (see ``os.pipe()``):
|
2020-04-29 19:48:23 -04:00
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
1. interpreter A calls ``os.pipe()`` to get a read/write pair
|
2023-03-14 19:13:30 -04:00
|
|
|
of file descriptors (both ``int`` objects)
|
2023-11-01 19:06:47 -04:00
|
|
|
2. interpreter A calls ``interp.set_main_attrs()``, binding the read FD
|
|
|
|
(or embeds it using string formatting)
|
|
|
|
3. interpreter A calls ``interp.exec()`` on interpreter B
|
|
|
|
4. interpreter A writes some bytes to the write FD
|
|
|
|
5. interpreter B reads those bytes
|
2020-05-01 18:05:01 -04:00
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
Several of the earlier examples demonstrate this, such as
|
|
|
|
`Synchronize using an OS pipe`_.
|
2020-05-01 18:05:01 -04:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
.. _interpreters-create-channel:
|
|
|
|
.. _interpreters-RecvChannel:
|
|
|
|
.. _interpreters-SendChannel:
|
2020-05-01 18:05:01 -04:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
Channels
|
|
|
|
--------
|
2020-05-01 18:05:01 -04:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
The ``interpreters`` module will include a dedicated solution for
|
|
|
|
passing object data between interpreters: channels. They are included
|
|
|
|
in the module in part to provide an easier mechanism than using
|
|
|
|
``os.pipe()`` and in part to demonstrate how libraries may take
|
|
|
|
advantage of ``Interpreter.set_main_attrs()``
|
|
|
|
and the protocol it uses.
|
2020-05-01 18:05:01 -04:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
A channel is a simplex FIFO. It is a basic, opt-in data sharing
|
|
|
|
mechanism that draws inspiration from pipes, queues, and CSP's
|
|
|
|
channels. [fifo]_ The main difference from pipes is that channels can
|
|
|
|
be associated with zero or more interpreters on either end. Like
|
|
|
|
queues, which are also many-to-many, channels are buffered (though
|
|
|
|
they also offer methods with unbuffered semantics).
|
2023-01-20 13:12:46 -05:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
Channels have two operations: send and receive. A key characteristic
|
|
|
|
of those operations is that channels transmit data derived from Python
|
|
|
|
objects rather than the objects themselves. When objects are sent,
|
|
|
|
their data is extracted. When the "object" is received in the other
|
|
|
|
interpreter, the data is converted back into an object owned by that
|
|
|
|
interpreter.
|
2020-05-01 18:05:01 -04:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
To make this work, the mutable shared state will be managed by the
|
|
|
|
Python runtime, not by any of the interpreters. Initially we will
|
|
|
|
support only one type of objects for shared state: the channels provided
|
|
|
|
by ``interpreters.create_channel()``. Channels, in turn, will carefully
|
|
|
|
manage passing objects between interpreters.
|
2023-01-20 13:12:46 -05:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
This approach, including keeping the API minimal, helps us avoid further
|
|
|
|
exposing any underlying complexity to Python users.
|
|
|
|
|
|
|
|
The ``interpreters`` module provides the following function related
|
|
|
|
to channels::
|
|
|
|
|
|
|
|
create_channel() -> (RecvChannel, SendChannel):
|
|
|
|
|
|
|
|
Create a new channel and return (recv, send), the RecvChannel
|
|
|
|
and SendChannel corresponding to the ends of the channel.
|
|
|
|
|
|
|
|
Both ends of the channel are supported "shared" objects (i.e.
|
|
|
|
may be safely shared by different interpreters. Thus they
|
|
|
|
may be set using "Interpreter.set_main_attrs()".
|
|
|
|
|
|
|
|
The module also provides the following channel-related classes::
|
|
|
|
|
|
|
|
class RecvChannel(id):
|
|
|
|
|
|
|
|
The receiving end of a channel. An interpreter may use this to
|
|
|
|
receive objects from another interpreter. Any type supported by
|
|
|
|
Interpreter.set_main_attrs() will be supported here, though at
|
|
|
|
first only a few of the simple, immutable builtin types
|
|
|
|
will be supported.
|
|
|
|
|
|
|
|
id -> int:
|
|
|
|
|
|
|
|
The channel's unique ID. The "send" end has the same one.
|
|
|
|
|
|
|
|
recv(*, timeout=None):
|
|
|
|
|
|
|
|
Return the next object from the channel. If none have been
|
|
|
|
sent then wait until the next send (or until the timeout is hit).
|
|
|
|
|
|
|
|
At the least, the object will be equivalent to the sent object.
|
|
|
|
That will almost always mean the same type with the same data,
|
|
|
|
though it could also be a compatible proxy. Regardless, it may
|
|
|
|
use a copy of that data or actually share the data. That's up
|
|
|
|
to the object's type.
|
|
|
|
|
|
|
|
recv_nowait(default=None):
|
|
|
|
|
|
|
|
Return the next object from the channel. If none have been
|
|
|
|
sent then return the default. Otherwise, this is the same
|
|
|
|
as the "recv()" method.
|
|
|
|
|
|
|
|
|
|
|
|
class SendChannel(id):
|
|
|
|
|
|
|
|
The sending end of a channel. An interpreter may use this to
|
|
|
|
send objects to another interpreter. Any type supported by
|
|
|
|
Interpreter.set_main_attrs() will be supported here, though
|
|
|
|
at first only a few of the simple, immutable builtin types
|
|
|
|
will be supported.
|
|
|
|
|
|
|
|
id -> int:
|
|
|
|
|
|
|
|
The channel's unique ID. The "recv" end has the same one.
|
|
|
|
|
|
|
|
send(obj, *, timeout=None):
|
|
|
|
|
|
|
|
Send the object (i.e. its data) to the "recv" end of the
|
|
|
|
channel. Wait until the object is received. If the object
|
|
|
|
is not shareable then ValueError is raised.
|
|
|
|
|
|
|
|
The builtin memoryview is supported, so sending a buffer
|
|
|
|
across involves first wrapping the object in a memoryview
|
|
|
|
and then sending that.
|
|
|
|
|
|
|
|
send_nowait(obj):
|
|
|
|
|
|
|
|
Send the object to the "recv" end of the channel. This
|
|
|
|
behaves the same as "send()", except for the waiting part.
|
|
|
|
If no interpreter is currently receiving (waiting on the
|
|
|
|
other end) then queue the object and return False. Otherwise
|
|
|
|
return True.
|
|
|
|
|
|
|
|
Caveats For Shared Objects
|
|
|
|
--------------------------
|
|
|
|
|
|
|
|
Again, Python objects are not shared between interpreters.
|
|
|
|
However, in some cases data those objects wrap is actually shared
|
|
|
|
and not just copied. One example might be :pep:`3118` buffers.
|
|
|
|
|
|
|
|
In those cases the object in the original interpreter is kept alive
|
|
|
|
until the shared data in the other interpreter is no longer used.
|
|
|
|
Then object destruction can happen like normal in the original
|
|
|
|
interpreter, along with the previously shared data.
|
2020-05-01 18:05:01 -04:00
|
|
|
|
|
|
|
|
|
|
|
Documentation
|
|
|
|
=============
|
|
|
|
|
|
|
|
The new stdlib docs page for the ``interpreters`` module will include
|
|
|
|
the following:
|
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
* (at the top) a clear note that support for multiple interpreters
|
|
|
|
is not required from extension modules
|
2020-05-01 18:05:01 -04:00
|
|
|
* some explanation about what subinterpreters are
|
2023-01-20 13:12:46 -05:00
|
|
|
* brief examples of how to use multiple interpreters
|
|
|
|
(and communicating between them)
|
|
|
|
* a summary of the limitations of using multiple interpreters
|
2020-05-01 18:05:01 -04:00
|
|
|
* (for extension maintainers) a link to the resources for ensuring
|
2023-01-20 13:12:46 -05:00
|
|
|
multiple interpreters compatibility
|
2020-05-01 18:05:01 -04:00
|
|
|
* much of the API information in this PEP
|
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
Docs about resources for extension maintainers already exist on the
|
|
|
|
`Isolating Extension Modules <isolation-howto_>`_ howto page. Any
|
|
|
|
extra help will be added there. For example, it may prove helpful
|
|
|
|
to discuss strategies for dealing with linked libraries that keep
|
|
|
|
their own subinterpreter-incompatible global state.
|
|
|
|
|
|
|
|
.. _isolation-howto:
|
|
|
|
https://docs.python.org/3/howto/isolating-extensions.html
|
2020-05-01 18:05:01 -04:00
|
|
|
|
|
|
|
Note that the documentation will play a large part in mitigating any
|
|
|
|
negative impact that the new ``interpreters`` module might have on
|
|
|
|
extension module maintainers.
|
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
Also, the ``ImportError`` for incompatible extension modules will be
|
|
|
|
updated to clearly say it is due to missing multiple interpreters
|
2020-05-01 18:05:01 -04:00
|
|
|
compatibility and that extensions are not required to provide it. This
|
|
|
|
will help set user expectations properly.
|
2020-04-21 12:47:03 -04:00
|
|
|
|
2023-03-14 19:13:30 -04:00
|
|
|
Alternative Solutions
|
|
|
|
=====================
|
|
|
|
|
|
|
|
One possible alternative to a new module is to add support for interpreters
|
|
|
|
to ``concurrent.futures``. There are several reasons why that wouldn't work:
|
|
|
|
|
|
|
|
* the obvious place to look for multiple interpreters support
|
|
|
|
is an "interpreters" module, much as with "threading", etc.
|
|
|
|
* ``concurrent.futures`` is all about executing functions
|
|
|
|
but currently we don't have a good way to run a function
|
|
|
|
from one interpreter in another
|
|
|
|
|
|
|
|
Similar reasoning applies for support in the ``multiprocessing`` module.
|
|
|
|
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
Open Questions
|
|
|
|
==============
|
|
|
|
|
|
|
|
* will is be too confusing that ``interp.exec()`` runs in the current thread?
|
|
|
|
* should we add pickling fallbacks right now for ``interp.exec()``, and/or
|
|
|
|
``Interpreter.set_main_attrs()`` and ``Interpreter.get_main_attr()``?
|
|
|
|
* should we support (limited) functions in ``interp.exec()`` right now?
|
|
|
|
* rename ``Interpreter.close()`` to ``Interpreter.destroy()``?
|
|
|
|
* drop ``Interpreter.get_main_attr()``, since we have channels?
|
|
|
|
* should channels be its own PEP?
|
|
|
|
|
|
|
|
|
2017-09-12 15:31:24 -04:00
|
|
|
Deferred Functionality
|
|
|
|
======================
|
|
|
|
|
|
|
|
In the interest of keeping this proposal minimal, the following
|
|
|
|
functionality has been left out for future consideration. Note that
|
|
|
|
this is not a judgement against any of said capability, but rather a
|
|
|
|
deferment. That said, each is arguably valid.
|
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
Add convenience API
|
|
|
|
-------------------
|
|
|
|
|
|
|
|
There are a number of things I can imagine would smooth out
|
|
|
|
*hypothetical* rough edges with the new module:
|
|
|
|
|
|
|
|
* add something like ``Interpreter.run()`` or ``Interpreter.call()``
|
|
|
|
that calls ``interp.exec()`` and falls back to pickle
|
|
|
|
* fall back to pickle in ``Interpreter.set_main_attrs()``
|
|
|
|
and ``Interpreter.get_main_attr()``
|
|
|
|
|
|
|
|
These would be easy to do if this proves to be a pain point.
|
|
|
|
|
|
|
|
Avoid possible confusion about interpreters running in the current thread
|
|
|
|
-------------------------------------------------------------------------
|
|
|
|
|
|
|
|
One regular point of confusion has been that ``Interpreter.exec()``
|
|
|
|
executes in the current OS thread, temporarily blocking the current
|
|
|
|
Python thread. It may be worth doing something to avoid that confusion.
|
|
|
|
|
|
|
|
Some possible solutions for this hypothetical problem:
|
2023-03-14 19:13:30 -04:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
* by default, run in a new thread?
|
|
|
|
* add ``Interpreter.exec_in_thread()``?
|
|
|
|
* add ``Interpreter.exec_in_current_thread()``?
|
|
|
|
|
|
|
|
In earlier versions of this PEP the method was ``interp.run()``.
|
|
|
|
The simple change to ``interp.exec()`` alone will probably reduce
|
|
|
|
confusion sufficiently, when coupled with educating users via
|
|
|
|
the docs. It it turns out to be a real problem, we can pursue
|
|
|
|
one of the alternatives at that point.
|
|
|
|
|
|
|
|
Clarify "running" vs. "has threads"
|
|
|
|
-----------------------------------
|
|
|
|
|
|
|
|
``Interpreter.is_running()`` refers specifically to whether or not
|
|
|
|
``Interpreter.exec()`` (or similar) is running somewhere. It does not
|
|
|
|
say anything about if the interpreter has any subthreads running. That
|
|
|
|
information might be helpful.
|
|
|
|
|
|
|
|
Some things we could do:
|
|
|
|
|
|
|
|
* rename ``Interpreter.is_running()`` to ``Interpreter.is_running_main()``
|
|
|
|
* add ``Interpreter.has_threads()``, to complement ``Interpreter.is_running()``
|
|
|
|
* expand to ``Interpreter.is_running(main=True, threads=False)``
|
|
|
|
|
|
|
|
None of these are urgent and any could be done later, if desired.
|
|
|
|
|
|
|
|
A Dunder Method For Sharing
|
|
|
|
---------------------------
|
|
|
|
|
|
|
|
We could add a special method, like ``__xid__`` to correspond to ``tp_xid``.
|
|
|
|
At the very least, it would allow Python types to convert their instances
|
|
|
|
to some other type that implements ``tp_xid``.
|
|
|
|
|
|
|
|
The problem is that exposing this capability to Python code presents
|
|
|
|
a degree of complixity that hasn't been explored yet, nor is there
|
|
|
|
a compelling case to investigate that complexity.
|
2023-03-14 19:13:30 -04:00
|
|
|
|
2017-09-12 15:31:24 -04:00
|
|
|
Interpreter.call()
|
|
|
|
------------------
|
|
|
|
|
|
|
|
It would be convenient to run existing functions in subinterpreters
|
2023-11-01 19:06:47 -04:00
|
|
|
directly. ``Interpreter.exec()`` could be adjusted to support this or
|
2017-09-12 15:31:24 -04:00
|
|
|
a ``call()`` method could be added::
|
|
|
|
|
|
|
|
Interpreter.call(f, *args, **kwargs)
|
|
|
|
|
|
|
|
This suffers from the same problem as sharing objects between
|
|
|
|
interpreters via queues. The minimal solution (running a source string)
|
|
|
|
is sufficient for us to get the feature out where it can be explored.
|
|
|
|
|
|
|
|
Interpreter.run_in_thread()
|
|
|
|
---------------------------
|
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
This method would make a ``interp.exec()`` call for you in a thread.
|
|
|
|
Doing this using only ``threading.Thread`` and ``interp.exec()`` is
|
|
|
|
relatively trivial so we've left it out.
|
2017-09-12 15:31:24 -04:00
|
|
|
|
|
|
|
Synchronization Primitives
|
|
|
|
--------------------------
|
|
|
|
|
|
|
|
The ``threading`` module provides a number of synchronization primitives
|
|
|
|
for coordinating concurrent operations. This is especially necessary
|
|
|
|
due to the shared-state nature of threading. In contrast,
|
2023-01-20 13:12:46 -05:00
|
|
|
interpreters do not share state. Data sharing is restricted to the
|
|
|
|
runtime's shareable objects capability, which does away with the need
|
|
|
|
for explicit synchronization. If any sort of opt-in shared state
|
|
|
|
support is added to CPython's interpreters in the future, that same
|
|
|
|
effort can introduce synchronization primitives to meet that need.
|
2017-09-12 15:31:24 -04:00
|
|
|
|
|
|
|
CSP Library
|
|
|
|
-----------
|
|
|
|
|
|
|
|
A ``csp`` module would not be a large step away from the functionality
|
|
|
|
provided by this PEP. However, adding such a module is outside the
|
|
|
|
minimalist goals of this proposal.
|
|
|
|
|
|
|
|
Syntactic Support
|
|
|
|
-----------------
|
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
The ``Go`` language provides a concurrency model based on CSP,
|
|
|
|
so it's similar to the concurrency model that multiple interpreters
|
|
|
|
support. However, ``Go`` also provides syntactic support, as well as
|
|
|
|
several builtin concurrency primitives, to make concurrency a
|
|
|
|
first-class feature. Conceivably, similar syntactic (and builtin)
|
|
|
|
support could be added to Python using interpreters. However,
|
|
|
|
that is *way* outside the scope of this PEP!
|
2017-09-12 15:31:24 -04:00
|
|
|
|
|
|
|
Multiprocessing
|
|
|
|
---------------
|
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
The ``multiprocessing`` module could support interpreters in the same
|
2017-09-12 15:31:24 -04:00
|
|
|
way it supports threads and processes. In fact, the module's
|
|
|
|
maintainer, Davin Potts, has indicated this is a reasonable feature
|
|
|
|
request. However, it is outside the narrow scope of this PEP.
|
|
|
|
|
2017-09-13 21:35:40 -04:00
|
|
|
C-extension opt-in/opt-out
|
|
|
|
--------------------------
|
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
By using the ``PyModuleDef_Slot`` introduced by :pep:`489`, we could
|
|
|
|
easily add a mechanism by which C-extension modules could opt out of
|
|
|
|
multiple interpreter support. Then the import machinery, when operating
|
|
|
|
in a subinterpreter, would need to check the module for support.
|
|
|
|
It would raise an ImportError if unsupported.
|
2017-09-13 21:35:40 -04:00
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
Alternately we could support opting in to multiple interpreters support.
|
2017-09-13 21:35:40 -04:00
|
|
|
However, that would probably exclude many more modules (unnecessarily)
|
2022-01-21 06:03:51 -05:00
|
|
|
than the opt-out approach. Also, note that :pep:`489` defined that an
|
2023-01-20 13:12:46 -05:00
|
|
|
extension's use of the PEP's machinery implies multiple interpreters
|
|
|
|
support.
|
2017-09-13 21:35:40 -04:00
|
|
|
|
|
|
|
The scope of adding the ModuleDef slot and fixing up the import
|
|
|
|
machinery is non-trivial, but could be worth it. It all depends on
|
2019-03-23 02:12:14 -04:00
|
|
|
how many extension modules break under subinterpreters. Given that
|
|
|
|
there are relatively few cases we know of through mod_wsgi, we can
|
|
|
|
leave this for later.
|
2017-09-13 21:35:40 -04:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
Poisoning channels
|
|
|
|
------------------
|
|
|
|
|
|
|
|
CSP has the concept of poisoning a channel. Once a channel has been
|
|
|
|
poisoned, any ``send()`` or ``recv()`` call on it would raise a special
|
|
|
|
exception, effectively ending execution in the interpreter that tried
|
|
|
|
to use the poisoned channel.
|
|
|
|
|
|
|
|
This could be accomplished by adding a ``poison()`` method to both ends
|
|
|
|
of the channel. The ``close()`` method can be used in this way
|
|
|
|
(mostly), but these semantics are relatively specialized and can wait.
|
|
|
|
|
2019-03-26 14:39:43 -04:00
|
|
|
Resetting __main__
|
|
|
|
------------------
|
2017-09-13 21:35:40 -04:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
As proposed, every call to ``Interpreter.exec()`` will execute in the
|
2017-09-13 21:35:40 -04:00
|
|
|
namespace of the interpreter's existing ``__main__`` module. This means
|
2023-11-01 19:06:47 -04:00
|
|
|
that data persists there between ``interp.exec()`` calls. Sometimes
|
|
|
|
this isn't desirable and you want to execute in a fresh ``__main__``.
|
|
|
|
Also, you don't necessarily want to leak objects there that you aren't
|
|
|
|
using any more.
|
2017-09-13 21:35:40 -04:00
|
|
|
|
2017-09-22 19:51:38 -04:00
|
|
|
Note that the following won't work right because it will clear too much
|
|
|
|
(e.g. ``__name__`` and the other "__dunder__" attributes::
|
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
interp.exec('globals().clear()')
|
2017-09-22 19:51:38 -04:00
|
|
|
|
|
|
|
Possible solutions include:
|
2017-09-13 21:35:40 -04:00
|
|
|
|
|
|
|
* a ``create()`` arg to indicate resetting ``__main__`` after each
|
2023-11-01 19:06:47 -04:00
|
|
|
``interp.exec()`` call
|
2017-09-13 21:35:40 -04:00
|
|
|
* an ``Interpreter.reset_main`` flag to support opting in or out
|
|
|
|
after the fact
|
|
|
|
* an ``Interpreter.reset_main()`` method to opt in when desired
|
2017-09-22 19:51:38 -04:00
|
|
|
* ``importlib.util.reset_globals()`` [reset_globals]_
|
|
|
|
|
2019-03-26 14:39:43 -04:00
|
|
|
Also note that resetting ``__main__`` does nothing about state stored
|
2017-09-22 19:51:38 -04:00
|
|
|
in other modules. So any solution would have to be clear about the
|
|
|
|
scope of what is being reset. Conceivably we could invent a mechanism
|
|
|
|
by which any (or every) module could be reset, unlike ``reload()``
|
2023-03-21 11:56:19 -04:00
|
|
|
which does not clear the module before loading into it.
|
|
|
|
|
|
|
|
Regardless, since ``__main__`` is the execution namespace of the
|
|
|
|
interpreter, resetting it has a much more direct correlation to
|
|
|
|
interpreters and their dynamic state than does resetting other modules.
|
|
|
|
So a more generic module reset mechanism may prove unnecessary.
|
2017-09-13 21:35:40 -04:00
|
|
|
|
|
|
|
This isn't a critical feature initially. It can wait until later
|
|
|
|
if desirable.
|
|
|
|
|
2019-03-26 14:39:43 -04:00
|
|
|
Resetting an interpreter's state
|
|
|
|
--------------------------------
|
|
|
|
|
|
|
|
It may be nice to re-use an existing subinterpreter instead of
|
|
|
|
spinning up a new one. Since an interpreter has substantially more
|
|
|
|
state than just the ``__main__`` module, it isn't so easy to put an
|
|
|
|
interpreter back into a pristine/fresh state. In fact, there *may*
|
|
|
|
be parts of the state that cannot be reset from Python code.
|
|
|
|
|
|
|
|
A possible solution is to add an ``Interpreter.reset()`` method. This
|
|
|
|
would put the interpreter back into the state it was in when newly
|
|
|
|
created. If called on a running interpreter it would fail (hence the
|
|
|
|
main interpreter could never be reset). This would likely be more
|
2023-01-20 13:12:46 -05:00
|
|
|
efficient than creating a new interpreter, though that depends on
|
|
|
|
what optimizations will be made later to interpreter creation.
|
2019-03-26 14:39:43 -04:00
|
|
|
|
|
|
|
While this would potentially provide functionality that is not
|
|
|
|
otherwise available from Python code, it isn't a fundamental
|
|
|
|
functionality. So in the spirit of minimalism here, this can wait.
|
|
|
|
Regardless, I doubt it would be controversial to add it post-PEP.
|
|
|
|
|
2023-03-21 11:56:19 -04:00
|
|
|
Copy an existing interpreter's state
|
|
|
|
------------------------------------
|
|
|
|
|
|
|
|
Relatedly, it may be useful to support creating a new interpreter
|
|
|
|
based on an existing one, e.g. ``Interpreter.copy()``. This ties
|
|
|
|
into the idea that a snapshot could be made of an interpreter's memory,
|
|
|
|
which would make starting up CPython, or creating new interpreters,
|
|
|
|
faster in general. The same mechanism could be used for a
|
|
|
|
hypothetical ``Interpreter.reset()``, as described previously.
|
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
Shareable file descriptors and sockets
|
|
|
|
--------------------------------------
|
2017-09-13 21:35:40 -04:00
|
|
|
|
|
|
|
Given that file descriptors and sockets are process-global resources,
|
2023-01-20 13:12:46 -05:00
|
|
|
making them shareable is a reasonable idea. They would be a good
|
|
|
|
candidate for the first effort at expanding the supported shareable
|
|
|
|
types. They aren't strictly necessary for the initial API.
|
2017-09-13 21:35:40 -04:00
|
|
|
|
2017-09-22 19:51:38 -04:00
|
|
|
Integration with async
|
|
|
|
----------------------
|
|
|
|
|
|
|
|
Per Antoine Pitrou [async]_::
|
|
|
|
|
|
|
|
Has any thought been given to how FIFOs could integrate with async
|
|
|
|
code driven by an event loop (e.g. asyncio)? I think the model of
|
|
|
|
executing several asyncio (or Tornado) applications each in their
|
|
|
|
own subinterpreter may prove quite interesting to reconcile multi-
|
|
|
|
core concurrency with ease of programming. That would require the
|
|
|
|
FIFOs to be able to synchronize on something an event loop can wait
|
|
|
|
on (probably a file descriptor?).
|
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
The basic functionality of multiple interpreters support does not depend
|
|
|
|
on async and can be added later.
|
2020-04-29 19:48:23 -04:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
A possible solution is to provide async implementations of the blocking
|
|
|
|
channel methods (``recv()``, and ``send()``).
|
|
|
|
|
|
|
|
Alternately, "readiness callbacks" could be used to simplify use in
|
|
|
|
async scenarios. This would mean adding an optional ``callback``
|
|
|
|
(kw-only) parameter to the ``recv_nowait()`` and ``send_nowait()``
|
|
|
|
channel methods. The callback would be called once the object was sent
|
|
|
|
or received (respectively).
|
|
|
|
|
|
|
|
(Note that making channels buffered makes readiness callbacks less
|
|
|
|
important.)
|
|
|
|
|
|
|
|
Support for iteration
|
|
|
|
---------------------
|
|
|
|
|
|
|
|
Supporting iteration on ``RecvChannel`` (via ``__iter__()`` or
|
|
|
|
``_next__()``) may be useful. A trivial implementation would use the
|
|
|
|
``recv()`` method, similar to how files do iteration. Since this isn't
|
|
|
|
a fundamental capability and has a simple analog, adding iteration
|
|
|
|
support can wait until later.
|
2017-09-22 19:51:38 -04:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
Channel context managers
|
|
|
|
------------------------
|
|
|
|
|
|
|
|
Context manager support on ``RecvChannel`` and ``SendChannel`` may be
|
|
|
|
helpful. The implementation would be simple, wrapping a call to
|
|
|
|
``close()`` (or maybe ``release()``) like files do. As with iteration,
|
|
|
|
this can wait.
|
2017-09-22 19:51:38 -04:00
|
|
|
|
|
|
|
Pipes and Queues
|
|
|
|
----------------
|
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
With the proposed object passing mechanism of "os.pipe()", other similar
|
|
|
|
basic types aren't strictly required to achieve the minimal useful
|
|
|
|
functionality of multiple interpreters. Such types include pipes
|
|
|
|
(like unbuffered channels, but one-to-one) and queues (like channels,
|
|
|
|
but more generic). See below in `Rejected Ideas`_ for more information.
|
2017-09-22 19:51:38 -04:00
|
|
|
|
|
|
|
Even though these types aren't part of this proposal, they may still
|
|
|
|
be useful in the context of concurrency. Adding them later is entirely
|
|
|
|
reasonable. The could be trivially implemented as wrappers around
|
|
|
|
channels. Alternatively they could be implemented for efficiency at the
|
|
|
|
same low level as channels.
|
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
Return a lock from send()
|
|
|
|
-------------------------
|
|
|
|
|
|
|
|
When sending an object through a channel, you don't have a way of knowing
|
|
|
|
when the object gets received on the other end. One way to work around
|
|
|
|
this is to return a locked ``threading.Lock`` from ``SendChannel.send()``
|
|
|
|
that unlocks once the object is received.
|
|
|
|
|
|
|
|
Alternately, the proposed ``SendChannel.send()`` (blocking) and
|
|
|
|
``SendChannel.send_nowait()`` provide an explicit distinction that is
|
|
|
|
less likely to confuse users.
|
|
|
|
|
|
|
|
Note that returning a lock would matter for buffered channels
|
|
|
|
(i.e. queues). For unbuffered channels it is a non-issue.
|
|
|
|
|
|
|
|
Support prioritization in channels
|
|
|
|
----------------------------------
|
|
|
|
|
|
|
|
A simple example is ``queue.PriorityQueue`` in the stdlib.
|
|
|
|
|
2019-04-27 11:49:25 -04:00
|
|
|
Support inheriting settings (and more?)
|
|
|
|
---------------------------------------
|
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
Folks might find it useful, when creating a new interpreter, to be
|
2019-04-27 11:49:25 -04:00
|
|
|
able to indicate that they would like some things "inherited" by the
|
|
|
|
new interpreter. The mechanism could be a strict copy or it could be
|
|
|
|
copy-on-write. The motivating example is with the warnings module
|
|
|
|
(e.g. copy the filters).
|
|
|
|
|
|
|
|
The feature isn't critical, nor would it be widely useful, so it
|
|
|
|
can wait until there's interest. Notably, both suggested solutions
|
|
|
|
will require significant work, especially when it comes to complex
|
|
|
|
objects and most especially for mutable containers of mutable
|
|
|
|
complex objects.
|
|
|
|
|
2020-04-29 19:48:23 -04:00
|
|
|
Make exceptions shareable
|
|
|
|
-------------------------
|
|
|
|
|
|
|
|
Exceptions are propagated out of ``run()`` calls, so it isn't a big
|
2023-01-20 13:12:46 -05:00
|
|
|
leap to make them shareable. However, as noted elsewhere,
|
2020-04-29 19:48:23 -04:00
|
|
|
it isn't essential or (particularly common) so we can wait on doing
|
|
|
|
that.
|
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
Make everything shareable through serialization
|
|
|
|
-----------------------------------------------
|
|
|
|
|
|
|
|
We could use pickle (or marshal) to serialize everything and thus
|
|
|
|
make them shareable. Doing this is potentially inefficient,
|
|
|
|
but it may be a matter of convenience in the end.
|
|
|
|
We can add it later, but trying to remove it later
|
|
|
|
would be significantly more painful.
|
|
|
|
|
2020-04-29 19:48:23 -04:00
|
|
|
Make RunFailedError.__cause__ lazy
|
|
|
|
----------------------------------
|
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
An uncaught exception in a subinterpreter (from ``interp.exec()``) is
|
|
|
|
copied to the calling interpreter and set as ``__cause__`` on a
|
2020-04-29 19:48:23 -04:00
|
|
|
``RunFailedError`` which is then raised. That copying part involves
|
2021-02-03 09:06:23 -05:00
|
|
|
some sort of deserialization in the calling interpreter, which can be
|
2020-04-29 19:48:23 -04:00
|
|
|
expensive (e.g. due to imports) yet is not always necessary.
|
|
|
|
|
|
|
|
So it may be useful to use an ``ExceptionProxy`` type to wrap the
|
|
|
|
serialized exception and only deserialize it when needed. That could
|
|
|
|
be via ``ExceptionProxy__getattribute__()`` or perhaps through
|
|
|
|
``RunFailedError.resolve()`` (which would raise the deserialized
|
|
|
|
exception and set ``RunFailedError.__cause__`` to the exception.
|
|
|
|
|
|
|
|
It may also make sense to have ``RunFailedError.__cause__`` be a
|
|
|
|
descriptor that does the lazy deserialization (and set ``__cause__``)
|
|
|
|
on the ``RunFailedError`` instance.
|
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
Return a value from ``interp.exec()``
|
|
|
|
-------------------------------------
|
2020-04-29 19:48:23 -04:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
Currently ``interp.exec()`` always returns None. One idea is to return
|
|
|
|
the return value from whatever the subinterpreter ran. However, for now
|
2020-04-29 19:48:23 -04:00
|
|
|
it doesn't make sense. The only thing folks can run is a string of
|
|
|
|
code (i.e. a script). This is equivalent to ``PyRun_StringFlags()``,
|
|
|
|
``exec()``, or a module body. None of those "return" anything. We can
|
2023-11-01 19:06:47 -04:00
|
|
|
revisit this once ``interp.exec()`` supports functions, etc.
|
2020-04-29 19:48:23 -04:00
|
|
|
|
2020-05-01 18:05:01 -04:00
|
|
|
Add a shareable synchronization primitive
|
|
|
|
-----------------------------------------
|
|
|
|
|
|
|
|
This would be ``_threading.Lock`` (or something like it) where
|
2023-01-20 13:12:46 -05:00
|
|
|
interpreters would actually share the underlying mutex. The main
|
|
|
|
concern is that locks and isolated interpreters may not mix well
|
|
|
|
(as learned in Go).
|
2020-05-01 18:05:01 -04:00
|
|
|
|
2021-02-03 09:06:23 -05:00
|
|
|
We can add this later if it proves desirable without much trouble.
|
2020-05-01 18:05:01 -04:00
|
|
|
|
|
|
|
Propagate SystemExit and KeyboardInterrupt Differently
|
|
|
|
------------------------------------------------------
|
|
|
|
|
|
|
|
The exception types that inherit from ``BaseException`` (aside from
|
|
|
|
``Exception``) are usually treated specially. These types are:
|
|
|
|
``KeyboardInterrupt``, ``SystemExit``, and ``GeneratorExit``. It may
|
|
|
|
make sense to treat them specially when it comes to propagation from
|
2023-11-01 19:06:47 -04:00
|
|
|
``interp.exec()``. Here are some options::
|
2020-05-01 18:05:01 -04:00
|
|
|
|
|
|
|
* propagate like normal via RunFailedError
|
|
|
|
* do not propagate (handle them somehow in the subinterpreter)
|
|
|
|
* propagate them directly (avoid RunFailedError)
|
|
|
|
* propagate them directly (set RunFailedError as __cause__)
|
|
|
|
|
|
|
|
We aren't going to worry about handling them differently. Threads
|
|
|
|
already ignore ``SystemExit``, so for now we will follow that pattern.
|
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
Add an explicit release() and close() to channel end classes
|
|
|
|
------------------------------------------------------------
|
|
|
|
|
|
|
|
It can be convenient to have an explicit way to close a channel against
|
|
|
|
further global use. Likewise it could be useful to have an explicit
|
|
|
|
way to release one of the channel ends relative to the current
|
|
|
|
interpreter. Among other reasons, such a mechanism is useful for
|
|
|
|
communicating overall state between interpreters without the extra
|
|
|
|
boilerplate that passing objects through a channel directly would
|
|
|
|
require.
|
|
|
|
|
|
|
|
The challenge is getting automatic release/close right without making
|
|
|
|
it hard to understand. This is especially true when dealing with a
|
|
|
|
non-empty channel. We should be able to get by without release/close
|
|
|
|
for now.
|
|
|
|
|
|
|
|
Add SendChannel.send_buffer()
|
|
|
|
-----------------------------
|
|
|
|
|
|
|
|
This method would allow no-copy sending of an object through a channel
|
|
|
|
if it supports the :pep:`3118` buffer protocol (e.g. memoryview).
|
|
|
|
|
|
|
|
Support for this is not fundamental to channels and can be added on
|
|
|
|
later without much disruption.
|
|
|
|
|
|
|
|
Auto-run in a thread
|
|
|
|
--------------------
|
|
|
|
|
|
|
|
The PEP proposes a hard separation between subinterpreters and threads:
|
|
|
|
if you want to run in a thread you must create the thread yourself and
|
|
|
|
call ``interp.exec()`` in it. However, it might be convenient if
|
|
|
|
``interp.exec()`` could do that for you, meaning there would be less
|
|
|
|
boilerplate.
|
|
|
|
|
|
|
|
Furthermore, we anticipate that users will want to run in a thread much
|
|
|
|
more often than not. So it would make sense to make this the default
|
|
|
|
behavior. We would add a kw-only param "threaded" (default ``True``)
|
|
|
|
to ``interp.exec()`` to allow the run-in-the-current-thread operation.
|
|
|
|
|
2017-09-13 21:35:40 -04:00
|
|
|
|
|
|
|
Rejected Ideas
|
|
|
|
==============
|
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
Explicit channel association
|
|
|
|
----------------------------
|
|
|
|
|
|
|
|
Interpreters are implicitly associated with channels upon ``recv()`` and
|
|
|
|
``send()`` calls. They are de-associated with ``release()`` calls. The
|
|
|
|
alternative would be explicit methods. It would be either
|
|
|
|
``add_channel()`` and ``remove_channel()`` methods on ``Interpreter``
|
|
|
|
objects or something similar on channel objects.
|
|
|
|
|
|
|
|
In practice, this level of management shouldn't be necessary for users.
|
|
|
|
So adding more explicit support would only add clutter to the API.
|
|
|
|
|
2023-03-21 11:56:19 -04:00
|
|
|
Add an API based on pipes
|
|
|
|
-------------------------
|
|
|
|
|
2017-09-13 21:35:40 -04:00
|
|
|
A pipe would be a simplex FIFO between exactly two interpreters. For
|
|
|
|
most use cases this would be sufficient. It could potentially simplify
|
|
|
|
the implementation as well. However, it isn't a big step to supporting
|
|
|
|
a many-to-many simplex FIFO via channels. Also, with pipes the API
|
|
|
|
ends up being slightly more complicated, requiring naming the pipes.
|
|
|
|
|
2023-03-21 11:56:19 -04:00
|
|
|
Add an API based on queues
|
|
|
|
--------------------------
|
|
|
|
|
2020-04-21 12:47:03 -04:00
|
|
|
Queues and buffered channels are almost the same thing. The main
|
2020-04-29 19:48:23 -04:00
|
|
|
difference is that channels have a stronger relationship with context
|
2020-04-21 12:47:03 -04:00
|
|
|
(i.e. the associated interpreter).
|
|
|
|
|
|
|
|
The name "Channel" was used instead of "Queue" to avoid confusion with
|
2020-04-29 19:48:23 -04:00
|
|
|
the stdlib ``queue.Queue``.
|
2017-09-13 21:35:40 -04:00
|
|
|
|
|
|
|
"enumerate"
|
|
|
|
-----------
|
|
|
|
|
|
|
|
The ``list_all()`` function provides the list of all interpreters.
|
|
|
|
In the threading module, which partly inspired the proposed API, the
|
|
|
|
function is called ``enumerate()``. The name is different here to
|
|
|
|
avoid confusing Python users that are not already familiar with the
|
|
|
|
threading API. For them "enumerate" is rather unclear, whereas
|
|
|
|
"list_all" is clear.
|
|
|
|
|
2017-12-05 21:16:00 -05:00
|
|
|
Alternate solutions to prevent leaking exceptions across interpreters
|
|
|
|
---------------------------------------------------------------------
|
|
|
|
|
|
|
|
In function calls, uncaught exceptions propagate to the calling frame.
|
2023-11-01 19:06:47 -04:00
|
|
|
The same approach could be taken with ``interp.exec()``. However, this
|
|
|
|
would mean that exception objects would leak across the inter-interpreter
|
2017-12-05 21:16:00 -05:00
|
|
|
boundary. Likewise, the frames in the traceback would potentially leak.
|
|
|
|
|
|
|
|
While that might not be a problem currently, it would be a problem once
|
|
|
|
interpreters get better isolation relative to memory management (which
|
|
|
|
is necessary to stop sharing the GIL between interpreters). We've
|
|
|
|
resolved the semantics of how the exceptions propagate by raising a
|
2018-05-14 13:39:07 -04:00
|
|
|
``RunFailedError`` instead, for which ``__cause__`` wraps a safe proxy
|
|
|
|
for the original exception and traceback.
|
2017-12-05 21:16:00 -05:00
|
|
|
|
|
|
|
Rejected possible solutions:
|
|
|
|
|
|
|
|
* reproduce the exception and traceback in the original interpreter
|
|
|
|
and raise that.
|
2018-05-14 13:39:07 -04:00
|
|
|
* raise a subclass of RunFailedError that proxies the original
|
|
|
|
exception and traceback.
|
|
|
|
* raise RuntimeError instead of RunFailedError
|
2017-12-05 21:16:00 -05:00
|
|
|
* convert at the boundary (a la ``subprocess.CalledProcessError``)
|
|
|
|
(requires a cross-interpreter representation)
|
|
|
|
* support customization via ``Interpreter.excepthook``
|
|
|
|
(requires a cross-interpreter representation)
|
|
|
|
* wrap in a proxy at the boundary (including with support for
|
|
|
|
something like ``err.raise()`` to propagate the traceback).
|
2023-11-01 19:06:47 -04:00
|
|
|
* return the exception (or its proxy) from ``interp.exec()`` instead of
|
2017-12-05 21:16:00 -05:00
|
|
|
raising it
|
|
|
|
* return a result object (like ``subprocess`` does) [result-object]_
|
2019-06-25 00:58:50 -04:00
|
|
|
(unnecessary complexity?)
|
2017-12-05 21:16:00 -05:00
|
|
|
* throw the exception away and expect users to deal with unhandled
|
2023-11-01 19:06:47 -04:00
|
|
|
exceptions explicitly in the script they pass to ``interp.exec()``
|
|
|
|
(they can pass error info out via channels);
|
2023-01-20 13:12:46 -05:00
|
|
|
with threads you have to do something similar
|
2017-12-05 21:16:00 -05:00
|
|
|
|
2019-03-26 14:39:43 -04:00
|
|
|
Always associate each new interpreter with its own thread
|
|
|
|
---------------------------------------------------------
|
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
As implemented in the C-API, an interpreter is not inherently tied to
|
2019-03-26 14:39:43 -04:00
|
|
|
any thread. Furthermore, it will run in any existing thread, whether
|
|
|
|
created by Python or not. You only have to activate one of its thread
|
|
|
|
states (``PyThreadState``) in the thread first. This means that the
|
|
|
|
same thread may run more than one interpreter (though obviously
|
|
|
|
not at the same time).
|
|
|
|
|
2023-01-20 13:12:46 -05:00
|
|
|
The proposed module maintains this behavior. Interpreters are not
|
2023-11-01 19:06:47 -04:00
|
|
|
tied to threads. Only calls to ``Interpreter.exec()`` are. However,
|
2023-03-10 13:48:32 -05:00
|
|
|
one of the key objectives of this PEP is to provide a more
|
|
|
|
human-centric concurrency model. With that in mind, from a conceptual
|
2019-03-26 14:39:43 -04:00
|
|
|
standpoint the module *might* be easier to understand if each
|
2023-01-20 13:12:46 -05:00
|
|
|
interpreter were associated with its own thread.
|
2019-03-26 14:39:43 -04:00
|
|
|
|
|
|
|
That would mean ``interpreters.create()`` would create a new thread
|
2023-11-01 19:06:47 -04:00
|
|
|
and ``Interpreter.exec()`` would only execute in that thread (and
|
2019-03-26 14:39:43 -04:00
|
|
|
nothing else would). The benefit is that users would not have to
|
2023-11-01 19:06:47 -04:00
|
|
|
wrap ``Interpreter.exec()`` calls in a new ``threading.Thread``. Nor
|
2019-03-26 14:39:43 -04:00
|
|
|
would they be in a position to accidentally pause the current
|
2023-01-20 13:12:46 -05:00
|
|
|
interpreter (in the current thread) while their interpreter
|
2019-03-26 14:39:43 -04:00
|
|
|
executes.
|
|
|
|
|
|
|
|
The idea is rejected because the benefit is small and the cost is high.
|
|
|
|
The difference from the capability in the C-API would be potentially
|
2019-07-03 14:20:45 -04:00
|
|
|
confusing. The implicit creation of threads is magical. The early
|
2019-03-26 14:39:43 -04:00
|
|
|
creation of threads is potentially wasteful. The inability to run
|
|
|
|
arbitrary interpreters in an existing thread would prevent some valid
|
|
|
|
use cases, frustrating users. Tying interpreters to threads would
|
|
|
|
require extra runtime modifications. It would also make the module's
|
|
|
|
implementation overly complicated. Finally, it might not even make
|
|
|
|
the module easier to understand.
|
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
Only associate interpreters upon use
|
|
|
|
------------------------------------
|
|
|
|
|
|
|
|
Associate interpreters with channel ends only once ``recv()``,
|
|
|
|
``send()``, etc. are called.
|
|
|
|
|
|
|
|
Doing this is potentially confusing and also can lead to unexpected
|
|
|
|
races where a channel is auto-closed before it can be used in the
|
|
|
|
original (creating) interpreter.
|
|
|
|
|
|
|
|
Allow multiple simultaneous calls to Interpreter.exec()
|
|
|
|
-------------------------------------------------------
|
2023-03-21 11:56:19 -04:00
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
This would make sense especially if ``Interpreter.exec()`` were to
|
2023-03-21 11:56:19 -04:00
|
|
|
manage new threads for you (which we've rejected). Essentially,
|
|
|
|
each call would run independently, which would be mostly fine
|
|
|
|
from a narrow technical standpoint, since each interpreter
|
|
|
|
can have multiple threads.
|
|
|
|
|
|
|
|
The problem is that the interpreter has only one ``__main__`` module
|
2023-11-01 19:06:47 -04:00
|
|
|
and simultaneous ``Interpreter.exec()`` calls would have to sort out
|
2023-03-21 11:56:19 -04:00
|
|
|
sharing ``__main__`` or we'd have to invent a new mechanism. Neither
|
|
|
|
would be simple enough to be worth doing.
|
|
|
|
|
2020-04-29 19:48:23 -04:00
|
|
|
Add a "reraise" method to RunFailedError
|
|
|
|
----------------------------------------
|
|
|
|
|
|
|
|
While having ``__cause__`` set on ``RunFailedError`` helps produce a
|
|
|
|
more useful traceback, it's less helpful when handling the original
|
|
|
|
error. To help facilitate this, we could add
|
|
|
|
``RunFailedError.reraise()``. This method would enable the following
|
|
|
|
pattern::
|
|
|
|
|
|
|
|
try:
|
|
|
|
try:
|
2023-11-01 19:06:47 -04:00
|
|
|
interp.exec(script)
|
2020-04-29 19:48:23 -04:00
|
|
|
except RunFailedError as exc:
|
|
|
|
exc.reraise()
|
|
|
|
except MyException:
|
|
|
|
...
|
|
|
|
|
|
|
|
This would be made even simpler if there existed a ``__reraise__``
|
|
|
|
protocol.
|
|
|
|
|
|
|
|
All that said, this is completely unnecessary. Using ``__cause__``
|
|
|
|
is good enough::
|
|
|
|
|
|
|
|
try:
|
|
|
|
try:
|
2023-11-01 19:06:47 -04:00
|
|
|
interp.exec(script)
|
2020-04-29 19:48:23 -04:00
|
|
|
except RunFailedError as exc:
|
|
|
|
raise exc.__cause__
|
|
|
|
except MyException:
|
|
|
|
...
|
|
|
|
|
|
|
|
Note that in extreme cases it may require a little extra boilerplate::
|
|
|
|
|
|
|
|
try:
|
|
|
|
try:
|
2023-11-01 19:06:47 -04:00
|
|
|
interp.exec(script)
|
2020-04-29 19:48:23 -04:00
|
|
|
except RunFailedError as exc:
|
|
|
|
if exc.__cause__ is not None:
|
|
|
|
raise exc.__cause__
|
|
|
|
raise # re-raise
|
|
|
|
except MyException:
|
|
|
|
...
|
|
|
|
|
2017-09-12 15:31:24 -04:00
|
|
|
|
2019-03-23 02:12:14 -04:00
|
|
|
Implementation
|
|
|
|
==============
|
|
|
|
|
|
|
|
The implementation of the PEP has 4 parts:
|
|
|
|
|
|
|
|
* the high-level module described in this PEP (mostly a light wrapper
|
|
|
|
around a low-level C extension
|
|
|
|
* the low-level C extension module
|
2023-11-01 19:06:47 -04:00
|
|
|
* additions to the internal C-API needed by the low-level module
|
2019-03-23 02:12:14 -04:00
|
|
|
* secondary fixes/changes in the CPython runtime that facilitate
|
|
|
|
the low-level module (among other benefits)
|
|
|
|
|
|
|
|
These are at various levels of completion, with more done the lower
|
|
|
|
you go:
|
|
|
|
|
|
|
|
* the high-level module has been, at best, roughly implemented.
|
|
|
|
However, fully implementing it will be almost trivial.
|
|
|
|
* the low-level module is mostly complete. The bulk of the
|
|
|
|
implementation was merged into master in December 2018 as the
|
2023-01-20 13:12:46 -05:00
|
|
|
"_xxsubinterpreters" module (for the sake of testing multiple
|
2023-11-01 19:06:47 -04:00
|
|
|
interpreters functionality). Only the exception propagation
|
|
|
|
implementation remains to be finished, which will not require
|
|
|
|
extensive work.
|
2019-03-23 02:12:14 -04:00
|
|
|
* all necessary C-API work has been finished
|
|
|
|
* all anticipated work in the runtime has been finished
|
|
|
|
|
2022-01-21 06:03:51 -05:00
|
|
|
The implementation effort for :pep:`554` is being tracked as part of
|
2019-03-23 02:12:14 -04:00
|
|
|
a larger project aimed at improving multi-core support in CPython.
|
|
|
|
[multi-core-project]_
|
|
|
|
|
|
|
|
|
2017-09-08 14:59:32 -04:00
|
|
|
References
|
|
|
|
==========
|
|
|
|
|
|
|
|
.. [c-api]
|
2017-09-12 15:31:24 -04:00
|
|
|
https://docs.python.org/3/c-api/init.html#sub-interpreter-support
|
|
|
|
|
|
|
|
.. [CSP]
|
|
|
|
https://en.wikipedia.org/wiki/Communicating_sequential_processes
|
|
|
|
https://github.com/futurecore/python-csp
|
|
|
|
|
2023-11-01 19:06:47 -04:00
|
|
|
.. [fifo]
|
|
|
|
https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Pipe
|
|
|
|
https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Queue
|
|
|
|
https://docs.python.org/3/library/queue.html#module-queue
|
|
|
|
http://stackless.readthedocs.io/en/2.7-slp/library/stackless/channels.html
|
|
|
|
https://golang.org/doc/effective_go.html#sharing
|
|
|
|
http://www.jtolds.com/writing/2016/03/go-channels-are-bad-and-you-should-feel-bad/
|
|
|
|
|
2017-09-12 15:31:24 -04:00
|
|
|
.. [caveats]
|
2017-09-08 14:59:32 -04:00
|
|
|
https://docs.python.org/3/c-api/init.html#bugs-and-caveats
|
|
|
|
|
2017-09-12 15:31:24 -04:00
|
|
|
.. [cryptography]
|
|
|
|
https://github.com/pyca/cryptography/issues/2299
|
|
|
|
|
|
|
|
.. [gilstate]
|
|
|
|
https://bugs.python.org/issue10915
|
|
|
|
http://bugs.python.org/issue15751
|
|
|
|
|
2017-09-22 19:51:38 -04:00
|
|
|
.. [bug-rate]
|
|
|
|
https://mail.python.org/pipermail/python-ideas/2017-September/047094.html
|
|
|
|
|
|
|
|
.. [benefits]
|
|
|
|
https://mail.python.org/pipermail/python-ideas/2017-September/047122.html
|
|
|
|
|
|
|
|
.. [reset_globals]
|
|
|
|
https://mail.python.org/pipermail/python-dev/2017-September/149545.html
|
|
|
|
|
|
|
|
.. [async]
|
|
|
|
https://mail.python.org/pipermail/python-dev/2017-September/149420.html
|
|
|
|
https://mail.python.org/pipermail/python-dev/2017-September/149585.html
|
|
|
|
|
|
|
|
.. [result-object]
|
|
|
|
https://mail.python.org/pipermail/python-dev/2017-September/149562.html
|
|
|
|
|
|
|
|
.. [jython]
|
|
|
|
https://mail.python.org/pipermail/python-ideas/2017-May/045771.html
|
|
|
|
|
2019-03-23 02:12:14 -04:00
|
|
|
.. [multi-core-project]
|
|
|
|
https://github.com/ericsnowcurrently/multi-core-python
|
|
|
|
|
2020-04-21 12:47:03 -04:00
|
|
|
.. [cache-line-ping-pong]
|
|
|
|
https://mail.python.org/archives/list/python-dev@python.org/message/3HVRFWHDMWPNR367GXBILZ4JJAUQ2STZ/
|
|
|
|
|
2022-06-14 13:27:47 -04:00
|
|
|
.. _nathaniel-asyncio:
|
2020-04-21 12:47:03 -04:00
|
|
|
https://mail.python.org/archives/list/python-dev@python.org/message/TUEAZNZHVJGGLL4OFD32OW6JJDKM6FAS/
|
|
|
|
|
2022-06-14 13:27:47 -04:00
|
|
|
* mp-conn
|
|
|
|
https://docs.python.org/3/library/multiprocessing.html#connection-objects
|
|
|
|
|
|
|
|
* main-thread
|
|
|
|
https://mail.python.org/pipermail/python-ideas/2017-September/047144.html
|
|
|
|
https://mail.python.org/pipermail/python-dev/2017-September/149566.html
|
2017-09-08 14:59:32 -04:00
|
|
|
|
2023-07-26 11:24:02 -04:00
|
|
|
* petr-c-ext
|
|
|
|
https://mail.python.org/pipermail/import-sig/2016-June/001062.html
|
|
|
|
https://mail.python.org/pipermail/python-ideas/2016-April/039748.html
|
|
|
|
|
2017-09-07 12:27:39 -04:00
|
|
|
Copyright
|
|
|
|
=========
|
|
|
|
|
|
|
|
This document has been placed in the public domain.
|