355 lines
12 KiB
ReStructuredText
355 lines
12 KiB
ReStructuredText
PEP: 554
|
||
Title: Multiple Interpreters in the Stdlib
|
||
Author: Eric Snow <ericsnowcurrently@gmail.com>
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 2017-09-05
|
||
Python-Version: 3.7
|
||
Post-History:
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
This proposal introduces the stdlib ``interpreters`` module. It exposes
|
||
the basic functionality of subinterpreters that already exists in the
|
||
C-API. Each subinterpreter runs with its own state (see
|
||
``Interpreter Isolation`` below). The module will be "provisional", as
|
||
described by PEP 411.
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
Running code in multiple interpreters provides a useful level of
|
||
isolation within the same process. This can be leveraged in number
|
||
of ways. Furthermore, subinterpreters provide a well-defined framework
|
||
in which such isolation may extended.
|
||
|
||
CPython has supported subinterpreters, with increasing levels of
|
||
support, since version 1.5. While the feature has the potential
|
||
to be a powerful tool, subinterpreters have suffered from neglect
|
||
because they are not available directly from Python. Exposing the
|
||
existing functionality in the stdlib will help reverse the situation.
|
||
|
||
This proposal is focused on enabling the fundamental capability of
|
||
multiple isolated interpreters in the same Python process. This is a
|
||
new area for Python so there is relative uncertainly about the best
|
||
tools to provide as companions to subinterpreters. Thus we minimize
|
||
the functionality we add in the proposal as much as possible.
|
||
|
||
Concerns
|
||
--------
|
||
|
||
* "subinterpreters are not worth the trouble"
|
||
|
||
Some have argued that subinterpreters do not add sufficient benefit
|
||
to justify making them an official part of Python. Adding features
|
||
to the language (or stdlib) has a cost in increasing the size of
|
||
the language. So it must pay for itself. In this case, subinterpreters
|
||
provide a novel concurrency model focused on isolated threads of
|
||
execution. Furthermore, they present an opportunity for changes in
|
||
CPython that will allow simulateous use of multiple CPU cores (currently
|
||
prevented by the GIL).
|
||
|
||
Alternatives to subinterpreters include threading, async, and
|
||
multiprocessing. Threading is limited by the GIL and async isn't
|
||
the right solution for every problem (nor for every person).
|
||
Multiprocessing is likewise valuable in some but not all situations.
|
||
Direct IPC (rather than via the multiprocessing module) provides
|
||
similar benefits but with the same caveat.
|
||
|
||
Notably, subinterpreters are not intended as a replacement for any of
|
||
the above. Certainly they overlap in some areas, but the benefits of
|
||
subinterpreters include isolation and (potentially) performance. In
|
||
particular, subinterpreters provide a direct route to an alternate
|
||
concurrency model (e.g. CSP) which has found success elsewhere and
|
||
will appeal to some Python users. That is the core value that the
|
||
``interpreters`` module will provide.
|
||
|
||
* "stdlib support for subinterpreters adds extra burden
|
||
on C extension authors"
|
||
|
||
In the ``Interpreter Isolation`` section below we identify ways in
|
||
which isolation in CPython's subinterpreters is incomplete. Most
|
||
notable is extension modules that use C globals to store internal
|
||
state. PEP 3121 and PEP 489 provide a solution for most of the
|
||
problem, but one still remains. [petr-c-ext]_ Until that is resolved,
|
||
C extension authors will face extra difficulty to support
|
||
subinterpreters.
|
||
|
||
Consequently, projects that publish extension modules may face an
|
||
increased maintenance burden as their users start using subinterpreters,
|
||
where their modules may break. This situation is limited to modules
|
||
that use C globals (or use libraries that use C globals) to store
|
||
internal state.
|
||
|
||
Ultimately this comes down to a question of how often it will be a
|
||
problem in practice: how many projects would be affected, how often
|
||
their users will be affected, what the additional maintenance burden
|
||
will be for projects, and what the overall benefit of subinterpreters
|
||
is to offset those costs. The position of this PEP is that the actual
|
||
extra maintenance burden will be small and well below the threshold at
|
||
which subinterpreters are worth it.
|
||
|
||
|
||
Proposal
|
||
========
|
||
|
||
The ``interpreters`` module will be added to the stdlib. It will
|
||
provide a high-level interface to subinterpreters and wrap the low-level
|
||
``_interpreters`` module. The proposed API is inspired by the
|
||
``threading`` module.
|
||
|
||
The module provides the following functions:
|
||
|
||
``list()``::
|
||
|
||
Return a list of all existing interpreters.
|
||
|
||
``get_current()``::
|
||
|
||
Return the currently running interpreter.
|
||
|
||
``get_main()``::
|
||
|
||
Return the main interpreter.
|
||
|
||
``create()``::
|
||
|
||
Initialize a new Python interpreter and return it. The
|
||
interpreter will be created in the current thread and will remain
|
||
idle until something is run in it.
|
||
|
||
The module also provides the following classes:
|
||
|
||
``Interpreter(id)``::
|
||
|
||
id:
|
||
|
||
The interpreter's ID (read-only).
|
||
|
||
is_running():
|
||
|
||
Return whether or not the interpreter is currently executing code.
|
||
|
||
destroy():
|
||
|
||
Finalize and destroy the interpreter. If called on a running
|
||
interpreter then raise RuntimeError in the interpreter that
|
||
called ``destroy()``.
|
||
|
||
run(code):
|
||
|
||
Run the provided Python code in the interpreter, in the current
|
||
OS thread. If the interpreter is already running then raise
|
||
RuntimeError in the interpreter that called ``run()``.
|
||
|
||
The current interpreter (which called ``run()``) will block until
|
||
the subinterpreter finishes running the requested code. Any
|
||
uncaught exception in that code will bubble up to the current
|
||
interpreter.
|
||
|
||
Supported code: source text.
|
||
|
||
get_fifo(name):
|
||
|
||
Return the FIFO object with the given name that is associated
|
||
with this interpreter. If no such FIFO exists then raise
|
||
KeyError. The FIFO will be either a "FIFOReader" or a
|
||
"FIFOWriter", depending on which "add_*_fifo()" was called.
|
||
|
||
list_fifos():
|
||
|
||
Return a list of all fifos associated with the interpreter.
|
||
|
||
add_recv_fifo(name=None):
|
||
|
||
Create a new FIFO, associate the two ends with the involved
|
||
interpreters, and return the side associated with the interpreter
|
||
in which "add_recv_fifo()" was called. A FIFOReader gets tied to
|
||
this interpreter. A FIFOWriter gets tied to the interpreter that
|
||
called "add_recv_fifo()".
|
||
|
||
The FIFO's name is set to the provided value. If no name is
|
||
provided then a dynamically generated one is used. If a FIFO
|
||
with the given name is already associated with this interpreter
|
||
(or with the one in which "add_recv_fifo()" was called) then raise
|
||
KeyError.
|
||
|
||
add_send_fifo(name=None):
|
||
|
||
Create a new FIFO, associate the two ends with the involved
|
||
interpreters, and return the side associated with the interpreter
|
||
in which "add_recv_fifo()" was called. A FIFOWriter gets tied to
|
||
this interpreter. A FIFOReader gets tied to the interpreter that
|
||
called "add_recv_fifo()".
|
||
|
||
The FIFO's name is set to the provided value. If no name is
|
||
provided then a dynamically generated one is used. If a FIFO
|
||
with the given name is already associated with this interpreter
|
||
(or with the one in which "add_send_fifo()" was called) then raise
|
||
KeyError.
|
||
|
||
remove_fifo(name):
|
||
|
||
Drop the association between the named FIFO and this interpreter.
|
||
If the named FIFO is not found then raise KeyError.
|
||
|
||
|
||
``FIFOReader(name)``::
|
||
|
||
The receiving end of a FIFO. An interpreter may use this to receive
|
||
objects from another interpreter. At first only bytes and None will
|
||
be supported.
|
||
|
||
name:
|
||
|
||
The FIFO's name.
|
||
|
||
__next__():
|
||
|
||
Return the next bytes object from the pipe. If none have been
|
||
pushed on then block.
|
||
|
||
pop(*, block=True):
|
||
|
||
Return the next bytes object from the pipe. If none have been
|
||
pushed on and "block" is True (the default) then block.
|
||
Otherwise return None.
|
||
|
||
|
||
``FIFOWriter(name)``::
|
||
|
||
The sending end of a FIFO. An interpreter may use this to send
|
||
objects to another interpreter. At first only bytes and None will
|
||
be supported.
|
||
|
||
name:
|
||
|
||
The FIFO's name.
|
||
|
||
push(object, *, block=True):
|
||
|
||
Add the object to the FIFO. If "block" is true then block
|
||
until the object is popped off. If the FIFO does not support
|
||
the object's type then TypeError is raised.
|
||
|
||
About FIFOs
|
||
-----------
|
||
|
||
Subinterpreters are inherently isolated (with caveats explained below),
|
||
in contrast to threads. This enables a different concurrency model than
|
||
currently exists in Python. CSP (Communicating Sequential Processes),
|
||
upon which Go's concurrency is based, is one example of this model.
|
||
|
||
A key component of this approach to concurrency is message passing. So
|
||
providing a message/object passing mechanism alongside ``Interpreter``
|
||
is a fundamental requirement. This proposal includes a basic mechanism
|
||
upon which more complex machinery may be built. That basic mechanism
|
||
draws inspiration from pipes, queues, and CSP's channels.
|
||
|
||
The key challenge here is that sharing objects between interpreters
|
||
faces complexity due in part to CPython's current memory model.
|
||
Furthermore, in this class of concurrency, the ideal is that objects
|
||
only exist in one interpreter at a time. However, this is not practical
|
||
for Python so we initially constrain supported objects to ``bytes`` and
|
||
``None``. There are a number of strategies we may pursue in the future
|
||
to expand supported objects and object sharing strategies.
|
||
|
||
Note that the complexity of object sharing increases as subinterpreters
|
||
become more isolated, e.g. after GIL removal. So the mechanism for
|
||
message passing needs to be carefully considered. Keeping the API
|
||
minimal and initially restricting the supported types helps us avoid
|
||
further exposing any underlying complexity to Python users.
|
||
|
||
|
||
Deferred Functionality
|
||
======================
|
||
|
||
In the interest of keeping this proposal minimal, the following
|
||
functionality has been left out for future consideration. Note that
|
||
this is not a judgement against any of said capability, but rather a
|
||
deferment. That said, each is arguably valid.
|
||
|
||
Interpreter.call()
|
||
------------------
|
||
|
||
It would be convenient to run existing functions in subinterpreters
|
||
directly. ``Interpreter.run()`` could be adjusted to support this or
|
||
a ``call()`` method could be added::
|
||
|
||
Interpreter.call(f, *args, **kwargs)
|
||
|
||
This suffers from the same problem as sharing objects between
|
||
interpreters via queues. The minimal solution (running a source string)
|
||
is sufficient for us to get the feature out where it can be explored.
|
||
|
||
timeout arg to pop() and push()
|
||
-------------------------------
|
||
|
||
Typically functions that have a ``block`` argument also have a
|
||
``timeout`` argument. We can add it later if needed.
|
||
|
||
|
||
Interpreter Isolation
|
||
=====================
|
||
|
||
CPython's interpreters are intended to be strictly isolated from each
|
||
other. Each interpreter has its own copy of all modules, classes,
|
||
functions, and variables. The same applies to state in C, including in
|
||
extension modules. The CPython C-API docs explain more. [c-api]_
|
||
|
||
However, there are ways in which interpreters share some state. First
|
||
of all, some process-global state remains shared, like file descriptors.
|
||
There are no plans to change this.
|
||
|
||
Second, some isolation is faulty due to bugs or implementations that did
|
||
not take subinterpreters into account. This includes things like
|
||
at-exit handlers and extension modules that rely on C globals. In these
|
||
cases bugs should be opened (some are already).
|
||
|
||
Finally, some potential isolation is missing due to the current design
|
||
of CPython. This includes the GIL and memory management. Improvements
|
||
are currently going on to address gaps in this area.
|
||
|
||
|
||
Provisional Status
|
||
==================
|
||
|
||
The new ``interpreters`` module will be added with "provisional" status
|
||
(see PEP 411). This allows Python users to experiment with the feature
|
||
and provide feedback while still allowing us to adjust to that feedback.
|
||
The module will be provisional in Python 3.7 and we will make a decision
|
||
before the 3.8 release whether to keep it provisional, graduate it, or
|
||
remove it.
|
||
|
||
|
||
References
|
||
==========
|
||
|
||
.. [c-api]
|
||
https://docs.python.org/3/c-api/init.html#bugs-and-caveats
|
||
|
||
.. [petr-c-ext]
|
||
https://mail.python.org/pipermail/import-sig/2016-June/001062.html
|
||
https://mail.python.org/pipermail/python-ideas/2016-April/039748.html
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|