887 lines
35 KiB
ReStructuredText
887 lines
35 KiB
ReStructuredText
PEP: 684
|
|
Title: A Per-Interpreter GIL
|
|
Author: Eric Snow <ericsnowcurrently@gmail.com>
|
|
Discussions-To: https://discuss.python.org/t/pep-684-a-per-interpreter-gil/19583
|
|
Status: Accepted
|
|
Type: Standards Track
|
|
Content-Type: text/x-rst
|
|
Requires: 683
|
|
Created: 08-Mar-2022
|
|
Python-Version: 3.12
|
|
Post-History: `08-Mar-2022 <https://mail.python.org/archives/list/python-dev@python.org/thread/CF7B7FMACFYDAHU6NPBEVEY6TOSGICXU/>`__,
|
|
`29-Sep-2022 <https://discuss.python.org/t/pep-684-a-per-interpreter-gil/19583>`__,
|
|
`28-Oct-2022 <https://discuss.python.org/t/pep-684-a-per-interpreter-gil/19583/19/>`__,
|
|
Resolution: https://discuss.python.org/t/19583/42
|
|
|
|
|
|
Abstract
|
|
========
|
|
|
|
Since Python 1.5 (1997), CPython users can run multiple interpreters
|
|
in the same process. However, interpreters in the same process
|
|
have always shared a significant
|
|
amount of global state. This is a source of bugs, with a growing
|
|
impact as more and more people use the feature. Furthermore,
|
|
sufficient isolation would facilitate true multi-core parallelism,
|
|
where interpreters no longer share the GIL. The changes outlined in
|
|
this proposal will result in that level of interpreter isolation.
|
|
|
|
|
|
High-Level Summary
|
|
==================
|
|
|
|
At a high level, this proposal changes CPython in the following ways:
|
|
|
|
* stops sharing the GIL between interpreters, given sufficient isolation
|
|
* adds several new interpreter config options for isolation settings
|
|
* keeps incompatible extensions from causing problems
|
|
|
|
The GIL
|
|
-------
|
|
|
|
The GIL protects concurrent access to most of CPython's runtime state.
|
|
So all that GIL-protected global state must move to each interpreter
|
|
before the GIL can.
|
|
|
|
(In a handful of cases, other mechanisms can be used to ensure
|
|
thread-safe sharing instead, such as locks or "immortal" objects.)
|
|
|
|
CPython Runtime State
|
|
---------------------
|
|
|
|
Properly isolating interpreters requires that most of CPython's
|
|
runtime state be stored in the ``PyInterpreterState`` struct. Currently,
|
|
only a portion of it is; the rest is found either in C global variables
|
|
or in ``_PyRuntimeState``. Most of that will have to be moved.
|
|
|
|
This directly coincides with an ongoing effort (of many years) to greatly
|
|
reduce internal use of global variables and consolidate the runtime
|
|
state into ``_PyRuntimeState`` and ``PyInterpreterState``.
|
|
(See `Consolidating Runtime Global State`_ below.) That project has
|
|
`significant merit on its own <Benefits to Consolidation_>`_
|
|
and has faced little controversy. So, while a per-interpreter GIL
|
|
relies on the completion of that effort, that project should not be
|
|
considered a part of this proposal--only a dependency.
|
|
|
|
Other Isolation Considerations
|
|
------------------------------
|
|
|
|
CPython's interpreters must be strictly isolated from each other, with
|
|
few exceptions. To a large extent they already are. Each interpreter
|
|
has its own copy of all modules, classes, functions, and variables.
|
|
The CPython C-API docs `explain further <caveats_>`_.
|
|
|
|
.. _caveats: https://docs.python.org/3/c-api/init.html#bugs-and-caveats
|
|
|
|
However, aside from what has already been mentioned (e.g. the GIL),
|
|
there are a couple of ways in which interpreters still share some state.
|
|
|
|
First of all, some process-global resources (e.g. memory,
|
|
file descriptors, environment variables) are shared. There are no
|
|
plans to change this.
|
|
|
|
Second, some isolation is faulty due to bugs or implementations that
|
|
did not take multiple interpreters into account. This includes
|
|
CPython's runtime and the stdlib, as well as extension modules that
|
|
rely on global variables. Bugs should be opened in these cases,
|
|
as some already have been.
|
|
|
|
Depending on Immortal Objects
|
|
-----------------------------
|
|
|
|
:pep:`683` introduces immortal objects as a CPython-internal feature.
|
|
With immortal objects, we can share any otherwise immutable global
|
|
objects between all interpreters. Consequently, this PEP does not
|
|
need to address how to deal with the various objects
|
|
`exposed in the public C-API <capi objects_>`_.
|
|
It also simplifies the question of what to do about the builtin
|
|
static types. (See `Global Objects`_ below.)
|
|
|
|
Both issues have alternate solutions, but everything is simpler with
|
|
immortal objects. If PEP 683 is not accepted then this one will be
|
|
updated with the alternatives. This lets us reduce noise in this
|
|
proposal.
|
|
|
|
|
|
Motivation
|
|
==========
|
|
|
|
The fundamental problem we're solving here is a lack of true multi-core
|
|
parallelism (for Python code) in the CPython runtime. The GIL is the
|
|
cause. While it usually isn't a problem in practice, at the very least
|
|
it makes Python's multi-core story murky, which makes the GIL
|
|
a consistent distraction.
|
|
|
|
Isolated interpreters are also an effective mechanism to support
|
|
certain concurrency models. :pep:`554` discusses this in more detail.
|
|
|
|
Indirect Benefits
|
|
-----------------
|
|
|
|
Most of the effort needed for a per-interpreter GIL has benefits that
|
|
make those tasks worth doing anyway:
|
|
|
|
* makes multiple-interpreter behavior more reliable
|
|
* has led to fixes for long-standing runtime bugs that otherwise
|
|
hadn't been prioritized
|
|
* has been exposing (and inspiring fixes for) previously unknown runtime bugs
|
|
* has driven cleaner runtime initialization (:pep:`432`, :pep:`587`)
|
|
* has driven cleaner and more complete runtime finalization
|
|
* led to structural layering of the C-API (e.g. ``Include/internal``)
|
|
* also see `Benefits to Consolidation`_ below
|
|
|
|
.. XXX Add links to example GitHub issues?
|
|
|
|
Furthermore, much of that work benefits other CPython-related projects:
|
|
|
|
* performance improvements ("`faster-cpython`_")
|
|
* pre-fork application deployment (e.g. `Instagram server`_)
|
|
* extension module isolation (see :pep:`630`, etc.)
|
|
* embedding CPython
|
|
|
|
.. _faster-cpython: https://github.com/faster-cpython/ideas
|
|
|
|
.. _Instagram server: https://instagram-engineering.com/copy-on-write-friendly-python-garbage-collection-ad6ed5233ddf
|
|
|
|
Existing Use of Multiple Interpreters
|
|
-------------------------------------
|
|
|
|
The C-API for multiple interpreters has been used for many years.
|
|
However, until relatively recently the feature wasn't widely known,
|
|
nor extensively used (with the exception of mod_wsgi).
|
|
|
|
In the last few years use of multiple interpreters has been increasing.
|
|
Here are some of the public projects using the feature currently:
|
|
|
|
* `mod_wsgi <https://github.com/GrahamDumpleton/mod_wsgi>`_
|
|
* `OpenStack Ceph <https://github.com/ceph/ceph/pull/14971>`_
|
|
* `JEP <https://github.com/ninia/jep>`_
|
|
* `Kodi <https://github.com/xbmc/xbmc>`_
|
|
|
|
Note that, with :pep:`554`, multiple interpreter usage would likely
|
|
grow significantly (via Python code rather than the C-API).
|
|
|
|
PEP 554 (Multiple Interpreters in the Stdlib)
|
|
---------------------------------------------
|
|
|
|
:pep:`554` is strictly about providing a minimal stdlib module
|
|
to give users access to multiple interpreters from Python code.
|
|
In fact, it specifically avoids proposing any changes related to
|
|
the GIL. Consider, however, that users of that module would benefit
|
|
from a per-interpreter GIL, which makes PEP 554 more appealing.
|
|
|
|
|
|
Rationale
|
|
=========
|
|
|
|
During initial investigations in 2014, a variety of possible solutions
|
|
for multi-core Python were explored, but each had its drawbacks
|
|
without simple solutions:
|
|
|
|
* the existing practice of releasing the GIL in extension modules
|
|
|
|
* doesn't help with Python code
|
|
|
|
* other Python implementations (e.g. Jython, IronPython)
|
|
|
|
* CPython dominates the community
|
|
|
|
* remove the GIL (e.g. gilectomy, "no-gil")
|
|
|
|
* too much technical risk (at the time)
|
|
|
|
* Trent Nelson's "PyParallel" project
|
|
|
|
* incomplete; Windows-only at the time
|
|
|
|
* ``multiprocessing``
|
|
|
|
* too much work to make it effective enough;
|
|
high penalties in some situations (at large scale, Windows)
|
|
|
|
* other parallelism tools (e.g. dask, ray, MPI)
|
|
|
|
* not a fit for the runtime/stdlib
|
|
|
|
* give up on multi-core (e.g. async, do nothing)
|
|
|
|
* this can only end in tears
|
|
|
|
Even in 2014, it was fairly clear that a solution using isolated
|
|
interpreters did not have a high level of technical risk and that
|
|
most of the work was worth doing anyway.
|
|
(The downside was the volume of work to be done.)
|
|
|
|
|
|
Specification
|
|
=============
|
|
|
|
As `summarized above <High-Level Summary_>`__, this proposal involves the
|
|
following changes, in the order they must happen:
|
|
|
|
1. `consolidate global runtime state <Consolidating Runtime Global State_>`_
|
|
(including objects) into ``_PyRuntimeState``
|
|
2. move nearly all of the state down into ``PyInterpreterState``
|
|
3. finally, move the GIL down into ``PyInterpreterState``
|
|
4. everything else
|
|
|
|
* update the C-API
|
|
* implement extension module restrictions
|
|
* work with popular extension maintainers to help
|
|
with multi-interpreter support
|
|
|
|
Per-Interpreter State
|
|
---------------------
|
|
|
|
The following runtime state will be moved to ``PyInterpreterState``:
|
|
|
|
* all global objects that are not safely shareable (fully immutable)
|
|
* the GIL
|
|
* most mutable data that's currently protected by the GIL
|
|
* mutable data that's currently protected by some other per-interpreter lock
|
|
* mutable data that may be used independently in different interpreters
|
|
(also applies to extension modules, including those with multi-phase init)
|
|
* all other mutable data not otherwise excluded below
|
|
|
|
Furthermore, a portion of the full global state has already been
|
|
moved to the interpreter, including GC, warnings, and atexit hooks.
|
|
|
|
The following runtime state will not be moved:
|
|
|
|
* global objects that are safely shareable, if any
|
|
* immutable data, often ``const``
|
|
* effectively immutable data (treated as immutable), for example:
|
|
|
|
* some state is initialized early and never modified again
|
|
* hashes for strings (``PyUnicodeObject``) are idempotently calculated
|
|
when first needed and then cached
|
|
|
|
* all data that is guaranteed to be modified exclusively in the main thread,
|
|
including:
|
|
|
|
* state used only in CPython's ``main()``
|
|
* the REPL's state
|
|
* data modified only during runtime init (effectively immutable afterward)
|
|
|
|
* mutable data that's protected by some global lock (other than the GIL)
|
|
* global state in atomic variables
|
|
* mutable global state that can be changed (sensibly) to atomic variables
|
|
|
|
Memory Allocators
|
|
'''''''''''''''''
|
|
|
|
This is one of the most sensitive parts of the work to isolate interpreters.
|
|
The simplest solution is to move the global state of the internal
|
|
"small block" allocator to ``PyInterpreterState``, as we are doing with
|
|
nearly all other runtime state. The following elaborates on the details
|
|
and rationale.
|
|
|
|
CPython provides a memory management C-API, with `three allocator domains`_:
|
|
"raw", "mem", and "object". Each provides the equivalent of ``malloc()``,
|
|
``calloc()``, ``realloc()``, and ``free()``. A custom allocator for each
|
|
domain can be set during runtime initialization and the current allocator
|
|
can be wrapped with a hook using the same API (for example, the stdlib
|
|
tracemalloc module). The allocators are currently runtime-global,
|
|
shared by all interpreters.
|
|
|
|
.. _three allocator domains: https://docs.python.org/3/c-api/memory.html#allocator-domains
|
|
|
|
The "raw" allocator is expected to be thread-safe and defaults to glibc's
|
|
allocator (``malloc()``, etc.). However, the "mem" and "object" allocators
|
|
are not expected to be thread-safe and currently may rely on the GIL for
|
|
thread-safety. This is partly because the default allocator for both,
|
|
AKA "pyobject", `is not thread-safe`_. This is due to how all state for
|
|
that allocator is stored in C global variables.
|
|
(See ``Objects/obmalloc.c``.)
|
|
|
|
.. _is not thread-safe: https://peps.python.org/pep-0445/#gil-free-pymem-malloc
|
|
|
|
Thus we come back to the question of isolating runtime state. In order
|
|
for interpreters to stop sharing the GIL, allocator thread-safety
|
|
must be addressed. If interpreters continue sharing the allocators
|
|
then we need some other way to get thread-safety. Otherwise interpreters
|
|
must stop sharing the allocators. In both cases there are a number of
|
|
possible solutions, each with potential downsides.
|
|
|
|
To keep sharing the allocators, the simplest solution is to use
|
|
a granular runtime-global lock around the calls to the "mem" and "object"
|
|
allocators in ``PyMem_Malloc()``, ``PyObject_Malloc()``, etc. This would
|
|
impact performance, but there are some ways to mitigate that (e.g. only
|
|
start locking once the first subinterpreter is created).
|
|
|
|
Another way to keep sharing the allocators is to require that the "mem"
|
|
and "object" allocators be thread-safe. This would mean we'd have to
|
|
make the pyobject allocator implementation thread-safe. That could
|
|
even involve re-implementing it using an extensible allocator like
|
|
mimalloc. The potential downside is in the cost to re-implement
|
|
the allocator and the risk of defects inherent to such an endeavor.
|
|
|
|
Regardless, a switch to requiring thread-safe allocators would impact
|
|
anyone that embeds CPython and currently sets a thread-unsafe allocator.
|
|
We'd need to consider who might be affected and how we reduce any
|
|
negative impact (e.g. add a basic C-API to help make an allocator
|
|
thread-safe).
|
|
|
|
If we did stop sharing the allocators between interpreters, we'd have
|
|
to do so only for the "mem" and "object" allocators. We might also need
|
|
to keep a full set of global allocators for certain runtime-level usage.
|
|
There would be some performance penalty due to looking up the current
|
|
interpreter and then pointer indirection to get the allocators.
|
|
Embedders would also likely have to provide a new allocator context
|
|
for each interpreter. On the plus side, allocator hooks (e.g. tracemalloc)
|
|
would not be affected.
|
|
|
|
Ultimately, we will go with the simplest option:
|
|
|
|
* keep the allocators in the global runtime state
|
|
* require that they be thread-safe
|
|
* move the state of the default object allocator (AKA "small block"
|
|
allocator) to ``PyInterpreterState``
|
|
|
|
We experimented with `a rough implementation`_ and found it was fairly
|
|
straightforward, and the performance penalty was essentially zero.
|
|
|
|
.. _a rough implementation: https://github.com/ericsnowcurrently/cpython/tree/try-per-interpreter-alloc
|
|
|
|
.. _proposed capi:
|
|
|
|
C-API
|
|
-----
|
|
|
|
Internally, the interpreter state will now track how the import system
|
|
should handle extension modules which do not support use with multiple
|
|
interpreters. See `Restricting Extension Modules`_ below. We'll refer
|
|
to that setting here as "PyInterpreterState.strict_extension_compat".
|
|
|
|
The following API will be made public, if they haven't been already:
|
|
|
|
* ``PyInterpreterConfig`` (struct)
|
|
* ``PyInterpreterConfig_INIT`` (macro)
|
|
* ``PyInterpreterConfig_LEGACY_INIT`` (macro)
|
|
* ``PyThreadState * Py_NewInterpreterFromConfig(PyInterpreterConfig *)``
|
|
|
|
We will add two new fields to ``PyInterpreterConfig``:
|
|
|
|
* ``int own_gil``
|
|
* ``int strict_extensions_compat``
|
|
|
|
We may add other fields over time, as needed (e.g. "own_initial_thread").
|
|
|
|
Regarding the initializer macros, ``PyInterpreterConfig_INIT`` would
|
|
be used to get an isolated interpreter that also avoids
|
|
subinterpreter-unfriendly features. It would be the default for
|
|
interpreters created through :pep:`554`. The unrestricted (status quo)
|
|
will continue to be available through ``PyInterpreterConfig_LEGACY_INIT``,
|
|
which is already used for the main interpreter and ``Py_NewInterpreter()``.
|
|
This will not change.
|
|
|
|
A note about the "main" interpreter:
|
|
|
|
Below, we mention the "main" interpreter several times. This refers
|
|
to the interpreter created during runtime initialization, for which
|
|
the initial ``PyThreadState`` corresponds to the process's main thread.
|
|
It is has a number of unique responsibilities (e.g. handling signals),
|
|
as well as a special role during runtime initialization/finalization.
|
|
It is also usually (for now) the only interpreter.
|
|
(Also see https://docs.python.org/3/c-api/init.html#sub-interpreter-support.)
|
|
|
|
PyInterpreterConfig.own_gil
|
|
'''''''''''''''''''''''''''
|
|
|
|
If ``true`` (``1``) then the new interpreter will have its own "global"
|
|
interpreter lock. This means the new interpreter can run without
|
|
getting interrupted by other interpreters. This effectively unblocks
|
|
full use of multiple cores. That is the fundamental goal of this PEP.
|
|
|
|
If ``false`` (``0``) then the new interpreter will use the main
|
|
interpreter's lock. This is the legacy (pre-3.12) behavior in CPython,
|
|
where all interpreters share a single GIL. Sharing the GIL like this
|
|
may be desirable when using extension modules that still depend
|
|
on the GIL for thread safety.
|
|
|
|
In ``PyInterpreterConfig_INIT``, this will be ``true``.
|
|
In ``PyInterpreterConfig_LEGACY_INIT``, this will be ``false``.
|
|
|
|
Also, to play it safe, for now we will not allow ``own_gil`` to be true
|
|
if a custom allocator was set during runtime init. Wrapping the allocator,
|
|
a la tracemalloc, will still be fine.
|
|
|
|
PyInterpreterConfig.strict_extensions_compat
|
|
''''''''''''''''''''''''''''''''''''''''''''
|
|
|
|
``PyInterpreterConfig.strict_extension_compat`` is basically the initial
|
|
value used for "PyInterpreterState.strict_extension_compat".
|
|
|
|
Restricting Extension Modules
|
|
-----------------------------
|
|
|
|
Extension modules have many of the same problems as the runtime when
|
|
state is stored in global variables. :pep:`630` covers all the details
|
|
of what extensions must do to support isolation, and thus safely run in
|
|
multiple interpreters at once. This includes dealing with their globals.
|
|
|
|
If an extension implements multi-phase init (see :pep:`489`) it is
|
|
considered compatible with multiple interpreters. All other extensions
|
|
are considered incompatible. (See `Extension Module Thread Safety`_
|
|
for more details about how a per-interpreter GIL may affect that
|
|
classification.)
|
|
|
|
If an incompatible extension is imported and the current
|
|
"PyInterpreterState.strict_extension_compat" value is ``true`` then the import
|
|
system will raise ``ImportError``. (For ``false`` it simply doesn't check.)
|
|
This will be done through
|
|
``importlib._bootstrap_external.ExtensionFileLoader`` (really, through
|
|
``_imp.create_dynamic()``, ``_PyImport_LoadDynamicModuleWithSpec()``, and
|
|
``PyModule_FromDefAndSpec2()``).
|
|
|
|
Such imports will never fail in the main interpreter (or in interpreters
|
|
created through ``Py_NewInterpreter()``) since
|
|
"PyInterpreterState.strict_extension_compat" initializes to ``false`` in both
|
|
cases. Thus the legacy (pre-3.12) behavior is preserved.
|
|
|
|
We will work with popular extensions to help them support use in
|
|
multiple interpreters. This may involve adding to CPython's public C-API,
|
|
which we will address on a case-by-case basis.
|
|
|
|
Extension Module Compatibility
|
|
''''''''''''''''''''''''''''''
|
|
|
|
As noted in `Extension Modules`_, many extensions work fine in multiple
|
|
interpreters (and under a per-interpreter GIL) without needing any
|
|
changes. The import system will still fail if such a module doesn't
|
|
explicitly indicate support. At first, not many extension modules
|
|
will, so this is a potential source of frustration.
|
|
|
|
We will address this by adding a context manager to temporarily disable
|
|
the check on multiple interpreter support:
|
|
``importlib.util.allow_all_extensions()``. More or less, it will modify
|
|
the current "PyInterpreterState.strict_extension_compat" value (e.g. through
|
|
a private ``sys`` function).
|
|
|
|
Extension Module Thread Safety
|
|
''''''''''''''''''''''''''''''
|
|
|
|
If a module supports use with multiple interpreters, that mostly implies
|
|
it will work even if those interpreters do not share the GIL. The one
|
|
caveat is where a module links against a library with internal global
|
|
state that isn't thread-safe. (Even something as innocuous as a static
|
|
local variable as a temporary buffer can be a problem.) With a shared
|
|
GIL, that state is protected. Without one, such modules must wrap any
|
|
use of that state (e.g. through calls) with a lock.
|
|
|
|
Currently, it isn't clear whether or not supports-multiple-interpreters
|
|
is sufficiently equivalent to supports-per-interpreter-gil, such that
|
|
we can avoid any special accommodations. This is still a point of
|
|
meaningful discussion and investigation. The practical distinction
|
|
between the two (in the Python community, e.g. PyPI) is not yet
|
|
understood well enough to settle the matter. Likewise, it isn't clear
|
|
what we might be able to do to help extension maintainers mitigate
|
|
the problem (assuming it is one).
|
|
|
|
In the meantime, we must proceed as though the difference would be
|
|
large enough to cause problems for enough extension modules out there.
|
|
The solution we would apply is:
|
|
|
|
* add a ``PyModuleDef`` slot that indicates an extension can be imported
|
|
under a per-interpreter GIL (i.e. opt in)
|
|
* add that slot as part of the definition of a "compatible" extension,
|
|
as discussed earlier
|
|
|
|
The downside is that not a single extension module will be able to take
|
|
advantage of the per-interpreter GIL without extra effort by the module
|
|
maintainer, regardless of how minor that effort. This compounds the
|
|
problem described in `Extension Module Compatibility`_ and the same
|
|
workaround applies. Ideally, we would determine that there isn't enough
|
|
difference to matter.
|
|
|
|
If we do end up requiring an opt-in for imports under a per-interpreter
|
|
GIL, and later determine it isn't necessary, then we can switch the
|
|
default at that point, make the old opt-in slot a noop, and add a new
|
|
``PyModuleDef`` slot for explicitly opting *out*. In fact, it makes
|
|
sense to add that opt-out slot from the beginning.
|
|
|
|
|
|
Documentation
|
|
-------------
|
|
|
|
* C-API: the "Sub-interpreter support" section of ``Doc/c-api/init.rst``
|
|
will detail the updated API
|
|
* C-API: that section will explain about the consequences of
|
|
a per-interpreter GIL
|
|
* importlib: the ``ExtensionFileLoader`` entry will note import
|
|
may fail in subinterpreters
|
|
* importlib: there will be a new entry about
|
|
``importlib.util.allow_all_extensions()``
|
|
|
|
|
|
Impact
|
|
======
|
|
|
|
Backwards Compatibility
|
|
-----------------------
|
|
|
|
No behavior or APIs are intended to change due to this proposal,
|
|
with two exceptions:
|
|
|
|
* some extensions will fail to import in some subinterpreters
|
|
(see `the next section <Extension Modules_>`_)
|
|
* "mem" and "object" allocators that are currently not thread-safe
|
|
may now be susceptible to data races when used in combination
|
|
with multiple interpreters
|
|
|
|
The existing C-API for managing interpreters will preserve its current
|
|
behavior, with new behavior exposed through new API. No other API
|
|
or runtime behavior is meant to change, including compatibility with
|
|
the stable ABI.
|
|
|
|
See `Objects Exposed in the C-API`_ below for related discussion.
|
|
|
|
Extension Modules
|
|
'''''''''''''''''
|
|
|
|
Currently the most common usage of Python, by far, is with the main
|
|
interpreter running by itself. This proposal has zero impact on
|
|
extension modules in that scenario. Likewise, for better or worse,
|
|
there is no change in behavior under multiple interpreters created
|
|
using the existing ``Py_NewInterpreter()``.
|
|
|
|
Keep in mind that some extensions already break when used in multiple
|
|
interpreters, due to keeping module state in global variables (or
|
|
due to the `internal state of linked libraries`_). They
|
|
may crash or, worse, experience inconsistent behavior. That was part
|
|
of the motivation for :pep:`630` and friends, so this is not a new
|
|
situation nor a consequence of this proposal.
|
|
|
|
.. _internal state of linked libraries: https://github.com/pyca/cryptography/issues/2299
|
|
|
|
In contrast, when the `proposed API <proposed capi_>`_ is used to
|
|
create multiple interpreters, with the appropriate settings,
|
|
the behavior will change for incompatible extensions. In that case,
|
|
importing such an extension will fail (outside the main interpreter),
|
|
as explained in `Restricting Extension Modules`_. For extensions that
|
|
already break in multiple interpreters, this will be an improvement.
|
|
|
|
Additionally, some extension modules link against libraries with
|
|
thread-unsafe internal global state.
|
|
(See `Extension Module Thread Safety`_.)
|
|
Such modules will have to start wrapping any direct or indirect use
|
|
of that state in a lock. This is the key difference from other modules
|
|
that also implement multi-phase init and thus indicate support for
|
|
multiple interpreters (i.e. isolation).
|
|
|
|
Now we get to the break in compatibility mentioned above. Some
|
|
extensions are safe under multiple interpreters (and a per-interpreter
|
|
GIL), even though they haven't indicated that. Unfortunately, there is
|
|
no reliable way for the import system to infer that such an extension
|
|
is safe, so importing them will still fail. This case is addressed
|
|
in `Extension Module Compatibility`_ above.
|
|
|
|
Extension Module Maintainers
|
|
----------------------------
|
|
|
|
One related consideration is that a per-interpreter GIL will likely
|
|
drive increased use of multiple interpreters, particularly if :pep:`554`
|
|
is accepted. Some maintainers of large extension modules have expressed
|
|
concern about the increased burden they anticipate due to increased
|
|
use of multiple interpreters.
|
|
|
|
Specifically, enabling support for multiple interpreters will require
|
|
substantial work for some extension modules (albeit likely not many).
|
|
To add that support, the maintainer(s) of such a module (often
|
|
volunteers) would have to set aside their normal priorities and
|
|
interests to focus on compatibility (see :pep:`630`).
|
|
|
|
Of course, extension maintainers are free to not add support for use
|
|
in multiple interpreters. However, users will increasingly demand
|
|
such support, especially if the feature grows in popularity.
|
|
|
|
Either way, the situation can be stressful for maintainers of such
|
|
extensions, particularly when they are doing the work in their spare
|
|
time. The concerns they have expressed are understandable, and we address
|
|
the partial solution in the `Restricting Extension Modules`_ and
|
|
`Extension Module Compatibility`_ sections.
|
|
|
|
Alternate Python Implementations
|
|
--------------------------------
|
|
|
|
Other Python implementation are not required to provide support for
|
|
multiple interpreters in the same process (though some do already).
|
|
|
|
Security Implications
|
|
---------------------
|
|
|
|
There is no known impact to security with this proposal.
|
|
|
|
Maintainability
|
|
---------------
|
|
|
|
On the one hand, this proposal has already motivated a number of
|
|
improvements that make CPython *more* maintainable. That is expected
|
|
to continue. On the other hand, the underlying work has already
|
|
exposed various pre-existing defects in the runtime that have had
|
|
to be fixed. That is also expected to continue as multiple interpreters
|
|
receive more use. Otherwise, there shouldn't be a significant impact
|
|
on maintainability, so the net effect should be positive.
|
|
|
|
Performance
|
|
-----------
|
|
|
|
The work to consolidate globals has already provided a number of
|
|
improvements to CPython's performance, both speeding it up and using
|
|
less memory, and this should continue. The performance benefits of a
|
|
per-interpreter GIL specifically have not been explored. At the very
|
|
least, it is not expected to make CPython slower
|
|
(as long as interpreters are sufficiently isolated). And, obviously,
|
|
it enable a variety of multi-core parallelism in Python code.
|
|
|
|
|
|
How to Teach This
|
|
=================
|
|
|
|
Unlike :pep:`554`, this is an advanced feature meant for a narrow set
|
|
of users of the C-API. There is no expectation that the specifics of
|
|
the API nor its direct application will be taught.
|
|
|
|
That said, if it were taught then it would boil down to the following:
|
|
|
|
In addition to Py_NewInterpreter(), you can use
|
|
Py_NewInterpreterFromConfig() to create an interpreter.
|
|
The config you pass it indicates how you want that
|
|
interpreter to behave.
|
|
|
|
Furthermore, the maintainers of any extension modules that create
|
|
isolated interpreters will likely need to explain the consequences
|
|
of a per-interpreter GIL to their users. The first thing to explain
|
|
is what :pep:`554` teaches about the concurrency model that isolated
|
|
interpreters enables. That leads into the point that Python software
|
|
written using that concurrency model can then take advantage
|
|
of multi-core parallelism, which is currently
|
|
prevented by the GIL.
|
|
|
|
.. XXX We should add docs (a la PEP 630) that spell out how to make
|
|
an extension compatible with per-interpreter GIL.
|
|
|
|
|
|
Reference Implementation
|
|
========================
|
|
|
|
<TBD>
|
|
|
|
|
|
Open Issues
|
|
===========
|
|
|
|
* Are we okay to require "mem" and "object" allocators to be thread-safe?
|
|
* How would a per-interpreter tracemalloc module relate to global allocators?
|
|
* Would the faulthandler module be limited to the main interpreter
|
|
(like the signal module) or would we leak that global state between
|
|
interpreters (protected by a granular lock)?
|
|
* Split out an informational PEP with all the relevant info,
|
|
based on the "Consolidating Runtime Global State" section?
|
|
* How likely is it that a module works under multiple interpreters
|
|
(isolation) but doesn't work under a per-interpreter GIL?
|
|
(See `Extension Module Thread Safety`_.)
|
|
* If it is likely enough, what can we do to help extension maintainers
|
|
mitigate the problem and enjoy use under a per-interpreter GIL?
|
|
* What would be a better (scarier-sounding) name
|
|
for ``allow_all_extensions``?
|
|
|
|
|
|
Deferred Functionality
|
|
======================
|
|
|
|
* ``PyInterpreterConfig`` option to always run the interpreter in a new thread
|
|
* ``PyInterpreterConfig`` option to assign a "main" thread to the interpreter
|
|
and only run in that thread
|
|
|
|
|
|
Rejected Ideas
|
|
==============
|
|
|
|
<TBD>
|
|
|
|
|
|
Extra Context
|
|
=============
|
|
|
|
Sharing Global Objects
|
|
----------------------
|
|
|
|
We are sharing some global objects between interpreters.
|
|
This is an implementation detail and relates more to
|
|
`globals consolidation <Consolidating Runtime Global State>`_
|
|
than to this proposal, but it is a significant enough detail
|
|
to explain here.
|
|
|
|
The alternative is to share no objects between interpreters, ever.
|
|
To accomplish that, we'd have to sort out the fate of all our static
|
|
types, as well as deal with compatibility issues for the many objects
|
|
`exposed in the public C-API <capi objects_>`_.
|
|
|
|
That approach introduces a meaningful amount of extra complexity
|
|
and higher risk, though prototyping has demonstrated valid solutions.
|
|
Also, it would likely result in a performance penalty.
|
|
|
|
`Immortal objects <Depending on Immortal Objects_>`_ allow us to
|
|
share the otherwise immutable global objects. That way we avoid
|
|
the extra costs.
|
|
|
|
.. _capi objects:
|
|
|
|
Objects Exposed in the C-API
|
|
''''''''''''''''''''''''''''
|
|
|
|
The C-API (including the limited API) exposes all the builtin types,
|
|
including the builtin exceptions, as well as the builtin singletons.
|
|
The exceptions are exposed as ``PyObject *`` but the rest are exposed
|
|
as the static values rather than pointers. This was one of the few
|
|
non-trivial problems we had to solve for per-interpreter GIL.
|
|
|
|
With immortal objects this is a non-issue.
|
|
|
|
|
|
Consolidating Runtime Global State
|
|
----------------------------------
|
|
|
|
As noted in `CPython Runtime State`_ above, there is an active effort
|
|
(separate from this PEP) to consolidate CPython's global state into the
|
|
``_PyRuntimeState`` struct. Nearly all the work involves moving that
|
|
state from global variables. The project is particularly relevant to
|
|
this proposal, so below is some extra detail.
|
|
|
|
Benefits to Consolidation
|
|
'''''''''''''''''''''''''
|
|
|
|
Consolidating the globals has a variety of benefits:
|
|
|
|
* greatly reduces the number of C globals (best practice for C code)
|
|
* the move draws attention to runtime state that is unstable or broken
|
|
* encourages more consistency in how runtime state is used
|
|
* makes it easier to discover/identify CPython's runtime state
|
|
* makes it easier to statically allocate runtime state in a consistent way
|
|
* better memory locality for runtime state
|
|
|
|
Furthermore all the benefits listed in `Indirect Benefits`_ above also
|
|
apply here, and the same projects listed there benefit.
|
|
|
|
Scale of Work
|
|
'''''''''''''
|
|
|
|
The number of global variables to be moved is large enough to matter,
|
|
but most are Python objects that can be dealt with in large groups
|
|
(like ``Py_IDENTIFIER``). In nearly all cases, moving these globals
|
|
to the interpreter is highly mechanical. That doesn't require
|
|
cleverness but instead requires someone to put in the time.
|
|
|
|
State To Be Moved
|
|
'''''''''''''''''
|
|
|
|
The remaining global variables can be categorized as follows:
|
|
|
|
* global objects
|
|
|
|
* static types (incl. exception types)
|
|
* non-static types (incl. heap types, structseq types)
|
|
* singletons (static)
|
|
* singletons (initialized once)
|
|
* cached objects
|
|
|
|
* non-objects
|
|
|
|
* will not (or unlikely to) change after init
|
|
* only used in the main thread
|
|
* initialized lazily
|
|
* pre-allocated buffers
|
|
* state
|
|
|
|
Those globals are spread between the core runtime, the builtin modules,
|
|
and the stdlib extension modules.
|
|
|
|
For a breakdown of the remaining globals, run:
|
|
|
|
.. code-block:: bash
|
|
|
|
./python Tools/c-analyzer/table-file.py Tools/c-analyzer/cpython/globals-to-fix.tsv
|
|
|
|
Already Completed Work
|
|
''''''''''''''''''''''
|
|
|
|
As mentioned, this work has been going on for many years. Here are some
|
|
of the things that have already been done:
|
|
|
|
* cleanup of runtime initialization (see :pep:`432` / :pep:`587`)
|
|
* extension module isolation machinery (see :pep:`384` / :pep:`3121` / :pep:`489`)
|
|
* isolation for many builtin modules
|
|
* isolation for many stdlib extension modules
|
|
* addition of ``_PyRuntimeState``
|
|
* no more ``_Py_IDENTIFIER()``
|
|
* statically allocated:
|
|
|
|
* empty string
|
|
* string literals
|
|
* identifiers
|
|
* latin-1 strings
|
|
* length-1 bytes
|
|
* empty tuple
|
|
|
|
Tooling
|
|
'''''''
|
|
|
|
As already indicated, there are several tools to help identify the
|
|
globals and reason about them.
|
|
|
|
* ``Tools/c-analyzer/cpython/globals-to-fix.tsv`` - the list of remaining globals
|
|
* ``Tools/c-analyzer/c-analyzer.py``
|
|
|
|
* ``analyze`` - identify all the globals
|
|
* ``check`` - fail if there are any unsupported globals that aren't ignored
|
|
|
|
* ``Tools/c-analyzer/table-file.py`` - summarize the known globals
|
|
|
|
Also, the check for unsupported globals is incorporated into CI so that
|
|
no new globals are accidentally added.
|
|
|
|
Global Objects
|
|
''''''''''''''
|
|
|
|
Global objects that are safe to be shared (without a GIL) between
|
|
interpreters can stay on ``_PyRuntimeState``. Not only must the object
|
|
be effectively immutable (e.g. singletons, strings), but not even the
|
|
refcount can change for it to be safe. Immortality (:pep:`683`)
|
|
provides that. (The alternative is that no objects are shared, which
|
|
adds significant complexity to the solution, particularly for the
|
|
objects `exposed in the public C-API <capi objects_>`_.)
|
|
|
|
Builtin static types are a special case of global objects that will be
|
|
shared. They are effectively immutable except for one part:
|
|
``__subclasses__`` (AKA ``tp_subclasses``). We expect that nothing
|
|
else on a builtin type will change, even the content
|
|
of ``__dict__`` (AKA ``tp_dict``).
|
|
|
|
``__subclasses__`` for the builtin types will be dealt with by making
|
|
it a getter that does a lookup on the current ``PyInterpreterState``
|
|
for that type.
|
|
|
|
|
|
References
|
|
==========
|
|
|
|
Related:
|
|
|
|
* :pep:`384` "Defining a Stable ABI"
|
|
* :pep:`432` "Restructuring the CPython startup sequence"
|
|
* :pep:`489` "Multi-phase extension module initialization"
|
|
* :pep:`554` "Multiple Interpreters in the Stdlib"
|
|
* :pep:`573` "Module State Access from C Extension Methods"
|
|
* :pep:`587` "Python Initialization Configuration"
|
|
* :pep:`630` "Isolating Extension Modules"
|
|
* :pep:`683` "Immortal Objects, Using a Fixed Refcount"
|
|
* :pep:`3121` "Extension Module Initialization and Finalization"
|
|
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document is placed in the public domain or under the
|
|
CC0-1.0-Universal license, whichever is more permissive.
|