python-peps/pep-0684/index.html

1022 lines
68 KiB
HTML
Raw Permalink Normal View History

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="color-scheme" content="light dark">
<title>PEP 684 A Per-Interpreter GIL | peps.python.org</title>
<link rel="shortcut icon" href="../_static/py.png">
<link rel="canonical" href="https://peps.python.org/pep-0684/">
<link rel="stylesheet" href="../_static/style.css" type="text/css">
<link rel="stylesheet" href="../_static/mq.css" type="text/css">
<link rel="stylesheet" href="../_static/pygments.css" type="text/css" media="(prefers-color-scheme: light)" id="pyg-light">
<link rel="stylesheet" href="../_static/pygments_dark.css" type="text/css" media="(prefers-color-scheme: dark)" id="pyg-dark">
<link rel="alternate" type="application/rss+xml" title="Latest PEPs" href="https://peps.python.org/peps.rss">
<meta property="og:title" content='PEP 684 A Per-Interpreter GIL | peps.python.org'>
<meta property="og:description" content="Since Python 1.5 (1997), CPython users can run multiple interpreters in the same process. However, interpreters in the same process have always shared a significant amount of global state. This is a source of bugs, with a growing impact as more and mo...">
<meta property="og:type" content="website">
<meta property="og:url" content="https://peps.python.org/pep-0684/">
<meta property="og:site_name" content="Python Enhancement Proposals (PEPs)">
<meta property="og:image" content="https://peps.python.org/_static/og-image.png">
<meta property="og:image:alt" content="Python PEPs">
<meta property="og:image:width" content="200">
<meta property="og:image:height" content="200">
<meta name="description" content="Since Python 1.5 (1997), CPython users can run multiple interpreters in the same process. However, interpreters in the same process have always shared a significant amount of global state. This is a source of bugs, with a growing impact as more and mo...">
<meta name="theme-color" content="#3776ab">
</head>
<body>
<svg xmlns="http://www.w3.org/2000/svg" style="display: none;">
<symbol id="svg-sun-half" viewBox="0 0 24 24" pointer-events="all">
<title>Following system colour scheme</title>
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none"
stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
<circle cx="12" cy="12" r="9"></circle>
<path d="M12 3v18m0-12l4.65-4.65M12 14.3l7.37-7.37M12 19.6l8.85-8.85"></path>
</svg>
</symbol>
<symbol id="svg-moon" viewBox="0 0 24 24" pointer-events="all">
<title>Selected dark colour scheme</title>
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none"
stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
<path stroke="none" d="M0 0h24v24H0z" fill="none"></path>
<path d="M12 3c.132 0 .263 0 .393 0a7.5 7.5 0 0 0 7.92 12.446a9 9 0 1 1 -8.313 -12.454z"></path>
</svg>
</symbol>
<symbol id="svg-sun" viewBox="0 0 24 24" pointer-events="all">
<title>Selected light colour scheme</title>
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none"
stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
<circle cx="12" cy="12" r="5"></circle>
<line x1="12" y1="1" x2="12" y2="3"></line>
<line x1="12" y1="21" x2="12" y2="23"></line>
<line x1="4.22" y1="4.22" x2="5.64" y2="5.64"></line>
<line x1="18.36" y1="18.36" x2="19.78" y2="19.78"></line>
<line x1="1" y1="12" x2="3" y2="12"></line>
<line x1="21" y1="12" x2="23" y2="12"></line>
<line x1="4.22" y1="19.78" x2="5.64" y2="18.36"></line>
<line x1="18.36" y1="5.64" x2="19.78" y2="4.22"></line>
</svg>
</symbol>
</svg>
<script>
document.documentElement.dataset.colour_scheme = localStorage.getItem("colour_scheme") || "auto"
</script>
<section id="pep-page-section">
<header>
<h1>Python Enhancement Proposals</h1>
<ul class="breadcrumbs">
<li><a href="https://www.python.org/" title="The Python Programming Language">Python</a> &raquo; </li>
<li><a href="../pep-0000/">PEP Index</a> &raquo; </li>
<li>PEP 684</li>
</ul>
<button id="colour-scheme-cycler" onClick="setColourScheme(nextColourScheme())">
<svg aria-hidden="true" class="colour-scheme-icon-when-auto"><use href="#svg-sun-half"></use></svg>
<svg aria-hidden="true" class="colour-scheme-icon-when-dark"><use href="#svg-moon"></use></svg>
<svg aria-hidden="true" class="colour-scheme-icon-when-light"><use href="#svg-sun"></use></svg>
<span class="visually-hidden">Toggle light / dark / auto colour theme</span>
</button>
</header>
<article>
<section id="pep-content">
<h1 class="page-title">PEP 684 A Per-Interpreter GIL</h1>
<dl class="rfc2822 field-list simple">
<dt class="field-odd">Author<span class="colon">:</span></dt>
<dd class="field-odd">Eric Snow &lt;ericsnowcurrently&#32;&#97;t&#32;gmail.com&gt;</dd>
<dt class="field-even">Discussions-To<span class="colon">:</span></dt>
<dd class="field-even"><a class="reference external" href="https://discuss.python.org/t/pep-684-a-per-interpreter-gil/19583">Discourse thread</a></dd>
<dt class="field-odd">Status<span class="colon">:</span></dt>
<dd class="field-odd"><abbr title="Accepted and implementation complete, or no longer active">Final</abbr></dd>
<dt class="field-even">Type<span class="colon">:</span></dt>
<dd class="field-even"><abbr title="Normative PEP with a new feature for Python, implementation change for CPython or interoperability standard for the ecosystem">Standards Track</abbr></dd>
<dt class="field-odd">Requires<span class="colon">:</span></dt>
<dd class="field-odd"><a class="reference external" href="../pep-0683/">683</a></dd>
<dt class="field-even">Created<span class="colon">:</span></dt>
<dd class="field-even">08-Mar-2022</dd>
<dt class="field-odd">Python-Version<span class="colon">:</span></dt>
<dd class="field-odd">3.12</dd>
<dt class="field-even">Post-History<span class="colon">:</span></dt>
<dd class="field-even"><a class="reference external" href="https://mail.python.org/archives/list/python-dev&#64;python.org/thread/CF7B7FMACFYDAHU6NPBEVEY6TOSGICXU/" title="Python-Dev thread">08-Mar-2022</a>,
<a class="reference external" href="https://discuss.python.org/t/pep-684-a-per-interpreter-gil/19583" title="Discourse thread">29-Sep-2022</a>,
<a class="reference external" href="https://discuss.python.org/t/pep-684-a-per-interpreter-gil/19583/19/" title="Discourse message">28-Oct-2022</a></dd>
<dt class="field-odd">Resolution<span class="colon">:</span></dt>
<dd class="field-odd"><a class="reference external" href="https://discuss.python.org/t/19583/42">Discourse message</a></dd>
</dl>
<hr class="docutils" />
<section id="contents">
<details><summary>Table of Contents</summary><ul class="simple">
<li><a class="reference internal" href="#abstract">Abstract</a></li>
<li><a class="reference internal" href="#high-level-summary">High-Level Summary</a><ul>
<li><a class="reference internal" href="#the-gil">The GIL</a></li>
<li><a class="reference internal" href="#cpython-runtime-state">CPython Runtime State</a></li>
<li><a class="reference internal" href="#other-isolation-considerations">Other Isolation Considerations</a></li>
<li><a class="reference internal" href="#depending-on-immortal-objects">Depending on Immortal Objects</a></li>
</ul>
</li>
<li><a class="reference internal" href="#motivation">Motivation</a><ul>
<li><a class="reference internal" href="#indirect-benefits">Indirect Benefits</a></li>
<li><a class="reference internal" href="#existing-use-of-multiple-interpreters">Existing Use of Multiple Interpreters</a></li>
<li><a class="reference internal" href="#pep-554-multiple-interpreters-in-the-stdlib">PEP 554 (Multiple Interpreters in the Stdlib)</a></li>
</ul>
</li>
<li><a class="reference internal" href="#rationale">Rationale</a></li>
<li><a class="reference internal" href="#specification">Specification</a><ul>
<li><a class="reference internal" href="#per-interpreter-state">Per-Interpreter State</a><ul>
<li><a class="reference internal" href="#memory-allocators">Memory Allocators</a></li>
</ul>
</li>
<li><a class="reference internal" href="#c-api">C-API</a><ul>
<li><a class="reference internal" href="#pyinterpreterconfig-own-gil">PyInterpreterConfig.own_gil</a></li>
<li><a class="reference internal" href="#pyinterpreterconfig-strict-extensions-compat">PyInterpreterConfig.strict_extensions_compat</a></li>
</ul>
</li>
<li><a class="reference internal" href="#restricting-extension-modules">Restricting Extension Modules</a><ul>
<li><a class="reference internal" href="#extension-module-compatibility">Extension Module Compatibility</a></li>
<li><a class="reference internal" href="#extension-module-thread-safety">Extension Module Thread Safety</a></li>
</ul>
</li>
<li><a class="reference internal" href="#documentation">Documentation</a></li>
</ul>
</li>
<li><a class="reference internal" href="#impact">Impact</a><ul>
<li><a class="reference internal" href="#backwards-compatibility">Backwards Compatibility</a><ul>
<li><a class="reference internal" href="#extension-modules">Extension Modules</a></li>
</ul>
</li>
<li><a class="reference internal" href="#extension-module-maintainers">Extension Module Maintainers</a></li>
<li><a class="reference internal" href="#alternate-python-implementations">Alternate Python Implementations</a></li>
<li><a class="reference internal" href="#security-implications">Security Implications</a></li>
<li><a class="reference internal" href="#maintainability">Maintainability</a></li>
<li><a class="reference internal" href="#performance">Performance</a></li>
</ul>
</li>
<li><a class="reference internal" href="#how-to-teach-this">How to Teach This</a></li>
<li><a class="reference internal" href="#reference-implementation">Reference Implementation</a></li>
<li><a class="reference internal" href="#open-issues">Open Issues</a></li>
<li><a class="reference internal" href="#deferred-functionality">Deferred Functionality</a></li>
<li><a class="reference internal" href="#rejected-ideas">Rejected Ideas</a></li>
<li><a class="reference internal" href="#extra-context">Extra Context</a><ul>
<li><a class="reference internal" href="#sharing-global-objects">Sharing Global Objects</a><ul>
<li><a class="reference internal" href="#objects-exposed-in-the-c-api">Objects Exposed in the C-API</a></li>
</ul>
</li>
<li><a class="reference internal" href="#consolidating-runtime-global-state">Consolidating Runtime Global State</a><ul>
<li><a class="reference internal" href="#benefits-to-consolidation">Benefits to Consolidation</a></li>
<li><a class="reference internal" href="#scale-of-work">Scale of Work</a></li>
<li><a class="reference internal" href="#state-to-be-moved">State To Be Moved</a></li>
<li><a class="reference internal" href="#already-completed-work">Already Completed Work</a></li>
<li><a class="reference internal" href="#tooling">Tooling</a></li>
<li><a class="reference internal" href="#global-objects">Global Objects</a></li>
</ul>
</li>
</ul>
</li>
<li><a class="reference internal" href="#references">References</a></li>
<li><a class="reference internal" href="#copyright">Copyright</a></li>
</ul>
</details></section>
<section id="abstract">
<h2><a class="toc-backref" href="#abstract" role="doc-backlink">Abstract</a></h2>
<p>Since Python 1.5 (1997), CPython users can run multiple interpreters
in the same process. However, interpreters in the same process
have always shared a significant
amount of global state. This is a source of bugs, with a growing
impact as more and more people use the feature. Furthermore,
sufficient isolation would facilitate true multi-core parallelism,
where interpreters no longer share the GIL. The changes outlined in
this proposal will result in that level of interpreter isolation.</p>
</section>
<section id="high-level-summary">
<h2><a class="toc-backref" href="#high-level-summary" role="doc-backlink">High-Level Summary</a></h2>
<p>At a high level, this proposal changes CPython in the following ways:</p>
<ul class="simple">
<li>stops sharing the GIL between interpreters, given sufficient isolation</li>
<li>adds several new interpreter config options for isolation settings</li>
<li>keeps incompatible extensions from causing problems</li>
</ul>
<section id="the-gil">
<h3><a class="toc-backref" href="#the-gil" role="doc-backlink">The GIL</a></h3>
<p>The GIL protects concurrent access to most of CPythons runtime state.
So all that GIL-protected global state must move to each interpreter
before the GIL can.</p>
<p>(In a handful of cases, other mechanisms can be used to ensure
thread-safe sharing instead, such as locks or “immortal” objects.)</p>
</section>
<section id="cpython-runtime-state">
<h3><a class="toc-backref" href="#cpython-runtime-state" role="doc-backlink">CPython Runtime State</a></h3>
<p>Properly isolating interpreters requires that most of CPythons
runtime state be stored in the <code class="docutils literal notranslate"><span class="pre">PyInterpreterState</span></code> struct. Currently,
only a portion of it is; the rest is found either in C global variables
or in <code class="docutils literal notranslate"><span class="pre">_PyRuntimeState</span></code>. Most of that will have to be moved.</p>
<p>This directly coincides with an ongoing effort (of many years) to greatly
reduce internal use of global variables and consolidate the runtime
state into <code class="docutils literal notranslate"><span class="pre">_PyRuntimeState</span></code> and <code class="docutils literal notranslate"><span class="pre">PyInterpreterState</span></code>.
(See <a class="reference internal" href="#consolidating-runtime-global-state">Consolidating Runtime Global State</a> below.) That project has
<a class="reference internal" href="#benefits-to-consolidation">significant merit on its own</a>
and has faced little controversy. So, while a per-interpreter GIL
relies on the completion of that effort, that project should not be
considered a part of this proposalonly a dependency.</p>
</section>
<section id="other-isolation-considerations">
<h3><a class="toc-backref" href="#other-isolation-considerations" role="doc-backlink">Other Isolation Considerations</a></h3>
<p>CPythons interpreters must be strictly isolated from each other, with
few exceptions. To a large extent they already are. Each interpreter
has its own copy of all modules, classes, functions, and variables.
The CPython C-API docs <a class="reference external" href="https://docs.python.org/3/c-api/init.html#bugs-and-caveats">explain further</a>.</p>
<p>However, aside from what has already been mentioned (e.g. the GIL),
there are a couple of ways in which interpreters still share some state.</p>
<p>First of all, some process-global resources (e.g. memory,
file descriptors, environment variables) are shared. There are no
plans to change this.</p>
<p>Second, some isolation is faulty due to bugs or implementations that
did not take multiple interpreters into account. This includes
CPythons runtime and the stdlib, as well as extension modules that
rely on global variables. Bugs should be opened in these cases,
as some already have been.</p>
</section>
<section id="depending-on-immortal-objects">
<h3><a class="toc-backref" href="#depending-on-immortal-objects" role="doc-backlink">Depending on Immortal Objects</a></h3>
<p><a class="pep reference internal" href="../pep-0683/" title="PEP 683 Immortal Objects, Using a Fixed Refcount">PEP 683</a> introduces immortal objects as a CPython-internal feature.
With immortal objects, we can share any otherwise immutable global
objects between all interpreters. Consequently, this PEP does not
need to address how to deal with the various objects
<a class="reference internal" href="#capi-objects">exposed in the public C-API</a>.
It also simplifies the question of what to do about the builtin
static types. (See <a class="reference internal" href="#global-objects">Global Objects</a> below.)</p>
<p>Both issues have alternate solutions, but everything is simpler with
immortal objects. If PEP 683 is not accepted then this one will be
updated with the alternatives. This lets us reduce noise in this
proposal.</p>
</section>
</section>
<section id="motivation">
<h2><a class="toc-backref" href="#motivation" role="doc-backlink">Motivation</a></h2>
<p>The fundamental problem were solving here is a lack of true multi-core
parallelism (for Python code) in the CPython runtime. The GIL is the
cause. While it usually isnt a problem in practice, at the very least
it makes Pythons multi-core story murky, which makes the GIL
a consistent distraction.</p>
<p>Isolated interpreters are also an effective mechanism to support
certain concurrency models. <a class="pep reference internal" href="../pep-0554/" title="PEP 554 Multiple Interpreters in the Stdlib">PEP 554</a> discusses this in more detail.</p>
<section id="indirect-benefits">
<h3><a class="toc-backref" href="#indirect-benefits" role="doc-backlink">Indirect Benefits</a></h3>
<p>Most of the effort needed for a per-interpreter GIL has benefits that
make those tasks worth doing anyway:</p>
<ul class="simple">
<li>makes multiple-interpreter behavior more reliable</li>
<li>has led to fixes for long-standing runtime bugs that otherwise
hadnt been prioritized</li>
<li>has been exposing (and inspiring fixes for) previously unknown runtime bugs</li>
<li>has driven cleaner runtime initialization (<a class="pep reference internal" href="../pep-0432/" title="PEP 432 Restructuring the CPython startup sequence">PEP 432</a>, <a class="pep reference internal" href="../pep-0587/" title="PEP 587 Python Initialization Configuration">PEP 587</a>)</li>
<li>has driven cleaner and more complete runtime finalization</li>
<li>led to structural layering of the C-API (e.g. <code class="docutils literal notranslate"><span class="pre">Include/internal</span></code>)</li>
<li>also see <a class="reference internal" href="#benefits-to-consolidation">Benefits to Consolidation</a> below</li>
</ul>
<p>Furthermore, much of that work benefits other CPython-related projects:</p>
<ul class="simple">
<li>performance improvements (”<a class="reference external" href="https://github.com/faster-cpython/ideas">faster-cpython</a>”)</li>
<li>pre-fork application deployment (e.g. <a class="reference external" href="https://instagram-engineering.com/copy-on-write-friendly-python-garbage-collection-ad6ed5233ddf">Instagram server</a>)</li>
<li>extension module isolation (see <a class="pep reference internal" href="../pep-0630/" title="PEP 630 Isolating Extension Modules">PEP 630</a>, etc.)</li>
<li>embedding CPython</li>
</ul>
</section>
<section id="existing-use-of-multiple-interpreters">
<h3><a class="toc-backref" href="#existing-use-of-multiple-interpreters" role="doc-backlink">Existing Use of Multiple Interpreters</a></h3>
<p>The C-API for multiple interpreters has been used for many years.
However, until relatively recently the feature wasnt widely known,
nor extensively used (with the exception of mod_wsgi).</p>
<p>In the last few years use of multiple interpreters has been increasing.
Here are some of the public projects using the feature currently:</p>
<ul class="simple">
<li><a class="reference external" href="https://github.com/GrahamDumpleton/mod_wsgi">mod_wsgi</a></li>
<li><a class="reference external" href="https://github.com/ceph/ceph/pull/14971">OpenStack Ceph</a></li>
<li><a class="reference external" href="https://github.com/ninia/jep">JEP</a></li>
<li><a class="reference external" href="https://github.com/xbmc/xbmc">Kodi</a></li>
</ul>
<p>Note that, with <a class="pep reference internal" href="../pep-0554/" title="PEP 554 Multiple Interpreters in the Stdlib">PEP 554</a>, multiple interpreter usage would likely
grow significantly (via Python code rather than the C-API).</p>
</section>
<section id="pep-554-multiple-interpreters-in-the-stdlib">
<h3><a class="toc-backref" href="#pep-554-multiple-interpreters-in-the-stdlib" role="doc-backlink">PEP 554 (Multiple Interpreters in the Stdlib)</a></h3>
<p><a class="pep reference internal" href="../pep-0554/" title="PEP 554 Multiple Interpreters in the Stdlib">PEP 554</a> is strictly about providing a minimal stdlib module
to give users access to multiple interpreters from Python code.
In fact, it specifically avoids proposing any changes related to
the GIL. Consider, however, that users of that module would benefit
from a per-interpreter GIL, which makes PEP 554 more appealing.</p>
</section>
</section>
<section id="rationale">
<h2><a class="toc-backref" href="#rationale" role="doc-backlink">Rationale</a></h2>
<p>During initial investigations in 2014, a variety of possible solutions
for multi-core Python were explored, but each had its drawbacks
without simple solutions:</p>
<ul class="simple">
<li>the existing practice of releasing the GIL in extension modules<ul>
<li>doesnt help with Python code</li>
</ul>
</li>
<li>other Python implementations (e.g. Jython, IronPython)<ul>
<li>CPython dominates the community</li>
</ul>
</li>
<li>remove the GIL (e.g. gilectomy, “no-gil”)<ul>
<li>too much technical risk (at the time)</li>
</ul>
</li>
<li>Trent Nelsons “PyParallel” project<ul>
<li>incomplete; Windows-only at the time</li>
</ul>
</li>
<li><code class="docutils literal notranslate"><span class="pre">multiprocessing</span></code><ul>
<li>too much work to make it effective enough;
high penalties in some situations (at large scale, Windows)</li>
</ul>
</li>
<li>other parallelism tools (e.g. dask, ray, MPI)<ul>
<li>not a fit for the runtime/stdlib</li>
</ul>
</li>
<li>give up on multi-core (e.g. async, do nothing)<ul>
<li>this can only end in tears</li>
</ul>
</li>
</ul>
<p>Even in 2014, it was fairly clear that a solution using isolated
interpreters did not have a high level of technical risk and that
most of the work was worth doing anyway.
(The downside was the volume of work to be done.)</p>
</section>
<section id="specification">
<h2><a class="toc-backref" href="#specification" role="doc-backlink">Specification</a></h2>
<p>As <a class="reference internal" href="#high-level-summary">summarized above</a>, this proposal involves the
following changes, in the order they must happen:</p>
<ol class="arabic simple">
<li><a class="reference internal" href="#consolidating-runtime-global-state">consolidate global runtime state</a>
(including objects) into <code class="docutils literal notranslate"><span class="pre">_PyRuntimeState</span></code></li>
<li>move nearly all of the state down into <code class="docutils literal notranslate"><span class="pre">PyInterpreterState</span></code></li>
<li>finally, move the GIL down into <code class="docutils literal notranslate"><span class="pre">PyInterpreterState</span></code></li>
<li>everything else<ul class="simple">
<li>update the C-API</li>
<li>implement extension module restrictions</li>
<li>work with popular extension maintainers to help
with multi-interpreter support</li>
</ul>
</li>
</ol>
<section id="per-interpreter-state">
<h3><a class="toc-backref" href="#per-interpreter-state" role="doc-backlink">Per-Interpreter State</a></h3>
<p>The following runtime state will be moved to <code class="docutils literal notranslate"><span class="pre">PyInterpreterState</span></code>:</p>
<ul class="simple">
<li>all global objects that are not safely shareable (fully immutable)</li>
<li>the GIL</li>
<li>most mutable data thats currently protected by the GIL</li>
<li>mutable data thats currently protected by some other per-interpreter lock</li>
<li>mutable data that may be used independently in different interpreters
(also applies to extension modules, including those with multi-phase init)</li>
<li>all other mutable data not otherwise excluded below</li>
</ul>
<p>Furthermore, a portion of the full global state has already been
moved to the interpreter, including GC, warnings, and atexit hooks.</p>
<p>The following runtime state will not be moved:</p>
<ul class="simple">
<li>global objects that are safely shareable, if any</li>
<li>immutable data, often <code class="docutils literal notranslate"><span class="pre">const</span></code></li>
<li>effectively immutable data (treated as immutable), for example:<ul>
<li>some state is initialized early and never modified again</li>
<li>hashes for strings (<code class="docutils literal notranslate"><span class="pre">PyUnicodeObject</span></code>) are idempotently calculated
when first needed and then cached</li>
</ul>
</li>
<li>all data that is guaranteed to be modified exclusively in the main thread,
including:<ul>
<li>state used only in CPythons <code class="docutils literal notranslate"><span class="pre">main()</span></code></li>
<li>the REPLs state</li>
<li>data modified only during runtime init (effectively immutable afterward)</li>
</ul>
</li>
<li>mutable data thats protected by some global lock (other than the GIL)</li>
<li>global state in atomic variables</li>
<li>mutable global state that can be changed (sensibly) to atomic variables</li>
</ul>
<section id="memory-allocators">
<h4><a class="toc-backref" href="#memory-allocators" role="doc-backlink">Memory Allocators</a></h4>
<p>This is one of the most sensitive parts of the work to isolate interpreters.
The simplest solution is to move the global state of the internal
“small block” allocator to <code class="docutils literal notranslate"><span class="pre">PyInterpreterState</span></code>, as we are doing with
nearly all other runtime state. The following elaborates on the details
and rationale.</p>
<p>CPython provides a memory management C-API, with <a class="reference external" href="https://docs.python.org/3/c-api/memory.html#allocator-domains">three allocator domains</a>:
“raw”, “mem”, and “object”. Each provides the equivalent of <code class="docutils literal notranslate"><span class="pre">malloc()</span></code>,
<code class="docutils literal notranslate"><span class="pre">calloc()</span></code>, <code class="docutils literal notranslate"><span class="pre">realloc()</span></code>, and <code class="docutils literal notranslate"><span class="pre">free()</span></code>. A custom allocator for each
domain can be set during runtime initialization and the current allocator
can be wrapped with a hook using the same API (for example, the stdlib
tracemalloc module). The allocators are currently runtime-global,
shared by all interpreters.</p>
<p>The “raw” allocator is expected to be thread-safe and defaults to glibcs
allocator (<code class="docutils literal notranslate"><span class="pre">malloc()</span></code>, etc.). However, the “mem” and “object” allocators
are not expected to be thread-safe and currently may rely on the GIL for
thread-safety. This is partly because the default allocator for both,
AKA “pyobject”, <a class="reference external" href="https://peps.python.org/pep-0445/#gil-free-pymem-malloc">is not thread-safe</a>. This is due to how all state for
that allocator is stored in C global variables.
(See <code class="docutils literal notranslate"><span class="pre">Objects/obmalloc.c</span></code>.)</p>
<p>Thus we come back to the question of isolating runtime state. In order
for interpreters to stop sharing the GIL, allocator thread-safety
must be addressed. If interpreters continue sharing the allocators
then we need some other way to get thread-safety. Otherwise interpreters
must stop sharing the allocators. In both cases there are a number of
possible solutions, each with potential downsides.</p>
<p>To keep sharing the allocators, the simplest solution is to use
a granular runtime-global lock around the calls to the “mem” and “object”
allocators in <code class="docutils literal notranslate"><span class="pre">PyMem_Malloc()</span></code>, <code class="docutils literal notranslate"><span class="pre">PyObject_Malloc()</span></code>, etc. This would
impact performance, but there are some ways to mitigate that (e.g. only
start locking once the first subinterpreter is created).</p>
<p>Another way to keep sharing the allocators is to require that the “mem”
and “object” allocators be thread-safe. This would mean wed have to
make the pyobject allocator implementation thread-safe. That could
even involve re-implementing it using an extensible allocator like
mimalloc. The potential downside is in the cost to re-implement
the allocator and the risk of defects inherent to such an endeavor.</p>
<p>Regardless, a switch to requiring thread-safe allocators would impact
anyone that embeds CPython and currently sets a thread-unsafe allocator.
Wed need to consider who might be affected and how we reduce any
negative impact (e.g. add a basic C-API to help make an allocator
thread-safe).</p>
<p>If we did stop sharing the allocators between interpreters, wed have
to do so only for the “mem” and “object” allocators. We might also need
to keep a full set of global allocators for certain runtime-level usage.
There would be some performance penalty due to looking up the current
interpreter and then pointer indirection to get the allocators.
Embedders would also likely have to provide a new allocator context
for each interpreter. On the plus side, allocator hooks (e.g. tracemalloc)
would not be affected.</p>
<p>Ultimately, we will go with the simplest option:</p>
<ul class="simple">
<li>keep the allocators in the global runtime state</li>
<li>require that they be thread-safe</li>
<li>move the state of the default object allocator (AKA “small block”
allocator) to <code class="docutils literal notranslate"><span class="pre">PyInterpreterState</span></code></li>
</ul>
<p>We experimented with <a class="reference external" href="https://github.com/ericsnowcurrently/cpython/tree/try-per-interpreter-alloc">a rough implementation</a> and found it was fairly
straightforward, and the performance penalty was essentially zero.</p>
</section>
</section>
<section id="c-api">
<span id="proposed-capi"></span><h3><a class="toc-backref" href="#c-api" role="doc-backlink">C-API</a></h3>
<p>Internally, the interpreter state will now track how the import system
should handle extension modules which do not support use with multiple
interpreters. See <a class="reference internal" href="#restricting-extension-modules">Restricting Extension Modules</a> below. Well refer
to that setting here as “PyInterpreterState.strict_extension_compat”.</p>
<p>The following API will be made public, if they havent been already:</p>
<ul class="simple">
<li><code class="docutils literal notranslate"><span class="pre">PyInterpreterConfig</span></code> (struct)</li>
<li><code class="docutils literal notranslate"><span class="pre">PyInterpreterConfig_INIT</span></code> (macro)</li>
<li><code class="docutils literal notranslate"><span class="pre">PyInterpreterConfig_LEGACY_INIT</span></code> (macro)</li>
<li><code class="docutils literal notranslate"><span class="pre">PyThreadState</span> <span class="pre">*</span> <span class="pre">Py_NewInterpreterFromConfig(PyInterpreterConfig</span> <span class="pre">*)</span></code></li>
</ul>
<p>We will add two new fields to <code class="docutils literal notranslate"><span class="pre">PyInterpreterConfig</span></code>:</p>
<ul class="simple">
<li><code class="docutils literal notranslate"><span class="pre">int</span> <span class="pre">own_gil</span></code></li>
<li><code class="docutils literal notranslate"><span class="pre">int</span> <span class="pre">strict_extensions_compat</span></code></li>
</ul>
<p>We may add other fields over time, as needed (e.g. “own_initial_thread”).</p>
<p>Regarding the initializer macros, <code class="docutils literal notranslate"><span class="pre">PyInterpreterConfig_INIT</span></code> would
be used to get an isolated interpreter that also avoids
subinterpreter-unfriendly features. It would be the default for
interpreters created through <a class="pep reference internal" href="../pep-0554/" title="PEP 554 Multiple Interpreters in the Stdlib">PEP 554</a>. The unrestricted (status quo)
will continue to be available through <code class="docutils literal notranslate"><span class="pre">PyInterpreterConfig_LEGACY_INIT</span></code>,
which is already used for the main interpreter and <code class="docutils literal notranslate"><span class="pre">Py_NewInterpreter()</span></code>.
This will not change.</p>
<p>A note about the “main” interpreter:</p>
<p>Below, we mention the “main” interpreter several times. This refers
to the interpreter created during runtime initialization, for which
the initial <code class="docutils literal notranslate"><span class="pre">PyThreadState</span></code> corresponds to the processs main thread.
It is has a number of unique responsibilities (e.g. handling signals),
as well as a special role during runtime initialization/finalization.
It is also usually (for now) the only interpreter.
(Also see <a class="reference external" href="https://docs.python.org/3/c-api/init.html#sub-interpreter-support">https://docs.python.org/3/c-api/init.html#sub-interpreter-support</a>.)</p>
<section id="pyinterpreterconfig-own-gil">
<h4><a class="toc-backref" href="#pyinterpreterconfig-own-gil" role="doc-backlink">PyInterpreterConfig.own_gil</a></h4>
<p>If <code class="docutils literal notranslate"><span class="pre">true</span></code> (<code class="docutils literal notranslate"><span class="pre">1</span></code>) then the new interpreter will have its own “global”
interpreter lock. This means the new interpreter can run without
getting interrupted by other interpreters. This effectively unblocks
full use of multiple cores. That is the fundamental goal of this PEP.</p>
<p>If <code class="docutils literal notranslate"><span class="pre">false</span></code> (<code class="docutils literal notranslate"><span class="pre">0</span></code>) then the new interpreter will use the main
interpreters lock. This is the legacy (pre-3.12) behavior in CPython,
where all interpreters share a single GIL. Sharing the GIL like this
may be desirable when using extension modules that still depend
on the GIL for thread safety.</p>
<p>In <code class="docutils literal notranslate"><span class="pre">PyInterpreterConfig_INIT</span></code>, this will be <code class="docutils literal notranslate"><span class="pre">true</span></code>.
In <code class="docutils literal notranslate"><span class="pre">PyInterpreterConfig_LEGACY_INIT</span></code>, this will be <code class="docutils literal notranslate"><span class="pre">false</span></code>.</p>
<p>Also, to play it safe, for now we will not allow <code class="docutils literal notranslate"><span class="pre">own_gil</span></code> to be true
if a custom allocator was set during runtime init. Wrapping the allocator,
a la tracemalloc, will still be fine.</p>
</section>
<section id="pyinterpreterconfig-strict-extensions-compat">
<h4><a class="toc-backref" href="#pyinterpreterconfig-strict-extensions-compat" role="doc-backlink">PyInterpreterConfig.strict_extensions_compat</a></h4>
<p><code class="docutils literal notranslate"><span class="pre">PyInterpreterConfig.strict_extension_compat</span></code> is basically the initial
value used for “PyInterpreterState.strict_extension_compat”.</p>
</section>
</section>
<section id="restricting-extension-modules">
<h3><a class="toc-backref" href="#restricting-extension-modules" role="doc-backlink">Restricting Extension Modules</a></h3>
<p>Extension modules have many of the same problems as the runtime when
state is stored in global variables. <a class="pep reference internal" href="../pep-0630/" title="PEP 630 Isolating Extension Modules">PEP 630</a> covers all the details
of what extensions must do to support isolation, and thus safely run in
multiple interpreters at once. This includes dealing with their globals.</p>
<p>If an extension implements multi-phase init (see <a class="pep reference internal" href="../pep-0489/" title="PEP 489 Multi-phase extension module initialization">PEP 489</a>) it is
considered compatible with multiple interpreters. All other extensions
are considered incompatible. (See <a class="reference internal" href="#extension-module-thread-safety">Extension Module Thread Safety</a>
for more details about how a per-interpreter GIL may affect that
classification.)</p>
<p>If an incompatible extension is imported and the current
“PyInterpreterState.strict_extension_compat” value is <code class="docutils literal notranslate"><span class="pre">true</span></code> then the import
system will raise <code class="docutils literal notranslate"><span class="pre">ImportError</span></code>. (For <code class="docutils literal notranslate"><span class="pre">false</span></code> it simply doesnt check.)
This will be done through
<code class="docutils literal notranslate"><span class="pre">importlib._bootstrap_external.ExtensionFileLoader</span></code> (really, through
<code class="docutils literal notranslate"><span class="pre">_imp.create_dynamic()</span></code>, <code class="docutils literal notranslate"><span class="pre">_PyImport_LoadDynamicModuleWithSpec()</span></code>, and
<code class="docutils literal notranslate"><span class="pre">PyModule_FromDefAndSpec2()</span></code>).</p>
<p>Such imports will never fail in the main interpreter (or in interpreters
created through <code class="docutils literal notranslate"><span class="pre">Py_NewInterpreter()</span></code>) since
“PyInterpreterState.strict_extension_compat” initializes to <code class="docutils literal notranslate"><span class="pre">false</span></code> in both
cases. Thus the legacy (pre-3.12) behavior is preserved.</p>
<p>We will work with popular extensions to help them support use in
multiple interpreters. This may involve adding to CPythons public C-API,
which we will address on a case-by-case basis.</p>
<section id="extension-module-compatibility">
<h4><a class="toc-backref" href="#extension-module-compatibility" role="doc-backlink">Extension Module Compatibility</a></h4>
<p>As noted in <a class="reference internal" href="#extension-modules">Extension Modules</a>, many extensions work fine in multiple
interpreters (and under a per-interpreter GIL) without needing any
changes. The import system will still fail if such a module doesnt
explicitly indicate support. At first, not many extension modules
will, so this is a potential source of frustration.</p>
<p>We will address this by adding a context manager to temporarily disable
the check on multiple interpreter support:
<code class="docutils literal notranslate"><span class="pre">importlib.util.allow_all_extensions()</span></code>. More or less, it will modify
the current “PyInterpreterState.strict_extension_compat” value (e.g. through
a private <code class="docutils literal notranslate"><span class="pre">sys</span></code> function).</p>
</section>
<section id="extension-module-thread-safety">
<h4><a class="toc-backref" href="#extension-module-thread-safety" role="doc-backlink">Extension Module Thread Safety</a></h4>
<p>If a module supports use with multiple interpreters, that mostly implies
it will work even if those interpreters do not share the GIL. The one
caveat is where a module links against a library with internal global
state that isnt thread-safe. (Even something as innocuous as a static
local variable as a temporary buffer can be a problem.) With a shared
GIL, that state is protected. Without one, such modules must wrap any
use of that state (e.g. through calls) with a lock.</p>
<p>Currently, it isnt clear whether or not supports-multiple-interpreters
is sufficiently equivalent to supports-per-interpreter-gil, such that
we can avoid any special accommodations. This is still a point of
meaningful discussion and investigation. The practical distinction
between the two (in the Python community, e.g. PyPI) is not yet
understood well enough to settle the matter. Likewise, it isnt clear
what we might be able to do to help extension maintainers mitigate
the problem (assuming it is one).</p>
<p>In the meantime, we must proceed as though the difference would be
large enough to cause problems for enough extension modules out there.
The solution we would apply is:</p>
<ul class="simple">
<li>add a <code class="docutils literal notranslate"><span class="pre">PyModuleDef</span></code> slot that indicates an extension can be imported
under a per-interpreter GIL (i.e. opt in)</li>
<li>add that slot as part of the definition of a “compatible” extension,
as discussed earlier</li>
</ul>
<p>The downside is that not a single extension module will be able to take
advantage of the per-interpreter GIL without extra effort by the module
maintainer, regardless of how minor that effort. This compounds the
problem described in <a class="reference internal" href="#extension-module-compatibility">Extension Module Compatibility</a> and the same
workaround applies. Ideally, we would determine that there isnt enough
difference to matter.</p>
<p>If we do end up requiring an opt-in for imports under a per-interpreter
GIL, and later determine it isnt necessary, then we can switch the
default at that point, make the old opt-in slot a noop, and add a new
<code class="docutils literal notranslate"><span class="pre">PyModuleDef</span></code> slot for explicitly opting <em>out</em>. In fact, it makes
sense to add that opt-out slot from the beginning.</p>
</section>
</section>
<section id="documentation">
<h3><a class="toc-backref" href="#documentation" role="doc-backlink">Documentation</a></h3>
<ul class="simple">
<li>C-API: the “Sub-interpreter support” section of <code class="docutils literal notranslate"><span class="pre">Doc/c-api/init.rst</span></code>
will detail the updated API</li>
<li>C-API: that section will explain about the consequences of
a per-interpreter GIL</li>
<li>importlib: the <code class="docutils literal notranslate"><span class="pre">ExtensionFileLoader</span></code> entry will note import
may fail in subinterpreters</li>
<li>importlib: there will be a new entry about
<code class="docutils literal notranslate"><span class="pre">importlib.util.allow_all_extensions()</span></code></li>
</ul>
</section>
</section>
<section id="impact">
<h2><a class="toc-backref" href="#impact" role="doc-backlink">Impact</a></h2>
<section id="backwards-compatibility">
<h3><a class="toc-backref" href="#backwards-compatibility" role="doc-backlink">Backwards Compatibility</a></h3>
<p>No behavior or APIs are intended to change due to this proposal,
with two exceptions:</p>
<ul class="simple">
<li>some extensions will fail to import in some subinterpreters
(see <a class="reference internal" href="#extension-modules">the next section</a>)</li>
<li>“mem” and “object” allocators that are currently not thread-safe
may now be susceptible to data races when used in combination
with multiple interpreters</li>
</ul>
<p>The existing C-API for managing interpreters will preserve its current
behavior, with new behavior exposed through new API. No other API
or runtime behavior is meant to change, including compatibility with
the stable ABI.</p>
<p>See <a class="reference internal" href="#objects-exposed-in-the-c-api">Objects Exposed in the C-API</a> below for related discussion.</p>
<section id="extension-modules">
<h4><a class="toc-backref" href="#extension-modules" role="doc-backlink">Extension Modules</a></h4>
<p>Currently the most common usage of Python, by far, is with the main
interpreter running by itself. This proposal has zero impact on
extension modules in that scenario. Likewise, for better or worse,
there is no change in behavior under multiple interpreters created
using the existing <code class="docutils literal notranslate"><span class="pre">Py_NewInterpreter()</span></code>.</p>
<p>Keep in mind that some extensions already break when used in multiple
interpreters, due to keeping module state in global variables (or
due to the <a class="reference external" href="https://github.com/pyca/cryptography/issues/2299">internal state of linked libraries</a>). They
may crash or, worse, experience inconsistent behavior. That was part
of the motivation for <a class="pep reference internal" href="../pep-0630/" title="PEP 630 Isolating Extension Modules">PEP 630</a> and friends, so this is not a new
situation nor a consequence of this proposal.</p>
<p>In contrast, when the <a class="reference internal" href="#proposed-capi">proposed API</a> is used to
create multiple interpreters, with the appropriate settings,
the behavior will change for incompatible extensions. In that case,
importing such an extension will fail (outside the main interpreter),
as explained in <a class="reference internal" href="#restricting-extension-modules">Restricting Extension Modules</a>. For extensions that
already break in multiple interpreters, this will be an improvement.</p>
<p>Additionally, some extension modules link against libraries with
thread-unsafe internal global state.
(See <a class="reference internal" href="#extension-module-thread-safety">Extension Module Thread Safety</a>.)
Such modules will have to start wrapping any direct or indirect use
of that state in a lock. This is the key difference from other modules
that also implement multi-phase init and thus indicate support for
multiple interpreters (i.e. isolation).</p>
<p>Now we get to the break in compatibility mentioned above. Some
extensions are safe under multiple interpreters (and a per-interpreter
GIL), even though they havent indicated that. Unfortunately, there is
no reliable way for the import system to infer that such an extension
is safe, so importing them will still fail. This case is addressed
in <a class="reference internal" href="#extension-module-compatibility">Extension Module Compatibility</a> above.</p>
</section>
</section>
<section id="extension-module-maintainers">
<h3><a class="toc-backref" href="#extension-module-maintainers" role="doc-backlink">Extension Module Maintainers</a></h3>
<p>One related consideration is that a per-interpreter GIL will likely
drive increased use of multiple interpreters, particularly if <a class="pep reference internal" href="../pep-0554/" title="PEP 554 Multiple Interpreters in the Stdlib">PEP 554</a>
is accepted. Some maintainers of large extension modules have expressed
concern about the increased burden they anticipate due to increased
use of multiple interpreters.</p>
<p>Specifically, enabling support for multiple interpreters will require
substantial work for some extension modules (albeit likely not many).
To add that support, the maintainer(s) of such a module (often
volunteers) would have to set aside their normal priorities and
interests to focus on compatibility (see <a class="pep reference internal" href="../pep-0630/" title="PEP 630 Isolating Extension Modules">PEP 630</a>).</p>
<p>Of course, extension maintainers are free to not add support for use
in multiple interpreters. However, users will increasingly demand
such support, especially if the feature grows in popularity.</p>
<p>Either way, the situation can be stressful for maintainers of such
extensions, particularly when they are doing the work in their spare
time. The concerns they have expressed are understandable, and we address
the partial solution in the <a class="reference internal" href="#restricting-extension-modules">Restricting Extension Modules</a> and
<a class="reference internal" href="#extension-module-compatibility">Extension Module Compatibility</a> sections.</p>
</section>
<section id="alternate-python-implementations">
<h3><a class="toc-backref" href="#alternate-python-implementations" role="doc-backlink">Alternate Python Implementations</a></h3>
<p>Other Python implementation are not required to provide support for
multiple interpreters in the same process (though some do already).</p>
</section>
<section id="security-implications">
<h3><a class="toc-backref" href="#security-implications" role="doc-backlink">Security Implications</a></h3>
<p>There is no known impact to security with this proposal.</p>
</section>
<section id="maintainability">
<h3><a class="toc-backref" href="#maintainability" role="doc-backlink">Maintainability</a></h3>
<p>On the one hand, this proposal has already motivated a number of
improvements that make CPython <em>more</em> maintainable. That is expected
to continue. On the other hand, the underlying work has already
exposed various pre-existing defects in the runtime that have had
to be fixed. That is also expected to continue as multiple interpreters
receive more use. Otherwise, there shouldnt be a significant impact
on maintainability, so the net effect should be positive.</p>
</section>
<section id="performance">
<h3><a class="toc-backref" href="#performance" role="doc-backlink">Performance</a></h3>
<p>The work to consolidate globals has already provided a number of
improvements to CPythons performance, both speeding it up and using
less memory, and this should continue. The performance benefits of a
per-interpreter GIL specifically have not been explored. At the very
least, it is not expected to make CPython slower
(as long as interpreters are sufficiently isolated). And, obviously,
it enable a variety of multi-core parallelism in Python code.</p>
</section>
</section>
<section id="how-to-teach-this">
<h2><a class="toc-backref" href="#how-to-teach-this" role="doc-backlink">How to Teach This</a></h2>
<p>Unlike <a class="pep reference internal" href="../pep-0554/" title="PEP 554 Multiple Interpreters in the Stdlib">PEP 554</a>, this is an advanced feature meant for a narrow set
of users of the C-API. There is no expectation that the specifics of
the API nor its direct application will be taught.</p>
<p>That said, if it were taught then it would boil down to the following:</p>
<blockquote>
<div>In addition to Py_NewInterpreter(), you can use
Py_NewInterpreterFromConfig() to create an interpreter.
The config you pass it indicates how you want that
interpreter to behave.</div></blockquote>
<p>Furthermore, the maintainers of any extension modules that create
isolated interpreters will likely need to explain the consequences
of a per-interpreter GIL to their users. The first thing to explain
is what <a class="pep reference internal" href="../pep-0554/" title="PEP 554 Multiple Interpreters in the Stdlib">PEP 554</a> teaches about the concurrency model that isolated
interpreters enables. That leads into the point that Python software
written using that concurrency model can then take advantage
of multi-core parallelism, which is currently
prevented by the GIL.</p>
</section>
<section id="reference-implementation">
<h2><a class="toc-backref" href="#reference-implementation" role="doc-backlink">Reference Implementation</a></h2>
<p>&lt;TBD&gt;</p>
</section>
<section id="open-issues">
<h2><a class="toc-backref" href="#open-issues" role="doc-backlink">Open Issues</a></h2>
<ul class="simple">
<li>Are we okay to require “mem” and “object” allocators to be thread-safe?</li>
<li>How would a per-interpreter tracemalloc module relate to global allocators?</li>
<li>Would the faulthandler module be limited to the main interpreter
(like the signal module) or would we leak that global state between
interpreters (protected by a granular lock)?</li>
<li>Split out an informational PEP with all the relevant info,
based on the “Consolidating Runtime Global State” section?</li>
<li>How likely is it that a module works under multiple interpreters
(isolation) but doesnt work under a per-interpreter GIL?
(See <a class="reference internal" href="#extension-module-thread-safety">Extension Module Thread Safety</a>.)</li>
<li>If it is likely enough, what can we do to help extension maintainers
mitigate the problem and enjoy use under a per-interpreter GIL?</li>
<li>What would be a better (scarier-sounding) name
for <code class="docutils literal notranslate"><span class="pre">allow_all_extensions</span></code>?</li>
</ul>
</section>
<section id="deferred-functionality">
<h2><a class="toc-backref" href="#deferred-functionality" role="doc-backlink">Deferred Functionality</a></h2>
<ul class="simple">
<li><code class="docutils literal notranslate"><span class="pre">PyInterpreterConfig</span></code> option to always run the interpreter in a new thread</li>
<li><code class="docutils literal notranslate"><span class="pre">PyInterpreterConfig</span></code> option to assign a “main” thread to the interpreter
and only run in that thread</li>
</ul>
</section>
<section id="rejected-ideas">
<h2><a class="toc-backref" href="#rejected-ideas" role="doc-backlink">Rejected Ideas</a></h2>
<p>&lt;TBD&gt;</p>
</section>
<section id="extra-context">
<h2><a class="toc-backref" href="#extra-context" role="doc-backlink">Extra Context</a></h2>
<section id="sharing-global-objects">
<h3><a class="toc-backref" href="#sharing-global-objects" role="doc-backlink">Sharing Global Objects</a></h3>
<p>We are sharing some global objects between interpreters.
This is an implementation detail and relates more to
<a class="reference external" href="ConsolidatingRuntimeGlobalState">globals consolidation</a>
than to this proposal, but it is a significant enough detail
to explain here.</p>
<p>The alternative is to share no objects between interpreters, ever.
To accomplish that, wed have to sort out the fate of all our static
types, as well as deal with compatibility issues for the many objects
<a class="reference internal" href="#capi-objects">exposed in the public C-API</a>.</p>
<p>That approach introduces a meaningful amount of extra complexity
and higher risk, though prototyping has demonstrated valid solutions.
Also, it would likely result in a performance penalty.</p>
<p><a class="reference internal" href="#depending-on-immortal-objects">Immortal objects</a> allow us to
share the otherwise immutable global objects. That way we avoid
the extra costs.</p>
<section id="objects-exposed-in-the-c-api">
<span id="capi-objects"></span><h4><a class="toc-backref" href="#objects-exposed-in-the-c-api" role="doc-backlink">Objects Exposed in the C-API</a></h4>
<p>The C-API (including the limited API) exposes all the builtin types,
including the builtin exceptions, as well as the builtin singletons.
The exceptions are exposed as <code class="docutils literal notranslate"><span class="pre">PyObject</span> <span class="pre">*</span></code> but the rest are exposed
as the static values rather than pointers. This was one of the few
non-trivial problems we had to solve for per-interpreter GIL.</p>
<p>With immortal objects this is a non-issue.</p>
</section>
</section>
<section id="consolidating-runtime-global-state">
<h3><a class="toc-backref" href="#consolidating-runtime-global-state" role="doc-backlink">Consolidating Runtime Global State</a></h3>
<p>As noted in <a class="reference internal" href="#cpython-runtime-state">CPython Runtime State</a> above, there is an active effort
(separate from this PEP) to consolidate CPythons global state into the
<code class="docutils literal notranslate"><span class="pre">_PyRuntimeState</span></code> struct. Nearly all the work involves moving that
state from global variables. The project is particularly relevant to
this proposal, so below is some extra detail.</p>
<section id="benefits-to-consolidation">
<h4><a class="toc-backref" href="#benefits-to-consolidation" role="doc-backlink">Benefits to Consolidation</a></h4>
<p>Consolidating the globals has a variety of benefits:</p>
<ul class="simple">
<li>greatly reduces the number of C globals (best practice for C code)</li>
<li>the move draws attention to runtime state that is unstable or broken</li>
<li>encourages more consistency in how runtime state is used</li>
<li>makes it easier to discover/identify CPythons runtime state</li>
<li>makes it easier to statically allocate runtime state in a consistent way</li>
<li>better memory locality for runtime state</li>
</ul>
<p>Furthermore all the benefits listed in <a class="reference internal" href="#indirect-benefits">Indirect Benefits</a> above also
apply here, and the same projects listed there benefit.</p>
</section>
<section id="scale-of-work">
<h4><a class="toc-backref" href="#scale-of-work" role="doc-backlink">Scale of Work</a></h4>
<p>The number of global variables to be moved is large enough to matter,
but most are Python objects that can be dealt with in large groups
(like <code class="docutils literal notranslate"><span class="pre">Py_IDENTIFIER</span></code>). In nearly all cases, moving these globals
to the interpreter is highly mechanical. That doesnt require
cleverness but instead requires someone to put in the time.</p>
</section>
<section id="state-to-be-moved">
<h4><a class="toc-backref" href="#state-to-be-moved" role="doc-backlink">State To Be Moved</a></h4>
<p>The remaining global variables can be categorized as follows:</p>
<ul class="simple">
<li>global objects<ul>
<li>static types (incl. exception types)</li>
<li>non-static types (incl. heap types, structseq types)</li>
<li>singletons (static)</li>
<li>singletons (initialized once)</li>
<li>cached objects</li>
</ul>
</li>
<li>non-objects<ul>
<li>will not (or unlikely to) change after init</li>
<li>only used in the main thread</li>
<li>initialized lazily</li>
<li>pre-allocated buffers</li>
<li>state</li>
</ul>
</li>
</ul>
<p>Those globals are spread between the core runtime, the builtin modules,
and the stdlib extension modules.</p>
<p>For a breakdown of the remaining globals, run:</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>./python<span class="w"> </span>Tools/c-analyzer/table-file.py<span class="w"> </span>Tools/c-analyzer/cpython/globals-to-fix.tsv
</pre></div>
</div>
</section>
<section id="already-completed-work">
<h4><a class="toc-backref" href="#already-completed-work" role="doc-backlink">Already Completed Work</a></h4>
<p>As mentioned, this work has been going on for many years. Here are some
of the things that have already been done:</p>
<ul class="simple">
<li>cleanup of runtime initialization (see <a class="pep reference internal" href="../pep-0432/" title="PEP 432 Restructuring the CPython startup sequence">PEP 432</a> / <a class="pep reference internal" href="../pep-0587/" title="PEP 587 Python Initialization Configuration">PEP 587</a>)</li>
<li>extension module isolation machinery (see <a class="pep reference internal" href="../pep-0384/" title="PEP 384 Defining a Stable ABI">PEP 384</a> / <a class="pep reference internal" href="../pep-3121/" title="PEP 3121 Extension Module Initialization and Finalization">PEP 3121</a> / <a class="pep reference internal" href="../pep-0489/" title="PEP 489 Multi-phase extension module initialization">PEP 489</a>)</li>
<li>isolation for many builtin modules</li>
<li>isolation for many stdlib extension modules</li>
<li>addition of <code class="docutils literal notranslate"><span class="pre">_PyRuntimeState</span></code></li>
<li>no more <code class="docutils literal notranslate"><span class="pre">_Py_IDENTIFIER()</span></code></li>
<li>statically allocated:<ul>
<li>empty string</li>
<li>string literals</li>
<li>identifiers</li>
<li>latin-1 strings</li>
<li>length-1 bytes</li>
<li>empty tuple</li>
</ul>
</li>
</ul>
</section>
<section id="tooling">
<h4><a class="toc-backref" href="#tooling" role="doc-backlink">Tooling</a></h4>
<p>As already indicated, there are several tools to help identify the
globals and reason about them.</p>
<ul class="simple">
<li><code class="docutils literal notranslate"><span class="pre">Tools/c-analyzer/cpython/globals-to-fix.tsv</span></code> - the list of remaining globals</li>
<li><code class="docutils literal notranslate"><span class="pre">Tools/c-analyzer/c-analyzer.py</span></code><ul>
<li><code class="docutils literal notranslate"><span class="pre">analyze</span></code> - identify all the globals</li>
<li><code class="docutils literal notranslate"><span class="pre">check</span></code> - fail if there are any unsupported globals that arent ignored</li>
</ul>
</li>
<li><code class="docutils literal notranslate"><span class="pre">Tools/c-analyzer/table-file.py</span></code> - summarize the known globals</li>
</ul>
<p>Also, the check for unsupported globals is incorporated into CI so that
no new globals are accidentally added.</p>
</section>
<section id="global-objects">
<h4><a class="toc-backref" href="#global-objects" role="doc-backlink">Global Objects</a></h4>
<p>Global objects that are safe to be shared (without a GIL) between
interpreters can stay on <code class="docutils literal notranslate"><span class="pre">_PyRuntimeState</span></code>. Not only must the object
be effectively immutable (e.g. singletons, strings), but not even the
refcount can change for it to be safe. Immortality (<a class="pep reference internal" href="../pep-0683/" title="PEP 683 Immortal Objects, Using a Fixed Refcount">PEP 683</a>)
provides that. (The alternative is that no objects are shared, which
adds significant complexity to the solution, particularly for the
objects <a class="reference internal" href="#capi-objects">exposed in the public C-API</a>.)</p>
<p>Builtin static types are a special case of global objects that will be
shared. They are effectively immutable except for one part:
<code class="docutils literal notranslate"><span class="pre">__subclasses__</span></code> (AKA <code class="docutils literal notranslate"><span class="pre">tp_subclasses</span></code>). We expect that nothing
else on a builtin type will change, even the content
of <code class="docutils literal notranslate"><span class="pre">__dict__</span></code> (AKA <code class="docutils literal notranslate"><span class="pre">tp_dict</span></code>).</p>
<p><code class="docutils literal notranslate"><span class="pre">__subclasses__</span></code> for the builtin types will be dealt with by making
it a getter that does a lookup on the current <code class="docutils literal notranslate"><span class="pre">PyInterpreterState</span></code>
for that type.</p>
</section>
</section>
</section>
<section id="references">
<h2><a class="toc-backref" href="#references" role="doc-backlink">References</a></h2>
<p>Related:</p>
<ul class="simple">
<li><a class="pep reference internal" href="../pep-0384/" title="PEP 384 Defining a Stable ABI">PEP 384</a> “Defining a Stable ABI”</li>
<li><a class="pep reference internal" href="../pep-0432/" title="PEP 432 Restructuring the CPython startup sequence">PEP 432</a> “Restructuring the CPython startup sequence”</li>
<li><a class="pep reference internal" href="../pep-0489/" title="PEP 489 Multi-phase extension module initialization">PEP 489</a> “Multi-phase extension module initialization”</li>
<li><a class="pep reference internal" href="../pep-0554/" title="PEP 554 Multiple Interpreters in the Stdlib">PEP 554</a> “Multiple Interpreters in the Stdlib”</li>
<li><a class="pep reference internal" href="../pep-0573/" title="PEP 573 Module State Access from C Extension Methods">PEP 573</a> “Module State Access from C Extension Methods”</li>
<li><a class="pep reference internal" href="../pep-0587/" title="PEP 587 Python Initialization Configuration">PEP 587</a> “Python Initialization Configuration”</li>
<li><a class="pep reference internal" href="../pep-0630/" title="PEP 630 Isolating Extension Modules">PEP 630</a> “Isolating Extension Modules”</li>
<li><a class="pep reference internal" href="../pep-0683/" title="PEP 683 Immortal Objects, Using a Fixed Refcount">PEP 683</a> “Immortal Objects, Using a Fixed Refcount”</li>
<li><a class="pep reference internal" href="../pep-3121/" title="PEP 3121 Extension Module Initialization and Finalization">PEP 3121</a> “Extension Module Initialization and Finalization”</li>
</ul>
</section>
<section id="copyright">
<h2><a class="toc-backref" href="#copyright" role="doc-backlink">Copyright</a></h2>
<p>This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.</p>
</section>
</section>
<hr class="docutils" />
<p>Source: <a class="reference external" href="https://github.com/python/peps/blob/main/peps/pep-0684.rst">https://github.com/python/peps/blob/main/peps/pep-0684.rst</a></p>
<p>Last modified: <a class="reference external" href="https://github.com/python/peps/commits/main/peps/pep-0684.rst">2024-06-04 17:05:36 GMT</a></p>
</article>
<nav id="pep-sidebar">
<h2>Contents</h2>
<ul>
<li><a class="reference internal" href="#abstract">Abstract</a></li>
<li><a class="reference internal" href="#high-level-summary">High-Level Summary</a><ul>
<li><a class="reference internal" href="#the-gil">The GIL</a></li>
<li><a class="reference internal" href="#cpython-runtime-state">CPython Runtime State</a></li>
<li><a class="reference internal" href="#other-isolation-considerations">Other Isolation Considerations</a></li>
<li><a class="reference internal" href="#depending-on-immortal-objects">Depending on Immortal Objects</a></li>
</ul>
</li>
<li><a class="reference internal" href="#motivation">Motivation</a><ul>
<li><a class="reference internal" href="#indirect-benefits">Indirect Benefits</a></li>
<li><a class="reference internal" href="#existing-use-of-multiple-interpreters">Existing Use of Multiple Interpreters</a></li>
<li><a class="reference internal" href="#pep-554-multiple-interpreters-in-the-stdlib">PEP 554 (Multiple Interpreters in the Stdlib)</a></li>
</ul>
</li>
<li><a class="reference internal" href="#rationale">Rationale</a></li>
<li><a class="reference internal" href="#specification">Specification</a><ul>
<li><a class="reference internal" href="#per-interpreter-state">Per-Interpreter State</a><ul>
<li><a class="reference internal" href="#memory-allocators">Memory Allocators</a></li>
</ul>
</li>
<li><a class="reference internal" href="#c-api">C-API</a><ul>
<li><a class="reference internal" href="#pyinterpreterconfig-own-gil">PyInterpreterConfig.own_gil</a></li>
<li><a class="reference internal" href="#pyinterpreterconfig-strict-extensions-compat">PyInterpreterConfig.strict_extensions_compat</a></li>
</ul>
</li>
<li><a class="reference internal" href="#restricting-extension-modules">Restricting Extension Modules</a><ul>
<li><a class="reference internal" href="#extension-module-compatibility">Extension Module Compatibility</a></li>
<li><a class="reference internal" href="#extension-module-thread-safety">Extension Module Thread Safety</a></li>
</ul>
</li>
<li><a class="reference internal" href="#documentation">Documentation</a></li>
</ul>
</li>
<li><a class="reference internal" href="#impact">Impact</a><ul>
<li><a class="reference internal" href="#backwards-compatibility">Backwards Compatibility</a><ul>
<li><a class="reference internal" href="#extension-modules">Extension Modules</a></li>
</ul>
</li>
<li><a class="reference internal" href="#extension-module-maintainers">Extension Module Maintainers</a></li>
<li><a class="reference internal" href="#alternate-python-implementations">Alternate Python Implementations</a></li>
<li><a class="reference internal" href="#security-implications">Security Implications</a></li>
<li><a class="reference internal" href="#maintainability">Maintainability</a></li>
<li><a class="reference internal" href="#performance">Performance</a></li>
</ul>
</li>
<li><a class="reference internal" href="#how-to-teach-this">How to Teach This</a></li>
<li><a class="reference internal" href="#reference-implementation">Reference Implementation</a></li>
<li><a class="reference internal" href="#open-issues">Open Issues</a></li>
<li><a class="reference internal" href="#deferred-functionality">Deferred Functionality</a></li>
<li><a class="reference internal" href="#rejected-ideas">Rejected Ideas</a></li>
<li><a class="reference internal" href="#extra-context">Extra Context</a><ul>
<li><a class="reference internal" href="#sharing-global-objects">Sharing Global Objects</a><ul>
<li><a class="reference internal" href="#objects-exposed-in-the-c-api">Objects Exposed in the C-API</a></li>
</ul>
</li>
<li><a class="reference internal" href="#consolidating-runtime-global-state">Consolidating Runtime Global State</a><ul>
<li><a class="reference internal" href="#benefits-to-consolidation">Benefits to Consolidation</a></li>
<li><a class="reference internal" href="#scale-of-work">Scale of Work</a></li>
<li><a class="reference internal" href="#state-to-be-moved">State To Be Moved</a></li>
<li><a class="reference internal" href="#already-completed-work">Already Completed Work</a></li>
<li><a class="reference internal" href="#tooling">Tooling</a></li>
<li><a class="reference internal" href="#global-objects">Global Objects</a></li>
</ul>
</li>
</ul>
</li>
<li><a class="reference internal" href="#references">References</a></li>
<li><a class="reference internal" href="#copyright">Copyright</a></li>
</ul>
<br>
<a id="source" href="https://github.com/python/peps/blob/main/peps/pep-0684.rst">Page Source (GitHub)</a>
</nav>
</section>
<script src="../_static/colour_scheme.js"></script>
<script src="../_static/wrap_tables.js"></script>
<script src="../_static/sticky_banner.js"></script>
</body>
</html>