PEP 683: Immortal Objects v3 (#2372)

This is mostly changes in response to https://mail.python.org/archives/list/python-dev@python.org/thread/KDAR6CCMPOX36GQJUDWHQBKRD5USNV3B/.  Also, we increase the focus on the immutability of per-object runtime state.
This commit is contained in:
Eric Snow 2022-02-28 17:55:07 -07:00 committed by GitHub
parent 07e537864a
commit 0a6375d4e3
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 223 additions and 104 deletions

View File

@ -7,7 +7,7 @@ Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 10-Feb-2022 Created: 10-Feb-2022
Python-Version: 3.11 Python-Version: 3.11
Post-History: 15-Feb-2022 Post-History: 15-Feb-2022, 19-Feb-2022, 28-Feb-2022
Resolution: Resolution:
@ -19,29 +19,64 @@ Currently the CPython runtime maintains a
allocated memory of each object. Because of this, otherwise immutable allocated memory of each object. Because of this, otherwise immutable
objects are actually mutable. This can have a large negative impact objects are actually mutable. This can have a large negative impact
on CPU and memory performance, especially for approaches to increasing on CPU and memory performance, especially for approaches to increasing
Python's scalability. The solution proposed here provides a way Python's scalability.
to mark an object as one for which that per-object
runtime state should not change.
Specifically, if an object's refcount matches a very specific value This proposal mandates that, internally, CPython will support marking
(defined below) then that object is treated as "immortal". If an object an object as one for which that runtime state will no longer change.
is immortal then its refcount will never be modified by ``Py_INCREF()``, Consequently, such an object's refcount will never reach 0, and so the
etc. Consequently, the refcount will never reach 0, so that object will object will never be cleaned up. We call these objects "immortal".
never be cleaned up (unless explicitly done, e.g. during runtime (Normally, only a relatively small number of internal objects
finalization). Additionally, all other per-object runtime state will ever be immortal.) The fundamental improvement here
for an immortal object will be considered immutable. is that now an object can be truly immutable.
This approach has some possible negative impact, which is explained Scope
below, along with mitigations. A critical requirement for this change -----
is that the performance regression be no more than 2-3%. Anything worse
than performance-neutral requires that the other benefits are proportionally
large. Aside from specific applications, the fundamental improvement
here is that now an object can be truly immutable.
(This proposal is meant to be CPython-specific and to affect only Object immortality is meant to be an internal-only feature. So this
internal implementation details. There are some slight exceptions proposal does not include any changes to public API or behavior
to that which are explained below. See `Backward Compatibility`_, (with one exception). As usual, we may still add some private
`Public Refcount Details`_, and `scope`_.) (yet publicly accessible) API to do things like immortalize an object
or tell if one is immortal. Any effort to expose this feature to users
would need to be proposed separately.
There is one exception to "no change in behavior": refcounting semantics
for immortal objects will differ in some cases from user expectations.
This exception, and the solution, are discussed below.
Most of this PEP focuses on an internal implementation that satisfies
the above mandate. However, those implementation details are not meant
to be strictly proscriptive. Instead, at the least they are included
to help illustrate the technical considerations required by the mandate.
The actual implementation may deviate somewhat as long as it satisfies
the constraints outlined below. Furthermore, the acceptability of any
specific implementation detail described below does not depend on
the status of this PEP, unless explicitly specified.
For example, the particular details of:
* how to mark something as immortal
* how to recognize something as immortal
* which subset of functionally immortal objects are marked as immortal
* which memory-management activities are skipped or modified for immortal objects
are not only CPython-specific but are also private implementation
details that are expected to change in subsequent versions.
Implementation Summary
----------------------
Here's a high-level look at the implementation:
If an object's refcount matches a very specific value (defined below)
then that object is treated as immortal. The CPython C-API and runtime
will not modify the refcount (or other runtime state) of an immortal
object.
Aside from the change to refcounting semantics, there is one other
possible negative impact to consider. A naive implementation of the
approach described below makes CPython roughly 4% slower. However,
the implementation is performance-neutral once known mitigations
are applied.
Motivation Motivation
@ -153,7 +188,7 @@ Impact
Benefits Benefits
-------- --------
Most notably, the cases described in the two examples above stand Most notably, the cases described in the above examples stand
to benefit greatly from immortal objects. Projects using pre-fork to benefit greatly from immortal objects. Projects using pre-fork
can drop their workarounds. For the per-interpreter GIL project, can drop their workarounds. For the per-interpreter GIL project,
immortal objects greatly simplifies the solution for existing static immortal objects greatly simplifies the solution for existing static
@ -167,10 +202,9 @@ usage. This is reflected in most of the above cases.
Performance Performance
----------- -----------
A naive implementation shows `a 4% slowdown`_. A naive implementation shows `a 4% slowdown`_. We have demonstrated
Several promising mitigation strategies will be pursued in the effort a return to performance-neutral with a handful of basic mitigations
to bring it closer to performance-neutral. See the `mitigation`_ applied. See the `mitigation`_ section below.
section below.
On the positive side, immortal objects save a significant amount of On the positive side, immortal objects save a significant amount of
memory when used with a pre-fork model. Also, immortal objects provide memory when used with a pre-fork model. Also, immortal objects provide
@ -182,59 +216,52 @@ performance.
Backward Compatibility Backward Compatibility
---------------------- ----------------------
This proposal is meant to be completely compatible. It focuses strictly Ideally this internal-only feature would be completely compatible.
on internal implementation details. It does not involve changes to any However, it does involve a change to refcount semantics in some cases.
public API, other than a few minor changes in behavior related to refcounts Only immortal objects are affected, but this includes high-use objects
(but only for immortal objects): like ``None``, ``True``, and ``False``.
Specifically, when an immortal object is involved:
* code that inspects the refcount will see a really, really large value * code that inspects the refcount will see a really, really large value
* the new noop behavior may break code that: * the new noop behavior may break code that:
* depends specifically on the refcount to always increment or decrement * depends specifically on the refcount to always increment or decrement
(or have a specific value from ``Py_SET_REFCNT()``) (or have a specific value from ``Py_SET_REFCNT()``)
* relies on any specific refcount value, other than 0 * relies on any specific refcount value, other than 0 or 1
* directly manipulates the refcount to store extra information there * directly manipulates the refcount to store extra information there
* in 32-bit pre-3.11 `Stable ABI`_ extensions,
objects may leak due to `Accidental Immortality`_
* such extensions may crash due to `Accidental De-Immortalizing`_
Again, those changes in behavior only apply to immortal objects, not Again, those changes in behavior only apply to immortal objects, not
most of the objects a user will access. Furthermore, users cannot mark most of the objects a user will access. Furthermore, users cannot mark
an object as immortal so no user-created objects will ever have that an object as immortal so no user-created objects will ever have that
changed behavior. Users that rely on any of the changing behavior for changed behavior. Users that rely on any of the changing behavior for
global (builtin) objects are already in trouble. global (builtin) objects are already in trouble. So the overall impact
should be small.
Also note that code which checks for refleaks should keep working fine, Also note that code which checks for refleaks should keep working fine,
unless it checks for hard-coded small values relative to some immortal unless it checks for hard-coded small values relative to some immortal
object. The problems noticed by `Pyston`_ shouldn't apply here since object. The problems noticed by `Pyston`_ shouldn't apply here since
we do not modify the refcount. we do not modify the refcount.
See `Public Refcount Details`_ and `scope`_ below for further discussion. See `Public Refcount Details`_ below for further discussion.
Stable ABI
----------
The approach is also compatible with extensions compiled to the stable
ABI. Unfortunately, they will modify the refcount and invalidate all
the performance benefits of immortal objects. However, the high bit
of the refcount `will still match _Py_IMMORTAL_REFCNT <_Py_IMMORTAL_REFCNT_>`_
so we can still identify such objects as immortal. At worst, objects
in that situation would feel the effects described in the `Motivation`_
section. Even then the overall impact is unlikely to be significant.
Also see `_Py_IMMORTAL_REFCNT`_ below.
Accidental Immortality Accidental Immortality
---------------------- ''''''''''''''''''''''
Hypothetically, a regular object could be incref'ed so much that it Hypothetically, a non-immortal object could be incref'ed so much
reaches the magic value needed to be considered immortal. That means that it reaches the magic value needed to be considered immortal.
it would accidentally never be cleaned up (by going back to 0). That means it would accidentally never be cleaned up
(by going back to 0).
While it isn't impossible, this accidental scenario is so unlikely On 64-bit builds, this accidental scenario is so unlikely that we need
that we need not worry. Even if done deliberately by using not worry. Even if done deliberately by using ``Py_INCREF()`` in a
``Py_INCREF()`` in a tight loop and each iteration only took 1 CPU tight loop and each iteration only took 1 CPU cycle, it would take
cycle, it would take 2^61 cycles (on a 64-bit processor). At a fast 2^60 cycles (if the immortal bit were 2^60). At a fast 5 GHz that would
5 GHz that would still take nearly 500,000,000 seconds (over 5,000 days)! still take nearly 250,000,000 seconds (over 2,500 days)!
If that CPU were 32-bit then it is (technically) more possible though
still highly unlikely.
Also note that it is doubly unlikely to be a problem because it wouldn't Also note that it is doubly unlikely to be a problem because it wouldn't
matter until the refcount got back to 0 and the object was cleaned up. matter until the refcount got back to 0 and the object was cleaned up.
@ -245,9 +272,106 @@ would be noticed.
Again, the only realistic way that the magic refcount would be reached Again, the only realistic way that the magic refcount would be reached
(and then reversed) is if it were done deliberately. (Of course, the (and then reversed) is if it were done deliberately. (Of course, the
same thing could be done efficiently using ``Py_SET_REFCNT()`` though same thing could be done efficiently using ``Py_SET_REFCNT()`` though
that would be even less of an accident.) At that point we don't that would be even less of an accident.) At that point we don't
consider it a concern of this proposal. consider it a concern of this proposal.
On 32-bit builds it isn't so obvious. Let's say the magic refcount
were 2^30. Using the same specs as above, it would take roughly
4 seconds to accidentally immortalize an object. Under reasonable
conditions, it is still highly unlikely that an object be accidentally
immortalized. It would have to meet these criteria:
* targeting a non-immortal object (so not one of the high-use builtins)
* the extension increfs without a corresponding decref
(e.g. returns from a function or method)
* no other code decrefs the object in the meantime
Even at a much less frequent rate incref it would not take long to reach
accidental immortality (on 32-bit). However, then it would have to run
through the same number of (now noop-ing) decrefs before that one object
would be effectively leaking. This is highly unlikely, especially because
the calculations assume no decrefs.
Furthermore, this isn't all that different from how such 32-bit extensions
can already incref an object past 2^31 and turn the refcount negative.
If that were an actual problem then we would have heard about it.
Between all of the above cases, the proposal doesn't consider
accidental immortality a problem.
Stable ABI
''''''''''
The implementation approach described in this PEP is compatible
with extensions compiled to the stable ABI (with the exception
of `Accidental Immortality`_ and `Accidental De-Immortalizing`_).
Due to the nature of the stable ABI, unfortunately, such extensions
use versions of ``Py_INCREF()``, etc. that directly modify the object's
``ob_refcnt`` field. This will invalidate all the performance benefits
of immortal objects.
However, we do ensure that immortal objects (mostly) stay immortal
in that situation. We set the initial refcount of immortal objects to
a value high above the magic refcount value, but one that still matches
the high bit. Thus we can still identify such objects as immortal.
(See `_Py_IMMORTAL_REFCNT`_.) At worst, objects in that situation
would feel the effects described in the `Motivation`_ section.
Even then the overall impact is unlikely to be significant.
Accidental De-Immortalizing
'''''''''''''''''''''''''''
32-bit builds of older stable ABI extensions can take `Accidental Immortality`_
to the next level.
Hypothetically, such an extension could incref an object to a value on
the next highest bit above the magic refcount value. For example, if
the magic value were 2^30 and the initial immortal refcount were thus
2^30 + 2^29 then it would take 2^29 increfs by the extension to reach
a value of 2^31, making the object non-immortal.
(Of course, a refcount that high would probably already cause a crash,
regardless of immortal objects.)
The more problematic case is where such a 32-bit stable ABI extension
goes crazy decref'ing an already immortal object. Continuing with the
above example, it would take 2^29 asymmetric decrefs to drop below the
magic immortal refcount value. So an object like ``None`` could be
made mortal and subject to decref. That still wouldn't be a problem
until somehow the decrefs continue on that object until it reaches 0.
For many immortal objects, like ``None``, the extension will crash
the process if it tries to dealloc the object. For the other
immortal objects, the dealloc might be okay. However, there will
be runtime code expecting the formerly-immortal object to be around
forever. That code will probably crash.
Again, the likelihood of this happening is extremely small, even on
32-bit builds. It would require roughly a billion decrefs on that
one object without a corresponding incref. The most likely scenario is
the following:
A "new" reference to ``None`` is returned by many functions and methods.
Unlike with non-immortal objects, the 3.11 runtime will almost never
incref ``None`` before giving it to the extension. However, the
extension *will* decref it when done with it (unless it returns it).
Each time that exchange happens with the one object, we get one step
closer to a crash.
How realistic is it that some form of that exchange (with a single
object) will happen a billion times in the lifetime of a Python process
on 32-bit? If it is a problem, how could it be addressed?
As to how realistic, the answer isn't clear currently. However, the
mitigation is simple enough that we can safely proceed under the
assumption that it would be a problem.
Here are some possible solutions (only needed on 32-bit):
* periodically reset the refcount for immortal objects
(only enable this if a stable ABI extension is imported?)
* special-case immortal objects in tp_dealloc() for the relevant types
(but not int, due to frequency?)
* provide a runtime flag for disabling immortality
Alternate Python Implementations Alternate Python Implementations
-------------------------------- --------------------------------
@ -318,8 +442,10 @@ to the following questions:
As part of this proposal, we must make sure that users can clearly As part of this proposal, we must make sure that users can clearly
understand on which parts of the refcount behavior they can rely and understand on which parts of the refcount behavior they can rely and
which are considered implementation details. Specifically, they should which are considered implementation details. Specifically, they should
use the existing public refcount-related API and the only refcount value use the existing public refcount-related API and the only refcount
with any meaning is 0. All other values are considered "not 0". values with any meaning are 0 and 1. (Some code relies on 1 as an
indicator that the object can be safely modified.) All other values
are considered "not 0 or 1".
This information will be clarified in the `documentation <Documentation_>`_. This information will be clarified in the `documentation <Documentation_>`_.
@ -343,27 +469,15 @@ Constraints
* be careful when immortalizing objects that we don't actually expect * be careful when immortalizing objects that we don't actually expect
to persist until runtime finalization. to persist until runtime finalization.
* be careful when immortalizing objects that are not otherwise immutable * be careful when immortalizing objects that are not otherwise immutable
* ``__del__`` and weakrefs must continue working properly
.. _scope: Regarding "truly" immutable objects, this PEP doesn't impact the
effective immutability of any objects, other than the per-object
Scope of Changes runtime state (e.g. refcount). So whether or not some immortal object
---------------- is truly (or even effectively) immutable can only be settled separately
from this proposal. For example, str objects are generally considered
Object immortality is not meant to be a public feature but rather an immutable, but ``PyUnicodeObject`` holds some lazily cached data. This
internal one. So the proposal does *not* include adding any new PEP has no influence on how that state affects str immutability.
public C-API, nor any Python API. However, this does not prevent
us from adding (publicly accessible) private API to do things
like immortalize an object or tell if one is immortal.
The particular details of:
* how to mark something as immortal
* how to recognize something as immortal
* which subset of functionally immortal objects are marked as immortal
* which memory-management activities are skipped or modified for immortal objects
are not only Cpython-specific but are also private implementation
details that are expected to change in subsequent versions.
Immortal Mutable Objects Immortal Mutable Objects
------------------------ ------------------------
@ -390,9 +504,6 @@ it immortal, we no longer incur the extra overhead during incref/decref.
We explore this idea further in the `mitigation`_ section below. We explore this idea further in the `mitigation`_ section below.
(Note that we are still investigating the impact on GC
of immortalizing containers.)
Implicitly Immortal Objects Implicitly Immortal Objects
--------------------------- ---------------------------
@ -437,14 +548,18 @@ _Py_IMMORTAL_REFCNT
We will add two internal constants:: We will add two internal constants::
#define _Py_IMMORTAL_BIT (1LL << (8 * sizeof(Py_ssize_t) - 4)) _Py_IMMORTAL_BIT - has the top-most available bit set (e.g. 2^62)
#define _Py_IMMORTAL_REFCNT (_Py_IMMORTAL_BIT + (_Py_IMMORTAL_BIT / 2)) _Py_IMMORTAL_REFCNT - has the two top-most available bits set
The refcount for immortal objects will be set to ``_Py_IMMORTAL_REFCNT``. The actual top-most bit depends on existing uses for refcount bits,
However, to check if an object is immortal we will compare its refcount e.g. the sign bit or some GC uses. We will use the highest bit possible
against just the bit:: after consideration of existing uses.
(op->ob_refcnt & _Py_IMMORTAL_BIT) != 0 The refcount for immortal objects will be set to ``_Py_IMMORTAL_REFCNT``
(meaning the value will be halfway between ``_Py_IMMORTAL_BIT`` and the
value at the next highest bit). However, to check if an object is
immortal we will compare (bitwise-and) its refcount against just
``_Py_IMMORTAL_BIT``.
The difference means that an immortal object will still be considered The difference means that an immortal object will still be considered
immortal, even if somehow its refcount were modified (e.g. by an older immortal, even if somehow its refcount were modified (e.g. by an older
@ -471,24 +586,21 @@ API that exposes refcounts (unchanged but may now return large values):
(Note that ``_Py_RefTotal`` and ``sys.gettotalrefcount()`` (Note that ``_Py_RefTotal`` and ``sys.gettotalrefcount()``
will not be affected.) will not be affected.)
Also, immortal objects will not participate in GC.
Immortal Global Objects Immortal Global Objects
----------------------- -----------------------
All objects that we expect to be shared globally (between interpreters) All runtime-global (builtin) objects will be made immortal.
will be made immortal. That includes the following: That includes the following:
* singletons (``None``, ``True``, ``False``, ``Ellipsis``, ``NotImplemented``) * singletons (``None``, ``True``, ``False``, ``Ellipsis``, ``NotImplemented``)
* all static types (e.g. ``PyLong_Type``, ``PyExc_Exception``) * all static types (e.g. ``PyLong_Type``, ``PyExc_Exception``)
* all static objects in ``_PyRuntimeState.global_objects`` (e.g. identifiers, * all static objects in ``_PyRuntimeState.global_objects`` (e.g. identifiers,
small ints) small ints)
All such objects will be immutable. In the case of the static types, The question of making them actually immutable (e.g. for
they will only be effectively immutable. ``PyTypeObject`` has some mutable per-interpreter GIL) is not in the scope of this PEP.
state (``tp_dict`` and ``tp_subclasses``), but we can work around this
by storing that state on ``PyInterpreterState`` instead of on the
respective static type object. Then the ``__dict__``, etc. getter
will do a lookup on the current interpreter, if appropriate, instead
of using ``tp_dict``.
Object Cleanup Object Cleanup
-------------- --------------
@ -515,6 +627,8 @@ accessible on the runtime state, e.g. in a ``_PyRuntimeState`` or
``PyInterpreterState`` field. We may need to add a tracking mechanism ``PyInterpreterState`` field. We may need to add a tracking mechanism
to the runtime state for a small number of objects. to the runtime state for a small number of objects.
None of the cleanup will have a significant effect on performance.
.. _mitigation: .. _mitigation:
Performance Regression Mitigation Performance Regression Mitigation
@ -557,11 +671,18 @@ However, we will update the documentation to make public guarantees
about refcount behavior more clear. That includes, specifically: about refcount behavior more clear. That includes, specifically:
* ``Py_INCREF()`` - change "Increment the reference count for object o." * ``Py_INCREF()`` - change "Increment the reference count for object o."
to "Acquire a new reference to object o." to "Indicate taking a new reference to object o."
* ``Py_DECREF()`` - change "Decrement the reference count for object o." * ``Py_DECREF()`` - change "Decrement the reference count for object o."
to "Release a reference to object o." to "Indicate no longer using a previously taken reference to object o."
* similar for ``Py_XINCREF()``, ``Py_XDECREF()``, ``Py_NewRef()``, * similar for ``Py_XINCREF()``, ``Py_XDECREF()``, ``Py_NewRef()``,
``Py_XNewRef()``, ``Py_Clear()``, ``Py_REFCNT()``, and ``Py_SET_REFCNT()`` ``Py_XNewRef()``, ``Py_Clear()``
* ``Py_REFCNT()`` - add "The refcounts 0 and 1 have specific meanings
and all others only mean code somewhere is using the object,
regardless of the value.
0 means the object is not used and will be cleaned up.
1 means code holds exactly a single reference."
* ``Py_SET_REFCNT()`` - refer to ``Py_REFCNT()`` about how values over 1
may be substituted with some over value
We *may* also add a note about immortal objects to the following, We *may* also add a note about immortal objects to the following,
to help reduce any surprise users may have with the change: to help reduce any surprise users may have with the change:
@ -586,9 +707,7 @@ https://github.com/python/cpython/pull/19474
Open Issues Open Issues
=========== ===========
* is there any other impact on GC? * how realistic is the `Accidental De-Immortalizing`_ concern?
* `are the copy-on-write benefits real? <https://mail.python.org/archives/list/python-dev@python.org/message/J53GY7XKFOI4KWHSTTA7FUL7TJLE7WG6/>`__
* must the fate of this PEP be tied to acceptance of a per-interpreter GIL PEP?
References References