2022-02-15 18:52:22 -05:00
|
|
|
PEP: 683
|
|
|
|
Title: Immortal Objects, Using a Fixed Refcount
|
|
|
|
Author: Eric Snow <ericsnowcurrently@gmail.com>, Eddie Elizondo <eduardo.elizondorueda@gmail.com>
|
2022-02-15 19:29:37 -05:00
|
|
|
Discussions-To: https://mail.python.org/archives/list/python-dev@python.org/thread/TPLEYDCXFQ4AMTW6F6OQFINSIFYBRFCR/
|
2022-02-15 18:52:22 -05:00
|
|
|
Status: Draft
|
|
|
|
Type: Standards Track
|
|
|
|
Content-Type: text/x-rst
|
|
|
|
Created: 10-Feb-2022
|
|
|
|
Python-Version: 3.11
|
2022-02-15 19:29:37 -05:00
|
|
|
Post-History: 15-Feb-2022
|
2022-02-15 18:52:22 -05:00
|
|
|
Resolution:
|
|
|
|
|
|
|
|
|
|
|
|
Abstract
|
|
|
|
========
|
|
|
|
|
2022-02-19 02:43:07 -05:00
|
|
|
Currently the CPython runtime maintains a
|
|
|
|
`small amount of mutable state <Runtime Object State_>`_ in the
|
|
|
|
allocated memory of each object. Because of this, otherwise immutable
|
|
|
|
objects are actually mutable. This can have a large negative impact
|
|
|
|
on CPU and memory performance, especially for approaches to increasing
|
|
|
|
Python's scalability. The solution proposed here provides a way
|
|
|
|
to mark an object as one for which that per-object
|
|
|
|
runtime state should not change.
|
|
|
|
|
|
|
|
Specifically, if an object's refcount matches a very specific value
|
|
|
|
(defined below) then that object is treated as "immortal". If an object
|
|
|
|
is immortal then its refcount will never be modified by ``Py_INCREF()``,
|
|
|
|
etc. Consequently, the refcount will never reach 0, so that object will
|
|
|
|
never be cleaned up (unless explicitly done, e.g. during runtime
|
|
|
|
finalization). Additionally, all other per-object runtime state
|
|
|
|
for an immortal object will be considered immutable.
|
|
|
|
|
|
|
|
This approach has some possible negative impact, which is explained
|
|
|
|
below, along with mitigations. A critical requirement for this change
|
|
|
|
is that the performance regression be no more than 2-3%. Anything worse
|
2022-02-21 16:46:18 -05:00
|
|
|
than performance-neutral requires that the other benefits are proportionally
|
2022-02-19 02:43:07 -05:00
|
|
|
large. Aside from specific applications, the fundamental improvement
|
|
|
|
here is that now an object can be truly immutable.
|
|
|
|
|
|
|
|
(This proposal is meant to be CPython-specific and to affect only
|
|
|
|
internal implementation details. There are some slight exceptions
|
|
|
|
to that which are explained below. See `Backward Compatibility`_,
|
|
|
|
`Public Refcount Details`_, and `scope`_.)
|
2022-02-15 18:52:22 -05:00
|
|
|
|
|
|
|
|
|
|
|
Motivation
|
|
|
|
==========
|
|
|
|
|
2022-02-19 02:43:07 -05:00
|
|
|
As noted above, currently all objects are effectively mutable. That
|
|
|
|
includes "immutable" objects like ``str`` instances. This is because
|
|
|
|
every object's refcount is frequently modified as the object is used
|
|
|
|
during execution. This is especially significant for a number of
|
|
|
|
commonly used global (builtin) objects, e.g. ``None``. Such objects
|
|
|
|
are used a lot, both in Python code and internally. That adds up to
|
|
|
|
a consistent high volume of refcount changes.
|
|
|
|
|
|
|
|
The effective mutability of all Python objects has a concrete impact
|
|
|
|
on parts of the Python community, e.g. projects that aim for
|
|
|
|
scalability like Instragram or the effort to make the GIL
|
|
|
|
per-interpreter. Below we describe several ways in which refcount
|
|
|
|
modification has a real negative effect on such projects.
|
|
|
|
None of that would happen for objects that are truly immutable.
|
|
|
|
|
|
|
|
Reducing CPU Cache Invalidation
|
|
|
|
-------------------------------
|
|
|
|
|
|
|
|
Every modification of a refcount causes the corresponding CPU cache
|
2022-02-15 18:52:22 -05:00
|
|
|
line to be invalidated. This has a number of effects.
|
|
|
|
|
|
|
|
For one, the write must be propagated to other cache levels
|
|
|
|
and to main memory. This has small effect on all Python programs.
|
|
|
|
Immortal objects would provide a slight relief in that regard.
|
|
|
|
|
|
|
|
On top of that, multi-core applications pay a price. If two threads
|
2022-02-19 02:43:07 -05:00
|
|
|
(running simultaneously on distinct cores) are interacting with the
|
|
|
|
same object (e.g. ``None``) then they will end up invalidating each
|
|
|
|
other's caches with each incref and decref. This is true even for
|
|
|
|
otherwise immutable objects like ``True``, ``0``, and ``str`` instances.
|
|
|
|
CPython's GIL helps reduce this effect, since only one thread runs at a
|
|
|
|
time, but it doesn't completely eliminate the penalty.
|
2022-02-15 18:52:22 -05:00
|
|
|
|
|
|
|
Avoiding Data Races
|
|
|
|
-------------------
|
|
|
|
|
|
|
|
Speaking of multi-core, we are considering making the GIL
|
|
|
|
a per-interpreter lock, which would enable true multi-core parallelism.
|
|
|
|
Among other things, the GIL currently protects against races between
|
2022-02-19 02:43:07 -05:00
|
|
|
multiple concurrent threads that may incref or decref the same object.
|
|
|
|
Without a shared GIL, two running interpreters could not safely share
|
|
|
|
any objects, even otherwise immutable ones like ``None``.
|
2022-02-15 18:52:22 -05:00
|
|
|
|
|
|
|
This means that, to have a per-interpreter GIL, each interpreter must
|
2022-02-19 02:43:07 -05:00
|
|
|
have its own copy of *every* object. That includes the singletons and
|
|
|
|
static types. We have a viable strategy for that but it will require
|
|
|
|
a meaningful amount of extra effort and extra complexity.
|
2022-02-15 18:52:22 -05:00
|
|
|
|
|
|
|
The alternative is to ensure that all shared objects are truly immutable.
|
|
|
|
There would be no races because there would be no modification. This
|
|
|
|
is something that the immortality proposed here would enable for
|
|
|
|
otherwise immutable objects. With immortal objects,
|
|
|
|
support for a per-interpreter GIL
|
|
|
|
becomes much simpler.
|
|
|
|
|
|
|
|
Avoiding Copy-on-Write
|
|
|
|
----------------------
|
|
|
|
|
|
|
|
For some applications it makes sense to get the application into
|
|
|
|
a desired initial state and then fork the process for each worker.
|
|
|
|
This can result in a large performance improvement, especially
|
|
|
|
memory usage. Several enterprise Python users (e.g. Instagram,
|
|
|
|
YouTube) have taken advantage of this. However, the above
|
|
|
|
refcount semantics drastically reduce the benefits and
|
2022-02-21 16:46:18 -05:00
|
|
|
have led to some sub-optimal workarounds.
|
2022-02-15 18:52:22 -05:00
|
|
|
|
|
|
|
Also note that "fork" isn't the only operating system mechanism
|
2022-02-19 02:43:07 -05:00
|
|
|
that uses copy-on-write semantics. Anything that uses ``mmap``
|
2022-02-21 16:46:18 -05:00
|
|
|
relies on copy-on-write, including sharing data from shared object
|
2022-02-19 02:43:07 -05:00
|
|
|
files between processes.
|
2022-02-15 18:52:22 -05:00
|
|
|
|
|
|
|
|
|
|
|
Rationale
|
|
|
|
=========
|
|
|
|
|
2022-02-19 02:43:07 -05:00
|
|
|
The proposed solution is obvious enough that both of this proposal's
|
|
|
|
authors came to the same conclusion (and implementation, more or less)
|
|
|
|
independently. The Pyston project `uses a similar approach <Pyston_>`_.
|
|
|
|
Other designs were also considered. Several possibilities have also
|
|
|
|
been discussed on python-dev in past years.
|
2022-02-15 18:52:22 -05:00
|
|
|
|
|
|
|
Alternatives include:
|
|
|
|
|
|
|
|
* use a high bit to mark "immortal" but do not change ``Py_INCREF()``
|
|
|
|
* add an explicit flag to objects
|
|
|
|
* implement via the type (``tp_dealloc()`` is a no-op)
|
|
|
|
* track via the object's type object
|
|
|
|
* track with a separate table
|
|
|
|
|
|
|
|
Each of the above makes objects immortal, but none of them address
|
|
|
|
the performance penalties from refcount modification described above.
|
|
|
|
|
|
|
|
In the case of per-interpreter GIL, the only realistic alternative
|
|
|
|
is to move all global objects into ``PyInterpreterState`` and add
|
|
|
|
one or more lookup functions to access them. Then we'd have to
|
|
|
|
add some hacks to the C-API to preserve compatibility for the
|
|
|
|
may objects exposed there. The story is much, much simpler
|
|
|
|
with immortal objects
|
|
|
|
|
|
|
|
|
|
|
|
Impact
|
|
|
|
======
|
|
|
|
|
|
|
|
Benefits
|
|
|
|
--------
|
|
|
|
|
|
|
|
Most notably, the cases described in the two examples above stand
|
|
|
|
to benefit greatly from immortal objects. Projects using pre-fork
|
|
|
|
can drop their workarounds. For the per-interpreter GIL project,
|
|
|
|
immortal objects greatly simplifies the solution for existing static
|
|
|
|
types, as well as objects exposed by the public C-API.
|
|
|
|
|
|
|
|
In general, a strong immutability guarantee for objects enables Python
|
|
|
|
applications to scale like never before. This is because they can
|
|
|
|
then leverage multi-core parallelism without a tradeoff in memory
|
|
|
|
usage. This is reflected in most of the above cases.
|
|
|
|
|
|
|
|
Performance
|
|
|
|
-----------
|
|
|
|
|
|
|
|
A naive implementation shows `a 4% slowdown`_.
|
|
|
|
Several promising mitigation strategies will be pursued in the effort
|
2022-02-19 02:43:07 -05:00
|
|
|
to bring it closer to performance-neutral. See the `mitigation`_
|
|
|
|
section below.
|
2022-02-15 18:52:22 -05:00
|
|
|
|
|
|
|
On the positive side, immortal objects save a significant amount of
|
|
|
|
memory when used with a pre-fork model. Also, immortal objects provide
|
|
|
|
opportunities for specialization in the eval loop that would improve
|
|
|
|
performance.
|
|
|
|
|
|
|
|
.. _a 4% slowdown: https://github.com/python/cpython/pull/19474#issuecomment-1032944709
|
|
|
|
|
|
|
|
Backward Compatibility
|
2022-02-19 02:43:07 -05:00
|
|
|
----------------------
|
2022-02-15 18:52:22 -05:00
|
|
|
|
2022-02-19 02:43:07 -05:00
|
|
|
This proposal is meant to be completely compatible. It focuses strictly
|
|
|
|
on internal implementation details. It does not involve changes to any
|
2022-02-25 13:11:59 -05:00
|
|
|
public API, other than a few minor changes in behavior related to refcounts
|
2022-02-19 02:43:07 -05:00
|
|
|
(but only for immortal objects):
|
|
|
|
|
|
|
|
* code that inspects the refcount will see a really, really large value
|
|
|
|
* the new noop behavior may break code that:
|
|
|
|
|
|
|
|
* depends specifically on the refcount to always increment or decrement
|
|
|
|
(or have a specific value from ``Py_SET_REFCNT()``)
|
|
|
|
* relies on any specific refcount value, other than 0
|
|
|
|
* directly manipulates the refcount to store extra information there
|
|
|
|
|
|
|
|
Again, those changes in behavior only apply to immortal objects, not
|
|
|
|
most of the objects a user will access. Furthermore, users cannot mark
|
|
|
|
an object as immortal so no user-created objects will ever have that
|
|
|
|
changed behavior. Users that rely on any of the changing behavior for
|
|
|
|
global (builtin) objects are already in trouble.
|
|
|
|
|
|
|
|
Also note that code which checks for refleaks should keep working fine,
|
|
|
|
unless it checks for hard-coded small values relative to some immortal
|
|
|
|
object. The problems noticed by `Pyston`_ shouldn't apply here since
|
|
|
|
we do not modify the refcount.
|
|
|
|
|
|
|
|
See `Public Refcount Details`_ and `scope`_ below for further discussion.
|
|
|
|
|
|
|
|
Stable ABI
|
|
|
|
----------
|
2022-02-15 18:52:22 -05:00
|
|
|
|
|
|
|
The approach is also compatible with extensions compiled to the stable
|
|
|
|
ABI. Unfortunately, they will modify the refcount and invalidate all
|
|
|
|
the performance benefits of immortal objects. However, the high bit
|
2022-02-19 02:43:07 -05:00
|
|
|
of the refcount `will still match _Py_IMMORTAL_REFCNT <_Py_IMMORTAL_REFCNT_>`_
|
|
|
|
so we can still identify such objects as immortal. At worst, objects
|
|
|
|
in that situation would feel the effects described in the `Motivation`_
|
|
|
|
section. Even then the overall impact is unlikely to be significant.
|
2022-02-15 18:52:22 -05:00
|
|
|
|
2022-02-19 02:43:07 -05:00
|
|
|
Also see `_Py_IMMORTAL_REFCNT`_ below.
|
2022-02-15 18:52:22 -05:00
|
|
|
|
2022-02-19 02:43:07 -05:00
|
|
|
Accidental Immortality
|
|
|
|
----------------------
|
2022-02-15 18:52:22 -05:00
|
|
|
|
2022-02-19 02:43:07 -05:00
|
|
|
Hypothetically, a regular object could be incref'ed so much that it
|
|
|
|
reaches the magic value needed to be considered immortal. That means
|
|
|
|
it would accidentally never be cleaned up (by going back to 0).
|
|
|
|
|
|
|
|
While it isn't impossible, this accidental scenario is so unlikely
|
|
|
|
that we need not worry. Even if done deliberately by using
|
|
|
|
``Py_INCREF()`` in a tight loop and each iteration only took 1 CPU
|
|
|
|
cycle, it would take 2^61 cycles (on a 64-bit processor). At a fast
|
|
|
|
5 GHz that would still take nearly 500,000,000 seconds (over 5,000 days)!
|
|
|
|
If that CPU were 32-bit then it is (technically) more possible though
|
|
|
|
still highly unlikely.
|
|
|
|
|
|
|
|
Also note that it is doubly unlikely to be a problem because it wouldn't
|
|
|
|
matter until the refcount got back to 0 and the object was cleaned up.
|
|
|
|
So any object that hit that magic "immortal" refcount value would have
|
|
|
|
to be decref'ed that many times again before the change in behavior
|
|
|
|
would be noticed.
|
|
|
|
|
|
|
|
Again, the only realistic way that the magic refcount would be reached
|
|
|
|
(and then reversed) is if it were done deliberately. (Of course, the
|
|
|
|
same thing could be done efficiently using ``Py_SET_REFCNT()`` though
|
|
|
|
that would be even less of an accident.) At that point we don't
|
|
|
|
consider it a concern of this proposal.
|
2022-02-15 18:52:22 -05:00
|
|
|
|
|
|
|
Alternate Python Implementations
|
|
|
|
--------------------------------
|
|
|
|
|
2022-02-19 02:43:07 -05:00
|
|
|
This proposal is CPython-specific. However, it does relate to the
|
|
|
|
behavior of the C-API, which may affect other Python implementations.
|
|
|
|
Consequently, the effect of changed behavior described in
|
|
|
|
`Backward Compatibility`_ above also applies here (e.g. if another
|
|
|
|
implementation is tightly coupled to specific refcount values, other
|
|
|
|
than 0, or on exactly how refcounts change, then they may impacted).
|
2022-02-15 18:52:22 -05:00
|
|
|
|
|
|
|
Security Implications
|
|
|
|
---------------------
|
|
|
|
|
|
|
|
This feature has no known impact on security.
|
|
|
|
|
|
|
|
Maintainability
|
|
|
|
---------------
|
|
|
|
|
|
|
|
This is not a complex feature so it should not cause much mental
|
|
|
|
overhead for maintainers. The basic implementation doesn't touch
|
|
|
|
much code so it should have much impact on maintainability. There
|
|
|
|
may be some extra complexity due to performance penalty mitigation.
|
|
|
|
However, that should be limited to where we immortalize all
|
|
|
|
objects post-init and that code will be in one place.
|
|
|
|
|
|
|
|
|
|
|
|
Specification
|
|
|
|
=============
|
|
|
|
|
|
|
|
The approach involves these fundamental changes:
|
|
|
|
|
2022-02-19 02:43:07 -05:00
|
|
|
* add `_Py_IMMORTAL_REFCNT`_ (the magic value) to the internal C-API
|
2022-02-15 18:52:22 -05:00
|
|
|
* update ``Py_INCREF()`` and ``Py_DECREF()`` to no-op for objects with
|
|
|
|
the magic refcount (or its most significant bit)
|
|
|
|
* do the same for any other API that modifies the refcount
|
2022-02-19 02:43:07 -05:00
|
|
|
* stop modifying ``PyGC_Head`` for immortal GC objects ("containers")
|
2022-02-15 18:52:22 -05:00
|
|
|
* ensure that all immortal objects are cleaned up during
|
|
|
|
runtime finalization
|
|
|
|
|
|
|
|
Then setting any object's refcount to ``_Py_IMMORTAL_REFCNT``
|
|
|
|
makes it immortal.
|
|
|
|
|
|
|
|
(There are other minor, internal changes which are not described here.)
|
|
|
|
|
2022-02-19 02:43:07 -05:00
|
|
|
In the following sub-sections we dive into the details. First we will
|
|
|
|
cover some conceptual topics, followed by more concrete aspects like
|
|
|
|
specific affected APIs.
|
|
|
|
|
|
|
|
Public Refcount Details
|
|
|
|
-----------------------
|
|
|
|
|
|
|
|
In `Backward Compatibility`_ we introduced possible ways that user code
|
|
|
|
might be broken by the change in this proposal. Any contributing
|
|
|
|
misunderstanding by users is likely due in large part to the names of
|
|
|
|
the refcount-related API and to how the documentation explains those
|
|
|
|
API (and refcounting in general).
|
|
|
|
|
|
|
|
Between the names and the docs, we can clearly see answers
|
|
|
|
to the following questions:
|
|
|
|
|
|
|
|
* what behavior do users expect?
|
|
|
|
* what guarantees do we make?
|
|
|
|
* do we indicate how to interpret the refcount value they receive?
|
|
|
|
* what are the use cases under which a user would set an object's
|
|
|
|
refcount to a specific value?
|
|
|
|
* are users setting the refcount of objects they did not create?
|
|
|
|
|
|
|
|
As part of this proposal, we must make sure that users can clearly
|
|
|
|
understand on which parts of the refcount behavior they can rely and
|
|
|
|
which are considered implementation details. Specifically, they should
|
|
|
|
use the existing public refcount-related API and the only refcount value
|
|
|
|
with any meaning is 0. All other values are considered "not 0".
|
|
|
|
|
|
|
|
This information will be clarified in the `documentation <Documentation_>`_.
|
|
|
|
|
|
|
|
Arguably, the existing refcount-related API should be modified to reflect
|
|
|
|
what we want users to expect. Something like the following:
|
|
|
|
|
|
|
|
* ``Py_INCREF()`` -> ``Py_ACQUIRE_REF()`` (or only support ``Py_NewRef()``)
|
|
|
|
* ``Py_DECREF()`` -> ``Py_RELEASE_REF()``
|
|
|
|
* ``Py_REFCNT()`` -> ``Py_HAS_REFS()``
|
|
|
|
* ``Py_SET_REFCNT()`` -> ``Py_RESET_REFS()`` and ``Py_SET_NO_REFS()``
|
|
|
|
|
|
|
|
However, such a change is not a part of this proposal. It is included
|
|
|
|
here to demonstrate the tighter focus for user expectations that would
|
|
|
|
benefit this change.
|
|
|
|
|
|
|
|
Constraints
|
|
|
|
-----------
|
|
|
|
|
|
|
|
* ensure that otherwise immutable objects can be truly immutable
|
|
|
|
* minimize performance penalty for normal Python use cases
|
|
|
|
* be careful when immortalizing objects that we don't actually expect
|
|
|
|
to persist until runtime finalization.
|
|
|
|
* be careful when immortalizing objects that are not otherwise immutable
|
|
|
|
|
|
|
|
.. _scope:
|
|
|
|
|
|
|
|
Scope of Changes
|
|
|
|
----------------
|
|
|
|
|
|
|
|
Object immortality is not meant to be a public feature but rather an
|
2022-02-21 16:46:18 -05:00
|
|
|
internal one. So the proposal does *not* include adding any new
|
2022-02-19 02:43:07 -05:00
|
|
|
public C-API, nor any Python API. However, this does not prevent
|
|
|
|
us from adding (publicly accessible) private API to do things
|
|
|
|
like immortalize an object or tell if one is immortal.
|
|
|
|
|
|
|
|
The particular details of:
|
|
|
|
|
|
|
|
* how to mark something as immortal
|
|
|
|
* how to recognize something as immortal
|
|
|
|
* which subset of functionally immortal objects are marked as immortal
|
|
|
|
* which memory-management activities are skipped or modified for immortal objects
|
|
|
|
|
|
|
|
are not only Cpython-specific but are also private implementation
|
|
|
|
details that are expected to change in subsequent versions.
|
|
|
|
|
|
|
|
Immortal Mutable Objects
|
|
|
|
------------------------
|
|
|
|
|
|
|
|
Any object can be marked as immortal. We do not propose any
|
|
|
|
restrictions or checks. However, in practice the value of making an
|
|
|
|
object immortal relates to its mutability and depends on the likelihood
|
|
|
|
it would be used for a sufficient portion of the application's lifetime.
|
|
|
|
Marking a mutable object as immortal can make sense in some situations.
|
|
|
|
|
|
|
|
Many of the use cases for immortal objects center on immutability, so
|
|
|
|
that threads can safely and efficiently share such objects without
|
|
|
|
locking. For this reason a mutable object, like a dict or list, would
|
|
|
|
never be shared (and thus no immortality). However, immortality may
|
|
|
|
be appropriate if there is sufficient guarantee that the normally
|
|
|
|
mutable object won't actually be modified.
|
|
|
|
|
|
|
|
On the other hand, some mutable objects will never be shared between
|
|
|
|
threads (at least not without a lock like the GIL). In some cases it
|
|
|
|
may be practical to make some of those immortal too. For example,
|
|
|
|
``sys.modules`` is a per-interpreter dict that we do not expect to ever
|
|
|
|
get freed until the corresponding interpreter is finalized. By making
|
|
|
|
it immortal, we no longer incur the extra overhead during incref/decref.
|
|
|
|
|
|
|
|
We explore this idea further in the `mitigation`_ section below.
|
|
|
|
|
|
|
|
(Note that we are still investigating the impact on GC
|
|
|
|
of immortalizing containers.)
|
|
|
|
|
|
|
|
Implicitly Immortal Objects
|
|
|
|
---------------------------
|
|
|
|
|
|
|
|
If an immortal object holds a reference to a normal (mortal) object
|
|
|
|
then that held object is effectively immortal. This is because that
|
|
|
|
object's refcount can never reach 0 until the immortal object releases
|
|
|
|
it.
|
|
|
|
|
|
|
|
Examples:
|
|
|
|
|
|
|
|
* containers like ``dict`` and ``list``
|
|
|
|
* objects that hold references internally like ``PyTypeObject.tp_subclasses``
|
|
|
|
* an object's type (held in ``ob_type``)
|
|
|
|
|
|
|
|
Such held objects are thus implicitly immortal for as long as they are
|
|
|
|
held. In practice, this should have no real consequences since it
|
|
|
|
really isn't a change in behavior. The only difference is that the
|
|
|
|
immortal object (holding the reference) doesn't ever get cleaned up.
|
|
|
|
|
|
|
|
We do not propose that such implicitly immortal objects be changed
|
|
|
|
in any way. They should not be explicitly marked as immortal just
|
|
|
|
because they are held by an immortal object. That would provide
|
|
|
|
no advantage over doing nothing.
|
|
|
|
|
|
|
|
Un-Immortalizing Objects
|
|
|
|
------------------------
|
|
|
|
|
|
|
|
This proposal does not include any mechanism for taking an immortal
|
|
|
|
object and returning it to a "normal" condition. Currently there
|
|
|
|
is no need for such an ability.
|
|
|
|
|
|
|
|
On top of that, the obvious approach is to simply set the refcount
|
|
|
|
to a small value. However, at that point there is no way in knowing
|
|
|
|
which value would be safe. Ideally we'd set it to the value that it
|
|
|
|
would have been if it hadn't been made immortal. However, that value
|
|
|
|
has long been lost. Hence the complexities involved make it less
|
|
|
|
likely that an object could safely be un-immortalized, even if we
|
|
|
|
had a good reason to do so.
|
|
|
|
|
|
|
|
_Py_IMMORTAL_REFCNT
|
|
|
|
-------------------
|
|
|
|
|
|
|
|
We will add two internal constants::
|
|
|
|
|
|
|
|
#define _Py_IMMORTAL_BIT (1LL << (8 * sizeof(Py_ssize_t) - 4))
|
|
|
|
#define _Py_IMMORTAL_REFCNT (_Py_IMMORTAL_BIT + (_Py_IMMORTAL_BIT / 2))
|
|
|
|
|
|
|
|
The refcount for immortal objects will be set to ``_Py_IMMORTAL_REFCNT``.
|
|
|
|
However, to check if an object is immortal we will compare its refcount
|
|
|
|
against just the bit::
|
|
|
|
|
|
|
|
(op->ob_refcnt & _Py_IMMORTAL_BIT) != 0
|
|
|
|
|
|
|
|
The difference means that an immortal object will still be considered
|
|
|
|
immortal, even if somehow its refcount were modified (e.g. by an older
|
|
|
|
stable ABI extension).
|
|
|
|
|
|
|
|
Note that top two bits of the refcount are already reserved for other
|
|
|
|
uses. That's why we are using the third top-most bit.
|
2022-02-15 18:52:22 -05:00
|
|
|
|
|
|
|
Affected API
|
|
|
|
------------
|
|
|
|
|
|
|
|
API that will now ignore immortal objects:
|
|
|
|
|
|
|
|
* (public) ``Py_INCREF()``
|
|
|
|
* (public) ``Py_DECREF()``
|
|
|
|
* (public) ``Py_SET_REFCNT()``
|
|
|
|
* (private) ``_Py_NewReference()``
|
|
|
|
|
|
|
|
API that exposes refcounts (unchanged but may now return large values):
|
|
|
|
|
|
|
|
* (public) ``Py_REFCNT()``
|
|
|
|
* (public) ``sys.getrefcount()``
|
|
|
|
|
|
|
|
(Note that ``_Py_RefTotal`` and ``sys.gettotalrefcount()``
|
|
|
|
will not be affected.)
|
|
|
|
|
|
|
|
Immortal Global Objects
|
|
|
|
-----------------------
|
|
|
|
|
2022-02-19 02:43:07 -05:00
|
|
|
All objects that we expect to be shared globally (between interpreters)
|
|
|
|
will be made immortal. That includes the following:
|
2022-02-15 18:52:22 -05:00
|
|
|
|
|
|
|
* singletons (``None``, ``True``, ``False``, ``Ellipsis``, ``NotImplemented``)
|
|
|
|
* all static types (e.g. ``PyLong_Type``, ``PyExc_Exception``)
|
|
|
|
* all static objects in ``_PyRuntimeState.global_objects`` (e.g. identifiers,
|
|
|
|
small ints)
|
|
|
|
|
2022-02-19 02:43:07 -05:00
|
|
|
All such objects will be immutable. In the case of the static types,
|
2022-02-21 16:46:18 -05:00
|
|
|
they will only be effectively immutable. ``PyTypeObject`` has some mutable
|
|
|
|
state (``tp_dict`` and ``tp_subclasses``), but we can work around this
|
2022-02-19 02:43:07 -05:00
|
|
|
by storing that state on ``PyInterpreterState`` instead of on the
|
|
|
|
respective static type object. Then the ``__dict__``, etc. getter
|
|
|
|
will do a lookup on the current interpreter, if appropriate, instead
|
|
|
|
of using ``tp_dict``.
|
2022-02-15 18:52:22 -05:00
|
|
|
|
|
|
|
Object Cleanup
|
|
|
|
--------------
|
|
|
|
|
|
|
|
In order to clean up all immortal objects during runtime finalization,
|
|
|
|
we must keep track of them.
|
|
|
|
|
2022-02-19 02:43:07 -05:00
|
|
|
For GC objects ("containers") we'll leverage the GC's permanent
|
|
|
|
generation by pushing all immortalized containers there. During
|
|
|
|
runtime shutdown, the strategy will be to first let the runtime try
|
|
|
|
to do its best effort of deallocating these instances normally. Most
|
|
|
|
of the module deallocation will now be handled by
|
|
|
|
``pylifecycle.c:finalize_modules()`` which cleans up the remaining
|
|
|
|
modules as best as we can. It will change which modules are available
|
2022-02-25 13:11:59 -05:00
|
|
|
during ``__del__``, but that's already explicitly undefined behavior in the
|
|
|
|
docs. Optionally, we could do some topological ordering to guarantee
|
2022-02-19 02:43:07 -05:00
|
|
|
that user modules will be deallocated first before the stdlib modules.
|
2022-02-25 13:11:59 -05:00
|
|
|
Finally, anything left over (if any) can be found through the permanent
|
|
|
|
generation GC list which we can clear after ``finalize_modules()``.
|
2022-02-15 18:52:22 -05:00
|
|
|
|
|
|
|
For non-container objects, the tracking approach will vary on a
|
|
|
|
case-by-case basis. In nearly every case, each such object is directly
|
|
|
|
accessible on the runtime state, e.g. in a ``_PyRuntimeState`` or
|
|
|
|
``PyInterpreterState`` field. We may need to add a tracking mechanism
|
|
|
|
to the runtime state for a small number of objects.
|
|
|
|
|
2022-02-19 02:43:07 -05:00
|
|
|
.. _mitigation:
|
|
|
|
|
|
|
|
Performance Regression Mitigation
|
|
|
|
---------------------------------
|
|
|
|
|
2022-02-21 16:46:18 -05:00
|
|
|
In the interest of clarity, here are some of the ways we are going
|
2022-02-19 02:43:07 -05:00
|
|
|
to try to recover some of the lost `performance <Performance_>`_:
|
|
|
|
|
|
|
|
* at the end of runtime init, mark all objects as immortal
|
|
|
|
* drop refcount operations in code where we know the object is immortal
|
|
|
|
(e.g. ``Py_RETURN_NONE``)
|
|
|
|
* specialize for immortal objects in the eval loop (see `Pyston`_)
|
|
|
|
|
|
|
|
Regarding that first point, we can apply the concept from
|
|
|
|
`Immortal Mutable Objects`_ in the pursuit of getting back some of
|
|
|
|
that 4% performance we lose with the naive implementation of immortal
|
|
|
|
objects. At the end of runtime init we can mark *all* objects as
|
|
|
|
immortal and avoid the extra cost in incref/decref. We only need
|
|
|
|
to worry about immutability with objects that we plan on sharing
|
|
|
|
between threads without a GIL.
|
|
|
|
|
|
|
|
Note that none of this section is part of the proposal.
|
|
|
|
The above is included here for clarity.
|
|
|
|
|
|
|
|
Possible Changes
|
|
|
|
----------------
|
|
|
|
|
|
|
|
* mark every interned string as immortal
|
|
|
|
* mark the "interned" dict as immortal if shared else share all interned strings
|
|
|
|
* (Larry,MvL) mark all constants unmarshalled for a module as immortal
|
|
|
|
* (Larry,MvL) allocate (immutable) immortal objects in their own memory page(s)
|
|
|
|
|
2022-02-15 18:52:22 -05:00
|
|
|
Documentation
|
|
|
|
-------------
|
|
|
|
|
2022-02-19 02:43:07 -05:00
|
|
|
The immortal objects behavior and API are internal, implementation
|
|
|
|
details and will not be added to the documentation.
|
|
|
|
|
|
|
|
However, we will update the documentation to make public guarantees
|
|
|
|
about refcount behavior more clear. That includes, specifically:
|
2022-02-15 18:52:22 -05:00
|
|
|
|
2022-02-19 02:43:07 -05:00
|
|
|
* ``Py_INCREF()`` - change "Increment the reference count for object o."
|
|
|
|
to "Acquire a new reference to object o."
|
|
|
|
* ``Py_DECREF()`` - change "Decrement the reference count for object o."
|
|
|
|
to "Release a reference to object o."
|
|
|
|
* similar for ``Py_XINCREF()``, ``Py_XDECREF()``, ``Py_NewRef()``,
|
|
|
|
``Py_XNewRef()``, ``Py_Clear()``, ``Py_REFCNT()``, and ``Py_SET_REFCNT()``
|
|
|
|
|
|
|
|
We *may* also add a note about immortal objects to the following,
|
2022-02-15 18:52:22 -05:00
|
|
|
to help reduce any surprise users may have with the change:
|
|
|
|
|
|
|
|
* ``Py_SET_REFCNT()`` (a no-op for immortal objects)
|
|
|
|
* ``Py_REFCNT()`` (value may be surprisingly large)
|
|
|
|
* ``sys.getrefcount()`` (value may be surprisingly large)
|
|
|
|
|
|
|
|
Other API that might benefit from such notes are currently undocumented.
|
2022-02-19 02:43:07 -05:00
|
|
|
We wouldn't add such a note anywhere else (including for ``Py_INCREF()``
|
|
|
|
and ``Py_DECREF()``) since the feature is otherwise transparent to users.
|
2022-02-15 18:52:22 -05:00
|
|
|
|
|
|
|
|
|
|
|
Reference Implementation
|
|
|
|
========================
|
|
|
|
|
|
|
|
The implementation is proposed on GitHub:
|
|
|
|
|
|
|
|
https://github.com/python/cpython/pull/19474
|
|
|
|
|
|
|
|
|
|
|
|
Open Issues
|
|
|
|
===========
|
|
|
|
|
|
|
|
* is there any other impact on GC?
|
2022-02-19 02:43:07 -05:00
|
|
|
* `are the copy-on-write benefits real? <https://mail.python.org/archives/list/python-dev@python.org/message/J53GY7XKFOI4KWHSTTA7FUL7TJLE7WG6/>`__
|
|
|
|
* must the fate of this PEP be tied to acceptance of a per-interpreter GIL PEP?
|
2022-02-15 18:52:22 -05:00
|
|
|
|
|
|
|
|
|
|
|
References
|
|
|
|
==========
|
|
|
|
|
2022-02-21 16:46:18 -05:00
|
|
|
.. _Pyston: https://mail.python.org/archives/list/python-dev@python.org/message/JLHRTBJGKAENPNZURV4CIJSO6HI62BV3/
|
2022-02-19 02:43:07 -05:00
|
|
|
|
|
|
|
Prior Art
|
|
|
|
---------
|
|
|
|
|
|
|
|
* `Pyston`_
|
|
|
|
|
|
|
|
Discussions
|
|
|
|
-----------
|
|
|
|
|
2022-02-15 18:52:22 -05:00
|
|
|
This was discussed in December 2021 on python-dev:
|
|
|
|
|
|
|
|
* https://mail.python.org/archives/list/python-dev@python.org/thread/7O3FUA52QGTVDC6MDAV5WXKNFEDRK5D6/#TBTHSOI2XRWRO6WQOLUW3X7S5DUXFAOV
|
|
|
|
* https://mail.python.org/archives/list/python-dev@python.org/thread/PNLBJBNIQDMG2YYGPBCTGOKOAVXRBJWY
|
|
|
|
|
2022-02-19 02:43:07 -05:00
|
|
|
Runtime Object State
|
|
|
|
--------------------
|
|
|
|
|
|
|
|
Here is the internal state that the CPython runtime keeps
|
|
|
|
for each Python object:
|
|
|
|
|
|
|
|
* `PyObject.ob_refcnt`_: the object's `refcount <refcounting_>`_
|
|
|
|
* `_PyGC_Head <PyGC_Head>`_: (optional) the object's node in a list of `"GC" objects <refcounting_>`_
|
|
|
|
* `_PyObject_HEAD_EXTRA <PyObject_HEAD_EXTRA>`_: (optional) the object's node in the list of heap objects
|
|
|
|
|
|
|
|
``ob_refcnt`` is part of the memory allocated for every object.
|
|
|
|
However, ``_PyObject_HEAD_EXTRA`` is allocated only if CPython was built
|
|
|
|
with ``Py_TRACE_REFS`` defined. ``PyGC_Head`` is allocated only if the
|
|
|
|
object's type has ``Py_TPFLAGS_HAVE_GC`` set. Typically this is only
|
|
|
|
container types (e.g. ``list``). Also note that ``PyObject.ob_refcnt``
|
|
|
|
and ``_PyObject_HEAD_EXTRA`` are part of ``PyObject_HEAD``.
|
|
|
|
|
|
|
|
.. _PyObject.ob_refcnt: https://github.com/python/cpython/blob/80a9ba537f1f1666a9e6c5eceef4683f86967a1f/Include/object.h#L107
|
|
|
|
.. _PyGC_Head: https://github.com/python/cpython/blob/80a9ba537f1f1666a9e6c5eceef4683f86967a1f/Include/internal/pycore_gc.h#L11-L20
|
|
|
|
.. _PyObject_HEAD_EXTRA: https://github.com/python/cpython/blob/80a9ba537f1f1666a9e6c5eceef4683f86967a1f/Include/object.h#L68-L72
|
|
|
|
|
|
|
|
.. _refcounting:
|
|
|
|
|
|
|
|
Reference Counting, with Cyclic Garbage Collection
|
|
|
|
--------------------------------------------------
|
|
|
|
|
|
|
|
Garbage collection is a memory management feature of some programming
|
|
|
|
languages. It means objects are cleaned up (e.g. memory freed)
|
|
|
|
once they are no longer used.
|
|
|
|
|
|
|
|
Refcounting is one approach to garbage collection. The language runtime
|
|
|
|
tracks how many references are held to an object. When code takes
|
|
|
|
ownership of a reference to an object or releases it, the runtime
|
|
|
|
is notified and it increments or decrements the refcount accordingly.
|
|
|
|
When the refcount reaches 0, the runtime cleans up the object.
|
|
|
|
|
|
|
|
With CPython, code must explicitly take or release references using
|
|
|
|
the C-API's ``Py_INCREF()`` and ``Py_DECREF()``. These macros happen
|
|
|
|
to directly modify the object's refcount (unfortunately, since that
|
|
|
|
causes ABI compatibility issues if we want to change our garbage
|
|
|
|
collection scheme). Also, when an object is cleaned up in CPython,
|
|
|
|
it also releases any references (and resources) it owns
|
|
|
|
(before it's memory is freed).
|
|
|
|
|
|
|
|
Sometimes objects may be involved in reference cycles, e.g. where
|
|
|
|
object A holds a reference to object B and object B holds a reference
|
|
|
|
to object A. Consequently, neither object would ever be cleaned up
|
|
|
|
even if no other references were held (i.e. a memory leak). The
|
|
|
|
most common objects involved in cycles are containers.
|
|
|
|
|
|
|
|
CPython has dedicated machinery to deal with reference cycles, which
|
|
|
|
we call the "cyclic garbage collector", or often just
|
|
|
|
"garbage collector" or "GC". Don't let the name confuse you.
|
|
|
|
It only deals with breaking reference cycles.
|
|
|
|
|
|
|
|
See the docs for a more detailed explanation of refcounting
|
|
|
|
and cyclic garbage collection:
|
|
|
|
|
|
|
|
* https://docs.python.org/3.11/c-api/intro.html#reference-counts
|
|
|
|
* https://docs.python.org/3.11/c-api/refcounting.html
|
|
|
|
* https://docs.python.org/3.11/c-api/typeobj.html#c.PyObject.ob_refcnt
|
|
|
|
* https://docs.python.org/3.11/c-api/gcsupport.html
|
|
|
|
|
2022-02-15 18:52:22 -05:00
|
|
|
|
|
|
|
Copyright
|
|
|
|
=========
|
|
|
|
|
|
|
|
This document is placed in the public domain or under the
|
|
|
|
CC0-1.0-Universal license, whichever is more permissive.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
..
|
|
|
|
Local Variables:
|
|
|
|
mode: indented-text
|
|
|
|
indent-tabs-mode: nil
|
|
|
|
sentence-end-double-space: t
|
|
|
|
fill-column: 70
|
|
|
|
coding: utf-8
|
|
|
|
End:
|