374 lines
13 KiB
ReStructuredText
374 lines
13 KiB
ReStructuredText
|
PEP: 683
|
||
|
Title: Immortal Objects, Using a Fixed Refcount
|
||
|
Author: Eric Snow <ericsnowcurrently@gmail.com>, Eddie Elizondo <eduardo.elizondorueda@gmail.com>
|
||
|
Discussions-To: python-dev@python.org
|
||
|
Status: Draft
|
||
|
Type: Standards Track
|
||
|
Content-Type: text/x-rst
|
||
|
Created: 10-Feb-2022
|
||
|
Python-Version: 3.11
|
||
|
Post-History:
|
||
|
Resolution:
|
||
|
|
||
|
|
||
|
Abstract
|
||
|
========
|
||
|
|
||
|
Under this proposal, any object may be marked as immortal.
|
||
|
"Immortal" means the object will never be cleaned up (at least until
|
||
|
runtime finalization). Specifically, the `refcount`_ for an immortal
|
||
|
object is set to a sentinel value, and that refcount is never changed
|
||
|
by ``Py_INCREF()``, ``Py_DECREF()``, or ``Py_SET_REFCNT()``.
|
||
|
For immortal containers, the ``PyGC_Head`` is never
|
||
|
changed by the garbage collector.
|
||
|
|
||
|
Avoiding changes to the refcount is an essential part of this
|
||
|
proposal. For what we call "immutable" objects, it makes them
|
||
|
truly immutable. As described further below, this allows us
|
||
|
to avoid performance penalties in scenarios that
|
||
|
would otherwise be prohibitive.
|
||
|
|
||
|
This proposal is CPython-specific and, effectively, describes
|
||
|
internal implementation details.
|
||
|
|
||
|
.. _refcount: https://docs.python.org/3.11/c-api/intro.html#reference-counts
|
||
|
|
||
|
|
||
|
Motivation
|
||
|
==========
|
||
|
|
||
|
Without immortal objects, all objects are effectively mutable. That
|
||
|
includes "immutable" objects like ``None`` and ``str`` instances.
|
||
|
This is because every object's refcount is frequently modified
|
||
|
as it is used during execution. In addition, for containers
|
||
|
the runtime may modify the object's ``PyGC_Head``. These
|
||
|
runtime-internal state currently prevent
|
||
|
full immutability.
|
||
|
|
||
|
This has a concrete impact on active projects in the Python community.
|
||
|
Below we describe several ways in which refcount modification has
|
||
|
a real negative effect on those projects. None of that would
|
||
|
happen for objects that are truly immutable.
|
||
|
|
||
|
Reducing Cache Invalidation
|
||
|
---------------------------
|
||
|
|
||
|
Every modification of a refcount causes the corresponding cache
|
||
|
line to be invalidated. This has a number of effects.
|
||
|
|
||
|
For one, the write must be propagated to other cache levels
|
||
|
and to main memory. This has small effect on all Python programs.
|
||
|
Immortal objects would provide a slight relief in that regard.
|
||
|
|
||
|
On top of that, multi-core applications pay a price. If two threads
|
||
|
are interacting with the same object (e.g. ``None``) then they will
|
||
|
end up invalidating each other's caches with each incref and decref.
|
||
|
This is true even for otherwise immutable objects like ``True``,
|
||
|
``0``, and ``str`` instances. This is also true even with
|
||
|
the GIL, though the impact is smaller.
|
||
|
|
||
|
Avoiding Data Races
|
||
|
-------------------
|
||
|
|
||
|
Speaking of multi-core, we are considering making the GIL
|
||
|
a per-interpreter lock, which would enable true multi-core parallelism.
|
||
|
Among other things, the GIL currently protects against races between
|
||
|
multiple threads that concurrently incref or decref. Without a shared
|
||
|
GIL, two running interpreters could not safely share any objects,
|
||
|
even otherwise immutable ones like ``None``.
|
||
|
|
||
|
This means that, to have a per-interpreter GIL, each interpreter must
|
||
|
have its own copy of *every* object, including the singletons and
|
||
|
static types. We have a viable strategy for that but it will
|
||
|
require a meaningful amount of extra effort and extra
|
||
|
complexity.
|
||
|
|
||
|
The alternative is to ensure that all shared objects are truly immutable.
|
||
|
There would be no races because there would be no modification. This
|
||
|
is something that the immortality proposed here would enable for
|
||
|
otherwise immutable objects. With immortal objects,
|
||
|
support for a per-interpreter GIL
|
||
|
becomes much simpler.
|
||
|
|
||
|
Avoiding Copy-on-Write
|
||
|
----------------------
|
||
|
|
||
|
For some applications it makes sense to get the application into
|
||
|
a desired initial state and then fork the process for each worker.
|
||
|
This can result in a large performance improvement, especially
|
||
|
memory usage. Several enterprise Python users (e.g. Instagram,
|
||
|
YouTube) have taken advantage of this. However, the above
|
||
|
refcount semantics drastically reduce the benefits and
|
||
|
has led to some sub-optimal workarounds.
|
||
|
|
||
|
Also note that "fork" isn't the only operating system mechanism
|
||
|
that uses copy-on-write semantics.
|
||
|
|
||
|
|
||
|
Rationale
|
||
|
=========
|
||
|
|
||
|
The proposed solution is obvious enough that two people came to the
|
||
|
same conclusion (and implementation, more or less) independently.
|
||
|
Other designs were also considered. Several possibilities
|
||
|
have also been discussed on python-dev in past years.
|
||
|
|
||
|
Alternatives include:
|
||
|
|
||
|
* use a high bit to mark "immortal" but do not change ``Py_INCREF()``
|
||
|
* add an explicit flag to objects
|
||
|
* implement via the type (``tp_dealloc()`` is a no-op)
|
||
|
* track via the object's type object
|
||
|
* track with a separate table
|
||
|
|
||
|
Each of the above makes objects immortal, but none of them address
|
||
|
the performance penalties from refcount modification described above.
|
||
|
|
||
|
In the case of per-interpreter GIL, the only realistic alternative
|
||
|
is to move all global objects into ``PyInterpreterState`` and add
|
||
|
one or more lookup functions to access them. Then we'd have to
|
||
|
add some hacks to the C-API to preserve compatibility for the
|
||
|
may objects exposed there. The story is much, much simpler
|
||
|
with immortal objects
|
||
|
|
||
|
|
||
|
Impact
|
||
|
======
|
||
|
|
||
|
Benefits
|
||
|
--------
|
||
|
|
||
|
Most notably, the cases described in the two examples above stand
|
||
|
to benefit greatly from immortal objects. Projects using pre-fork
|
||
|
can drop their workarounds. For the per-interpreter GIL project,
|
||
|
immortal objects greatly simplifies the solution for existing static
|
||
|
types, as well as objects exposed by the public C-API.
|
||
|
|
||
|
In general, a strong immutability guarantee for objects enables Python
|
||
|
applications to scale like never before. This is because they can
|
||
|
then leverage multi-core parallelism without a tradeoff in memory
|
||
|
usage. This is reflected in most of the above cases.
|
||
|
|
||
|
|
||
|
Performance
|
||
|
-----------
|
||
|
|
||
|
A naive implementation shows `a 4% slowdown`_.
|
||
|
Several promising mitigation strategies will be pursued in the effort
|
||
|
to bring it closer to performance-neutral.
|
||
|
|
||
|
On the positive side, immortal objects save a significant amount of
|
||
|
memory when used with a pre-fork model. Also, immortal objects provide
|
||
|
opportunities for specialization in the eval loop that would improve
|
||
|
performance.
|
||
|
|
||
|
.. _a 4% slowdown: https://github.com/python/cpython/pull/19474#issuecomment-1032944709
|
||
|
|
||
|
Backward Compatibility
|
||
|
-----------------------
|
||
|
|
||
|
This proposal is completely compatible. It is internal-only so no API
|
||
|
is changing.
|
||
|
|
||
|
The approach is also compatible with extensions compiled to the stable
|
||
|
ABI. Unfortunately, they will modify the refcount and invalidate all
|
||
|
the performance benefits of immortal objects. However, the high bit
|
||
|
of the refcount will still match ``_Py_IMMORTAL_REFCNT`` so we can
|
||
|
still identify such objects as immortal.
|
||
|
|
||
|
No user-facing behavior changes, with the following exceptions:
|
||
|
|
||
|
* code that inspects the refcount (e.g. ``sys.getrefcount()``
|
||
|
or directly via ``ob_refcnt``) will see a really, really large
|
||
|
value
|
||
|
* ``Py_SET_REFCNT()`` will be a no-op for immortal objects
|
||
|
|
||
|
Neither should cause a problem.
|
||
|
|
||
|
Alternate Python Implementations
|
||
|
--------------------------------
|
||
|
|
||
|
This proposal is CPython-specific.
|
||
|
|
||
|
Security Implications
|
||
|
---------------------
|
||
|
|
||
|
This feature has no known impact on security.
|
||
|
|
||
|
Maintainability
|
||
|
---------------
|
||
|
|
||
|
This is not a complex feature so it should not cause much mental
|
||
|
overhead for maintainers. The basic implementation doesn't touch
|
||
|
much code so it should have much impact on maintainability. There
|
||
|
may be some extra complexity due to performance penalty mitigation.
|
||
|
However, that should be limited to where we immortalize all
|
||
|
objects post-init and that code will be in one place.
|
||
|
|
||
|
Non-Obvious Consequences
|
||
|
------------------------
|
||
|
|
||
|
* immortal containers effectively immortalize each contained item
|
||
|
* the same is true for objects held internally by other objects
|
||
|
(e.g. ``PyTypeObject.tp_subclasses``)
|
||
|
* an immortal object's type is effectively immortal
|
||
|
* though extremely unlikely (and technically hard), any object could
|
||
|
be incref'ed enough to reach ``_Py_IMMORTAL_REFCNT`` and then
|
||
|
be treated as immortal
|
||
|
|
||
|
|
||
|
Specification
|
||
|
=============
|
||
|
|
||
|
The approach involves these fundamental changes:
|
||
|
|
||
|
* add ``_Py_IMMORTAL_REFCNT`` (the magic value) to the internal C-API
|
||
|
* update ``Py_INCREF()`` and ``Py_DECREF()`` to no-op for objects with
|
||
|
the magic refcount (or its most significant bit)
|
||
|
* do the same for any other API that modifies the refcount
|
||
|
* stop modifying ``PyGC_Head`` for immortal containers
|
||
|
* ensure that all immortal objects are cleaned up during
|
||
|
runtime finalization
|
||
|
|
||
|
Then setting any object's refcount to ``_Py_IMMORTAL_REFCNT``
|
||
|
makes it immortal.
|
||
|
|
||
|
To be clear, we will likely use the most-significant bit of
|
||
|
``_Py_IMMORTAL_REFCNT`` to tell if an object is immortal, rather
|
||
|
than comparing with ``_Py_IMMORTAL_REFCNT`` directly.
|
||
|
|
||
|
(There are other minor, internal changes which are not described here.)
|
||
|
|
||
|
This is not meant to be a public feature but rather an internal one.
|
||
|
So the proposal does *not* including adding any new public C-API,
|
||
|
nor any Python API. However, this does not prevent us from
|
||
|
adding (publicly accessible) private API to do things
|
||
|
like immortalize an object or tell if one
|
||
|
is immortal.
|
||
|
|
||
|
Affected API
|
||
|
------------
|
||
|
|
||
|
API that will now ignore immortal objects:
|
||
|
|
||
|
* (public) ``Py_INCREF()``
|
||
|
* (public) ``Py_DECREF()``
|
||
|
* (public) ``Py_SET_REFCNT()``
|
||
|
* (private) ``_Py_NewReference()``
|
||
|
|
||
|
API that exposes refcounts (unchanged but may now return large values):
|
||
|
|
||
|
* (public) ``Py_REFCNT()``
|
||
|
* (public) ``sys.getrefcount()``
|
||
|
|
||
|
(Note that ``_Py_RefTotal`` and ``sys.gettotalrefcount()``
|
||
|
will not be affected.)
|
||
|
|
||
|
Immortal Global Objects
|
||
|
-----------------------
|
||
|
|
||
|
The following objects will be made immortal:
|
||
|
|
||
|
* singletons (``None``, ``True``, ``False``, ``Ellipsis``, ``NotImplemented``)
|
||
|
* all static types (e.g. ``PyLong_Type``, ``PyExc_Exception``)
|
||
|
* all static objects in ``_PyRuntimeState.global_objects`` (e.g. identifiers,
|
||
|
small ints)
|
||
|
|
||
|
There will likely be others we have not enumerated here.
|
||
|
|
||
|
Object Cleanup
|
||
|
--------------
|
||
|
|
||
|
In order to clean up all immortal objects during runtime finalization,
|
||
|
we must keep track of them.
|
||
|
|
||
|
For container objects we'll leverage the GC's permanent generation by
|
||
|
pushing all immortalized containers there. During runtime shutdown, the
|
||
|
strategy will be to first let the runtime try to do its best effort of
|
||
|
deallocating these instances normally. Most of the module deallocation
|
||
|
will now be handled by pylifecycle.c:finalize_modules which cleans up
|
||
|
the remaining modules as best as we can. It will change which modules
|
||
|
are available during __del__ but that's already defined as undefined
|
||
|
behavior by the docs. Optionally, we could do some topological disorder
|
||
|
to guarantee that user modules will be deallocated first before the
|
||
|
stdlib modules. Finally, anything leftover (if any) can be found
|
||
|
through the permanent generation gc list which we can clear after
|
||
|
finalize_modules.
|
||
|
|
||
|
For non-container objects, the tracking approach will vary on a
|
||
|
case-by-case basis. In nearly every case, each such object is directly
|
||
|
accessible on the runtime state, e.g. in a ``_PyRuntimeState`` or
|
||
|
``PyInterpreterState`` field. We may need to add a tracking mechanism
|
||
|
to the runtime state for a small number of objects.
|
||
|
|
||
|
Documentation
|
||
|
-------------
|
||
|
|
||
|
The feature itself is internal and will not be added to the documentation.
|
||
|
|
||
|
We *may* add a note about immortal objects to the following,
|
||
|
to help reduce any surprise users may have with the change:
|
||
|
|
||
|
* ``Py_SET_REFCNT()`` (a no-op for immortal objects)
|
||
|
* ``Py_REFCNT()`` (value may be surprisingly large)
|
||
|
* ``sys.getrefcount()`` (value may be surprisingly large)
|
||
|
|
||
|
Other API that might benefit from such notes are currently undocumented.
|
||
|
|
||
|
We wouldn't add a note anywhere else (including for ``Py_INCREF()`` and
|
||
|
``Py_DECREF()``) since the feature is otherwise transparent to users.
|
||
|
|
||
|
|
||
|
Rejected Ideas
|
||
|
==============
|
||
|
|
||
|
Equate Immortal with Immutable
|
||
|
------------------------------
|
||
|
|
||
|
Making a mutable object immortal isn't particularly helpful.
|
||
|
The exception is if you can ensure the object isn't actually
|
||
|
modified again. Since we aren't enforcing any immutability
|
||
|
for immortal objects it didn't make sense to emphasis
|
||
|
that relationship.
|
||
|
|
||
|
|
||
|
Reference Implementation
|
||
|
========================
|
||
|
|
||
|
The implementation is proposed on GitHub:
|
||
|
|
||
|
https://github.com/python/cpython/pull/19474
|
||
|
|
||
|
|
||
|
Open Issues
|
||
|
===========
|
||
|
|
||
|
* is there any other impact on GC?
|
||
|
|
||
|
|
||
|
References
|
||
|
==========
|
||
|
|
||
|
This was discussed in December 2021 on python-dev:
|
||
|
|
||
|
* https://mail.python.org/archives/list/python-dev@python.org/thread/7O3FUA52QGTVDC6MDAV5WXKNFEDRK5D6/#TBTHSOI2XRWRO6WQOLUW3X7S5DUXFAOV
|
||
|
* https://mail.python.org/archives/list/python-dev@python.org/thread/PNLBJBNIQDMG2YYGPBCTGOKOAVXRBJWY
|
||
|
|
||
|
|
||
|
Copyright
|
||
|
=========
|
||
|
|
||
|
This document is placed in the public domain or under the
|
||
|
CC0-1.0-Universal license, whichever is more permissive.
|
||
|
|
||
|
|
||
|
|
||
|
..
|
||
|
Local Variables:
|
||
|
mode: indented-text
|
||
|
indent-tabs-mode: nil
|
||
|
sentence-end-double-space: t
|
||
|
fill-column: 70
|
||
|
coding: utf-8
|
||
|
End:
|