PEP 683: Updates for Round 4 of Discussions (gh-2766)
This commit is contained in:
parent
0850bfb6c0
commit
290853700e
62
pep-0683.rst
62
pep-0683.rst
|
@ -72,14 +72,19 @@ Here's a high-level look at the implementation:
|
||||||
If an object's refcount matches a very specific value (defined below)
|
If an object's refcount matches a very specific value (defined below)
|
||||||
then that object is treated as immortal. The CPython C-API and runtime
|
then that object is treated as immortal. The CPython C-API and runtime
|
||||||
will not modify the refcount (or other runtime state) of an immortal
|
will not modify the refcount (or other runtime state) of an immortal
|
||||||
object.
|
object. The runtime will now be explicitly responsible for deallocating
|
||||||
|
all immortal objects during finalization, unless statically allocated.
|
||||||
|
(See `Object Cleanup`_ below.)
|
||||||
|
|
||||||
Aside from the change to refcounting semantics, there is one other
|
Aside from the change to refcounting semantics, there is one other
|
||||||
possible negative impact to consider. The threshold for an "acceptable"
|
possible negative impact to consider. The threshold for an "acceptable"
|
||||||
performance penalty for immortal objects is 2% (the consensus at the
|
performance penalty for immortal objects is 2% (the consensus at the
|
||||||
2022 Language Summit). A naive implementation of the approach described
|
2022 Language Summit). A naive implementation of the approach described
|
||||||
below makes CPython roughly 6% slower. However, the implementation
|
below makes CPython roughly 4% slower. However, the implementation
|
||||||
is performance-neutral once known mitigations are applied.
|
is ~performance-neutral~ once known mitigations are applied.
|
||||||
|
|
||||||
|
TODO: Update the performance impact for the latest branch
|
||||||
|
(both for GCC and for clang).
|
||||||
|
|
||||||
|
|
||||||
Motivation
|
Motivation
|
||||||
|
@ -95,7 +100,7 @@ a consistent high volume of refcount changes.
|
||||||
|
|
||||||
The effective mutability of all Python objects has a concrete impact
|
The effective mutability of all Python objects has a concrete impact
|
||||||
on parts of the Python community, e.g. projects that aim for
|
on parts of the Python community, e.g. projects that aim for
|
||||||
scalability like Instragram or the effort to make the GIL
|
scalability like Instagram or the effort to make the GIL
|
||||||
per-interpreter. Below we describe several ways in which refcount
|
per-interpreter. Below we describe several ways in which refcount
|
||||||
modification has a real negative effect on such projects.
|
modification has a real negative effect on such projects.
|
||||||
None of that would happen for objects that are truly immutable.
|
None of that would happen for objects that are truly immutable.
|
||||||
|
@ -152,9 +157,10 @@ refcount semantics drastically reduce the benefits and
|
||||||
have led to some sub-optimal workarounds.
|
have led to some sub-optimal workarounds.
|
||||||
|
|
||||||
Also note that "fork" isn't the only operating system mechanism
|
Also note that "fork" isn't the only operating system mechanism
|
||||||
that uses copy-on-write semantics. Anything that uses ``mmap``
|
that uses copy-on-write semantics. Another example is ``mmap``.
|
||||||
relies on copy-on-write, including sharing data from shared object
|
Any such utility will potentially benefit from fewer copy-on-writes
|
||||||
files between processes.
|
when immortal objects are involved, when compared to using only
|
||||||
|
"mortal" objects.
|
||||||
|
|
||||||
|
|
||||||
Rationale
|
Rationale
|
||||||
|
@ -198,24 +204,29 @@ immortal objects greatly simplifies the solution for existing static
|
||||||
types, as well as objects exposed by the public C-API.
|
types, as well as objects exposed by the public C-API.
|
||||||
|
|
||||||
In general, a strong immutability guarantee for objects enables Python
|
In general, a strong immutability guarantee for objects enables Python
|
||||||
applications to scale like never before. This is because they can
|
applications to scale better, particularly in
|
||||||
then leverage multi-core parallelism without a tradeoff in memory
|
`multi-process deployments <Facebook>`_. This is because they can then
|
||||||
usage. This is reflected in most of the above cases.
|
leverage multi-core parallelism without such a significant tradeoff in
|
||||||
|
memory usage as they now have. The cases we just described, as well as
|
||||||
|
those described above in `Motivation`_, reflect this improvement.
|
||||||
|
|
||||||
Performance
|
Performance
|
||||||
-----------
|
-----------
|
||||||
|
|
||||||
A naive implementation shows `a 4% slowdown`_. We have demonstrated
|
A naive implementation shows `a 4% slowdown`_. We have demonstrated
|
||||||
a return to performance-neutral with a handful of basic mitigations
|
a return to ~performance-neutral~ with a handful of basic mitigations
|
||||||
applied. See the `mitigations`_ section below.
|
applied. See the `mitigations`_ section below.
|
||||||
|
|
||||||
On the positive side, immortal objects save a significant amount of
|
On the positive side, immortal objects save a significant amount of
|
||||||
memory when used with a pre-fork model. Also, immortal objects provide
|
memory when used `with a pre-fork model <Facebook>`_. Also, immortal
|
||||||
opportunities for specialization in the eval loop that would improve
|
objects provide opportunities for specialization in the eval loop that
|
||||||
performance.
|
would improve performance.
|
||||||
|
|
||||||
.. _a 4% slowdown: https://github.com/python/cpython/pull/19474#issuecomment-1032944709
|
.. _a 4% slowdown: https://github.com/python/cpython/pull/19474#issuecomment-1032944709
|
||||||
|
|
||||||
|
TODO: Update the performance impact for the latest branch
|
||||||
|
(both for GCC and for clang).
|
||||||
|
|
||||||
Backward Compatibility
|
Backward Compatibility
|
||||||
----------------------
|
----------------------
|
||||||
|
|
||||||
|
@ -503,9 +514,10 @@ mutable object won't actually be modified.
|
||||||
On the other hand, some mutable objects will never be shared between
|
On the other hand, some mutable objects will never be shared between
|
||||||
threads (at least not without a lock like the GIL). In some cases it
|
threads (at least not without a lock like the GIL). In some cases it
|
||||||
may be practical to make some of those immortal too. For example,
|
may be practical to make some of those immortal too. For example,
|
||||||
``sys.modules`` is a per-interpreter dict that we do not expect to ever
|
``sys.modules`` is a per-interpreter dict that we do not expect to
|
||||||
get freed until the corresponding interpreter is finalized. By making
|
ever get freed until the corresponding interpreter is finalized
|
||||||
it immortal, we no longer incur the extra overhead during incref/decref.
|
(assuming it isn't replaced). By making it immortal, we would
|
||||||
|
no longer incur the extra overhead during incref/decref.
|
||||||
|
|
||||||
We explore this idea further in the `mitigations`_ section below.
|
We explore this idea further in the `mitigations`_ section below.
|
||||||
|
|
||||||
|
@ -520,7 +532,8 @@ it.
|
||||||
Examples:
|
Examples:
|
||||||
|
|
||||||
* containers like ``dict`` and ``list``
|
* containers like ``dict`` and ``list``
|
||||||
* objects that hold references internally like ``PyTypeObject.tp_subclasses``
|
* objects that hold references internally like ``PyTypeObject`` with
|
||||||
|
its ``tp_subclasses`` and ``tp_weaklist``
|
||||||
* an object's type (held in ``ob_type``)
|
* an object's type (held in ``ob_type``)
|
||||||
|
|
||||||
Such held objects are thus implicitly immortal for as long as they are
|
Such held objects are thus implicitly immortal for as long as they are
|
||||||
|
@ -573,6 +586,9 @@ stable ABI extension).
|
||||||
Note that top two bits of the refcount are already reserved for other
|
Note that top two bits of the refcount are already reserved for other
|
||||||
uses. That's why we are using the third top-most bit.
|
uses. That's why we are using the third top-most bit.
|
||||||
|
|
||||||
|
The implementation is also open to using other values for the immortal
|
||||||
|
bit, such as the sign bit or 2^31 (for saturated refcounts on 64-bit).
|
||||||
|
|
||||||
Affected API
|
Affected API
|
||||||
------------
|
------------
|
||||||
|
|
||||||
|
@ -588,9 +604,11 @@ API that exposes refcounts (unchanged but may now return large values):
|
||||||
* (public) ``Py_REFCNT()``
|
* (public) ``Py_REFCNT()``
|
||||||
* (public) ``sys.getrefcount()``
|
* (public) ``sys.getrefcount()``
|
||||||
|
|
||||||
(Note that ``_Py_RefTotal`` and ``sys.gettotalrefcount()``
|
(Note that ``_Py_RefTotal``, and consequently ``sys.gettotalrefcount()``,
|
||||||
will not be affected.)
|
will not be affected.)
|
||||||
|
|
||||||
|
TODO: clarify the status of ``_Py_RefTotal``.
|
||||||
|
|
||||||
Also, immortal objects will not participate in GC.
|
Also, immortal objects will not participate in GC.
|
||||||
|
|
||||||
Immortal Global Objects
|
Immortal Global Objects
|
||||||
|
@ -678,6 +696,7 @@ other possibilities
|
||||||
* mark the "interned" dict as immortal if shared else share all interned strings
|
* mark the "interned" dict as immortal if shared else share all interned strings
|
||||||
* (Larry,MAL) mark all constants unmarshalled for a module as immortal
|
* (Larry,MAL) mark all constants unmarshalled for a module as immortal
|
||||||
* (Larry,MAL) allocate (immutable) immortal objects in their own memory page(s)
|
* (Larry,MAL) allocate (immutable) immortal objects in their own memory page(s)
|
||||||
|
* saturated refcounts using the 32 least-significant bits
|
||||||
|
|
||||||
Solutions for Accidental De-Immortalization
|
Solutions for Accidental De-Immortalization
|
||||||
-------------------------------------------
|
-------------------------------------------
|
||||||
|
@ -730,6 +749,9 @@ has further detail.)
|
||||||
Regardless of the solution we end up with, we can do something else
|
Regardless of the solution we end up with, we can do something else
|
||||||
later if necessary.
|
later if necessary.
|
||||||
|
|
||||||
|
TODO: Add a note indicating that the implemented solution does not
|
||||||
|
affect the overall ~performance-neutral~ outcome.
|
||||||
|
|
||||||
Documentation
|
Documentation
|
||||||
-------------
|
-------------
|
||||||
|
|
||||||
|
@ -784,6 +806,8 @@ References
|
||||||
|
|
||||||
.. _Pyston: https://mail.python.org/archives/list/python-dev@python.org/message/JLHRTBJGKAENPNZURV4CIJSO6HI62BV3/
|
.. _Pyston: https://mail.python.org/archives/list/python-dev@python.org/message/JLHRTBJGKAENPNZURV4CIJSO6HI62BV3/
|
||||||
|
|
||||||
|
.. _Facebook: https://www.facebook.com/watch/?v=437636037237097&t=560
|
||||||
|
|
||||||
Prior Art
|
Prior Art
|
||||||
---------
|
---------
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue