PEP 683: Updates for Round 4 of Discussions (gh-2766)
This commit is contained in:
parent
0850bfb6c0
commit
290853700e
62
pep-0683.rst
62
pep-0683.rst
|
@ -72,14 +72,19 @@ Here's a high-level look at the implementation:
|
|||
If an object's refcount matches a very specific value (defined below)
|
||||
then that object is treated as immortal. The CPython C-API and runtime
|
||||
will not modify the refcount (or other runtime state) of an immortal
|
||||
object.
|
||||
object. The runtime will now be explicitly responsible for deallocating
|
||||
all immortal objects during finalization, unless statically allocated.
|
||||
(See `Object Cleanup`_ below.)
|
||||
|
||||
Aside from the change to refcounting semantics, there is one other
|
||||
possible negative impact to consider. The threshold for an "acceptable"
|
||||
performance penalty for immortal objects is 2% (the consensus at the
|
||||
2022 Language Summit). A naive implementation of the approach described
|
||||
below makes CPython roughly 6% slower. However, the implementation
|
||||
is performance-neutral once known mitigations are applied.
|
||||
below makes CPython roughly 4% slower. However, the implementation
|
||||
is ~performance-neutral~ once known mitigations are applied.
|
||||
|
||||
TODO: Update the performance impact for the latest branch
|
||||
(both for GCC and for clang).
|
||||
|
||||
|
||||
Motivation
|
||||
|
@ -95,7 +100,7 @@ a consistent high volume of refcount changes.
|
|||
|
||||
The effective mutability of all Python objects has a concrete impact
|
||||
on parts of the Python community, e.g. projects that aim for
|
||||
scalability like Instragram or the effort to make the GIL
|
||||
scalability like Instagram or the effort to make the GIL
|
||||
per-interpreter. Below we describe several ways in which refcount
|
||||
modification has a real negative effect on such projects.
|
||||
None of that would happen for objects that are truly immutable.
|
||||
|
@ -152,9 +157,10 @@ refcount semantics drastically reduce the benefits and
|
|||
have led to some sub-optimal workarounds.
|
||||
|
||||
Also note that "fork" isn't the only operating system mechanism
|
||||
that uses copy-on-write semantics. Anything that uses ``mmap``
|
||||
relies on copy-on-write, including sharing data from shared object
|
||||
files between processes.
|
||||
that uses copy-on-write semantics. Another example is ``mmap``.
|
||||
Any such utility will potentially benefit from fewer copy-on-writes
|
||||
when immortal objects are involved, when compared to using only
|
||||
"mortal" objects.
|
||||
|
||||
|
||||
Rationale
|
||||
|
@ -198,24 +204,29 @@ immortal objects greatly simplifies the solution for existing static
|
|||
types, as well as objects exposed by the public C-API.
|
||||
|
||||
In general, a strong immutability guarantee for objects enables Python
|
||||
applications to scale like never before. This is because they can
|
||||
then leverage multi-core parallelism without a tradeoff in memory
|
||||
usage. This is reflected in most of the above cases.
|
||||
applications to scale better, particularly in
|
||||
`multi-process deployments <Facebook>`_. This is because they can then
|
||||
leverage multi-core parallelism without such a significant tradeoff in
|
||||
memory usage as they now have. The cases we just described, as well as
|
||||
those described above in `Motivation`_, reflect this improvement.
|
||||
|
||||
Performance
|
||||
-----------
|
||||
|
||||
A naive implementation shows `a 4% slowdown`_. We have demonstrated
|
||||
a return to performance-neutral with a handful of basic mitigations
|
||||
a return to ~performance-neutral~ with a handful of basic mitigations
|
||||
applied. See the `mitigations`_ section below.
|
||||
|
||||
On the positive side, immortal objects save a significant amount of
|
||||
memory when used with a pre-fork model. Also, immortal objects provide
|
||||
opportunities for specialization in the eval loop that would improve
|
||||
performance.
|
||||
memory when used `with a pre-fork model <Facebook>`_. Also, immortal
|
||||
objects provide opportunities for specialization in the eval loop that
|
||||
would improve performance.
|
||||
|
||||
.. _a 4% slowdown: https://github.com/python/cpython/pull/19474#issuecomment-1032944709
|
||||
|
||||
TODO: Update the performance impact for the latest branch
|
||||
(both for GCC and for clang).
|
||||
|
||||
Backward Compatibility
|
||||
----------------------
|
||||
|
||||
|
@ -503,9 +514,10 @@ mutable object won't actually be modified.
|
|||
On the other hand, some mutable objects will never be shared between
|
||||
threads (at least not without a lock like the GIL). In some cases it
|
||||
may be practical to make some of those immortal too. For example,
|
||||
``sys.modules`` is a per-interpreter dict that we do not expect to ever
|
||||
get freed until the corresponding interpreter is finalized. By making
|
||||
it immortal, we no longer incur the extra overhead during incref/decref.
|
||||
``sys.modules`` is a per-interpreter dict that we do not expect to
|
||||
ever get freed until the corresponding interpreter is finalized
|
||||
(assuming it isn't replaced). By making it immortal, we would
|
||||
no longer incur the extra overhead during incref/decref.
|
||||
|
||||
We explore this idea further in the `mitigations`_ section below.
|
||||
|
||||
|
@ -520,7 +532,8 @@ it.
|
|||
Examples:
|
||||
|
||||
* containers like ``dict`` and ``list``
|
||||
* objects that hold references internally like ``PyTypeObject.tp_subclasses``
|
||||
* objects that hold references internally like ``PyTypeObject`` with
|
||||
its ``tp_subclasses`` and ``tp_weaklist``
|
||||
* an object's type (held in ``ob_type``)
|
||||
|
||||
Such held objects are thus implicitly immortal for as long as they are
|
||||
|
@ -573,6 +586,9 @@ stable ABI extension).
|
|||
Note that top two bits of the refcount are already reserved for other
|
||||
uses. That's why we are using the third top-most bit.
|
||||
|
||||
The implementation is also open to using other values for the immortal
|
||||
bit, such as the sign bit or 2^31 (for saturated refcounts on 64-bit).
|
||||
|
||||
Affected API
|
||||
------------
|
||||
|
||||
|
@ -588,9 +604,11 @@ API that exposes refcounts (unchanged but may now return large values):
|
|||
* (public) ``Py_REFCNT()``
|
||||
* (public) ``sys.getrefcount()``
|
||||
|
||||
(Note that ``_Py_RefTotal`` and ``sys.gettotalrefcount()``
|
||||
(Note that ``_Py_RefTotal``, and consequently ``sys.gettotalrefcount()``,
|
||||
will not be affected.)
|
||||
|
||||
TODO: clarify the status of ``_Py_RefTotal``.
|
||||
|
||||
Also, immortal objects will not participate in GC.
|
||||
|
||||
Immortal Global Objects
|
||||
|
@ -678,6 +696,7 @@ other possibilities
|
|||
* mark the "interned" dict as immortal if shared else share all interned strings
|
||||
* (Larry,MAL) mark all constants unmarshalled for a module as immortal
|
||||
* (Larry,MAL) allocate (immutable) immortal objects in their own memory page(s)
|
||||
* saturated refcounts using the 32 least-significant bits
|
||||
|
||||
Solutions for Accidental De-Immortalization
|
||||
-------------------------------------------
|
||||
|
@ -730,6 +749,9 @@ has further detail.)
|
|||
Regardless of the solution we end up with, we can do something else
|
||||
later if necessary.
|
||||
|
||||
TODO: Add a note indicating that the implemented solution does not
|
||||
affect the overall ~performance-neutral~ outcome.
|
||||
|
||||
Documentation
|
||||
-------------
|
||||
|
||||
|
@ -784,6 +806,8 @@ References
|
|||
|
||||
.. _Pyston: https://mail.python.org/archives/list/python-dev@python.org/message/JLHRTBJGKAENPNZURV4CIJSO6HI62BV3/
|
||||
|
||||
.. _Facebook: https://www.facebook.com/watch/?v=437636037237097&t=560
|
||||
|
||||
Prior Art
|
||||
---------
|
||||
|
||||
|
|
Loading…
Reference in New Issue