PEP 683: Updates for Round 4 of Discussions (gh-2766)

This commit is contained in:
Eric Snow 2022-10-04 17:06:56 -06:00 committed by GitHub
parent 0850bfb6c0
commit 290853700e
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 43 additions and 19 deletions

View File

@ -72,14 +72,19 @@ Here's a high-level look at the implementation:
If an object's refcount matches a very specific value (defined below)
then that object is treated as immortal. The CPython C-API and runtime
will not modify the refcount (or other runtime state) of an immortal
object.
object. The runtime will now be explicitly responsible for deallocating
all immortal objects during finalization, unless statically allocated.
(See `Object Cleanup`_ below.)
Aside from the change to refcounting semantics, there is one other
possible negative impact to consider. The threshold for an "acceptable"
performance penalty for immortal objects is 2% (the consensus at the
2022 Language Summit). A naive implementation of the approach described
below makes CPython roughly 6% slower. However, the implementation
is performance-neutral once known mitigations are applied.
below makes CPython roughly 4% slower. However, the implementation
is ~performance-neutral~ once known mitigations are applied.
TODO: Update the performance impact for the latest branch
(both for GCC and for clang).
Motivation
@ -95,7 +100,7 @@ a consistent high volume of refcount changes.
The effective mutability of all Python objects has a concrete impact
on parts of the Python community, e.g. projects that aim for
scalability like Instragram or the effort to make the GIL
scalability like Instagram or the effort to make the GIL
per-interpreter. Below we describe several ways in which refcount
modification has a real negative effect on such projects.
None of that would happen for objects that are truly immutable.
@ -152,9 +157,10 @@ refcount semantics drastically reduce the benefits and
have led to some sub-optimal workarounds.
Also note that "fork" isn't the only operating system mechanism
that uses copy-on-write semantics. Anything that uses ``mmap``
relies on copy-on-write, including sharing data from shared object
files between processes.
that uses copy-on-write semantics. Another example is ``mmap``.
Any such utility will potentially benefit from fewer copy-on-writes
when immortal objects are involved, when compared to using only
"mortal" objects.
Rationale
@ -198,24 +204,29 @@ immortal objects greatly simplifies the solution for existing static
types, as well as objects exposed by the public C-API.
In general, a strong immutability guarantee for objects enables Python
applications to scale like never before. This is because they can
then leverage multi-core parallelism without a tradeoff in memory
usage. This is reflected in most of the above cases.
applications to scale better, particularly in
`multi-process deployments <Facebook>`_. This is because they can then
leverage multi-core parallelism without such a significant tradeoff in
memory usage as they now have. The cases we just described, as well as
those described above in `Motivation`_, reflect this improvement.
Performance
-----------
A naive implementation shows `a 4% slowdown`_. We have demonstrated
a return to performance-neutral with a handful of basic mitigations
a return to ~performance-neutral~ with a handful of basic mitigations
applied. See the `mitigations`_ section below.
On the positive side, immortal objects save a significant amount of
memory when used with a pre-fork model. Also, immortal objects provide
opportunities for specialization in the eval loop that would improve
performance.
memory when used `with a pre-fork model <Facebook>`_. Also, immortal
objects provide opportunities for specialization in the eval loop that
would improve performance.
.. _a 4% slowdown: https://github.com/python/cpython/pull/19474#issuecomment-1032944709
TODO: Update the performance impact for the latest branch
(both for GCC and for clang).
Backward Compatibility
----------------------
@ -503,9 +514,10 @@ mutable object won't actually be modified.
On the other hand, some mutable objects will never be shared between
threads (at least not without a lock like the GIL). In some cases it
may be practical to make some of those immortal too. For example,
``sys.modules`` is a per-interpreter dict that we do not expect to ever
get freed until the corresponding interpreter is finalized. By making
it immortal, we no longer incur the extra overhead during incref/decref.
``sys.modules`` is a per-interpreter dict that we do not expect to
ever get freed until the corresponding interpreter is finalized
(assuming it isn't replaced). By making it immortal, we would
no longer incur the extra overhead during incref/decref.
We explore this idea further in the `mitigations`_ section below.
@ -520,7 +532,8 @@ it.
Examples:
* containers like ``dict`` and ``list``
* objects that hold references internally like ``PyTypeObject.tp_subclasses``
* objects that hold references internally like ``PyTypeObject`` with
its ``tp_subclasses`` and ``tp_weaklist``
* an object's type (held in ``ob_type``)
Such held objects are thus implicitly immortal for as long as they are
@ -573,6 +586,9 @@ stable ABI extension).
Note that top two bits of the refcount are already reserved for other
uses. That's why we are using the third top-most bit.
The implementation is also open to using other values for the immortal
bit, such as the sign bit or 2^31 (for saturated refcounts on 64-bit).
Affected API
------------
@ -588,9 +604,11 @@ API that exposes refcounts (unchanged but may now return large values):
* (public) ``Py_REFCNT()``
* (public) ``sys.getrefcount()``
(Note that ``_Py_RefTotal`` and ``sys.gettotalrefcount()``
(Note that ``_Py_RefTotal``, and consequently ``sys.gettotalrefcount()``,
will not be affected.)
TODO: clarify the status of ``_Py_RefTotal``.
Also, immortal objects will not participate in GC.
Immortal Global Objects
@ -678,6 +696,7 @@ other possibilities
* mark the "interned" dict as immortal if shared else share all interned strings
* (Larry,MAL) mark all constants unmarshalled for a module as immortal
* (Larry,MAL) allocate (immutable) immortal objects in their own memory page(s)
* saturated refcounts using the 32 least-significant bits
Solutions for Accidental De-Immortalization
-------------------------------------------
@ -730,6 +749,9 @@ has further detail.)
Regardless of the solution we end up with, we can do something else
later if necessary.
TODO: Add a note indicating that the implemented solution does not
affect the overall ~performance-neutral~ outcome.
Documentation
-------------
@ -784,6 +806,8 @@ References
.. _Pyston: https://mail.python.org/archives/list/python-dev@python.org/message/JLHRTBJGKAENPNZURV4CIJSO6HI62BV3/
.. _Facebook: https://www.facebook.com/watch/?v=437636037237097&t=560
Prior Art
---------