diff --git a/pep-0683.rst b/pep-0683.rst index e3b54f38a..f4075821c 100644 --- a/pep-0683.rst +++ b/pep-0683.rst @@ -72,14 +72,19 @@ Here's a high-level look at the implementation: If an object's refcount matches a very specific value (defined below) then that object is treated as immortal. The CPython C-API and runtime will not modify the refcount (or other runtime state) of an immortal -object. +object. The runtime will now be explicitly responsible for deallocating +all immortal objects during finalization, unless statically allocated. +(See `Object Cleanup`_ below.) Aside from the change to refcounting semantics, there is one other possible negative impact to consider. The threshold for an "acceptable" performance penalty for immortal objects is 2% (the consensus at the 2022 Language Summit). A naive implementation of the approach described -below makes CPython roughly 6% slower. However, the implementation -is performance-neutral once known mitigations are applied. +below makes CPython roughly 4% slower. However, the implementation +is ~performance-neutral~ once known mitigations are applied. + +TODO: Update the performance impact for the latest branch +(both for GCC and for clang). Motivation @@ -95,7 +100,7 @@ a consistent high volume of refcount changes. The effective mutability of all Python objects has a concrete impact on parts of the Python community, e.g. projects that aim for -scalability like Instragram or the effort to make the GIL +scalability like Instagram or the effort to make the GIL per-interpreter. Below we describe several ways in which refcount modification has a real negative effect on such projects. None of that would happen for objects that are truly immutable. @@ -152,9 +157,10 @@ refcount semantics drastically reduce the benefits and have led to some sub-optimal workarounds. Also note that "fork" isn't the only operating system mechanism -that uses copy-on-write semantics. Anything that uses ``mmap`` -relies on copy-on-write, including sharing data from shared object -files between processes. +that uses copy-on-write semantics. Another example is ``mmap``. +Any such utility will potentially benefit from fewer copy-on-writes +when immortal objects are involved, when compared to using only +"mortal" objects. Rationale @@ -198,24 +204,29 @@ immortal objects greatly simplifies the solution for existing static types, as well as objects exposed by the public C-API. In general, a strong immutability guarantee for objects enables Python -applications to scale like never before. This is because they can -then leverage multi-core parallelism without a tradeoff in memory -usage. This is reflected in most of the above cases. +applications to scale better, particularly in +`multi-process deployments `_. This is because they can then +leverage multi-core parallelism without such a significant tradeoff in +memory usage as they now have. The cases we just described, as well as +those described above in `Motivation`_, reflect this improvement. Performance ----------- A naive implementation shows `a 4% slowdown`_. We have demonstrated -a return to performance-neutral with a handful of basic mitigations +a return to ~performance-neutral~ with a handful of basic mitigations applied. See the `mitigations`_ section below. On the positive side, immortal objects save a significant amount of -memory when used with a pre-fork model. Also, immortal objects provide -opportunities for specialization in the eval loop that would improve -performance. +memory when used `with a pre-fork model `_. Also, immortal +objects provide opportunities for specialization in the eval loop that +would improve performance. .. _a 4% slowdown: https://github.com/python/cpython/pull/19474#issuecomment-1032944709 +TODO: Update the performance impact for the latest branch +(both for GCC and for clang). + Backward Compatibility ---------------------- @@ -503,9 +514,10 @@ mutable object won't actually be modified. On the other hand, some mutable objects will never be shared between threads (at least not without a lock like the GIL). In some cases it may be practical to make some of those immortal too. For example, -``sys.modules`` is a per-interpreter dict that we do not expect to ever -get freed until the corresponding interpreter is finalized. By making -it immortal, we no longer incur the extra overhead during incref/decref. +``sys.modules`` is a per-interpreter dict that we do not expect to +ever get freed until the corresponding interpreter is finalized +(assuming it isn't replaced). By making it immortal, we would +no longer incur the extra overhead during incref/decref. We explore this idea further in the `mitigations`_ section below. @@ -520,7 +532,8 @@ it. Examples: * containers like ``dict`` and ``list`` -* objects that hold references internally like ``PyTypeObject.tp_subclasses`` +* objects that hold references internally like ``PyTypeObject`` with + its ``tp_subclasses`` and ``tp_weaklist`` * an object's type (held in ``ob_type``) Such held objects are thus implicitly immortal for as long as they are @@ -573,6 +586,9 @@ stable ABI extension). Note that top two bits of the refcount are already reserved for other uses. That's why we are using the third top-most bit. +The implementation is also open to using other values for the immortal +bit, such as the sign bit or 2^31 (for saturated refcounts on 64-bit). + Affected API ------------ @@ -588,9 +604,11 @@ API that exposes refcounts (unchanged but may now return large values): * (public) ``Py_REFCNT()`` * (public) ``sys.getrefcount()`` -(Note that ``_Py_RefTotal`` and ``sys.gettotalrefcount()`` +(Note that ``_Py_RefTotal``, and consequently ``sys.gettotalrefcount()``, will not be affected.) +TODO: clarify the status of ``_Py_RefTotal``. + Also, immortal objects will not participate in GC. Immortal Global Objects @@ -678,6 +696,7 @@ other possibilities * mark the "interned" dict as immortal if shared else share all interned strings * (Larry,MAL) mark all constants unmarshalled for a module as immortal * (Larry,MAL) allocate (immutable) immortal objects in their own memory page(s) +* saturated refcounts using the 32 least-significant bits Solutions for Accidental De-Immortalization ------------------------------------------- @@ -730,6 +749,9 @@ has further detail.) Regardless of the solution we end up with, we can do something else later if necessary. +TODO: Add a note indicating that the implemented solution does not +affect the overall ~performance-neutral~ outcome. + Documentation ------------- @@ -784,6 +806,8 @@ References .. _Pyston: https://mail.python.org/archives/list/python-dev@python.org/message/JLHRTBJGKAENPNZURV4CIJSO6HI62BV3/ +.. _Facebook: https://www.facebook.com/watch/?v=437636037237097&t=560 + Prior Art ---------