PEP 683: Small Updates (#2622)

This covers typos, tweaks in wording, and some adjustments in response to [the last email thread](https://mail.python.org/archives/list/python-dev@python.org/thread/MI22URMVKC63OFMZTALHFZKAKVGAT4UF/).
2022-07-05 14:34:09 -06:00 · 2022-07-05 14:34:09 -06:00 · 3298a237a9
parent 3fe4784290
commit 3298a237a9
1 changed files with 154 additions and 85 deletions
--- a/pep-0683.rst
+++ b/pep-0683.rst
@ -6,7 +6,7 @@ Status: Draft
 Type: Standards Track
 Content-Type: text/x-rst
 Created: 10-Feb-2022
-Python-Version: 3.11
+Python-Version: 3.12
 Post-History: 15-Feb-2022, 19-Feb-2022, 28-Feb-2022
 Resolution:

@ -23,16 +23,18 @@ Python's scalability.

 This proposal mandates that, internally, CPython will support marking
 an object as one for which that runtime state will no longer change.
-Consequently, such an object's refcount will never reach 0, and so the
-object will never be cleaned up.  We call these objects "immortal".
-(Normally, only a relatively small number of internal objects
-will ever be immortal.)  The fundamental improvement here
-is that now an object can be truly immutable.
+Consequently, such an object's refcount will never reach 0, and thus
+the object will never be cleaned up (except when the runtime knows
+it's safe to do so, like during runtime finalization).
+We call these objects "immortal".  (Normally, only a relatively small
+number of internal objects will ever be immortal.)
+The fundamental improvement here is that now an object
+can be truly immutable.

 Scope
 -----

-Object immortality is meant to be an internal-only feature.  So this
+Object immortality is meant to be an internal-only feature, so this
 proposal does not include any changes to public API or behavior
 (with one exception).  As usual, we may still add some private
 (yet publicly accessible) API to do things like immortalize an object
@ -73,10 +75,11 @@ will not modify the refcount (or other runtime state) of an immortal
 object.

 Aside from the change to refcounting semantics, there is one other
-possible negative impact to consider.  A naive implementation of the
-approach described below makes CPython roughly 4% slower.  However,
-the implementation is performance-neutral once known mitigations
-are applied.
+possible negative impact to consider.  The threshold for an "acceptable"
+performance penalty for immortal objects is 2% (the consensus at the
+2022 Language Summit).  A naive implementation of the approach described
+below makes CPython roughly 6% slower.  However, the implementation
+is performance-neutral once known mitigations are applied.


 Motivation
@ -179,7 +182,7 @@ is to move all global objects into ``PyInterpreterState`` and add
 one or more lookup functions to access them.  Then we'd have to
 add some hacks to the C-API to preserve compatibility for the
 may objects exposed there.  The story is much, much simpler
-with immortal objects
+with immortal objects.


 Impact
@ -204,7 +207,7 @@ Performance

 A naive implementation shows `a 4% slowdown`_.  We have demonstrated
 a return to performance-neutral with a handful of basic mitigations
-applied.  See the `mitigation`_ section below.
+applied.  See the `mitigations`_ section below.

 On the positive side, immortal objects save a significant amount of
 memory when used with a pre-fork model.  Also, immortal objects provide
@ -231,16 +234,16 @@ Specifically, when an immortal object is involved:
  * relies on any specific refcount value, other than 0 or 1
  * directly manipulates the refcount to store extra information there

-* in 32-bit pre-3.11 `Stable ABI`_ extensions,
+* in 32-bit pre-3.12 `Stable ABI`_ extensions,
  objects may leak due to `Accidental Immortality`_
 * such extensions may crash due to `Accidental De-Immortalizing`_

-Again, those changes in behavior only apply to immortal objects, not
-most of the objects a user will access.  Furthermore, users cannot mark
-an object as immortal so no user-created objects will ever have that
-changed behavior.  Users that rely on any of the changing behavior for
-global (builtin) objects are already in trouble.  So the overall impact
-should be small.
+Again, those changes in behavior only apply to immortal objects,
+not the vast majority of objects a user will use.  Furthermore,
+users cannot mark an object as immortal so no user-created objects
+will ever have that changed behavior.  Users that rely on any of
+the changing behavior for global (builtin) objects are already
+in trouble.  So the overall impact should be small.

 Also note that code which checks for refleaks should keep working fine,
 unless it checks for hard-coded small values relative to some immortal
@ -254,20 +257,20 @@ Accidental Immortality

 Hypothetically, a non-immortal object could be incref'ed so much
 that it reaches the magic value needed to be considered immortal.
-That means it would accidentally never be cleaned up
-(by going back to 0).
+That means it would never be decref'ed all the way back to 0, so it
+would accidentally leak (never be cleaned up).

-On 64-bit builds, this accidental scenario is so unlikely that we need
-not worry.  Even if done deliberately by using ``Py_INCREF()`` in a
-tight loop and each iteration only took 1 CPU cycle, it would take
+With 64-bit refcounts, this accidental scenario is so unlikely that
+we need not worry.  Even if done deliberately by using ``Py_INCREF()``
+in a tight loop and each iteration only took 1 CPU cycle, it would take
 2^60 cycles (if the immortal bit were 2^60).  At a fast 5 GHz that would
 still take nearly 250,000,000 seconds (over 2,500 days)!

 Also note that it is doubly unlikely to be a problem because it wouldn't
-matter until the refcount got back to 0 and the object was cleaned up.
-So any object that hit that magic "immortal" refcount value would have
-to be decref'ed that many times again before the change in behavior
-would be noticed.
+matter until the refcount would have gotten back to 0 and the object
+cleaned up.  So any object that hit that magic "immortal" refcount value
+would have to be decref'ed that many times again before the change
+in behavior would be noticed.

 Again, the only realistic way that the magic refcount would be reached
 (and then reversed) is if it were done deliberately.  (Of course, the
@ -275,7 +278,8 @@ same thing could be done efficiently using ``Py_SET_REFCNT()`` though
 that would be even less of an accident.)  At that point we don't
 consider it a concern of this proposal.

-On 32-bit builds it isn't so obvious.  Let's say the magic refcount
+On builds with much smaller maximum refcounts, like 32-bit platforms,
+the consequences aren't so obvious.  Let's say the magic refcount
 were 2^30.  Using the same specs as above, it would take roughly
 4 seconds to accidentally immortalize an object.  Under reasonable
 conditions, it is still highly unlikely that an object be accidentally
@ -286,7 +290,7 @@ immortalized.  It would have to meet these criteria:
  (e.g. returns from a function or method)
 * no other code decrefs the object in the meantime

-Even at a much less frequent rate incref it would not take long to reach
+Even at a much less frequent rate it would not take long to reach
 accidental immortality (on 32-bit).  However, then it would have to run
 through the same number of (now noop-ing) decrefs before that one object
 would be effectively leaking.  This is highly unlikely, especially because
@ -312,17 +316,21 @@ of immortal objects.

 However, we do ensure that immortal objects (mostly) stay immortal
 in that situation.  We set the initial refcount of immortal objects to
-a value high above the magic refcount value, but one that still matches
-the high bit.  Thus we can still identify such objects as immortal.
-(See `_Py_IMMORTAL_REFCNT`_.)  At worst, objects in that situation
-would feel the effects described in the `Motivation`_ section.
-Even then the overall impact is unlikely to be significant.
+a value for which we can identify the object as immortal and which
+continues to do so even if the refcount is modified by an extension.
+(For example, suppose we used one of the high refcount bits to indicate
+that an object was immortal.  We would set the initial refcount to a
+higher value that still matches the bit, like halfway to the next bit.
+See `_Py_IMMORTAL_REFCNT`_.)
+At worst, objects in that situation would feel the effects
+described in the `Motivation`_ section.  Even then
+the overall impact is unlikely to be significant.

 Accidental De-Immortalizing
 '''''''''''''''''''''''''''

-32-bit builds of older stable ABI extensions can take `Accidental Immortality`_
-to the next level.
+32-bit builds of older stable ABI extensions can take
+`Accidental Immortality`_ to the next level.

 Hypothetically, such an extension could incref an object to a value on
 the next highest bit above the magic refcount value.  For example, if
@ -338,11 +346,11 @@ above example, it would take 2^29 asymmetric decrefs to drop below the
 magic immortal refcount value.  So an object like ``None`` could be
 made mortal and subject to decref.  That still wouldn't be a problem
 until somehow the decrefs continue on that object until it reaches 0.
-For many immortal objects, like ``None``, the extension will crash
-the process if it tries to dealloc the object.  For the other
-immortal objects, the dealloc might be okay.  However, there will
-be runtime code expecting the formerly-immortal object to be around
-forever.  That code will probably crash.
+For statically allocated immortal objects, like ``None``, the extension
+would crash the process if it tried to dealloc the object.  For any
+other immortal objects, the dealloc might be okay.  However, there
+might be runtime code expecting the formerly-immortal object to be
+around forever.  That code would probably crash.

 Again, the likelihood of this happening is extremely small, even on
 32-bit builds.  It would require roughly a billion decrefs on that
@ -350,7 +358,7 @@ one object without a corresponding incref.  The most likely scenario is
 the following:

 A "new" reference to ``None`` is returned by many functions and methods.
-Unlike with non-immortal objects, the 3.11 runtime will almost never
+Unlike with non-immortal objects, the 3.12 runtime will basically never
 incref ``None`` before giving it to the extension.  However, the
 extension *will* decref it when done with it (unless it returns it).
 Each time that exchange happens with the one object, we get one step
@ -362,15 +370,10 @@ on 32-bit?  If it is a problem, how could it be addressed?

 As to how realistic, the answer isn't clear currently.  However, the
 mitigation is simple enough that we can safely proceed under the
-assumption that it would be a problem.
+assumption that it would not be a problem.

-Here are some possible solutions (only needed on 32-bit):
-
-* periodically reset the refcount for immortal objects
-  (only enable this if a stable ABI extension is imported?)
-* special-case immortal objects in tp_dealloc() for the relevant types
-  (but not int, due to frequency?)
-* provide a runtime flag for disabling immortality
+We look at possible solutions
+`later on <Solutions for Accidental De-Immortalization>`_.

 Alternate Python Implementations
 --------------------------------
@ -394,8 +397,9 @@ This is not a complex feature so it should not cause much mental
 overhead for maintainers.  The basic implementation doesn't touch
 much code so it should have much impact on maintainability.  There
 may be some extra complexity due to performance penalty mitigation.
-However, that should be limited to where we immortalize all
-objects post-init and that code will be in one place.
+However, that should be limited to where we immortalize all objects
+post-init and later explicitly deallocate them during runtime
+finalization.  The code for this should be relatively concentrated.


 Specification
@ -404,8 +408,8 @@ Specification
 The approach involves these fundamental changes:

 * add `_Py_IMMORTAL_REFCNT`_ (the magic value) to the internal C-API
-* update ``Py_INCREF()`` and ``Py_DECREF()`` to no-op for objects with
-  the magic refcount (or its most significant bit)
+* update ``Py_INCREF()`` and ``Py_DECREF()`` to no-op for objects
+  that match the magic refcount
 * do the same for any other API that modifies the refcount
 * stop modifying ``PyGC_Head`` for immortal GC objects ("containers")
 * ensure that all immortal objects are cleaned up during
@ -416,9 +420,9 @@ makes it immortal.

 (There are other minor, internal changes which are not described here.)

-In the following sub-sections we dive into the details.  First we will
-cover some conceptual topics, followed by more concrete aspects like
-specific affected APIs.
+In the following sub-sections we dive into the most significant details.
+First we will cover some conceptual topics, followed by more concrete
+aspects like specific affected APIs.

 Public Refcount Details
 -----------------------
@ -447,7 +451,8 @@ values with any meaning are 0 and 1.  (Some code relies on 1 as an
 indicator that the object can be safely modified.)  All other values
 are considered "not 0 or 1".

-This information will be clarified in the `documentation <Documentation_>`_.
+This information will be clarified
+in the `documentation <Documentation_>`_.

 Arguably, the existing refcount-related API should be modified to reflect
 what we want users to expect.  Something like the following:
@ -502,7 +507,7 @@ may be practical to make some of those immortal too.  For example,
 get freed until the corresponding interpreter is finalized.  By making
 it immortal, we no longer incur the extra overhead during incref/decref.

-We explore this idea further in the `mitigation`_ section below.
+We explore this idea further in the `mitigations`_ section below.

 Implicitly Immortal Objects
 ---------------------------
@ -539,7 +544,7 @@ On top of that, the obvious approach is to simply set the refcount
 to a small value.  However, at that point there is no way in knowing
 which value would be safe.  Ideally we'd set it to the value that it
 would have been if it hadn't been made immortal.  However, that value
-has long been lost.  Hence the complexities involved make it less
+will have long been lost.  Hence the complexities involved make it less
 likely that an object could safely be un-immortalized, even if we
 had a good reason to do so.

@ -599,8 +604,8 @@ That includes the following:
 * all static objects in ``_PyRuntimeState.global_objects`` (e.g. identifiers,
  small ints)

-The question of making them actually immutable (e.g. for
-per-interpreter GIL) is not in the scope of this PEP.
+The question of making the full objects actually immutable (e.g.
+for per-interpreter GIL) is not in the scope of this PEP.

 Object Cleanup
 --------------
@ -613,13 +618,14 @@ generation by pushing all immortalized containers there.  During
 runtime shutdown, the strategy will be to first let the runtime try
 to do its best effort of deallocating these instances normally.  Most
 of the module deallocation will now be handled by
-``pylifecycle.c:finalize_modules()`` which cleans up the remaining
+``pylifecycle.c:finalize_modules()`` where we clean up the remaining
 modules as best as we can.  It will change which modules are available
-during ``__del__``, but that's already explicitly undefined behavior in the
-docs.  Optionally, we could do some topological ordering to guarantee
-that user modules will be deallocated first before the stdlib modules.
-Finally, anything left over (if any) can be found through the permanent
-generation GC list which we can clear after ``finalize_modules()``.
+during ``__del__``, but that's already explicitly undefined behavior
+in the docs.  Optionally, we could do some topological ordering
+to guarantee that user modules will be deallocated first before
+the stdlib modules.  Finally, anything left over (if any) can be found
+through the permanent generation GC list which we can clear
+after ``finalize_modules()`` is done.

 For non-container objects, the tracking approach will vary on a
 case-by-case basis.  In nearly every case, each such object is directly
@ -629,20 +635,21 @@ to the runtime state for a small number of objects.

 None of the cleanup will have a significant effect on performance.

-.. _mitigation:
+.. _mitigations:

-Performance Regression Mitigation
---------------------------------
+Performance Regression Mitigations
+----------------------------------

 In the interest of clarity, here are some of the ways we are going
-to try to recover some of the lost `performance <Performance_>`_:
+to try to recover some of the `4% performance <Performance_>`_
+we lose with the naive implementation of immortal objects.

-* at the end of runtime init, mark all objects as immortal
-* drop refcount operations in code where we know the object is immortal
-  (e.g. ``Py_RETURN_NONE``)
-* specialize for immortal objects in the eval loop (see `Pyston`_)
+Note that none of this section is actually part of the proposal.

-Regarding that first point, we can apply the concept from
+at the end of runtime init, mark all objects as immortal
+''''''''''''''''''''''''''''''''''''''''''''''''''''''''
+
+We can apply the concept from
 `Immortal Mutable Objects`_ in the pursuit of getting back some of
 that 4% performance we lose with the naive implementation of immortal
 objects.  At the end of runtime init we can mark *all* objects as
@ -650,16 +657,78 @@ immortal and avoid the extra cost in incref/decref.  We only need
 to worry about immutability with objects that we plan on sharing
 between threads without a GIL.

-Note that none of this section is part of the proposal.
-The above is included here for clarity.
+drop unnecessary hard-coded refcount operations
+'''''''''''''''''''''''''''''''''''''''''''''''

-Possible Changes
----------------
+Parts of the C-API interact specifically with objects that we know
+to be immortal, like ``Py_RETURN_NONE``.  Such functions and macros
+can be updated to drop any refcount operations.
+
+specialize for immortal objects in the eval loop
+''''''''''''''''''''''''''''''''''''''''''''''''
+
+There are opportunities to optimize operations in the eval loop
+involving speicific known immortal objects (e.g. ``None``).  The
+general mechanism is described in :pep:`659`.  Also see `Pyston`_.
+
+other possibilities
+'''''''''''''''''''

 * mark every interned string as immortal
 * mark the "interned" dict as immortal if shared else share all interned strings
-* (Larry,MvL) mark all constants unmarshalled for a module as immortal
-* (Larry,MvL) allocate (immutable) immortal objects in their own memory page(s)
+* (Larry,MAL) mark all constants unmarshalled for a module as immortal
+* (Larry,MAL) allocate (immutable) immortal objects in their own memory page(s)
+
+Solutions for Accidental De-Immortalization
+-------------------------------------------
+
+In the `Accidental De-Immortalizing`_ section we outlined a possible
+negative consequence of immortal objects.  Here we look at some
+of the options to deal with that.
+
+Note that we enumerate solutions here to illustrate that satisfactory
+options are available, rather than to dictate how the problem will
+be solved.
+
+Also note the following:
+
+* this only matters in the 32-bit stable-ABI case
+* it only affects immortal objects
+* there are no user-defined immortal objects, only built-in types
+* most immortal objects will be statically allocated
+  (and thus already must fail if ``tp_dealloc()`` is called)
+* only a handful of immortal objects will be used often enough
+  to possibly face this problem in practice (e.g. ``None``)
+* the main problem to solve is crashes coming from ``tp_dealloc()``
+
+One fundamental observation for a solution is that we can reset
+an immortal object's refcount to ``_Py_IMMORTAL_REFCNT``
+when some condition is met.
+
+With all that in mind, a simple, yet effective, solution would be
+to reset an immortal object's refcount in ``tp_dealloc()``.
+``NoneType`` and ``bool`` already have a ``tp_dealloc()`` that calls
+``Py_FatalError()`` if triggered.  The same goes for other types based
+on certain conditions, like ``PyUnicodeObject`` (depending on
+``unicode_is_singleton()``), ``PyTupleObject``, and ``PyTypeObject``.
+In fact, the same check is important for all statically declared object.
+For those types, we would instead reset the refcount.  For the
+remaining cases we would introduce the check.  In all cases,
+the overhead of the check in ``tp_dealloc()`` should be too small
+to matter.
+
+Other (less practical) solutions:
+
+* periodically reset the refcount for immortal objects
+* only do that for high-use objects
+* only do it if a stable-ABI extension has been imported
+* provide a runtime flag for disabling immortality
+
+(`The discussion thread <https://mail.python.org/archives/list/python-dev@python.org/message/OXAYWH47ZGLOWXTNKCIW4YE5PXGHNT4Y/>`
+has further detail.)
+
+Regardless of the solution we end up with, we can do something else
+later if necessary.

 Documentation
 -------------