PEP 558: Adopt Python level semantics from PEP 667 (#2124)

* fast locals proxy never assumes the value cache is already up to date
* operations become O(n) as required to avoid that assumption
* remove `*View()` APIs from proposal due to algorithmic complexity issue
* add Python pseudo-code to the PEP 667 comparison section
* reword PEP 667 comparison section to focus on the remaining differences
  in the C API proposal
This commit is contained in:
Nick Coghlan 2021-12-23 09:25:51 +10:00 committed by GitHub
parent 107361803d
commit dedc9d250e
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 301 additions and 202 deletions

View File

@ -35,7 +35,6 @@ Python C API/ABI::
PyLocals_Kind PyLocals_GetKind(); PyLocals_Kind PyLocals_GetKind();
PyObject * PyLocals_Get(); PyObject * PyLocals_Get();
PyObject * PyLocals_GetCopy(); PyObject * PyLocals_GetCopy();
PyObject * PyLocals_GetView();
It also proposes the addition of several supporting functions and type It also proposes the addition of several supporting functions and type
definitions to the CPython C API. definitions to the CPython C API.
@ -281,10 +280,6 @@ Summary of proposed implementation-specific changes
the local namespace in the running frame:: the local namespace in the running frame::
PyObject * PyLocals_GetCopy(); PyObject * PyLocals_GetCopy();
* One new function is added to the stable ABI to get a read-only view of the
local namespace in the running frame::
PyObject * PyLocals_GetView();
* Corresponding frame accessor functions for these new public APIs are added to * Corresponding frame accessor functions for these new public APIs are added to
the CPython frame C API the CPython frame C API
* On optimised frames, the Python level ``f_locals`` API will return dynamically * On optimised frames, the Python level ``f_locals`` API will return dynamically
@ -309,7 +304,7 @@ Summary of proposed implementation-specific changes
mutable read/write mapping for the local variables. mutable read/write mapping for the local variables.
* The trace hook implementation will no longer call ``PyFrame_FastToLocals()`` * The trace hook implementation will no longer call ``PyFrame_FastToLocals()``
implicitly. The version porting guide will recommend migrating to implicitly. The version porting guide will recommend migrating to
``PyFrame_GetLocalsView()`` for read-only access and ``PyFrame_GetLocals()`` for read-only access and
``PyObject_GetAttrString(frame, "f_locals")`` for read/write access. ``PyObject_GetAttrString(frame, "f_locals")`` for read/write access.
@ -379,6 +374,7 @@ retained for two key purposes:
fast locals array (e.g. the ``__return__`` and ``__exception__`` keys set by fast locals array (e.g. the ``__return__`` and ``__exception__`` keys set by
``pdb`` when tracing code execution for debugging purposes) ``pdb`` when tracing code execution for debugging purposes)
With the changes in this PEP, this internal frame value cache is no longer With the changes in this PEP, this internal frame value cache is no longer
directly accessible from Python code (whereas historically it was both directly accessible from Python code (whereas historically it was both
returned by the ``locals()`` builtin and available as the ``frame.f_locals`` returned by the ``locals()`` builtin and available as the ``frame.f_locals``
@ -397,50 +393,46 @@ Fast locals proxy objects and the internal frame value cache returned by
to the frame itself, and will only be reliably visible via fast locals proxies to the frame itself, and will only be reliably visible via fast locals proxies
for the same frame if the change relates to extra variables that don't have for the same frame if the change relates to extra variables that don't have
slots in the frame's fast locals array slots in the frame's fast locals array
* changes made by executing code in the frame will be visible to newly created * changes made by executing code in the frame will be immediately visible to all
fast locals proxy objects, when directly accessing specific keys on existing fast locals proxy objects for that frame (both existing proxies and newly
fast locals proxy objects, and when performing intrinsically O(n) operations created ones). Visibility in the internal frame value cache cache returned
on existing fast locals proxy objects. Visibility in the internal frame value by ``PyEval_GetLocals()`` is subject to the cache update guidelines discussed
cache (and in fast locals proxy operations that rely on the frame) cache is in the next section
subject to the cache update guidelines discussed in the next section
Due to the last point, the frame API documentation will recommend that a new As a result of these points, only code using ``PyEval_GetLocals()``,
``frame.f_locals`` reference be retrieved whenever an optimised frame (or ``PyLocals_Get()``, or ``PyLocals_GetCopy()`` will need to be concerned about
a related frame) might have been running code that binds or unbinds local the frame value cache potentially becoming stale. Code using the new frame fast
variable or cell references, and the code iterates over the proxy, checks locals proxy API (whether from Python or from C) will always see the live state
its length, or calls ``popitem()``. This will be the most natural style of use of the frame.
in tracing function implementations, as those are passed references to frames
rather than directly to ``frames.f_locals``.
Fast locals proxy implementation details Fast locals proxy implementation details
---------------------------------------- ----------------------------------------
Each fast locals proxy instance has two internal attributes that are not Each fast locals proxy instance has a single internal attribute that is not
exposed as part of the Python runtime API: exposed as part of the Python runtime API:
* *frame*: the underlying optimised frame that the proxy provides access to * *frame*: the underlying optimised frame that the proxy provides access to
* *frame_cache_updated*: whether this proxy has already updated the frame's
internal value cache at least once
In addition, proxy instances use and update the following attributes stored on the In addition, proxy instances use and update the following attributes stored on the
underlying frame: underlying frame or code object:
* *fast_refs*: a hidden mapping from variable names to either fast local storage * *_name_to_offset_mapping*: a hidden mapping from variable names to fast local
offsets (for local variables) or to closure cells (for closure variables). storage offsets. This mapping is lazily initialized on the first frame read or
This mapping is lazily initialized on the first frame read or write access write access through a fast locals proxy, rather than being eagerly populated
through a fast locals proxy, rather than being eagerly populated as soon as as soon as the first fast locals proxy is created. Since the mapping is
the first fast locals proxy is created. identical for all frames running a given code object, a single copy is stored
on the code object, rather than each frame object populating its own mapping
* *locals*: the internal frame value cache returned by the ``PyEval_GetLocals()`` * *locals*: the internal frame value cache returned by the ``PyEval_GetLocals()``
C API and updated by the ``PyFrame_FastToLocals()`` C API. This is the mapping C API and updated by the ``PyFrame_FastToLocals()`` C API. This is the mapping
that the ``locals()`` builtin returns in Python 3.10 and earlier. that the ``locals()`` builtin returns in Python 3.10 and earlier.
``__getitem__`` operations on the proxy will populate the ``fast_refs`` mapping ``__getitem__`` operations on the proxy will populate the ``_name_to_offset_mapping``
(if it is not already populated), and then either return the relevant value on the code object (if it is not already populated), and then either return the
(if the key is found in either the ``fast_refs`` mapping or the internal frame relevant value (if the key is found in either the ``_name_to_offset_mapping``
value cache), or else raise ``KeyError``. Variables that are defined on the mapping or the internal frame value cache), or else raise ``KeyError``. Variables
frame but not currently bound raise ``KeyError`` (just as they're omitted from that are defined on the frame but not currently bound also raise ``KeyError``
the result of ``locals()``). (just as they're omitted from the result of ``locals()``).
As the frame storage is always accessed directly, the proxy will automatically As the frame storage is always accessed directly, the proxy will automatically
pick up name binding and unbinding operations that take place as the function pick up name binding and unbinding operations that take place as the function
@ -453,8 +445,7 @@ directly affect the corresponding fast local or cell reference on the underlying
frame, ensuring that changes are immediately visible to the running Python code, frame, ensuring that changes are immediately visible to the running Python code,
rather than needing to be written back to the runtime storage at some later time. rather than needing to be written back to the runtime storage at some later time.
Such changes are also immediately written to the internal frame value cache to Such changes are also immediately written to the internal frame value cache to
reduce the opportunities for the cache to get out of sync with the frame state make them visible to users of the ``PyEval_GetLocals()`` C API.
and to make them visible to users of the ``PyEval_GetLocals()`` C API.
Keys that are not defined as local or closure variables on the underlying frame Keys that are not defined as local or closure variables on the underlying frame
are still written to the internal value cache on optimised frames. This allows are still written to the internal value cache on optimised frames. This allows
@ -462,40 +453,11 @@ utilities like ``pdb`` (which writes ``__return__`` and ``__exception__``
values into the frame's ``f_locals`` mapping) to continue working as they always values into the frame's ``f_locals`` mapping) to continue working as they always
have. These additional keys that do not correspond to a local or closure have. These additional keys that do not correspond to a local or closure
variable on the frame will be left alone by future cache sync operations. variable on the frame will be left alone by future cache sync operations.
Using the frame value cache to store these extra keys (rather than defining a
Fast locals proxy objects offer a proxy-specific method that explicitly syncs new mapping that holds only the extra keys) provides full interoperability
the internal frame cache with the current state of the fast locals array: with the existing ``PyEval_GetLocals()`` API (since users of either API will
``proxy.sync_frame_cache()``. This method runs ``PyFrame_FastToLocalsWithError()`` see extra keys added by users of either API, rather than users of the new fast
to ensure the cache is consistent with the current frame state. locals proxy API only seeing keys added via that API).
Using a particular proxy instance to sync the frame cache sets the internal
``frame_cache_updated`` flag on that instance.
For most use cases, explicitly syncing the frame cache shouldn't be necessary,
as the following intrinsically O(n) operations implicitly sync the frame cache
whenever they're called on a proxy instance:
* ``__str__``
* ``__or__`` (dict union)
* ``copy()``
While the following operations will implicitly sync the frame cache if
``frame_cache_updated`` has not yet been set on that instance:
* ``__len__``
* ``__iter__``
* ``__reversed__``
* ``keys()``
* ``values()``
* ``items()``
* ``popitem()``
* value comparison operations
Other ``Mapping`` and ``MutableMapping`` methods on the proxy will behave as
expected for a mapping with these essential method semantics regardless of
whether the internal frame value cache is up to date or not.
An additional benefit of storing only the variable value cache on the frame An additional benefit of storing only the variable value cache on the frame
(rather than storing an instance of the proxy type), is that it avoids (rather than storing an instance of the proxy type), is that it avoids
@ -558,25 +520,25 @@ ensure that it is safe to cast arbitrary signed 32-bit signed integers to
This query API allows extension module code to determine the potential impact This query API allows extension module code to determine the potential impact
of mutating the mapping returned by ``PyLocals_Get()`` without needing access of mutating the mapping returned by ``PyLocals_Get()`` without needing access
to the details of the running frame object. to the details of the running frame object. Python code gets equivalent
information visually through lexical scoping (as covered in the new ``locals()``
builtin documention).
To allow extension module code to behave consistently regardless of the active To allow extension module code to behave consistently regardless of the active
Python scope, the stable C ABI would gain the following new functions:: Python scope, the stable C ABI would gain the following new function::
PyObject * PyLocals_GetCopy(); PyObject * PyLocals_GetCopy();
PyObject * PyLocals_GetView();
``PyLocals_GetCopy()`` returns a new dict instance populated from the current ``PyLocals_GetCopy()`` returns a new dict instance populated from the current
locals namespace. Roughly equivalent to ``dict(locals())`` in Python code, but locals namespace. Roughly equivalent to ``dict(locals())`` in Python code, but
avoids the double-copy in the case where ``locals()`` already returns a shallow avoids the double-copy in the case where ``locals()`` already returns a shallow
copy. copy. Akin to the following code, but doesn't assume there will only ever be
two kinds of locals result::
``PyLocals_GetView()`` returns a new read-only mapping proxy instance for the locals = PyLocals_Get();
current locals namespace. This view immediately reflects all local variable if (PyLocals_GetKind() == PyLocals_DIRECT_REFERENCE) {
changes, independently of whether the running frame is optimised or not. locals = PyDict_Copy(locals);
However, some operations (e.g. length checking, iteration, mapping equality }
comparisons) may be subject to frame cache consistency issues on optimised
frames (as noted above when describing the behaviour of the fast locals proxy).
The existing ``PyEval_GetLocals()`` API will retain its existing behaviour in The existing ``PyEval_GetLocals()`` API will retain its existing behaviour in
CPython (mutable locals at class and module scope, shared dynamic snapshot CPython (mutable locals at class and module scope, shared dynamic snapshot
@ -587,8 +549,9 @@ The ``PyEval_GetLocals()`` documentation will also be updated to recommend
replacing usage of this API with whichever of the new APIs is most appropriate replacing usage of this API with whichever of the new APIs is most appropriate
for the use case: for the use case:
* Use ``PyLocals_GetView()`` for read-only access to the current locals * Use ``PyLocals_Get()`` (optionally combined with ``PyDictProxy_New()``) for
namespace. read-only access to the current locals namespace. This form of usage will
need to be aware that the copy may go stale in optimised frames.
* Use ``PyLocals_GetCopy()`` for a regular mutable dict that contains a copy of * Use ``PyLocals_GetCopy()`` for a regular mutable dict that contains a copy of
the current locals namespace, but has no ongoing connection to the active the current locals namespace, but has no ongoing connection to the active
frame. frame.
@ -619,14 +582,11 @@ will be updated only in the following circumstance:
* any call to ``PyFrame_GetLocals()``, ``PyFrame_GetLocalsCopy()``, * any call to ``PyFrame_GetLocals()``, ``PyFrame_GetLocalsCopy()``,
``_PyFrame_BorrowLocals()``, ``PyFrame_FastToLocals()``, or ``_PyFrame_BorrowLocals()``, ``PyFrame_FastToLocals()``, or
``PyFrame_FastToLocalsWithError()`` for the frame ``PyFrame_FastToLocalsWithError()`` for the frame
* retrieving the ``f_locals`` attribute from a Python level frame object * any operation on a fast locals proxy object that updates the shared
* any call to the ``sync_frame_cache()`` method on a fast locals proxy mapping as part of its implementation. In the initial reference
referencing that frame
* any operation on a fast locals proxy object that requires the shared
mapping to be up to date on the underlying frame. In the initial reference
implementation, those operations are those that are intrinsically ``O(n)`` implementation, those operations are those that are intrinsically ``O(n)``
operations (``flp.copy()`` and rendering as a string), as well as those that operations (``len(flp)``, mapping comparison, ``flp.copy()`` and rendering as
refresh the cache entries for individual keys. a string), as well as those that refresh the cache entries for individual keys.
Accessing the frame "view" APIs will *not* implicitly update the shared dynamic Accessing the frame "view" APIs will *not* implicitly update the shared dynamic
snapshot, and the CPython trace hook handling will no longer implicitly update snapshot, and the CPython trace hook handling will no longer implicitly update
@ -642,7 +602,6 @@ needed to support the stable C API/ABI updates::
PyLocals_Kind PyFrame_GetLocalsKind(frame); PyLocals_Kind PyFrame_GetLocalsKind(frame);
PyObject * PyFrame_GetLocals(frame); PyObject * PyFrame_GetLocals(frame);
PyObject * PyFrame_GetLocalsCopy(frame); PyObject * PyFrame_GetLocalsCopy(frame);
PyObject * PyFrame_GetLocalsView(frame);
PyObject * _PyFrame_BorrowLocals(frame); PyObject * _PyFrame_BorrowLocals(frame);
@ -654,8 +613,6 @@ needed to support the stable C API/ABI updates::
``PyFrame_GetLocalsCopy(frame)`` is the underlying API for ``PyFrame_GetLocalsCopy(frame)`` is the underlying API for
``PyLocals_GetCopy()``. ``PyLocals_GetCopy()``.
``PyFrame_GetLocalsView(frame)`` is the underlying API for ``PyLocals_GetView()``.
``_PyFrame_BorrowLocals(frame)`` is the underlying API for ``_PyFrame_BorrowLocals(frame)`` is the underlying API for
``PyEval_GetLocals()``. The underscore prefix is intended to discourage use and ``PyEval_GetLocals()``. The underscore prefix is intended to discourage use and
to indicate that code using it is unlikely to be portable across to indicate that code using it is unlikely to be portable across
@ -818,14 +775,6 @@ With the frame value cache being kept around anyway, it then further made sense
to rely on it to simplify the fast locals proxy mapping implementation. to rely on it to simplify the fast locals proxy mapping implementation.
Delaying implicit frame value cache updates
-------------------------------------------
Earlier iterations of this PEP proposed updating the internal frame value cache
whenever a new fast locals proxy instance was created for that frame. They also
proposed storing a separate copy of the ``fast_refs`` lookup mapping on each
What happens with the default args for ``eval()`` and ``exec()``? What happens with the default args for ``eval()`` and ``exec()``?
----------------------------------------------------------------- -----------------------------------------------------------------
@ -903,11 +852,9 @@ arbitrary frames, so the standard library test suite fails if that functionality
no longer works. no longer works.
Accordingly, the ability to store arbitrary keys was retained, at the expense Accordingly, the ability to store arbitrary keys was retained, at the expense
of certain operations on proxy objects currently either being slower than desired of certain operations on proxy objects being slower than could otherwise be
(as they need to update the dynamic snapshot in order to provide correct (since they can't assume that only names defined on the code object will be
behaviour), or else assuming that the cache is currently up to date (and hence accessible through the proxy).
potentially giving an incorrect answer if the frame state has changed in a
way that doesn't automatically update the cache contents).
It is expected that the exact details of the interaction between the fast locals It is expected that the exact details of the interaction between the fast locals
proxy and the ``f_locals`` value cache on the underlying frame will evolve over proxy and the ``f_locals`` value cache on the underlying frame will evolve over
@ -978,8 +925,9 @@ into the following cases:
current Python ``locals()`` namespace, but *not* wanting any changes to current Python ``locals()`` namespace, but *not* wanting any changes to
be visible to Python code. This is the ``PyLocals_GetCopy()`` API. be visible to Python code. This is the ``PyLocals_GetCopy()`` API.
* always wanting a read-only view of the current locals namespace, without * always wanting a read-only view of the current locals namespace, without
incurring the runtime overhead of making a full copy each time. This is the incurring the runtime overhead of making a full copy each time. This isn't
``PyLocals_GetView()`` API. readily offered for optimised frames due to the need to check whether names
are currently bound or not, so no specific API is being added to cover it.
Historically, these kinds of checks and operations would only have been Historically, these kinds of checks and operations would only have been
possible if a Python implementation emulated the full CPython frame API. With possible if a Python implementation emulated the full CPython frame API. With
@ -998,8 +946,8 @@ frames entirely.
These changes were originally offered as amendments to PEP 558, and the PEP These changes were originally offered as amendments to PEP 558, and the PEP
author rejected them for three main reasons: author rejected them for three main reasons:
* the claim that ``PyEval_GetLocals()`` is unfixable because it returns a * the initial claim that ``PyEval_GetLocals()`` was unfixable because it returns
borrowed reference is simply false, as it is still working in the PEP 558 a borrowed reference was simply false, as it is still working in the PEP 558
reference implementation. All that is required to keep it working is to reference implementation. All that is required to keep it working is to
retain the internal frame value cache and design the fast locals proxy in retain the internal frame value cache and design the fast locals proxy in
such a way that it is reasonably straightforward to keep the cache up to date such a way that it is reasonably straightforward to keep the cache up to date
@ -1016,11 +964,11 @@ author rejected them for three main reasons:
example, becomes consistently O(n) in the number of variables defined on the example, becomes consistently O(n) in the number of variables defined on the
frame, as the proxy has to iterate over the entire fast locals array to see frame, as the proxy has to iterate over the entire fast locals array to see
which names are currently bound to values before it can determine the answer. which names are currently bound to values before it can determine the answer.
By contrast, maintaining an internal frame value cache allows proxies to By contrast, maintaining an internal frame value cache potentially allows
largely be treated as normal dictionaries from an algorithmic complexity point proxies to largely be treated as normal dictionaries from an algorithmic
of view, with allowances only needing to be made for the initial implicit O(n) complexity point of view, with allowances only needing to be made for the
cache refresh that runs the first time an operation that relies on the cache initial implicit O(n) cache refresh that runs the first time an operation
being up to date is executed. that relies on the cache being up to date is executed.
* the claim that a cache-free implementation would be simpler is highly suspect, * the claim that a cache-free implementation would be simpler is highly suspect,
as PEP 667 includes only a pure Python sketch of a subset of a mutable mapping as PEP 667 includes only a pure Python sketch of a subset of a mutable mapping
implementation, rather than a full-fledged C implementation of a new mapping implementation, rather than a full-fledged C implementation of a new mapping
@ -1045,119 +993,269 @@ author rejected them for three main reasons:
Of the three reasons, the first is the most important (since we need compelling Of the three reasons, the first is the most important (since we need compelling
reasons to break API backwards compatibility, and we don't have them). reasons to break API backwards compatibility, and we don't have them).
The other two points relate to why the author of this PEP doesn't believe PEP However, after reviewing PEP 667's proposed Python level semantics, the author
667's proposal would actually offer any significant benefits to either API of this PEP eventually agreed that they *would* be simpler for users of the
consumers (while the author of this PEP concedes that PEP 558's internal frame Python ``locals()`` API, so this distinction between the two PEPs has been
cache sync management is more complex to deal with than PEP 667's API eliminated: regardless of which PEP and implementation is accepted, the fast
algorithmic complexity quirks, it's still markedly less complex than the locals proxy object *always* provides a consistent view of the current state
tracing mode semantics in current Python versions) or to CPython core developers of the local variables, even if this results in some operations becoming O(n)
(the author of this PEP certainly didn't want to write C implementations of five that would be O(1) on a regular dictionary (specifically, ``len(proxy)``
new fast locals proxy specific mutable mapping helper types when he could becomes O(n), since it needs to check which names are currently bound, and proxy
instead just write a single cache refresh helper method and then reuse the mapping comparisons avoid relying on the length check optimisation that allows
existing builtin dict method implementations). differences in the number of stored keys to be detected quickly for regular
mappings).
Taking the specific frame access example cited in PEP 667:: Due to the adoption of these non-standard performance characteristics in the
proxy implementation, the ``PyLocals_GetView()`` and ``PyFrame_GetLocalsView()``
C APIs were also removed from the proposal in this PEP.
def foo(): This leaves the only remaining points of distinction between the two PEPs as
x = sys._getframe().f_locals specifically related to the C API:
y = locals()
print(tuple(x))
print(tuple(y))
Following the implementation improvements prompted by the suggestions in PEP 667, * PEP 667 still proposes completely unnecessary C API breakage (the programmatic
PEP 558 prints the same result as PEP 667 does:: deprecation and eventual removal of ``PyEval_GetLocals()``,
``PyFrame_FastToLocalsWithError()``, and ``PyFrame_FastToLocals()``) without
justification, when it is entirely possible to keep these working indefintely
(and interoperably) given a suitably designed fast locals proxy implementation
* the fast locals proxy handling of additional variables is defined in this PEP
in a way that is fully interoperable with the existing ``PyEval_GetLocals()``
API. In the proxy implementation proposed in PEP 667, users of the new frame
API will not see changes made to additional variables by users of the old API,
and changes made to additional variables via the old API will be overwritten
on subsequent calls to ``PyEval_GetLocals()``.
* the ``PyLocals_Get()`` API in this PEP is called ``PyEval_Locals()`` in PEP 667.
This function name is a bit strange as it lacks a verb, making it look more
like a type name than a data access API.
* this PEP adds ``PyLocals_GetCopy()`` and ``PyFrame_GetLocalsCopy()`` APIs to
allow extension modules to easily avoid incurring a double copy operation in
frames where ``PyLocals_Get()`` alreadys makes a copy
* this PEP adds ``PyLocals_Kind``, ``PyLocals_GetKind()``, and
``PyFrame_GetLocalsKind()`` to allow extension modules to identify when code
is running at function scope without having to inspect non-portable frame and
code objects APIs (without the proposed query API, the existing equivalent to
the new ``PyLocals_GetKind() == PyLocals_SHALLOW_COPY`` check is to include
the CPython internal frame API headers and check if
``_PyFrame_GetCode(PyEval_GetFrame())->co_flags & CO_OPTIMIZED`` is set)
('x', 'y') The Python pseudo-code below is based on the implementation sketch presented
('x',) in PEP 667 as of the time of writing (2021-10-24). The differences that
provide the improved interoperability between the new fast locals proxy API
and the existing ``PyEval_GetLocals()`` API are noted in comments.
That said, it's certainly possible to desynchronise the cache quite easily when As in PEP 667, all attributes that start with an underscore are invisible and
keeping proxy references around while letting code run in the frame. cannot be accessed directly. They serve only to illustrate the proposed design.
This isn't a new problem, as it's similar to the way that
``sys._getframe().f_locals`` behaves in existing versions when no trace hooks
are installed. The following example::
def foo(): For simplicity (and as in PEP 667), the handling of module and class level
x = sys._getframe().f_locals frames is omitted (they're much simpler, as ``_locals`` *is* the execution
print(tuple(x)) namespace, so no translation is required).
y = locals()
print(tuple(x))
print(tuple(y))
will print the following under PEP 558, as the first ``tuple(x)`` call consumes ::
the single implicit cache update performed by the proxy instance, and ``y``
hasn't been bound yet when the ``locals()`` call refreshes it again::
('x',) NULL: Object # NULL is a singleton representing the absence of a value.
('x',)
('x',)
However, this is the origin of the coding style guideline in the body of the class CodeType:
PEP: don't keep fast locals proxy references around if code might have been
executed in that frame since the proxy instance was created. With the code
updated to follow that guideline::
def foo(): _name_to_offset_mapping_impl: dict | NULL
x = sys._getframe().f_locals ...
print(tuple(x))
y = locals()
x = sys._getframe().f_locals
print(tuple(x))
print(tuple(y))
def __init__(self, ...):
self._name_to_offset_mapping_impl = NULL
self._variable_names = deduplicate(
self.co_varnames + self.co_cellvars + self.co_freevars
)
...
The output once again becomes the same as it would be under PEP 667:: def _is_cell(self, offset):
... # How the interpreter identifies cells is an implementation detail
('x',) @property
('x', 'y',) def _name_to_offset_mapping(self):
('x',) "Mapping of names to offsets in local variable array."
if self._name_to_offset_mapping_impl is NULL:
Tracing function implementations, which are expected to be the main consumer of self._name_to_offset_mapping_impl = {
the fast locals proxy API, generally won't run into the above problem, since name: index for (index, name) in enumerate(self._variable_names)
they get passed a reference to the frame object (and retrieve a fresh fast }
locals proxy instance from that), while the frame itself isn't running code return self._name_to_offset_mapping_impl
while the trace function is running. If the trace function *does* allow code to
be run on the frame (e.g. it's a debugger), then it should also follow the
coding guideline and retrieve a new proxy instance each time it allows code
to run in the frame.
Most trace functions are going to be reading or writing individual keys, or class FrameType:
running intrinsically O(n) operations like iterating over all currently bound
variables, so they also shouldn't be impacted *too* badly by the performance
quirks in the PEP 667 proposal. The most likely source of annoyance would be
the O(n) ``len(proxy)`` implementation.
Note: the simplest way to convert the PEP 558 reference implementation into a _fast_locals : array[Object] # The values of the local variables, items may be NULL.
PEP 667 implementation that doesn't break ``PyEval_GetLocals()`` would be to _locals: dict | NULL # Dictionary returned by PyEval_GetLocals()
remove the ``frame_cache_updated`` checks in affected operations, and instead
always sync the frame cache in those methods. Adopting that approach would def __init__(self, ...):
change the algorithmic complexity of the following operations as shown self._locals = NULL
(where ``n`` is the number of local and cell variables defined on the frame): ...
@property
def f_locals(self):
return FastLocalsProxy(self)
class FastLocalsProxy:
__slots__ "_frame"
def __init__(self, frame:FrameType):
self._frame = frame
def _set_locals_entry(self, name, val):
f = self._frame
if f._locals is NULL:
f._locals = {}
f._locals[name] = val
def __getitem__(self, name):
f = self._frame
co = f.f_code
if name in co._name_to_offset_mapping:
index = co._name_to_offset_mapping[name]
val = f._fast_locals[index]
if val is NULL:
raise KeyError(name)
if co._is_cell(offset)
val = val.cell_contents
if val is NULL:
raise KeyError(name)
# PyEval_GetLocals() interop: implicit frame cache refresh
self._set_locals_entry(name, val)
return val
# PyEval_GetLocals() interop: frame cache may contain additional names
if f._locals is NULL:
raise KeyError(name)
return f._locals[name]
def __setitem__(self, name, value):
f = self._frame
co = f.f_code
if name in co._name_to_offset_mapping:
index = co._name_to_offset_mapping[name]
kind = co._local_kinds[index]
if co._is_cell(offset)
cell = f._locals[index]
cell.cell_contents = val
else:
f._fast_locals[index] = val
# PyEval_GetLocals() interop: implicit frame cache update
# even for names that are part of the fast locals array
self._set_locals_entry(name, val)
def __delitem__(self, name):
f = self._frame
co = f.f_code
if name in co._name_to_offset_mapping:
index = co._name_to_offset_mapping[name]
kind = co._local_kinds[index]
if co._is_cell(offset)
cell = f._locals[index]
cell.cell_contents = NULL
else:
f._fast_locals[index] = NULL
# PyEval_GetLocals() interop: implicit frame cache update
# even for names that are part of the fast locals array
if f._locals is not NULL:
del f._locals[name]
def __iter__(self):
f = self._frame
co = f.f_code
for index, name in enumerate(co._variable_names):
val = f._fast_locals[index]
if val is NULL:
continue
if co._is_cell(offset):
val = val.cell_contents
if val is NULL:
continue
yield name
for name in f._locals:
# Yield any extra names not defined on the frame
if name in co._name_to_offset_mapping:
continue
yield name
def popitem(self):
f = self._frame
co = f.f_code
for name in self:
val = self[name]
# PyEval_GetLocals() interop: implicit frame cache update
# even for names that are part of the fast locals array
del name
return name, val
def _sync_frame_cache(self):
# This method underpins PyEval_GetLocals, PyFrame_FastToLocals
# PyFrame_GetLocals, PyLocals_Get, mapping comparison, etc
f = self._frame
co = f.f_code
res = 0
if f._locals is NULL:
f._locals = {}
for index, name in enumerate(co._variable_names):
val = f._fast_locals[index]
if val is NULL:
f._locals.pop(name, None)
continue
if co._is_cell(offset):
if val.cell_contents is NULL:
f._locals.pop(name, None)
continue
f._locals[name] = val
def __len__(self):
self._sync_frame_cache()
return len(self._locals)
Note: the simplest way to convert the earlier iterations of the PEP 558
reference implementation into a preliminary implementation of the now proposed
semantics is to remove the ``frame_cache_updated`` checks in affected operations,
and instead always sync the frame cache in those methods. Adopting that approach
changes the algorithmic complexity of the following operations as shown (where
``n`` is the number of local and cell variables defined on the frame):
* ``__len__``: O(1) -> O(n) * ``__len__``: O(1) -> O(n)
* value comparison operations: no longer benefit from O(1) length check shortcut
* ``__iter__``: O(1) -> O(n) * ``__iter__``: O(1) -> O(n)
* ``__reversed__``: O(1) -> O(n) * ``__reversed__``: O(1) -> O(n)
* ``keys()``: O(1) -> O(n) * ``keys()``: O(1) -> O(n)
* ``values()``: O(1) -> O(n) * ``values()``: O(1) -> O(n)
* ``items()``: O(1) -> O(n) * ``items()``: O(1) -> O(n)
* ``popitem()``: O(1) -> O(n) * ``popitem()``: O(1) -> O(n)
* value comparison operations: no longer benefit from O(1) length check shortcut
Keeping the iterator/iterable retrieval methods as ``O(1)`` would involve The length check and value comparison operations have relatively limited
writing custom replacements for the corresponding builtin dict helper types. opportunities for improvement: without allowing usage of a potentially stale
``popitem()`` could be improved from "always O(n)" to "O(n) worst case" by cache, the only way to know how many variables are currently bound is to iterate
creating a custom implementation that iterates over the fast locals array over all of them and check, and if the implementation is going to be spending
directly. The length check and value comparison operations have very limited that many cycles on an operation anyway, it may as well spend it updating the
opportunities for improvement: without a cache, the only way to know how many frame value cache and then consuming the result. These operations are O(n) in
variables are currently bound is to iterate over all of them and check, and if both this PEP and in PEP 667. Customised implementations could be provided that
the implementation is going to be spending that much time on an operation *are* faster than updating the frame cache, but it's far from clear that the
anyway, it may as well spend it updating the frame value cache and then extra code complexity needed to speed these operations up would be worthwhile
consuming the result. when it only offers a linear performance improvement rather than an algorithmic
complexity improvement.
This feels worse than PEP 558 as written, where folks that don't want to think The O(1) nature of the other operations can be restored by adding implementation
too hard about the cache management details, and don't care about potential code that doesn't rely on the value cache being up to date.
performance issues with large frames, are free to add as many
``proxy.sync_frame_cache()`` (or other internal frame cache updating) calls to Keeping the iterator/iterable retrieval methods as ``O(1)`` will involve
their code as they like. writing custom replacements for the corresponding builtin dict helper types,
just as proposed in PEP 667. As illustrated above, the implementations would
be similar to the pseudo-code presented in PEP 667, but not identical (due to
the improved ``PyEval_GetLocals()`` interoperability offered by this PEP
affecting the way it stores extra variables).
``popitem()`` can be improved from "always O(n)" to "O(n) worst case" by
creating a custom implementation that relies on the improved iteration APIs.
To ensure stale frame information is never presented in the Python fast locals
proxy API, these changes in the reference implementation will need to be
implemented before merging.
The current implementation at time of writing (2021-10-24) also still stores a
copy of the fast refs mapping on each frame rather than storing a single
instance on the underlying code object (as it still stores cell references
directly, rather than check for cells on each fast locals array access). Fixing
this would also be required before merging.
Implementation Implementation
@ -1187,7 +1285,8 @@ restarting discussion on the PEP in early 2021 after a further year of
inactivity) [10,11,12]_. Mark's comments that were ultimately published as inactivity) [10,11,12]_. Mark's comments that were ultimately published as
PEP 667 also directly resulted in several implementation efficiency improvements PEP 667 also directly resulted in several implementation efficiency improvements
that avoid incurring the cost of redundant O(n) mapping refresh operations that avoid incurring the cost of redundant O(n) mapping refresh operations
when the relevant mappings aren't used. when the relevant mappings aren't used, as well as the change to ensure that
the state reported through the Python level ``f_locals`` API is never stale.
References References