PEP 558: Adopt Python level semantics from PEP 667 (#2124)

* fast locals proxy never assumes the value cache is already up to date
* operations become O(n) as required to avoid that assumption
* remove `*View()` APIs from proposal due to algorithmic complexity issue
* add Python pseudo-code to the PEP 667 comparison section
* reword PEP 667 comparison section to focus on the remaining differences
  in the C API proposal
This commit is contained in:
Nick Coghlan 2021-12-23 09:25:51 +10:00 committed by GitHub
parent 107361803d
commit dedc9d250e
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 301 additions and 202 deletions

View File

@ -35,7 +35,6 @@ Python C API/ABI::
PyLocals_Kind PyLocals_GetKind();
PyObject * PyLocals_Get();
PyObject * PyLocals_GetCopy();
PyObject * PyLocals_GetView();
It also proposes the addition of several supporting functions and type
definitions to the CPython C API.
@ -281,10 +280,6 @@ Summary of proposed implementation-specific changes
the local namespace in the running frame::
PyObject * PyLocals_GetCopy();
* One new function is added to the stable ABI to get a read-only view of the
local namespace in the running frame::
PyObject * PyLocals_GetView();
* Corresponding frame accessor functions for these new public APIs are added to
the CPython frame C API
* On optimised frames, the Python level ``f_locals`` API will return dynamically
@ -309,7 +304,7 @@ Summary of proposed implementation-specific changes
mutable read/write mapping for the local variables.
* The trace hook implementation will no longer call ``PyFrame_FastToLocals()``
implicitly. The version porting guide will recommend migrating to
``PyFrame_GetLocalsView()`` for read-only access and
``PyFrame_GetLocals()`` for read-only access and
``PyObject_GetAttrString(frame, "f_locals")`` for read/write access.
@ -379,6 +374,7 @@ retained for two key purposes:
fast locals array (e.g. the ``__return__`` and ``__exception__`` keys set by
``pdb`` when tracing code execution for debugging purposes)
With the changes in this PEP, this internal frame value cache is no longer
directly accessible from Python code (whereas historically it was both
returned by the ``locals()`` builtin and available as the ``frame.f_locals``
@ -397,50 +393,46 @@ Fast locals proxy objects and the internal frame value cache returned by
to the frame itself, and will only be reliably visible via fast locals proxies
for the same frame if the change relates to extra variables that don't have
slots in the frame's fast locals array
* changes made by executing code in the frame will be visible to newly created
fast locals proxy objects, when directly accessing specific keys on existing
fast locals proxy objects, and when performing intrinsically O(n) operations
on existing fast locals proxy objects. Visibility in the internal frame value
cache (and in fast locals proxy operations that rely on the frame) cache is
subject to the cache update guidelines discussed in the next section
* changes made by executing code in the frame will be immediately visible to all
fast locals proxy objects for that frame (both existing proxies and newly
created ones). Visibility in the internal frame value cache cache returned
by ``PyEval_GetLocals()`` is subject to the cache update guidelines discussed
in the next section
Due to the last point, the frame API documentation will recommend that a new
``frame.f_locals`` reference be retrieved whenever an optimised frame (or
a related frame) might have been running code that binds or unbinds local
variable or cell references, and the code iterates over the proxy, checks
its length, or calls ``popitem()``. This will be the most natural style of use
in tracing function implementations, as those are passed references to frames
rather than directly to ``frames.f_locals``.
As a result of these points, only code using ``PyEval_GetLocals()``,
``PyLocals_Get()``, or ``PyLocals_GetCopy()`` will need to be concerned about
the frame value cache potentially becoming stale. Code using the new frame fast
locals proxy API (whether from Python or from C) will always see the live state
of the frame.
Fast locals proxy implementation details
----------------------------------------
Each fast locals proxy instance has two internal attributes that are not
Each fast locals proxy instance has a single internal attribute that is not
exposed as part of the Python runtime API:
* *frame*: the underlying optimised frame that the proxy provides access to
* *frame_cache_updated*: whether this proxy has already updated the frame's
internal value cache at least once
In addition, proxy instances use and update the following attributes stored on the
underlying frame:
underlying frame or code object:
* *fast_refs*: a hidden mapping from variable names to either fast local storage
offsets (for local variables) or to closure cells (for closure variables).
This mapping is lazily initialized on the first frame read or write access
through a fast locals proxy, rather than being eagerly populated as soon as
the first fast locals proxy is created.
* *_name_to_offset_mapping*: a hidden mapping from variable names to fast local
storage offsets. This mapping is lazily initialized on the first frame read or
write access through a fast locals proxy, rather than being eagerly populated
as soon as the first fast locals proxy is created. Since the mapping is
identical for all frames running a given code object, a single copy is stored
on the code object, rather than each frame object populating its own mapping
* *locals*: the internal frame value cache returned by the ``PyEval_GetLocals()``
C API and updated by the ``PyFrame_FastToLocals()`` C API. This is the mapping
that the ``locals()`` builtin returns in Python 3.10 and earlier.
``__getitem__`` operations on the proxy will populate the ``fast_refs`` mapping
(if it is not already populated), and then either return the relevant value
(if the key is found in either the ``fast_refs`` mapping or the internal frame
value cache), or else raise ``KeyError``. Variables that are defined on the
frame but not currently bound raise ``KeyError`` (just as they're omitted from
the result of ``locals()``).
``__getitem__`` operations on the proxy will populate the ``_name_to_offset_mapping``
on the code object (if it is not already populated), and then either return the
relevant value (if the key is found in either the ``_name_to_offset_mapping``
mapping or the internal frame value cache), or else raise ``KeyError``. Variables
that are defined on the frame but not currently bound also raise ``KeyError``
(just as they're omitted from the result of ``locals()``).
As the frame storage is always accessed directly, the proxy will automatically
pick up name binding and unbinding operations that take place as the function
@ -453,8 +445,7 @@ directly affect the corresponding fast local or cell reference on the underlying
frame, ensuring that changes are immediately visible to the running Python code,
rather than needing to be written back to the runtime storage at some later time.
Such changes are also immediately written to the internal frame value cache to
reduce the opportunities for the cache to get out of sync with the frame state
and to make them visible to users of the ``PyEval_GetLocals()`` C API.
make them visible to users of the ``PyEval_GetLocals()`` C API.
Keys that are not defined as local or closure variables on the underlying frame
are still written to the internal value cache on optimised frames. This allows
@ -462,40 +453,11 @@ utilities like ``pdb`` (which writes ``__return__`` and ``__exception__``
values into the frame's ``f_locals`` mapping) to continue working as they always
have. These additional keys that do not correspond to a local or closure
variable on the frame will be left alone by future cache sync operations.
Fast locals proxy objects offer a proxy-specific method that explicitly syncs
the internal frame cache with the current state of the fast locals array:
``proxy.sync_frame_cache()``. This method runs ``PyFrame_FastToLocalsWithError()``
to ensure the cache is consistent with the current frame state.
Using a particular proxy instance to sync the frame cache sets the internal
``frame_cache_updated`` flag on that instance.
For most use cases, explicitly syncing the frame cache shouldn't be necessary,
as the following intrinsically O(n) operations implicitly sync the frame cache
whenever they're called on a proxy instance:
* ``__str__``
* ``__or__`` (dict union)
* ``copy()``
While the following operations will implicitly sync the frame cache if
``frame_cache_updated`` has not yet been set on that instance:
* ``__len__``
* ``__iter__``
* ``__reversed__``
* ``keys()``
* ``values()``
* ``items()``
* ``popitem()``
* value comparison operations
Other ``Mapping`` and ``MutableMapping`` methods on the proxy will behave as
expected for a mapping with these essential method semantics regardless of
whether the internal frame value cache is up to date or not.
Using the frame value cache to store these extra keys (rather than defining a
new mapping that holds only the extra keys) provides full interoperability
with the existing ``PyEval_GetLocals()`` API (since users of either API will
see extra keys added by users of either API, rather than users of the new fast
locals proxy API only seeing keys added via that API).
An additional benefit of storing only the variable value cache on the frame
(rather than storing an instance of the proxy type), is that it avoids
@ -558,25 +520,25 @@ ensure that it is safe to cast arbitrary signed 32-bit signed integers to
This query API allows extension module code to determine the potential impact
of mutating the mapping returned by ``PyLocals_Get()`` without needing access
to the details of the running frame object.
to the details of the running frame object. Python code gets equivalent
information visually through lexical scoping (as covered in the new ``locals()``
builtin documention).
To allow extension module code to behave consistently regardless of the active
Python scope, the stable C ABI would gain the following new functions::
Python scope, the stable C ABI would gain the following new function::
PyObject * PyLocals_GetCopy();
PyObject * PyLocals_GetView();
``PyLocals_GetCopy()`` returns a new dict instance populated from the current
locals namespace. Roughly equivalent to ``dict(locals())`` in Python code, but
avoids the double-copy in the case where ``locals()`` already returns a shallow
copy.
copy. Akin to the following code, but doesn't assume there will only ever be
two kinds of locals result::
``PyLocals_GetView()`` returns a new read-only mapping proxy instance for the
current locals namespace. This view immediately reflects all local variable
changes, independently of whether the running frame is optimised or not.
However, some operations (e.g. length checking, iteration, mapping equality
comparisons) may be subject to frame cache consistency issues on optimised
frames (as noted above when describing the behaviour of the fast locals proxy).
locals = PyLocals_Get();
if (PyLocals_GetKind() == PyLocals_DIRECT_REFERENCE) {
locals = PyDict_Copy(locals);
}
The existing ``PyEval_GetLocals()`` API will retain its existing behaviour in
CPython (mutable locals at class and module scope, shared dynamic snapshot
@ -587,8 +549,9 @@ The ``PyEval_GetLocals()`` documentation will also be updated to recommend
replacing usage of this API with whichever of the new APIs is most appropriate
for the use case:
* Use ``PyLocals_GetView()`` for read-only access to the current locals
namespace.
* Use ``PyLocals_Get()`` (optionally combined with ``PyDictProxy_New()``) for
read-only access to the current locals namespace. This form of usage will
need to be aware that the copy may go stale in optimised frames.
* Use ``PyLocals_GetCopy()`` for a regular mutable dict that contains a copy of
the current locals namespace, but has no ongoing connection to the active
frame.
@ -619,14 +582,11 @@ will be updated only in the following circumstance:
* any call to ``PyFrame_GetLocals()``, ``PyFrame_GetLocalsCopy()``,
``_PyFrame_BorrowLocals()``, ``PyFrame_FastToLocals()``, or
``PyFrame_FastToLocalsWithError()`` for the frame
* retrieving the ``f_locals`` attribute from a Python level frame object
* any call to the ``sync_frame_cache()`` method on a fast locals proxy
referencing that frame
* any operation on a fast locals proxy object that requires the shared
mapping to be up to date on the underlying frame. In the initial reference
* any operation on a fast locals proxy object that updates the shared
mapping as part of its implementation. In the initial reference
implementation, those operations are those that are intrinsically ``O(n)``
operations (``flp.copy()`` and rendering as a string), as well as those that
refresh the cache entries for individual keys.
operations (``len(flp)``, mapping comparison, ``flp.copy()`` and rendering as
a string), as well as those that refresh the cache entries for individual keys.
Accessing the frame "view" APIs will *not* implicitly update the shared dynamic
snapshot, and the CPython trace hook handling will no longer implicitly update
@ -642,7 +602,6 @@ needed to support the stable C API/ABI updates::
PyLocals_Kind PyFrame_GetLocalsKind(frame);
PyObject * PyFrame_GetLocals(frame);
PyObject * PyFrame_GetLocalsCopy(frame);
PyObject * PyFrame_GetLocalsView(frame);
PyObject * _PyFrame_BorrowLocals(frame);
@ -654,8 +613,6 @@ needed to support the stable C API/ABI updates::
``PyFrame_GetLocalsCopy(frame)`` is the underlying API for
``PyLocals_GetCopy()``.
``PyFrame_GetLocalsView(frame)`` is the underlying API for ``PyLocals_GetView()``.
``_PyFrame_BorrowLocals(frame)`` is the underlying API for
``PyEval_GetLocals()``. The underscore prefix is intended to discourage use and
to indicate that code using it is unlikely to be portable across
@ -818,14 +775,6 @@ With the frame value cache being kept around anyway, it then further made sense
to rely on it to simplify the fast locals proxy mapping implementation.
Delaying implicit frame value cache updates
-------------------------------------------
Earlier iterations of this PEP proposed updating the internal frame value cache
whenever a new fast locals proxy instance was created for that frame. They also
proposed storing a separate copy of the ``fast_refs`` lookup mapping on each
What happens with the default args for ``eval()`` and ``exec()``?
-----------------------------------------------------------------
@ -903,11 +852,9 @@ arbitrary frames, so the standard library test suite fails if that functionality
no longer works.
Accordingly, the ability to store arbitrary keys was retained, at the expense
of certain operations on proxy objects currently either being slower than desired
(as they need to update the dynamic snapshot in order to provide correct
behaviour), or else assuming that the cache is currently up to date (and hence
potentially giving an incorrect answer if the frame state has changed in a
way that doesn't automatically update the cache contents).
of certain operations on proxy objects being slower than could otherwise be
(since they can't assume that only names defined on the code object will be
accessible through the proxy).
It is expected that the exact details of the interaction between the fast locals
proxy and the ``f_locals`` value cache on the underlying frame will evolve over
@ -978,8 +925,9 @@ into the following cases:
current Python ``locals()`` namespace, but *not* wanting any changes to
be visible to Python code. This is the ``PyLocals_GetCopy()`` API.
* always wanting a read-only view of the current locals namespace, without
incurring the runtime overhead of making a full copy each time. This is the
``PyLocals_GetView()`` API.
incurring the runtime overhead of making a full copy each time. This isn't
readily offered for optimised frames due to the need to check whether names
are currently bound or not, so no specific API is being added to cover it.
Historically, these kinds of checks and operations would only have been
possible if a Python implementation emulated the full CPython frame API. With
@ -998,8 +946,8 @@ frames entirely.
These changes were originally offered as amendments to PEP 558, and the PEP
author rejected them for three main reasons:
* the claim that ``PyEval_GetLocals()`` is unfixable because it returns a
borrowed reference is simply false, as it is still working in the PEP 558
* the initial claim that ``PyEval_GetLocals()`` was unfixable because it returns
a borrowed reference was simply false, as it is still working in the PEP 558
reference implementation. All that is required to keep it working is to
retain the internal frame value cache and design the fast locals proxy in
such a way that it is reasonably straightforward to keep the cache up to date
@ -1016,11 +964,11 @@ author rejected them for three main reasons:
example, becomes consistently O(n) in the number of variables defined on the
frame, as the proxy has to iterate over the entire fast locals array to see
which names are currently bound to values before it can determine the answer.
By contrast, maintaining an internal frame value cache allows proxies to
largely be treated as normal dictionaries from an algorithmic complexity point
of view, with allowances only needing to be made for the initial implicit O(n)
cache refresh that runs the first time an operation that relies on the cache
being up to date is executed.
By contrast, maintaining an internal frame value cache potentially allows
proxies to largely be treated as normal dictionaries from an algorithmic
complexity point of view, with allowances only needing to be made for the
initial implicit O(n) cache refresh that runs the first time an operation
that relies on the cache being up to date is executed.
* the claim that a cache-free implementation would be simpler is highly suspect,
as PEP 667 includes only a pure Python sketch of a subset of a mutable mapping
implementation, rather than a full-fledged C implementation of a new mapping
@ -1045,119 +993,269 @@ author rejected them for three main reasons:
Of the three reasons, the first is the most important (since we need compelling
reasons to break API backwards compatibility, and we don't have them).
The other two points relate to why the author of this PEP doesn't believe PEP
667's proposal would actually offer any significant benefits to either API
consumers (while the author of this PEP concedes that PEP 558's internal frame
cache sync management is more complex to deal with than PEP 667's API
algorithmic complexity quirks, it's still markedly less complex than the
tracing mode semantics in current Python versions) or to CPython core developers
(the author of this PEP certainly didn't want to write C implementations of five
new fast locals proxy specific mutable mapping helper types when he could
instead just write a single cache refresh helper method and then reuse the
existing builtin dict method implementations).
However, after reviewing PEP 667's proposed Python level semantics, the author
of this PEP eventually agreed that they *would* be simpler for users of the
Python ``locals()`` API, so this distinction between the two PEPs has been
eliminated: regardless of which PEP and implementation is accepted, the fast
locals proxy object *always* provides a consistent view of the current state
of the local variables, even if this results in some operations becoming O(n)
that would be O(1) on a regular dictionary (specifically, ``len(proxy)``
becomes O(n), since it needs to check which names are currently bound, and proxy
mapping comparisons avoid relying on the length check optimisation that allows
differences in the number of stored keys to be detected quickly for regular
mappings).
Taking the specific frame access example cited in PEP 667::
Due to the adoption of these non-standard performance characteristics in the
proxy implementation, the ``PyLocals_GetView()`` and ``PyFrame_GetLocalsView()``
C APIs were also removed from the proposal in this PEP.
def foo():
x = sys._getframe().f_locals
y = locals()
print(tuple(x))
print(tuple(y))
This leaves the only remaining points of distinction between the two PEPs as
specifically related to the C API:
Following the implementation improvements prompted by the suggestions in PEP 667,
PEP 558 prints the same result as PEP 667 does::
* PEP 667 still proposes completely unnecessary C API breakage (the programmatic
deprecation and eventual removal of ``PyEval_GetLocals()``,
``PyFrame_FastToLocalsWithError()``, and ``PyFrame_FastToLocals()``) without
justification, when it is entirely possible to keep these working indefintely
(and interoperably) given a suitably designed fast locals proxy implementation
* the fast locals proxy handling of additional variables is defined in this PEP
in a way that is fully interoperable with the existing ``PyEval_GetLocals()``
API. In the proxy implementation proposed in PEP 667, users of the new frame
API will not see changes made to additional variables by users of the old API,
and changes made to additional variables via the old API will be overwritten
on subsequent calls to ``PyEval_GetLocals()``.
* the ``PyLocals_Get()`` API in this PEP is called ``PyEval_Locals()`` in PEP 667.
This function name is a bit strange as it lacks a verb, making it look more
like a type name than a data access API.
* this PEP adds ``PyLocals_GetCopy()`` and ``PyFrame_GetLocalsCopy()`` APIs to
allow extension modules to easily avoid incurring a double copy operation in
frames where ``PyLocals_Get()`` alreadys makes a copy
* this PEP adds ``PyLocals_Kind``, ``PyLocals_GetKind()``, and
``PyFrame_GetLocalsKind()`` to allow extension modules to identify when code
is running at function scope without having to inspect non-portable frame and
code objects APIs (without the proposed query API, the existing equivalent to
the new ``PyLocals_GetKind() == PyLocals_SHALLOW_COPY`` check is to include
the CPython internal frame API headers and check if
``_PyFrame_GetCode(PyEval_GetFrame())->co_flags & CO_OPTIMIZED`` is set)
('x', 'y')
('x',)
The Python pseudo-code below is based on the implementation sketch presented
in PEP 667 as of the time of writing (2021-10-24). The differences that
provide the improved interoperability between the new fast locals proxy API
and the existing ``PyEval_GetLocals()`` API are noted in comments.
That said, it's certainly possible to desynchronise the cache quite easily when
keeping proxy references around while letting code run in the frame.
This isn't a new problem, as it's similar to the way that
``sys._getframe().f_locals`` behaves in existing versions when no trace hooks
are installed. The following example::
As in PEP 667, all attributes that start with an underscore are invisible and
cannot be accessed directly. They serve only to illustrate the proposed design.
def foo():
x = sys._getframe().f_locals
print(tuple(x))
y = locals()
print(tuple(x))
print(tuple(y))
For simplicity (and as in PEP 667), the handling of module and class level
frames is omitted (they're much simpler, as ``_locals`` *is* the execution
namespace, so no translation is required).
will print the following under PEP 558, as the first ``tuple(x)`` call consumes
the single implicit cache update performed by the proxy instance, and ``y``
hasn't been bound yet when the ``locals()`` call refreshes it again::
::
('x',)
('x',)
('x',)
NULL: Object # NULL is a singleton representing the absence of a value.
However, this is the origin of the coding style guideline in the body of the
PEP: don't keep fast locals proxy references around if code might have been
executed in that frame since the proxy instance was created. With the code
updated to follow that guideline::
class CodeType:
def foo():
x = sys._getframe().f_locals
print(tuple(x))
y = locals()
x = sys._getframe().f_locals
print(tuple(x))
print(tuple(y))
_name_to_offset_mapping_impl: dict | NULL
...
def __init__(self, ...):
self._name_to_offset_mapping_impl = NULL
self._variable_names = deduplicate(
self.co_varnames + self.co_cellvars + self.co_freevars
)
...
The output once again becomes the same as it would be under PEP 667::
def _is_cell(self, offset):
... # How the interpreter identifies cells is an implementation detail
('x',)
('x', 'y',)
('x',)
@property
def _name_to_offset_mapping(self):
"Mapping of names to offsets in local variable array."
if self._name_to_offset_mapping_impl is NULL:
Tracing function implementations, which are expected to be the main consumer of
the fast locals proxy API, generally won't run into the above problem, since
they get passed a reference to the frame object (and retrieve a fresh fast
locals proxy instance from that), while the frame itself isn't running code
while the trace function is running. If the trace function *does* allow code to
be run on the frame (e.g. it's a debugger), then it should also follow the
coding guideline and retrieve a new proxy instance each time it allows code
to run in the frame.
self._name_to_offset_mapping_impl = {
name: index for (index, name) in enumerate(self._variable_names)
}
return self._name_to_offset_mapping_impl
Most trace functions are going to be reading or writing individual keys, or
running intrinsically O(n) operations like iterating over all currently bound
variables, so they also shouldn't be impacted *too* badly by the performance
quirks in the PEP 667 proposal. The most likely source of annoyance would be
the O(n) ``len(proxy)`` implementation.
class FrameType:
Note: the simplest way to convert the PEP 558 reference implementation into a
PEP 667 implementation that doesn't break ``PyEval_GetLocals()`` would be to
remove the ``frame_cache_updated`` checks in affected operations, and instead
always sync the frame cache in those methods. Adopting that approach would
change the algorithmic complexity of the following operations as shown
(where ``n`` is the number of local and cell variables defined on the frame):
_fast_locals : array[Object] # The values of the local variables, items may be NULL.
_locals: dict | NULL # Dictionary returned by PyEval_GetLocals()
def __init__(self, ...):
self._locals = NULL
...
@property
def f_locals(self):
return FastLocalsProxy(self)
class FastLocalsProxy:
__slots__ "_frame"
def __init__(self, frame:FrameType):
self._frame = frame
def _set_locals_entry(self, name, val):
f = self._frame
if f._locals is NULL:
f._locals = {}
f._locals[name] = val
def __getitem__(self, name):
f = self._frame
co = f.f_code
if name in co._name_to_offset_mapping:
index = co._name_to_offset_mapping[name]
val = f._fast_locals[index]
if val is NULL:
raise KeyError(name)
if co._is_cell(offset)
val = val.cell_contents
if val is NULL:
raise KeyError(name)
# PyEval_GetLocals() interop: implicit frame cache refresh
self._set_locals_entry(name, val)
return val
# PyEval_GetLocals() interop: frame cache may contain additional names
if f._locals is NULL:
raise KeyError(name)
return f._locals[name]
def __setitem__(self, name, value):
f = self._frame
co = f.f_code
if name in co._name_to_offset_mapping:
index = co._name_to_offset_mapping[name]
kind = co._local_kinds[index]
if co._is_cell(offset)
cell = f._locals[index]
cell.cell_contents = val
else:
f._fast_locals[index] = val
# PyEval_GetLocals() interop: implicit frame cache update
# even for names that are part of the fast locals array
self._set_locals_entry(name, val)
def __delitem__(self, name):
f = self._frame
co = f.f_code
if name in co._name_to_offset_mapping:
index = co._name_to_offset_mapping[name]
kind = co._local_kinds[index]
if co._is_cell(offset)
cell = f._locals[index]
cell.cell_contents = NULL
else:
f._fast_locals[index] = NULL
# PyEval_GetLocals() interop: implicit frame cache update
# even for names that are part of the fast locals array
if f._locals is not NULL:
del f._locals[name]
def __iter__(self):
f = self._frame
co = f.f_code
for index, name in enumerate(co._variable_names):
val = f._fast_locals[index]
if val is NULL:
continue
if co._is_cell(offset):
val = val.cell_contents
if val is NULL:
continue
yield name
for name in f._locals:
# Yield any extra names not defined on the frame
if name in co._name_to_offset_mapping:
continue
yield name
def popitem(self):
f = self._frame
co = f.f_code
for name in self:
val = self[name]
# PyEval_GetLocals() interop: implicit frame cache update
# even for names that are part of the fast locals array
del name
return name, val
def _sync_frame_cache(self):
# This method underpins PyEval_GetLocals, PyFrame_FastToLocals
# PyFrame_GetLocals, PyLocals_Get, mapping comparison, etc
f = self._frame
co = f.f_code
res = 0
if f._locals is NULL:
f._locals = {}
for index, name in enumerate(co._variable_names):
val = f._fast_locals[index]
if val is NULL:
f._locals.pop(name, None)
continue
if co._is_cell(offset):
if val.cell_contents is NULL:
f._locals.pop(name, None)
continue
f._locals[name] = val
def __len__(self):
self._sync_frame_cache()
return len(self._locals)
Note: the simplest way to convert the earlier iterations of the PEP 558
reference implementation into a preliminary implementation of the now proposed
semantics is to remove the ``frame_cache_updated`` checks in affected operations,
and instead always sync the frame cache in those methods. Adopting that approach
changes the algorithmic complexity of the following operations as shown (where
``n`` is the number of local and cell variables defined on the frame):
* ``__len__``: O(1) -> O(n)
* value comparison operations: no longer benefit from O(1) length check shortcut
* ``__iter__``: O(1) -> O(n)
* ``__reversed__``: O(1) -> O(n)
* ``keys()``: O(1) -> O(n)
* ``values()``: O(1) -> O(n)
* ``items()``: O(1) -> O(n)
* ``popitem()``: O(1) -> O(n)
* value comparison operations: no longer benefit from O(1) length check shortcut
Keeping the iterator/iterable retrieval methods as ``O(1)`` would involve
writing custom replacements for the corresponding builtin dict helper types.
``popitem()`` could be improved from "always O(n)" to "O(n) worst case" by
creating a custom implementation that iterates over the fast locals array
directly. The length check and value comparison operations have very limited
opportunities for improvement: without a cache, the only way to know how many
variables are currently bound is to iterate over all of them and check, and if
the implementation is going to be spending that much time on an operation
anyway, it may as well spend it updating the frame value cache and then
consuming the result.
The length check and value comparison operations have relatively limited
opportunities for improvement: without allowing usage of a potentially stale
cache, the only way to know how many variables are currently bound is to iterate
over all of them and check, and if the implementation is going to be spending
that many cycles on an operation anyway, it may as well spend it updating the
frame value cache and then consuming the result. These operations are O(n) in
both this PEP and in PEP 667. Customised implementations could be provided that
*are* faster than updating the frame cache, but it's far from clear that the
extra code complexity needed to speed these operations up would be worthwhile
when it only offers a linear performance improvement rather than an algorithmic
complexity improvement.
This feels worse than PEP 558 as written, where folks that don't want to think
too hard about the cache management details, and don't care about potential
performance issues with large frames, are free to add as many
``proxy.sync_frame_cache()`` (or other internal frame cache updating) calls to
their code as they like.
The O(1) nature of the other operations can be restored by adding implementation
code that doesn't rely on the value cache being up to date.
Keeping the iterator/iterable retrieval methods as ``O(1)`` will involve
writing custom replacements for the corresponding builtin dict helper types,
just as proposed in PEP 667. As illustrated above, the implementations would
be similar to the pseudo-code presented in PEP 667, but not identical (due to
the improved ``PyEval_GetLocals()`` interoperability offered by this PEP
affecting the way it stores extra variables).
``popitem()`` can be improved from "always O(n)" to "O(n) worst case" by
creating a custom implementation that relies on the improved iteration APIs.
To ensure stale frame information is never presented in the Python fast locals
proxy API, these changes in the reference implementation will need to be
implemented before merging.
The current implementation at time of writing (2021-10-24) also still stores a
copy of the fast refs mapping on each frame rather than storing a single
instance on the underlying code object (as it still stores cell references
directly, rather than check for cells on each fast locals array access). Fixing
this would also be required before merging.
Implementation
@ -1187,7 +1285,8 @@ restarting discussion on the PEP in early 2021 after a further year of
inactivity) [10,11,12]_. Mark's comments that were ultimately published as
PEP 667 also directly resulted in several implementation efficiency improvements
that avoid incurring the cost of redundant O(n) mapping refresh operations
when the relevant mappings aren't used.
when the relevant mappings aren't used, as well as the change to ensure that
the state reported through the Python level ``f_locals`` API is never stale.
References