PEP 558: Clarify rationale for locals() snapshots (#3895)
This commit is contained in:
parent
85040f7b77
commit
34771adaea
|
@ -34,6 +34,12 @@ are outweighed by the availability of a viable reference implementation.
|
|||
|
||||
Accordingly, this PEP has been withdrawn in favour of proceeding with :pep:`667`.
|
||||
|
||||
Note: while implementing :pep:`667` it became apparent that the rationale for and impact
|
||||
of ``locals()`` being updated to return independent snapshots in
|
||||
:term:`optimized scopes <py3.13:optimized scope>` was not entirely clear in either PEP.
|
||||
The Motivation and Rationale sections in this PEP have been updated accordingly (since those
|
||||
aspects are equally applicable to the accepted :pep:`667`).
|
||||
|
||||
Abstract
|
||||
========
|
||||
|
||||
|
@ -64,6 +70,7 @@ Python C API/ABI:
|
|||
It also proposes the addition of several supporting functions and type
|
||||
definitions to the CPython C API.
|
||||
|
||||
.. _pep-558-motivation:
|
||||
|
||||
Motivation
|
||||
==========
|
||||
|
@ -89,6 +96,32 @@ independent snapshot of the function locals and closure variables on each
|
|||
call, rather than continuing to return the semi-dynamic intermittently updated
|
||||
shared copy that it has historically returned in CPython.
|
||||
|
||||
Specifically, the proposal in this PEP eliminates the historical behaviour where
|
||||
adding a new local variable can change the behaviour of code executed with
|
||||
``exec()`` in function scopes, even if that code runs *before* the local variable
|
||||
is defined.
|
||||
|
||||
For example::
|
||||
|
||||
def f():
|
||||
exec("x = 1")
|
||||
print(locals().get("x"))
|
||||
f()
|
||||
|
||||
prints ``1``, but::
|
||||
|
||||
def f():
|
||||
exec("x = 1")
|
||||
print(locals().get("x"))
|
||||
x = 0
|
||||
f()
|
||||
|
||||
prints ``None`` (the default value from the ``.get()`` call).
|
||||
|
||||
With this PEP both examples would print ``None``, as the call to
|
||||
``exec()`` and the subsequent call to ``locals()`` would use
|
||||
independent dictionary snapshots of the local variables rather
|
||||
than using the same shared dictionary cached on the frame object.
|
||||
|
||||
Proposal
|
||||
========
|
||||
|
@ -797,25 +830,6 @@ frame machinery will allow rebinding of local and nonlocal variable
|
|||
references in a way that is hidden from static analysis.
|
||||
|
||||
|
||||
Retaining the internal frame value cache
|
||||
----------------------------------------
|
||||
|
||||
Retaining the internal frame value cache results in some visible quirks when
|
||||
frame proxy instances are kept around and re-used after name binding and
|
||||
unbinding operations have been executed on the frame.
|
||||
|
||||
The primary reason for retaining the frame value cache is to maintain backwards
|
||||
compatibility with the ``PyEval_GetLocals()`` API. That API returns a borrowed
|
||||
reference, so it must refer to persistent state stored on the frame object.
|
||||
Storing a fast locals proxy object on the frame creates a problematic reference
|
||||
cycle, so the cleanest option is to instead continue to return a frame value
|
||||
cache, just as this function has done since optimised frames were first
|
||||
introduced.
|
||||
|
||||
With the frame value cache being kept around anyway, it then further made sense
|
||||
to rely on it to simplify the fast locals proxy mapping implementation.
|
||||
|
||||
|
||||
What happens with the default args for ``eval()`` and ``exec()``?
|
||||
-----------------------------------------------------------------
|
||||
|
||||
|
@ -840,9 +854,241 @@ namespace on each iteration).
|
|||
to make a list from the keys).
|
||||
|
||||
|
||||
.. _pep-558-exec-eval-impact:
|
||||
|
||||
Additional considerations for ``eval()`` and ``exec()`` in optimized scopes
|
||||
---------------------------------------------------------------------------
|
||||
|
||||
Note: while implementing :pep:`667`, it was noted that neither that PEP nor this one
|
||||
clearly explained the impact the ``locals()`` changes would have on code execution APIs
|
||||
like ``exec()`` and ``eval()``. This section was added to this PEP's rationale to better
|
||||
describe the impact and explain the intended benefits of the change.
|
||||
|
||||
When ``exec()`` was converted from a statement to a builtin function
|
||||
in Python 3.0 (part of the core language changes in :pep:`3100`), the
|
||||
associated implicit call to ``PyFrame_LocalsToFast()`` was removed, so
|
||||
it typically appears as if attempts to write to local variables with
|
||||
``exec()`` in optimized frames are ignored::
|
||||
|
||||
>>> def f():
|
||||
... x = 0
|
||||
... exec("x = 1")
|
||||
... print(x)
|
||||
... print(locals()["x"])
|
||||
...
|
||||
>>> f()
|
||||
0
|
||||
0
|
||||
|
||||
In truth, the writes aren't being ignored, they just aren't
|
||||
being copied from the dictionary cache back to the optimized local
|
||||
variable array. The changes to the dictionary are then overwritten
|
||||
the next time the dictionary cache is refreshed from the array::
|
||||
|
||||
>>> def f():
|
||||
... x = 0
|
||||
... locals_cache = locals()
|
||||
... exec("x = 1")
|
||||
... print(x)
|
||||
... print(locals_cache["x"])
|
||||
... print(locals()["x"])
|
||||
...
|
||||
>>> f()
|
||||
0
|
||||
1
|
||||
0
|
||||
|
||||
.. _pep-558-ctypes-example:
|
||||
|
||||
The behaviour becomes even stranger if a tracing function
|
||||
or another piece of code invokes ``PyFrame_LocalsToFast()`` before
|
||||
the cache is next refreshed. In those cases the change *is*
|
||||
written back to the optimized local variable array::
|
||||
|
||||
>>> from sys import _getframe
|
||||
>>> from ctypes import pythonapi, py_object, c_int
|
||||
>>> _locals_to_fast = pythonapi.PyFrame_LocalsToFast
|
||||
>>> _locals_to_fast.argtypes = [py_object, c_int]
|
||||
>>> def f():
|
||||
... _frame = _getframe()
|
||||
... _f_locals = _frame.f_locals
|
||||
... x = 0
|
||||
... exec("x = 1")
|
||||
... _locals_to_fast(_frame, 0)
|
||||
... print(x)
|
||||
... print(locals()["x"])
|
||||
... print(_f_locals["x"])
|
||||
...
|
||||
>>> f()
|
||||
1
|
||||
1
|
||||
1
|
||||
|
||||
This situation was more common in Python 3.10 and earlier
|
||||
versions, as merely installing a tracing function was enough
|
||||
to trigger implicit calls to ``PyFrame_LocalsToFast()`` after
|
||||
every line of Python code. However, it can still happen in Python
|
||||
3.11+ depending on exactly which tracing functions are active
|
||||
(e.g. interactive debuggers intentionally do this so that changes
|
||||
made at the debugging prompt are visible when code execution
|
||||
resumes).
|
||||
|
||||
All of the above comments in relation to ``exec()`` apply to
|
||||
*any* attempt to mutate the result of ``locals()`` in optimized
|
||||
scopes, and are the main reason that the ``locals()`` builtin
|
||||
docs contain this caveat:
|
||||
|
||||
Note: The contents of this dictionary should not be modified;
|
||||
changes may not affect the values of local and free variables
|
||||
used by the interpreter.
|
||||
|
||||
While the exact wording in the library reference is not entirely explicit,
|
||||
both ``exec()`` and ``eval()`` have long used the results of calling
|
||||
``globals()`` and ``locals()`` in the calling Python frame as their default
|
||||
execution namespace.
|
||||
|
||||
This was historically also equivalent to using the calling frame's
|
||||
``frame.f_globals`` and ``frame.f_locals`` attributes, but this PEP maps
|
||||
the default namespace arguments for ``exec()`` and ``eval()`` to
|
||||
``globals()`` and ``locals()`` in the calling frame in order to preserve
|
||||
the property of defaulting to ignoring attempted writes to the local
|
||||
namespace in optimized scopes.
|
||||
|
||||
This poses a potential compatibility issue for some code, as with the
|
||||
previous implementation that returns the same dict when ``locals()`` is called
|
||||
multiple times in function scope, the following code usually worked due to
|
||||
the implicitly shared local variable namespace::
|
||||
|
||||
def f():
|
||||
exec('a = 0') # equivalent to exec('a = 0', globals(), locals())
|
||||
exec('print(a)') # equivalent to exec('print(a)', globals(), locals())
|
||||
print(locals()) # {'a': 0}
|
||||
# However, print(a) will not work here
|
||||
f()
|
||||
|
||||
With ``locals()`` in an optimised scope returning the same shared dict for each call,
|
||||
it was possible to store extra "fake locals" in that dict. While these aren't real
|
||||
locals known by the compiler (so they can't be printed with code like ``print(a)``),
|
||||
they can still be accessed via ``locals()`` and shared between multiple ``exec()``
|
||||
calls in the same function scope. Furthermore, because they're *not* real locals,
|
||||
they don't get implicitly updated or removed when the shared cache is refreshed
|
||||
from the local variable storage array.
|
||||
|
||||
When the code in ``exec()`` tries to write to an existing local variable, the
|
||||
runtime behaviour gets harder to predict::
|
||||
|
||||
def f():
|
||||
a = None
|
||||
exec('a = 0') # equivalent to exec('a = 0', globals(), locals())
|
||||
exec('print(a)') # equivalent to exec('print(a)', globals(), locals())
|
||||
print(locals()) # {'a': None}
|
||||
f()
|
||||
|
||||
``print(a)`` will print ``None`` because the implicit ``locals()`` call in
|
||||
``exec()`` refreshes the cached dict with the actual values on the frame.
|
||||
This means that, unlike the "fake" locals created by writing back to ``locals()``
|
||||
(including via previous calls to ``exec()``), the real locals known by the
|
||||
compiler can't easily be modified by ``exec()`` (it can be done, but it requires
|
||||
both retrieving the ``frame.f_locals`` attribute to enable writes back to the frame,
|
||||
and then invoking ``PyFrame_LocalsToFast()``, as :ref:`shown <pep-558-ctypes-example>`
|
||||
using ``ctypes`` above).
|
||||
|
||||
As noted in the :ref:`pep-558-motivation` section, this confusing side effect
|
||||
happens even if the local variable is only defined *after* the ``exec()`` calls::
|
||||
|
||||
>>> def f():
|
||||
... exec("a = 0")
|
||||
... exec("print('a' in locals())") # Printing 'a' directly won't work
|
||||
... print(locals())
|
||||
... a = None
|
||||
... print(locals())
|
||||
...
|
||||
>>> f()
|
||||
False
|
||||
{}
|
||||
{'a': None}
|
||||
|
||||
Because ``a`` is a real local variable that is not currently bound to a value, it
|
||||
gets explicitly removed from the dictionary returned by ``locals()`` whenever
|
||||
``locals()`` is called prior to the ``a = None`` line. This removal is intentional,
|
||||
as it allows the contents of ``locals()`` to be updated correctly in optimized
|
||||
scopes when ``del`` statements are used to delete previously bound local variables.
|
||||
|
||||
As noted in the ``ctypes`` :ref:`example <pep-558-ctypes-example>`, the above behavioural
|
||||
description may be invalidated if the CPython ``PyFrame_LocalsToFast()`` API gets invoked
|
||||
while the frame is still running. In that case, the changes to ``a`` *might* become visible
|
||||
to the running code, depending on exactly when that API is called (and whether the frame
|
||||
has been primed for locals modification by accessing the ``frame.f_locals`` attribute).
|
||||
|
||||
As described above, two options were considered to replace this confusing behaviour:
|
||||
|
||||
* make ``locals()`` return write-through proxy instances (similar
|
||||
to ``frame.f_locals``)
|
||||
* make ``locals()`` return genuinely independent snapshots so that
|
||||
attempts to change the values of local variables via ``exec()``
|
||||
would be *consistently* ignored without any of the caveats
|
||||
noted above.
|
||||
|
||||
The PEP chooses the second option for the following reasons:
|
||||
|
||||
* returning independent snapshots in optimized scopes preserves
|
||||
the Python 3.0 change to ``exec()`` that resulted in attempts
|
||||
to mutate local variables via ``exec()`` being ignored in most
|
||||
cases
|
||||
* the distinction between "``locals()`` gives an instantaneous
|
||||
snapshot of the local variables in optimized scopes, and
|
||||
read/write access in other scopes" and "``frame.f_locals``
|
||||
gives read/write access to the local variables in all scopes,
|
||||
including optimized scopes" allows the intent of a piece of
|
||||
code to be clearer than it would be if both APIs granted
|
||||
full read/write access in optimized scopes, even when write
|
||||
access wasn't needed or desired
|
||||
* in addition to improving clarity for human readers, ensuring
|
||||
that name rebinding in optimized scopes remains lexically
|
||||
visible in the code (as long as the frame introspection APIs
|
||||
are not accessed) allows compilers and interpreters to apply
|
||||
related performance optimizations more consistently
|
||||
* only Python implementations that support the optional frame
|
||||
introspection APIs will need to provide the new write-through
|
||||
proxy support for optimized frames
|
||||
|
||||
With the semantic changes to ``locals()`` in this PEP, it becomes much easier to explain
|
||||
the behavior of ``exec()`` and ``eval()``: in optimized scopes, they will *never* implicitly
|
||||
affect local variables; in other scopes, they will *always* implicitly affect local
|
||||
variables. In optimized scopes, any implicit assignment to the local variables will be
|
||||
discarded when the code execution API returns, since a fresh copy of the local variables
|
||||
is used on each invocation.
|
||||
|
||||
|
||||
Retaining the internal frame value cache
|
||||
----------------------------------------
|
||||
|
||||
Retaining the internal frame value cache results in some visible quirks when
|
||||
frame proxy instances are kept around and re-used after name binding and
|
||||
unbinding operations have been executed on the frame.
|
||||
|
||||
The primary reason for retaining the frame value cache is to maintain backwards
|
||||
compatibility with the ``PyEval_GetLocals()`` API. That API returns a borrowed
|
||||
reference, so it must refer to persistent state stored on the frame object.
|
||||
Storing a fast locals proxy object on the frame creates a problematic reference
|
||||
cycle, so the cleanest option is to instead continue to return a frame value
|
||||
cache, just as this function has done since optimised frames were first
|
||||
introduced.
|
||||
|
||||
With the frame value cache being kept around anyway, it then further made sense
|
||||
to rely on it to simplify the fast locals proxy mapping implementation.
|
||||
|
||||
Note: the fact :pep:`667` *doesn't* use the internal frame value cache as part of the
|
||||
write-through proxy implementation is the key Python level difference between the two PEPs.
|
||||
|
||||
|
||||
Changing the frame API semantics in regular operation
|
||||
-----------------------------------------------------
|
||||
|
||||
Note: when this PEP was first written, it predated the Python 3.11 change to drop the
|
||||
implicit writeback of the frame local variables whenever a tracing function was installed,
|
||||
so making that change was included as part of the proposal.
|
||||
|
||||
Earlier versions of this PEP proposed having the semantics of the frame
|
||||
``f_locals`` attribute depend on whether or not a tracing hook was currently
|
||||
installed - only providing the write-through proxy behaviour when a tracing hook
|
||||
|
|
Loading…
Reference in New Issue