352 lines
15 KiB
ReStructuredText
352 lines
15 KiB
ReStructuredText
PEP: 558
|
||
Title: Defined semantics for locals()
|
||
Author: Nick Coghlan <ncoghlan@gmail.com>
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 2017-09-08
|
||
Python-Version: 3.7
|
||
Post-History: 2017-09-08
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
The semantics of the ``locals()`` builtin have historically been underspecified
|
||
and hence implementation dependent.
|
||
|
||
This PEP proposes formally standardising on the behaviour of the CPython 3.6
|
||
reference implementation for most execution scopes, with some adjustments to the
|
||
behaviour at function scope to make it more predictable and independent of the
|
||
presence or absence of tracing functions.
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
While the precise semantics of the ``locals()`` builtin are nominally undefined,
|
||
in practice, many Python programs depend on it behaving exactly as it behaves in
|
||
CPython (at least when no tracing functions are installed).
|
||
|
||
Other implementations such as PyPy are currently replicating that behaviour,
|
||
up to and including replication of local variable mutation bugs that
|
||
can arise when a trace hook is installed [1]_.
|
||
|
||
|
||
Proposal
|
||
========
|
||
|
||
The expected semantics of the ``locals()`` builtin change based on the current
|
||
execution scope. For this purpose, the defined scopes of execution are:
|
||
|
||
* module scope: top-level module code, as well as any other code executed using
|
||
``exec()`` or ``eval()`` with a single namespace
|
||
* class scope: code in the body of a ``class`` statement, as well as any other
|
||
code executed using ``exec()`` or ``eval()`` with separate local and global
|
||
namespaces
|
||
* function scope: code in the body of a ``def`` or ``async def`` statement
|
||
|
||
|
||
Module scope
|
||
------------
|
||
|
||
At module scope, as well as when using ``exec()`` or ``eval()`` with a
|
||
single namespace, ``locals()`` must return the same object as ``globals()``,
|
||
which must be the actual execution namespace (available as
|
||
``inspect.currentframe().f_locals`` in implementations that provide access
|
||
to frame objects).
|
||
|
||
Variable assignments during subsequent code execution in the same scope must
|
||
dynamically change the contents of the returned mapping, and changes to the
|
||
returned mapping must change the values bound to local variable names in the
|
||
execution environment.
|
||
|
||
This part of the proposal does not require any changes to the reference
|
||
implementation - it is standardisation of the current behaviour.
|
||
|
||
|
||
Class scope
|
||
-----------
|
||
|
||
At class scope, as well as when using ``exec()`` or ``eval()`` with separate
|
||
global and local namespaces, ``locals()`` must return the specified local
|
||
namespace (which may be supplied by the metaclass ``__prepare__`` method
|
||
in the case of classes). As for module scope, this must be a direct reference
|
||
to the actual execution namespace (available as
|
||
``inspect.currentframe().f_locals`` in implementations that provide access
|
||
to frame objects).
|
||
|
||
Variable assignments during subsequent code execution in the same scope must
|
||
change the contents of the returned mapping, and changes to the returned mapping
|
||
must change the values bound to local variable names in the
|
||
execution environment.
|
||
|
||
The mapping returned by ``locals()`` will *not* be used as the actual class namespace
|
||
underlying the defined class (the class creation process will copy the contents
|
||
to a fresh dictionary that is only accessible by going through the class
|
||
machinery).
|
||
|
||
For nested classes defined inside a function, any nonlocal cells referenced from
|
||
the class scope are *not* included in the ``locals()`` mapping.
|
||
|
||
This part of the proposal does not require any changes to the reference
|
||
implementation - it is standardisation of the current behaviour.
|
||
|
||
|
||
Function scope
|
||
--------------
|
||
|
||
At function scope, interpreter implementations are granted significant freedom
|
||
to optimise local variable access, and hence are NOT required to permit
|
||
arbitrary modification of local and nonlocal variable bindings through the
|
||
mapping returned from ``locals()``.
|
||
|
||
Instead, ``locals()`` is expected to return a mutable *snapshot* of the
|
||
function's local variables and any referenced nonlocal cells with the following
|
||
semantics:
|
||
|
||
* each call to ``locals()`` returns the *same* mapping object
|
||
* each call to ``locals()`` updates the mapping to the current state of the
|
||
local variables and any nonlocal cells referenced from either the function
|
||
itself, or from any nested class definitions
|
||
* changes to the returned mapping are *not* written back to the
|
||
local variable bindings or the nonlocal cell references
|
||
* changes to the returned mapping may be overwritten by subsequent calls to
|
||
``locals()`` and other operations that cause the mapping to be refreshed from
|
||
the actual execution state
|
||
* for interpreters that provide access to frame objects, the reference returned
|
||
by ``locals()`` *must* be a reference to the same namespace as is returned by
|
||
``inspect.currentframe().f_locals`` (in a running function, generator, or
|
||
coroutine), ``inspect.getgeneratorlocals()`` (in a running or suspended
|
||
generator), and ``inspect.getcoroutinelocals()`` (in a running or suspended
|
||
coroutine)
|
||
|
||
Additional entries may also be added through ``locals()`` or ``frame.f_locals``
|
||
and will then be accessible through both ``frame.f_locals`` and ``locals()``,
|
||
but will not be accessible by name from within the function (as any
|
||
names which don't appear as local or nonlocal variables at compile time will
|
||
only be looked up in the module globals and process builtins, not in the
|
||
function locals).
|
||
|
||
|
||
Allowing trace hooks to reliably mutate local variables
|
||
-------------------------------------------------------
|
||
|
||
To allow for the implementation of runtime debuggers that can update local
|
||
variable state, trace functions are required to write changes made to
|
||
``frame.f_locals`` back to the actual execution namespace.
|
||
|
||
This is not a problem for trace hooks executed at module or class scope, as
|
||
any changes made via ``frame.f_locals`` are made directly to the actual local
|
||
namespace used for code execution, and hence no special handling of trace hooks
|
||
is required.
|
||
|
||
At function scope, however, special trace hook handling is needed in order to
|
||
copy changes made through ``frame.f_locals`` back into the actual execution
|
||
state.
|
||
|
||
For Python versions up to and including Python 3.6, this worked as follows:
|
||
|
||
1. Before calling the trace hook, update ``frame.f_locals`` from the current
|
||
execution state
|
||
2. Run the trace hook
|
||
3. After the trace hook returns, update the current execution state from
|
||
``frame.f_locals``
|
||
|
||
Due to the problems this behaviour creates for closure references (as reported
|
||
in [1]_), this PEP proposes to amend this behaviour as follows:
|
||
|
||
1. Before calling the trace hook, update ``frame.f_locals`` from the current
|
||
execution state, but include the actual cell object for all closure
|
||
references, *not* the value referred to by the cell
|
||
2. Run the trace hook
|
||
3. After the trace hook returns:
|
||
|
||
* update the current execution state from ``frame.f_locals``, but leave
|
||
closure reference values unmodified if ``frame.f_locals`` still contains
|
||
the relevant cell object for that variable reference (and hence clearly
|
||
hasn't been modified by the trace function)
|
||
* after updating the execution state, replace the cells for closure references
|
||
in ``frame.f_locals`` with the values referenced by those cells (restoring
|
||
the expected behaviour of ``locals()`` at function scope)
|
||
|
||
|
||
Open Questions
|
||
==============
|
||
|
||
How much compatibility is enough compatibility?
|
||
-----------------------------------------------
|
||
|
||
As discussed below, the proposed design aims to keep almost all current code
|
||
working, *except* code that relies on being able to read the values of
|
||
closure references directly from ``frame.f_locals`` while a trace hook is
|
||
running.
|
||
|
||
This is considered reasonable, as trace hooks may use
|
||
``frame.f_code.co_freevars`` and ``frame.f_code.co_cellvars`` to identify
|
||
variables for which they need to read ``frame.f_locals[varname].cell_contents``
|
||
to get the actual current value, rather than the cell object.
|
||
|
||
|
||
Design Discussion
|
||
=================
|
||
|
||
Ensuring ``locals()`` returns a shared snapshot at function scope
|
||
-----------------------------------------------------------------
|
||
|
||
The ``locals()`` builtin is a required part of the language, and in the
|
||
reference implementation it has historically returned a mutable mapping with
|
||
the following characteristics:
|
||
|
||
* each call to ``locals()`` returns the *same* mapping
|
||
* each call to ``locals()`` updates the mapping with the current
|
||
state of the local variables and any referenced nonlocal cells
|
||
* changes to the returned mapping *usually* aren't written back to the
|
||
local variable bindings or the nonlocal cell references, but write backs
|
||
can be triggered by doing one of the following:
|
||
|
||
* installing a Python level trace hook (write backs then happen whenever
|
||
the trace hook is called)
|
||
* running a function level wildcard import (requires bytecode injection in Py3)
|
||
* running an ``exec`` statement in the function's scope (Py2 only, since
|
||
``exec`` became an ordinary builtin in Python 3)
|
||
|
||
The current proposal aims to retain the first two properties (to maintain
|
||
backwards compatibility with as much code as possible) while still
|
||
eliminating the ability to dynamically alter local and nonlocal variable
|
||
bindings through the mapping returned by ``locals()``.
|
||
|
||
|
||
What happens with the default args for ``eval()`` and ``exec()``?
|
||
-----------------------------------------------------------------
|
||
|
||
These are formally defined as inheriting ``globals()`` and ``locals()`` from
|
||
the calling scope by default.
|
||
|
||
There doesn't seem to be any reason for the PEP to change this.
|
||
|
||
|
||
Historical semantics at function scope
|
||
--------------------------------------
|
||
|
||
The current semantics of mutating ``locals()`` and ``frame.f_locals`` in CPython
|
||
are rather quirky due to historical implementation details:
|
||
|
||
* actual execution uses the fast locals array for local variable bindings and
|
||
cell references for nonlocal variables
|
||
* there's a ``PyFrame_FastToLocals`` operation that populates the frame's
|
||
``f_locals`` attribute based on the current state of the fast locals array
|
||
and any referenced cells. This exists for three reasons:
|
||
|
||
* allowing trace functions to read the state of local variables
|
||
* allowing traceback processors to read the state of local variables
|
||
* allowing locals() to read the state of local variables
|
||
* a direct reference to ``frame.f_locals`` is returned from ``locals()``, so if
|
||
you hand out multiple concurrent references, then all those references will be
|
||
to the exact same dictionary
|
||
* the two common calls to the reverse operation, ``PyFrame_LocalsToFast``, were
|
||
removed in the migration to Python 3: ``exec`` is no longer a statement (and
|
||
hence can no longer affect function local namespaces), and the compiler now
|
||
disallows the use of ``from module import *`` operations at function scope
|
||
* however, two obscure calling paths remain: ``PyFrame_LocalsToFast`` is called
|
||
as part of returning from a trace function (which allows debuggers to make
|
||
changes to the local variable state), and you can also still inject the
|
||
``IMPORT_STAR`` opcode when creating a function directly from a code object
|
||
rather than via the compiler
|
||
|
||
This proposal deliberately *doesn't* formalise these semantics as is, since they
|
||
only make sense in terms of the historical evolution of the language and the
|
||
reference implementation, rather than being deliberately designed.
|
||
|
||
|
||
Rejected Alternatives
|
||
=====================
|
||
|
||
Allowing local variable binding mutation outside trace functions
|
||
----------------------------------------------------------------
|
||
|
||
Earlier versions of this PEP allowed local variable bindings to be mutated
|
||
whenever code had access to the frame object - it didn't restrict that ability
|
||
to trace functions the way the status quo does.
|
||
|
||
This was considered undesirable, so the design was changed to retain the
|
||
characteristic where only trace hooks can mutate local variable bindings
|
||
from outside a function.
|
||
|
||
|
||
Making ``frame.f_locals`` a write-through proxy at function scope
|
||
-----------------------------------------------------------------
|
||
|
||
While frame objects and related APIs are an explicitly optional feature of
|
||
Python implementations, there are nevertheless a lot of debuggers and other
|
||
introspection tools that expect them to behave in certain ways, including the
|
||
ability to update the bindings of local variables and nonlocal cell references
|
||
by modifying ``frame.f_locals`` in a trace hook, as well as being able to store
|
||
custom keys in the local namespace for arbitrary frames and retrieve those
|
||
values later.
|
||
|
||
Rather than the proposed approach of temporarily injecting the closure cells
|
||
into ``frame.f_locals`` and using that to determine if a trace hook has
|
||
rebound a particular local variable reference, it would technically be
|
||
possible to devise a write-through proxy that *immediately* wrote local variable
|
||
rebindings back to the frame execution state, closer to the way things work
|
||
at module and class scope.
|
||
|
||
However, in addition to being more complex to implement, adopting such an
|
||
approach would *also* allow arbitrary changes to local variables in suspended
|
||
generators and coroutines, as well as potentially allowing other threads to
|
||
mutate a regular synchronous function's local variables while it was running.
|
||
|
||
While it does introduce some additional runtime overhead when calling trace
|
||
hooks in frames that provide or reference closure variables, the proposal in
|
||
the PEP more specifically targets the actual problem being solved (i.e. updates
|
||
to closure variable references being unexpectedly overwritten by the trace hook
|
||
machinery) while otherwise preserving the existing semantics of both
|
||
``locals()`` and ``frame.f_locals``.
|
||
|
||
|
||
Making ``locals()`` and ``frame.f_locals`` refer to different namespaces
|
||
------------------------------------------------------------------------
|
||
|
||
Rather than replacing closure references in ``frame.f_locals`` before and
|
||
after calling trace hooks, it would also be possible to persistently maintain
|
||
two different namespaces, one containing the cell objects, and one containing
|
||
the values they reference.
|
||
|
||
Similar to the write-through proxy idea, this has been rejected mainly on the
|
||
basis of it being a larger divergence from established semantics than is needed
|
||
to actually solve the problem with changes to closure variable references being
|
||
unexpectedly overwritten by the trace hook machinery.
|
||
|
||
|
||
Implementation
|
||
==============
|
||
|
||
The reference implementation update is TBD - when available, it will be linked
|
||
from [2]_.
|
||
|
||
References
|
||
==========
|
||
|
||
.. [1] Broken local variable assignment given threads + trace hook + closure
|
||
(https://bugs.python.org/issue30744)
|
||
|
||
.. [2] Clarify the required behaviour of ``locals()``
|
||
(https://bugs.python.org/issue17960)
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|