diff --git a/pep-0667.rst b/pep-0667.rst new file mode 100644 index 000000000..ec3afc78f --- /dev/null +++ b/pep-0667.rst @@ -0,0 +1,437 @@ +PEP: 667 +Title: Consistent views of namespaces +Author: Mark Shannon +Status: Draft +Type: Standards Track +Content-Type: text/x-rst +Created: 30-Jul-2021 +Post-History: 20-Aug-2021 + + +Abstract +======== + +In early versions of Python all namespaces, whether in functions, +classes or modules, were all implemented the same way: as a dictionary. + +For performance reasons, the implementation of function namespaces was +changed. Unfortunately this meant that accessing these namespaces through +``locals()`` and ``frame.f_locals`` ceased to be consistent and some +odd bugs crept in over the years as threads, generators and coroutines +were added. + +This PEP proposes making these namespaces consistent once more. +Modifications to ``frame.f_locals`` will always be visible in +the underlying variables. Modifications to local variables will +immediately be visible in ``frame.f_locals``, and they will be +consistent regardless of threading or coroutines. + +The ``locals()`` function will act the same as it does now for class +and modules scopes. For function scopes it will return an instantaneous +snapshot of the underlying ``frame.f_locals``. + +Motivation +========== + +The current implementation of ``locals()`` and ``frame.f_locals`` is slow, +inconsistent and buggy. +We want to make it faster, consistent, and most importantly fix the bugs. + +For example:: + + class C: + x = 1 + sys._getframe().f_locals['x'] = 2 + print(x) + +prints ``2`` + +but:: + + def f(): + x = 1 + sys._getframe().f_locals['x'] = 2 + print(x) + f() + +prints ``1`` + +This is inconsistent, and confusing. +With this PEP both examples would print ``2``. + +Worse than that, the current behavior can result in strange bugs [1]_ + +There are no compensating advantages for the current behavior; +it is unreliable and slow. + +Rationale +========= + +The current implementation of ``frame.f_locals`` returns a dictionary +that is created on the fly from the array of local variables. +This can result in the array and dictionary getting out of sync with +each other. Writes to the ``f_locals`` may not show up as +modifications to local variables. Writes to local variables can get lost. + +By making ``frame.f_locals`` return a view on the +underlying frame, these problems go away. ``frame.f_locals`` is always in +sync with the frame because it is a view of it, not a copy of it. + +Specification +============= + +Python +------ + +``frame.f_locals`` will return a view object on the frame that +implements the ``collections.abc.Mapping`` interface. + +For module and class scopes ``frame.f_locals`` will be a dictionary, +for function scopes it will be a custom class. + +``locals()`` will be defined as:: + + def locals(): + f_locals = sys._getframe(1).f_locals + if not isinstance(f_locals, dict): + f_locals = dict(f_locals) + return f_locals + +All writes to the ``f_locals`` mapping will be immediately visible +in the underlying variables. All changes to the underlying variables +will be immediately visible in the mapping. The ``f_locals`` object will +be a full mapping, and can have arbitrary key-value pairs added to it. + +For example:: + + def l(): + "Get the locals of caller" + return sys._getframe(1).f_locals + + def test(): + x = 1 + l()['x'] = 2 + l()['y'] = 4 + l()['z'] = 5 + y + print(locals(), x) + +``test()`` will print ``{'x': 2, 'y': 4, 'z': 5} 2`` + +In Python 3.10, the above will fail with a ``NameError``, +as the definition of ``y`` by ``l()['y'] = 4`` is lost. + +C-API +----- + +Extensions to the API +''''''''''''''''''''' + +Two new C-API functions will be added:: + + PyObject *PyEval_Locals(void) + PyObject *PyFrame_GetLocals(PyFrameObject *f) + +``PyEval_Locals()`` is equivalent to: ``locals()``. + +``PyFrame_GetLocals(f)`` is equivalent to: ``f.f_locals``. + +Both functions will return a new reference. + +Changes to existing APIs +'''''''''''''''''''''''' + +The existing C-API function ``PyEval_GetLocals()`` will always raise an +exception with a message like:: + + PyEval_GetLocals() is unsafe. Please use PyEval_Locals() instead. + +This is necessary as ``PyEval_GetLocals()`` +returns a borrowed reference which cannot be made safe. + +The following functions will be retained, but will become no-ops:: + + PyFrame_FastToLocalsWithError() + PyFrame_FastToLocals() + PyFrame_LocalsToFast() + +Behavior of f_locals for optimized functions +-------------------------------------------- + +Although ``f.f_locals`` behaves as if it were the namespace of the function, +there will be some observable differences. +For example, ``f.f_locals is f.f_locals`` may be ``False``. + +However ``f.f_locals == f.f_locals`` will be ``True``, and +all changes to the underlying variables, by any means, will be +always be visible. + +Backwards Compatibility +======================= + +Python +------ + +The current implementation has many corner cases and oddities. +Code that works around those may need to be changed. +Code that uses ``locals()`` for simple templating, or print debugging, +will continue to work correctly. Debuggers and other tools that use +``f_locals`` to modify local variables, will now work correctly, +even in the presence of threaded code, coroutines and generators. + +C-API +----- + +PyEval_GetLocals +'''''''''''''''' + +Code that uses ``PyEval_GetLocals()`` will continue to operate safely, but +will need to be changed to use ``PyEval_Locals()`` to restore functionality. + +This code:: + + locals = PyEval_GetLocals(); + if (locals == NULL) { + goto error_handler; + } + Py_INCREF(locals); + +should be replaced with:: + + locals = PyEval_Locals(); + if (locals == NULL) { + goto error_handler; + } + +PyFrame_FastToLocals, etc. +'''''''''''''''''''''''''' + +These functions were designed to convert the internal "fast" representation +of the locals variables of a function to a dictionary, and vice versa. + +Calls to them are no longer required. C code that directly accesses the ``f_locals`` +field of a frame should be modified to call ``PyFrame_GetLocals()`` instead:: + + PyFrame_FastToLocals(frame); + PyObject *locals = frame.f_locals; + Py_INCREF(locals); + +becomes:: + + PyObject *locals = PyFrame_GetLocals(frame); + if (frame == NULL) + goto error_handler; + +Implementation +============== + +Each read of ``frame.f_locals`` will create a new proxy object that gives +the appearance of being the mapping of local (including cell and free) +variable names to the values of those local variables. + +A possible implementation is sketched out below. +All attributes that start with an underscore are invisible and +cannot be accessed directly. +They serve only to illustrate the proposed design. + +:: + + NULL: Object # NULL is a singleton representing the absence of a value. + + class CodeType: + + _name_to_offset_mapping_impl: dict | NULL + _cells: frozenset # Set of indexes of cell and free variables + ... + + def __init__(self, ...): + self._name_to_offset_mapping_impl = NULL + ... + + @property + def _name_to_offset_mapping(self): + "Mapping of names to offsets in local variable array." + if self._name_to_offset_mapping_impl is NULL: + self._name_to_offset_mapping_impl = { + name: index for (index, name) in enumerate(self.co_varnames) + } + return self._name_to_offset_mapping_impl + + class FrameType: + + _locals : array[Object] # The values of the local variables, items may be NULL. + _extra_locals: dict | NULL # Dictionary for storing extra locals not in _locals. + + def __init__(self, ...): + self._extra_locals = NULL + ... + + @property + def f_locals(self): + return FrameLocalsProxy(self) + + class FrameLocalsProxy: + + __slots__ "_frame" + + def __init__(self, frame:FrameType): + self._frame = frame + + def __getitem__(self, name): + f = self._frame + co = f.f_code + if name in co._name_to_offset_mapping: + index = co._name_to_offset_mapping[name] + val = f._locals[index] + if val is NULL: + raise KeyError(name) + if index in co._cells + val = val.cell_contents + if val is NULL: + raise KeyError(name) + return val + else: + if f._extra_locals is NULL: + raise KeyError(name) + return f._extra_locals[name] + + def __setitem__(self, name, value): + f = self._frame + co = f.f_code + if name in co._name_to_offset_mapping: + index = co._name_to_offset_mapping[name] + kind = co._local_kinds[index] + if index in co._cells + cell = f._locals[index] + cell.cell_contents = val + else: + f._locals[index] = val + else: + if f._extra_locals is NULL: + f._extra_locals = {} + f._extra_locals[name] = val + + def __iter__(self): + f = self._frame + co = f.f_code + yield from iter(f._extra_locals) + for index, name in enumerate(co._varnames): + val = f._locals[index] + if val is NULL: + continue + if index in co._cells: + val = val.cell_contents + if val is NULL: + continue + yield name + + def pop(self): + f = self._frame + co = f.f_code + if f._extra_locals: + return f._extra_locals.pop() + for index, _ in enumerate(co._varnames): + val = f._locals[index] + if val is NULL: + continue + if index in co._cells: + cell = val + val = cell.cell_contents + if val is NULL: + continue + cell.cell_contents = NULL + else: + f._locals[index] = NULL + return val + + def __len__(self): + f = self._frame + co = f.f_code + res = 0 + for index, _ in enumerate(co._varnames): + val = f._locals[index] + if val is NULL: + continue + if index in co._cells: + if val.cell_contents is NULL: + continue + res += 1 + return len(self._extra_locals) + res + + +Comparison with PEP 558 +======================= + +This PEP and PEP 558 [2]_ share a common goal: +to make the semantics of ``locals()`` and ``frame.f_locals()`` +intelligible, and their operation reliable. + +In the author's opinion, PEP 558 fails to do that as it is too +complex, and has many corner cases which will lead to bugs. + +The key difference between this PEP and PEP 558 is that +PEP 558 requires an internal copy of the local variables, +whereas this PEP does not. +Maintaining a copy would add considerably to the complexity of both +the specification and implementation, and bring no real benefits. + +The semantics of ``frame.f_locals`` +----------------------------------- + +In this PEP, ``frame.f_locals`` is a view onto the underlying frame. +It is always synchronized with the underlying frame. +In PEP 558, there is an additional copy of the local variables present +in the frame which is updated whenever ``frame.f_locals`` is accessed. +PEP 558 does not make it clear whether calls to ``locals()`` +update ``frame.f_locals`` or not. + +For example consider:: + + def foo(): + x = sys._getframe().f_locals + y = locals() + print(tuple(x)) + print(tuple(y)) + +It is not clear from PEP 558 (at time of writing) what would be printed. +Does the call to ``locals()`` update ``x``? +Would ``"y"`` be present in either ``x`` or ``y``? + +With this PEP it should be clear that the above would print:: + + ('x', 'y') + ('x',) + +Open Issues +=========== + +An alternative way to define ``locals()`` would be simply as:: + + def locals(): + return sys._getframe(1).f_locals + +This would be simpler and easier to understand. However, +there would be backwards compatibility issues when ``locals`` is assigned +to a local variable or when passed to ``eval``. + +References +========== + +.. [1] https://bugs.python.org/issue30744 + +.. [2] https://www.python.org/dev/peps/pep-0558/ + +Copyright +========= + +This document is placed in the public domain or under the +CC0-1.0-Universal license, whichever is more permissive. + +.. + Local Variables: + mode: indented-text + indent-tabs-mode: nil + sentence-end-double-space: t + fill-column: 70 + coding: utf-8 + End: