993 lines
39 KiB
ReStructuredText
993 lines
39 KiB
ReStructuredText
PEP: 649
|
|
Title: Deferred Evaluation Of Annotations Using Descriptors
|
|
Author: Larry Hastings <larry@hastings.org>
|
|
Status: Draft
|
|
Type: Standards Track
|
|
Content-Type: text/x-rst
|
|
Created: 11-Jan-2021
|
|
Post-History: 11-Jan-2021, 11-Apr-2021
|
|
|
|
|
|
Abstract
|
|
========
|
|
|
|
As of Python 3.9, Python supports two different behaviors
|
|
for annotations:
|
|
|
|
* original or "stock" Python semantics, in which annotations
|
|
are evaluated at the time they are bound, and
|
|
* :pep:`563` semantics, currently enabled per-module by
|
|
``from __future__ import annotations``, in which annotations
|
|
are converted back into strings and must be reparsed and
|
|
executed by ``eval()`` to be used.
|
|
|
|
Original Python semantics created a circular references problem
|
|
for static typing analysis. :pep:`563` solved that problem--but
|
|
its novel semantics introduced new problems, including its
|
|
restriction that annotations can only reference names at
|
|
module-level scope.
|
|
|
|
This PEP proposes a third way that embodies the best of both
|
|
previous approaches. It solves the same circular reference
|
|
problems solved by :pep:`563`, while otherwise preserving Python's
|
|
original annotation semantics, including allowing annotations
|
|
to refer to local and class variables.
|
|
|
|
In this new approach, the code to generate the annotations
|
|
dict is written to its own function which computes and returns
|
|
the annotations dict. Then, ``__annotations__`` is a "data
|
|
descriptor" which calls this annotation function once and
|
|
retains the result. This delays the evaluation of annotations
|
|
expressions until the annotations are examined, at which point
|
|
all circular references have likely been resolved. And if
|
|
the annotations are never examined, the function is never
|
|
called and the annotations are never computed.
|
|
|
|
Annotations defined using this PEP's semantics have the same
|
|
visibility into the symbol table as annotations under "stock"
|
|
semantics--any name visible to an annotation in Python 3.9
|
|
is visible to an annotation under this PEP. In addition,
|
|
annotations under this PEP can refer to names defined *after*
|
|
the annotation is defined, as long as the name is defined in
|
|
a scope visible to the annotation. Specifically, when this PEP
|
|
is active:
|
|
|
|
* An annotation can refer to a local variable defined in the
|
|
current function scope.
|
|
* An annotation can refer to a local variable defined in an
|
|
enclosing function scope.
|
|
* An annotation can refer to a class variable defined in the
|
|
current class scope.
|
|
* An annotation can refer to a global variable.
|
|
|
|
And in all four of these cases, the variable referenced by
|
|
the annotation needn't be defined at the time the annotation
|
|
is defined--it can be defined afterwards. The only restriction
|
|
is that the name or variable be defined before the annotation
|
|
is *evaluated.*
|
|
|
|
If accepted, these new semantics for annotations would initially
|
|
be gated behind ``from __future__ import co_annotations``.
|
|
However, these semantics would eventually be promoted to be
|
|
Python's default behavior. Thus this PEP would *supersede*
|
|
:pep:`563`, and :pep:`563`'s behavior would be deprecated and
|
|
eventually removed.
|
|
|
|
Overview
|
|
========
|
|
|
|
.. note:: The code presented in this section is simplified
|
|
for clarity. The intention is to communicate the high-level
|
|
concepts involved without getting lost in with the details.
|
|
The actual details are often quite different. See the
|
|
Implementation_ section later in this PEP for a much more
|
|
accurate description of how this PEP works.
|
|
|
|
Consider this example code:
|
|
|
|
.. code-block::
|
|
|
|
def foo(x: int = 3, y: MyType = None) -> float:
|
|
...
|
|
class MyType:
|
|
...
|
|
foo_y_type = foo.__annotations__['y']
|
|
|
|
As we see here, annotations are available at runtime through an
|
|
``__annotations__`` attribute on functions, classes, and modules.
|
|
When annotations are specified on one of these objects,
|
|
``__annotations__`` is a dictionary mapping the names of the
|
|
fields to the value specified as that field's annotation.
|
|
|
|
The default behavior in Python 3.9 is to evaluate the expressions
|
|
for the annotations, and build the annotations dict, at the time
|
|
the function, class, or module is bound. At runtime the above
|
|
code actually works something like this:
|
|
|
|
.. code-block::
|
|
|
|
annotations = {'x': int, 'y': MyType, 'return': float}
|
|
def foo(x = 3, y = "abc"):
|
|
...
|
|
foo.__annotations__ = annotations
|
|
class MyType:
|
|
...
|
|
foo_y_type = foo.__annotations__['y']
|
|
|
|
The crucial detail here is that the values ``int``, ``MyType``,
|
|
and ``float`` are looked up at the time the function object is
|
|
bound, and these values are stored in the annotations dict.
|
|
But this code doesn't run—it throws a ``NameError`` on the first
|
|
line, because ``MyType`` hasn't been defined yet.
|
|
|
|
:pep:`563`'s solution is to decompile the expressions back
|
|
into strings, and store those *strings* in the annotations dict.
|
|
The equivalent runtime code would look something like this:
|
|
|
|
.. code-block::
|
|
|
|
annotations = {'x': 'int', 'y': 'MyType', 'return': 'float'}
|
|
def foo(x = 3, y = "abc"):
|
|
...
|
|
foo.__annotations__ = annotations
|
|
class MyType:
|
|
...
|
|
foo_y_type = foo.__annotations__['y']
|
|
|
|
This code now runs successfully. However, ``foo_y_type``
|
|
is no longer a reference to ``MyType``, it is the *string*
|
|
``'MyType'``. The code would have to be further modified to
|
|
call ``eval()`` or ``typing.get_type_hints()`` to convert
|
|
the string into a useful reference to the actual ``MyType``
|
|
object.
|
|
|
|
This PEP proposes a third approach, delaying the evaluation of
|
|
the annotations by computing them in their own function. If
|
|
this PEP was active, the generated code would work something
|
|
like this:
|
|
|
|
.. code-block::
|
|
|
|
class function:
|
|
# __annotations__ on a function object is already a
|
|
# "data descriptor" in Python, we're just changing what it does
|
|
@property
|
|
def __annotations__(self):
|
|
return self.__co_annotations__()
|
|
|
|
# ...
|
|
|
|
def foo_annotations_fn():
|
|
return {'x': int, 'y': MyType, 'return': float}
|
|
def foo(x = 3, y = "abc"):
|
|
...
|
|
foo.__co_annotations__ = foo_annotations_fn
|
|
class MyType:
|
|
...
|
|
foo_y_type = foo.__annotations__['y']
|
|
|
|
The important change is that the code constructing the
|
|
annotations dict now lives in a function—here, called
|
|
``foo_annotations_fn()``. But this function isn't called
|
|
until we ask for the value of ``foo.__annotations__``,
|
|
and we don't do that until *after* the definition of ``MyType``.
|
|
So this code also runs successfully, and ``foo_y_type`` now
|
|
has the correct value--the class ``MyType``--even though
|
|
``MyType`` wasn't defined until *after* the annotation was
|
|
defined.
|
|
|
|
|
|
Motivation
|
|
==========
|
|
|
|
Python's original semantics for annotations made its use for
|
|
static type analysis painful due to forward reference problems.
|
|
This was the main justification for :pep:`563`, and we need not
|
|
revisit those arguments here.
|
|
|
|
However, :pep:`563`'s solution was to decompile code for Python
|
|
annotations back into strings at compile time, requiring
|
|
users of annotations to ``eval()`` those strings to restore
|
|
them to their actual Python values. This has several drawbacks:
|
|
|
|
* It requires Python implementations to stringize their
|
|
annotations. This is surprising behavior—unprecedented
|
|
for a language-level feature. Also, adding this feature
|
|
to CPython was complicated, and this complicated code would
|
|
need to be reimplemented independently by every other Python
|
|
implementation.
|
|
* It requires that all annotations be evaluated at module-level
|
|
scope. Annotations under :pep:`563` can no longer refer to
|
|
* class variables,
|
|
* local variables in the current function, or
|
|
* local variables in enclosing functions.
|
|
* It requires a code change every time existing code uses an
|
|
annotation, to handle converting the stringized
|
|
annotation back into a useful value.
|
|
* ``eval()`` is slow.
|
|
* ``eval()`` isn't always available; it's sometimes removed
|
|
from Python for space reasons.
|
|
* In order to evaluate the annotations on a class,
|
|
it requires obtaining a reference to that class's globals,
|
|
which :pep:`563` suggests should be done by looking up that class
|
|
by name in ``sys.modules``—another surprising requirement for
|
|
a language-level feature.
|
|
* It adds an ongoing maintenance burden to Python implementations.
|
|
Every time the language adds a new feature available in expressions,
|
|
the implementation's stringizing code must be updated in
|
|
tandem in order to support decompiling it.
|
|
|
|
This PEP also solves the forward reference problem outlined in
|
|
:pep:`563` while avoiding the problems listed above:
|
|
|
|
* Python implementations would generate annotations as code
|
|
objects. This is simpler than stringizing, and is something
|
|
Python implementations are already quite good at. This means:
|
|
- alternate implementations would need to write less code to
|
|
implement this feature, and
|
|
- the implementation would be simpler overall, which should
|
|
reduce its ongoing maintenance cost.
|
|
* Existing annotations would not need to be changed to only
|
|
use global scope. Actually, annotations would become much
|
|
easier to use, as they would now also handle forward
|
|
references.
|
|
* Code examining annotations at runtime would no longer need
|
|
to use ``eval()`` or anything else—it would automatically
|
|
see the correct values. This is easier, faster, and
|
|
removes the dependency on ``eval()``.
|
|
|
|
|
|
Backwards Compatibility
|
|
=======================
|
|
|
|
:pep:`563` changed the semantics of annotations. When its semantics
|
|
are active, annotations must assume they will be evaluated in
|
|
*module-level* scope. They may no longer refer directly
|
|
to local variables or class attributes.
|
|
|
|
This PEP removes that restriction; annotations may refer to globals,
|
|
local variables inside functions, local variables defined in enclosing
|
|
functions, and class members in the current class. In addition,
|
|
annotations may refer to any of these that haven't been defined yet
|
|
at the time the annotation is defined, as long as the not-yet-defined
|
|
name is created normally (in such a way that it is known to the symbol
|
|
table for the relevant block, or is a global or class variable found
|
|
using normal name resolution). Thus, this PEP demonstrates *improved*
|
|
backwards compatibility over :pep:`563`.
|
|
|
|
:pep:`563` also requires using ``eval()`` or ``typing.get_type_hints()``
|
|
to examine annotations. Code updated to work with :pep:`563` that calls
|
|
``eval()`` directly would have to be updated simply to remove the
|
|
``eval()`` call. Code using ``typing.get_type_hints()`` would
|
|
continue to work unchanged, though future use of that function
|
|
would become optional in most cases.
|
|
|
|
Because this PEP makes semantic changes to how annotations are
|
|
evaluated, this PEP will be initially gated with a per-module
|
|
``from __future__ import co_annotations`` before it eventually
|
|
becomes the default behavior.
|
|
|
|
Apart from the delay in evaluating values stored in annotations
|
|
dicts, this PEP preserves nearly all existing behavior of
|
|
annotations dicts. Specifically:
|
|
|
|
* Annotations dicts are mutable, and any changes to them are
|
|
preserved.
|
|
* The ``__annotations__`` attribute can be explicitly set,
|
|
and any value set this way will be preserved.
|
|
* The ``__annotations__`` attribute can be deleted using
|
|
the ``del`` statement.
|
|
|
|
However, there are two uncommon interactions possible with class
|
|
and module annotations that work today—both with stock semantics,
|
|
and with :pep:`563` semantics—that would no longer work when this PEP
|
|
was active. These two interactions would have to be prohibited.
|
|
The good news is, neither is common, and neither is considered good
|
|
practice. In fact, they're rarely seen outside of Python's own
|
|
regression test suite. They are:
|
|
|
|
* *Code that sets annotations on module or class attributes
|
|
from inside any kind of flow control statement.* It's
|
|
currently possible to set module and class attributes with
|
|
annotations inside an ``if`` or ``try`` statement, and it works
|
|
as one would expect. It's untenable to support this behavior
|
|
when this PEP is active.
|
|
* *Code in module or class scope that references or modifies the
|
|
local* ``__annotations__`` *dict directly.* Currently, when
|
|
setting annotations on module or class attributes, the generated
|
|
code simply creates a local ``__annotations__`` dict, then sets
|
|
mappings in it as needed. It's also possible for user code
|
|
to directly modify this dict, though this doesn't seem like it's
|
|
an intentional feature. Although it would be possible to support
|
|
this after a fashion when this PEP was active, the semantics
|
|
would likely be surprising and wouldn't make anyone happy.
|
|
|
|
Note that these are both also pain points for static type checkers,
|
|
and are unsupported by those checkers. It seems reasonable to
|
|
declare that both are at the very least unsupported, and their
|
|
use results in undefined behavior. It might be worth making a
|
|
small effort to explicitly prohibit them with compile-time checks.
|
|
|
|
In addition, there are a few operators that would no longer be
|
|
valid for use in annotations, because their side effects would
|
|
affect the *annotation function* instead of the
|
|
class/function/module the annotation was nominally defined in:
|
|
|
|
* ``:=`` (aka the "walrus operator"),
|
|
* ``yield`` and ``yield from``, and
|
|
* ``await``.
|
|
|
|
Use of any of these operators in an annotation will result in a
|
|
compile-time error.
|
|
|
|
Since delaying the evaluation of annotations until they are
|
|
evaluated changes the semantics of the language, it's observable
|
|
from within the language. Therefore it's possible to write code
|
|
that behaves differently based on whether annotations are
|
|
evaluated at binding time or at access time, e.g.
|
|
|
|
.. code-block::
|
|
|
|
mytype = str
|
|
def foo(a:mytype): pass
|
|
mytype = int
|
|
print(foo.__annotations__['a'])
|
|
|
|
This will print ``<class 'str'>`` with stock semantics
|
|
and ``<class 'int'>`` when this PEP is active. Since
|
|
this is poor programming style to begin with, it seems
|
|
acceptable that this PEP changes its behavior.
|
|
|
|
Finally, there's a standard idiom that's actually somewhat common
|
|
when accessing class annotations, and which will become more
|
|
problematic when this PEP is active: code often accesses class
|
|
annotations via ``cls.__dict__.get("__annotations__", {})``
|
|
rather than simply ``cls.__annotations__``. It's due to a flaw
|
|
in the original design of annotations themselves. This topic
|
|
will be examined in a separate discussion; the outcome of
|
|
that discussion will likely guide the future evolution of this
|
|
PEP.
|
|
|
|
|
|
Mistaken Rejection Of This Approach In November 2017
|
|
====================================================
|
|
|
|
During the early days of discussion around :pep:`563`,
|
|
using code to delay the evaluation of annotations was
|
|
briefly discussed, in a November 2017 thread in
|
|
``comp.lang.python-dev``. At the time the
|
|
technique was termed an "implicit lambda expression".
|
|
|
|
Guido van Rossum—Python's BDFL at the time—replied,
|
|
asserting that these "implicit lambda expression" wouldn't
|
|
work, because they'd only be able to resolve symbols at
|
|
module-level scope:
|
|
|
|
IMO the inability of referencing class-level definitions
|
|
from annotations on methods pretty much kills this idea.
|
|
|
|
https://mail.python.org/pipermail/python-dev/2017-November/150109.html
|
|
|
|
This led to a short discussion about extending lambda-ized
|
|
annotations for methods to be able to refer to class-level
|
|
definitions, by maintaining a reference to the class-level
|
|
scope. This idea, too, was quickly rejected.
|
|
|
|
:pep:`PEP 563 summarizes the above discussion
|
|
<563#keeping-the-ability-to-use-function-local-state-when-defining-annotations>`
|
|
|
|
What's puzzling is :pep:`563`'s own changes to the scoping rules
|
|
of annotations—it *also* doesn't permit annotations to reference
|
|
class-level definitions. It's not immediately clear why an
|
|
inability to reference class-level definitions was enough to
|
|
reject using "implicit lambda expressions" for annotations,
|
|
but was acceptable for stringized annotations.
|
|
|
|
In retrospect there was probably a pivot during the development
|
|
of :pep:`563`. It seems that, early on, there was a prevailing
|
|
assumption that :pep:`563` would support references to class-level
|
|
definitions. But by the time :pep:`563` was finalized, this
|
|
assumption had apparently been abandoned. And it looks like
|
|
"implicit lambda expressions" were never reconsidered in this
|
|
new light.
|
|
|
|
In any case, annotations are still able to refer to class-level
|
|
definitions under this PEP, rendering the objection moot.
|
|
|
|
.. _Implementation:
|
|
|
|
Implementation
|
|
==============
|
|
|
|
There's a prototype implementation of this PEP, here:
|
|
|
|
https://github.com/larryhastings/co_annotations/
|
|
|
|
As of this writing, all features described in this PEP are
|
|
implemented, and there are some rudimentary tests in the
|
|
test suite. There are still some broken tests, and the
|
|
``co_annotations`` repo is many months behind the
|
|
CPython repo.
|
|
|
|
|
|
from __future__ import co_annotations
|
|
-------------------------------------
|
|
|
|
In the prototype, the semantics presented in this PEP are gated with:
|
|
|
|
.. code-block::
|
|
|
|
from __future__ import co_annotations
|
|
|
|
|
|
|
|
__co_annotations__
|
|
------------------
|
|
|
|
Python supports runtime metadata for annotations for three different
|
|
types: function, classes, and modules. The basic approach to
|
|
implement this PEP is much the same for all three with only minor
|
|
variations.
|
|
|
|
With this PEP, each of these types adds a new attribute,
|
|
``__co_annotations__``. ``__co_annotations__`` is a function:
|
|
it takes no arguments, and must return either ``None`` or a dict
|
|
(or subclass of dict). It adds the following semantics:
|
|
|
|
* ``__co_annotations__`` is always set, and may contain either
|
|
``None`` or a callable.
|
|
* ``__co_annotations__`` cannot be deleted.
|
|
* ``__annotations__`` and ``__co_annotations__`` can't both
|
|
be set to a useful value simultaneously:
|
|
|
|
- If you set ``__annotations__`` to a dict, this also sets
|
|
``__co_annotations__`` to None.
|
|
- If you set ``__co_annotations__`` to a callable, this also
|
|
deletes ``__annotations__``
|
|
|
|
Internally, ``__co_annotations__`` is a "data descriptor",
|
|
where functions are called whenever user code gets, sets,
|
|
or deletes the attribute. In all three cases, the object
|
|
has separate internal storage for the current value
|
|
of the ``__co_annotations__`` attribute.
|
|
|
|
``__annotations__`` is also as a data descriptor, with its own
|
|
separate internal storage for its internal value. The code
|
|
implementing the "get" for ``__annotations__`` works something
|
|
like this:
|
|
|
|
.. code-block::
|
|
|
|
if (the internal value is set)
|
|
return the internal annotations dict
|
|
if (__co_annotations__ is not None)
|
|
call the __co_annotations__ function
|
|
if the result is a dict:
|
|
store the result as the internal value
|
|
set __co_annotations__ to None
|
|
return the internal value
|
|
do whatever this object does when there are no annotations
|
|
|
|
|
|
Unbound code objects
|
|
--------------------
|
|
|
|
When Python code defines one of these three objects with
|
|
annotations, the Python compiler generates a separate code
|
|
object which builds and returns the appropriate annotations
|
|
dict. Wherever possible, the "annotation code object" is
|
|
then stored *unbound* as the internal value of
|
|
``__co_annotations__``; it is then bound on demand when
|
|
the user asks for ``__annotations__``.
|
|
|
|
This is a useful optimization for both speed and memory
|
|
consumption. Python processes rarely examine annotations
|
|
at runtime. Therefore, pre-binding these code objects to
|
|
function objects would usually be a waste of resources.
|
|
|
|
When is this optimization not possible?
|
|
|
|
* When an annotation function contains references to
|
|
free variables, in the current function or in an
|
|
outer function.
|
|
* When an annotation function is defined on a method
|
|
(a function defined inside a class) and the annotations
|
|
possibly refer directly to class variables.
|
|
|
|
Note that user code isn't permitted to directly access these
|
|
unbound code objects. If the user "gets" the value of
|
|
``__co_annotations__``, and the internal value of
|
|
``__co_annotations__`` is an unbound code object,
|
|
it immediately binds the code object, and the resulting
|
|
function object is stored as the new value of
|
|
``__co_annotations__`` and returned.
|
|
|
|
(However, these unbound code objects *are* stored in the
|
|
``.pyc`` file. So a determined user could examine them
|
|
should that be necessary for some reason.)
|
|
|
|
|
|
|
|
|
|
Function Annotations
|
|
--------------------
|
|
|
|
When compiling a function, the CPython bytecode compiler
|
|
visits the annotations for the function all in one place,
|
|
starting with ``compiler_visit_annotations()`` in ``compile.c``.
|
|
If there are any annotations, they create the scope for
|
|
the annotations function on demand, and
|
|
``compiler_visit_annotations()`` assembles it.
|
|
|
|
The code object is passed in place of the annotations dict
|
|
for the ``MAKE_FUNCTION`` bytecode instruction.
|
|
``MAKE_FUNCTION`` supports a new bit in its oparg
|
|
bitfield, ``0x10``, which tells it to expect a
|
|
``co_annotations`` code object on the stack.
|
|
The bitfields for ``annotations`` (``0x04``) and
|
|
``co_annotations`` (``0x10``) are mutually exclusive.
|
|
|
|
When binding an unbound annotation code object, a function will
|
|
use its own ``__globals__`` as the new function's globals.
|
|
|
|
One quirk of Python: you can't actually remove the annotations
|
|
from a function object. If you delete the ``__annotations__``
|
|
attribute of a function, then get its ``__annotations__`` member,
|
|
it will create an empty dict and use that as its
|
|
``__annotations__``. The implementation of this PEP maintains
|
|
this quirk for backwards compatibility.
|
|
|
|
|
|
Class Annotations
|
|
-----------------
|
|
|
|
When compiling a class body, the compiler maintains two scopes:
|
|
one for the normal class body code, and one for annotations.
|
|
(This is facilitated by four new functions: ``compiler.c``
|
|
adds ``compiler_push_scope()`` and ``compiler_pop_scope()``,
|
|
and ``symtable.c`` adds ``symtable_push_scope()`` and
|
|
``symtable_pop_scope()``.)
|
|
Once the code generator reaches the end of the class body,
|
|
but before it generates the bytecode for the class body,
|
|
it assembles the bytecode for ``__co_annotations__``, then
|
|
assigns that to ``__co_annotations__`` using ``STORE_NAME``.
|
|
|
|
It also sets a new ``__globals__`` attribute. Currently it
|
|
does this by calling ``globals()`` and storing the result.
|
|
(Surely there's a more elegant way to find the class's
|
|
globals--but this was good enough for the prototype.) When
|
|
binding an unbound annotation code object, a class will use
|
|
the value of this ``__globals__`` attribute. When the class
|
|
drops its reference to the unbound code object--either because
|
|
it has bound it to a function, or because ``__annotations__``
|
|
has been explicitly set--it also deletes its ``__globals__``
|
|
attribute.
|
|
|
|
As discussed above, examination or modification of
|
|
``__annotations__`` from within the class body is no
|
|
longer supported. Also, any flow control (``if`` or ``try`` blocks)
|
|
around declarations of members with annotations is unsupported.
|
|
|
|
If you delete the ``__annotations__`` attribute of a class,
|
|
then get its ``__annotations__`` member, it will return the
|
|
annotations dict of the first base class with annotations set.
|
|
If no base classes have annotations set, it will raise
|
|
``AttributeError``.
|
|
|
|
Although it's an implementation-specific detail, currently
|
|
classes store the internal value of ``__co_annotations__``
|
|
in their ``tp_dict`` under the same name.
|
|
|
|
|
|
Module Annotations
|
|
------------------
|
|
|
|
Module annotations work much the same as class annotations.
|
|
The main difference is, a module uses its own dict as the
|
|
``__globals__`` when binding the function.
|
|
|
|
If you delete the ``__annotations__`` attribute of a class,
|
|
then get its ``__annotations__`` member, the module will
|
|
raise ``AttributeError``.
|
|
|
|
Annotations With Closures
|
|
-------------------------
|
|
|
|
It's possible to write annotations that refer to
|
|
free variables, and even free variables that have yet
|
|
to be defined. For example:
|
|
|
|
.. code-block::
|
|
|
|
from __future__ import co_annotations
|
|
|
|
def outer():
|
|
def middle():
|
|
def inner(a:mytype, b:mytype2): pass
|
|
mytype = str
|
|
return inner
|
|
mytype2 = int
|
|
return middle()
|
|
|
|
fn = outer()
|
|
print(fn.__annotations__)
|
|
|
|
At the time ``fn`` is set, ``inner.__co_annotations__()``
|
|
hasn't been run. So it has to retain a reference to
|
|
the *future* definitions of ``mytype`` and ``mytype2`` if
|
|
it is to correctly evaluate its annotations.
|
|
|
|
If an annotation function refers to a local variable
|
|
from the current function scope, or a free variable
|
|
from an enclosing function scope--if, in CPython, the
|
|
annotation function code object contains one or more
|
|
``LOAD_DEREF`` opcodes--then the annotation code object
|
|
is bound at definition time with references to these
|
|
variables. ``LOAD_DEREF`` instructions require the annotation
|
|
function to be bound with special run-time information
|
|
(in CPython, a ``freevars`` array). Rather than store
|
|
that separately and use that to later lazy-bind the
|
|
function object, the current implementation simply
|
|
early-binds the function object.
|
|
|
|
Note that, since the annotation function ``inner.__co_annotations__()``
|
|
is defined while parsing ``outer()``, from Python's perspective
|
|
the annotation function is a "nested function". So "local
|
|
variable inside the 'current' function" and "free variable
|
|
from an enclosing function" are, from the perspective of
|
|
the annotation function, the same thing.
|
|
|
|
|
|
Annotations That Refer To Class Variables
|
|
-----------------------------------------
|
|
|
|
It's possible to write annotations that refer to
|
|
class variables, and even class variables that haven't
|
|
yet been defined. For example:
|
|
|
|
.. code-block::
|
|
|
|
from __future__ import co_annotations
|
|
|
|
class C:
|
|
def method(a:mytype): pass
|
|
mytype = str
|
|
|
|
print(C.method.__annotations__)
|
|
|
|
Internally, annotation functions are defined as
|
|
a new type of "block" in CPython's symbol table
|
|
called an ``AnnotationBlock``. An ``AnnotationBlock``
|
|
is almost identical to a ``FunctionBlock``. It differs
|
|
in that it's permitted to see names from an enclosing
|
|
class scope. (Again: annotation functions are functions,
|
|
and they're defined *inside* the same scope as
|
|
the thing they're being defined on. So in the above
|
|
example, the annotation function for ``C.method()``
|
|
is defined inside ``C``.)
|
|
|
|
If it's possible that an annotation function refers
|
|
to class variables--if all these conditions are true:
|
|
|
|
* The annotation function is being defined inside
|
|
a class scope.
|
|
* The generated code for the annotation function
|
|
has at least one ``LOAD_NAME`` instruction.
|
|
|
|
Then the annotation function is bound at the time
|
|
it's set on the class/function, and this binding
|
|
includes a reference to the class dict. The class
|
|
dict is pushed on the stack, and the ``MAKE_FUNCTION``
|
|
bytecode instruction takes a new second bitfield (0x20)
|
|
indicating that it should consume that stack argument
|
|
and store it as ``__locals__`` on the newly created
|
|
function object.
|
|
|
|
Then, at the time the function is executed, the
|
|
``f_locals`` field of the frame object is set to
|
|
the function's ``__locals__``, if set. This permits
|
|
``LOAD_NAME`` opcodes to work normally, which means
|
|
the code generated for annotation functions is nearly
|
|
identical to that generated for conventional Python
|
|
functions.
|
|
|
|
|
|
Interactive REPL Shell
|
|
----------------------
|
|
|
|
Everything works the same inside Python's interactive REPL shell,
|
|
except for module annotations in the interactive module (``__main__``)
|
|
itself. Since that module is never "finished", there's no specific
|
|
point where we can compile the ``__co_annotations__`` function.
|
|
|
|
For the sake of simplicity, in this case we forego delayed evaluation.
|
|
Module-level annotations in the REPL shell will continue to work
|
|
exactly as they do today, evaluating immediately and setting the
|
|
result directly inside the ``__annotations__`` dict.
|
|
|
|
(It might be possible to support delayed evaluation here.
|
|
But it gets complicated quickly, and for a nearly-non-existent
|
|
use case.)
|
|
|
|
|
|
Annotations On Local Variables Inside Functions
|
|
-----------------------------------------------
|
|
|
|
Python supports syntax for local variable annotations inside
|
|
functions. However, these annotations have no runtime
|
|
effect--they're discarded at compile-time. Therefore, this
|
|
PEP doesn't need to do anything to support them, the same
|
|
as stock semantics and :pep:`563`.
|
|
|
|
|
|
|
|
Performance Comparison
|
|
----------------------
|
|
|
|
Performance with this PEP should be favorable, when compared with either
|
|
stock behavior or :pep:`563`. In general, resources are only consumed
|
|
on demand—"you only pay for what you use".
|
|
|
|
There are three scenarios to consider:
|
|
|
|
* the runtime cost when annotations aren't defined,
|
|
* the runtime cost when annotations are defined but *not* referenced, and
|
|
* the runtime cost when annotations are defined *and* referenced.
|
|
|
|
We'll examine each of these scenarios in the context of all three
|
|
semantics for annotations: stock, :pep:`563`, and this PEP.
|
|
|
|
When there are no annotations, all three semantics have the same
|
|
runtime cost: zero. No annotations dict is created and no code is
|
|
generated for it. This requires no runtime processor time and
|
|
consumes no memory.
|
|
|
|
When annotations are defined but not referenced, the runtime cost
|
|
of Python with this PEP should be roughly equal to or slightly better
|
|
than :pep:`563` semantics, and slightly better than "stock" Python
|
|
semantics. The specifics depend on the object being annotated:
|
|
|
|
* With stock semantics, the annotations dict is always built, and
|
|
set as an attribute of the object being annotated.
|
|
* In :pep:`563` semantics, for function objects, a single constant
|
|
(a tuple) is set as an attribute of the function. For class and
|
|
module objects, the annotations dict is always built and set as
|
|
an attribute of the class or module.
|
|
* With this PEP, a single object is set as an attribute of the
|
|
object being annotated. Most often, this object is a constant
|
|
(a code object). In cases where the annotation refers to local
|
|
variables or class variables, the code object will be bound to
|
|
a function object, and the function object is set as the attribute
|
|
of the object being annotated.
|
|
|
|
When annotations are both defined and referenced, code using
|
|
this PEP should be much faster than code using :pep:`563` semantics,
|
|
and equivalent to or slightly improved over original Python
|
|
semantics. :pep:`563` semantics requires invoking ``eval()`` for
|
|
every value inside an annotations dict, which is enormously slow.
|
|
And, as already mentioned, this PEP generates measurably more
|
|
efficient bytecode for class and module annotations than stock
|
|
semantics; for function annotations, this PEP and stock semantics
|
|
should be roughly equivalent.
|
|
|
|
Memory use should also be comparable in all three scenarios across
|
|
all three semantic contexts. In the first and third scenarios,
|
|
memory usage should be roughly equivalent in all cases.
|
|
In the second scenario, when annotations are defined but not
|
|
referenced, using this PEP's semantics will mean the
|
|
function/class/module will store one unused code object (possibly
|
|
bound to an unused function object); with the other two semantics,
|
|
they'll store one unused dictionary (or constant tuple).
|
|
|
|
Bytecode Comparison
|
|
-------------------
|
|
|
|
The bytecode generated for annotations functions with
|
|
this PEP uses the efficient ``BUILD_CONST_KEY_MAP`` opcode
|
|
to build the dict for all annotatable objects:
|
|
functions, classes, and modules.
|
|
|
|
Stock semantics also uses ``BUILD_CONST_KEY_MAP`` bytecode
|
|
for function annotations. :pep:`563` has an even more efficient
|
|
method for building annotations dicts on functions, leveraging
|
|
the fact that its annotations dicts only contain strings for
|
|
both keys and values. At compile-time it constructs a tuple
|
|
containing pairs of keys and values at compile-time, then
|
|
at runtime it converts that tuple into a dict on demand.
|
|
This is a faster technique than either stock semantics
|
|
or this PEP can employ, because in those two cases
|
|
annotations dicts can contain Python values of any type.
|
|
Of course, this performance win is negated if the
|
|
annotations are examined, due to the overhead of ``eval()``.
|
|
|
|
For class and module annotations, both stock semantics
|
|
and :pep:`563` generate a longer and slightly-less-efficient
|
|
stanza of bytecode, creating the dict and setting the
|
|
annotations individually.
|
|
|
|
|
|
For Future Discussion
|
|
=====================
|
|
|
|
Circular Imports
|
|
----------------
|
|
|
|
There is one unfortunately-common scenario where :pep:`563`
|
|
currently provides a better experience, and it has to do
|
|
with large code bases, with circular dependencies and
|
|
imports, that examine their annotations at run-time.
|
|
|
|
:pep:`563` permitted defining *and examining* invalid
|
|
expressions as annotations. Its implementation requires
|
|
annotations to be legal Python expressions, which it then
|
|
converts into strings at compile-time. But legal Python
|
|
expressions may not be computable at runtime, if for
|
|
example the expression references a name that isn't defined.
|
|
This is a problem for stringized annotations if they're
|
|
evaluated, e.g. with ``typing.get_type_hints()``. But
|
|
any stringized annotation may be examined harmlessly at
|
|
any time--as long as you don't evaluate it, and only
|
|
examine it as a string.
|
|
|
|
Some large organizations have code bases that unfortunately
|
|
have circular dependency problems with their annotations--class
|
|
A has methods annotated with class B, but class B has methods
|
|
annotated with class A--that can be difficult to resolve.
|
|
Since :pep:`563` stringizes their annotations, it allows them
|
|
to leave these circular dependencies in place, and they can
|
|
sidestep the circular import problem by never importing the
|
|
module that defines the types used in the annotations. Their
|
|
annotations can no longer be evaluated, but this appears not
|
|
to be a concern in practice. They can then examine the
|
|
stringized form of the annotations at runtime and this seems
|
|
to be sufficient for their needs.
|
|
|
|
This PEP allows for many of the same behaviors.
|
|
Annotations must be legal Python expressions, which
|
|
are compiled into a function at compile-time.
|
|
And if the code never examines an annotation, it won't
|
|
have any runtime effect, so here too annotations can
|
|
harmlessly refer to undefined names. (It's exactly
|
|
like defining a function that refers to undefined
|
|
names--then never calling that function. Until you
|
|
call the function, nothing bad will happen.)
|
|
|
|
But examining an annotation when this PEP is active
|
|
means evaluating it, which means the names evaluated
|
|
in that expression must be defined. An undefined name
|
|
will throw a ``NameError`` in an annotation function,
|
|
just as it would with a stringized annotation passed
|
|
in to ``typing.get_type_hints()``, and just like any
|
|
other context in Python where an expression is evaluated.
|
|
|
|
In discussions we have yet to find a solution to this
|
|
problem that makes all the participants in the
|
|
conversation happy. There are various avenues to explore
|
|
here:
|
|
|
|
* One workaround is to continue to stringize one's
|
|
annotations, either by hand or done automatically
|
|
by the Python compiler (as it does today with
|
|
``from __future__ import annotations``). This might
|
|
mean preserving Python's current stringizing annotations
|
|
going forward, although leaving it turned off by default,
|
|
only available by explicit request (though likely with
|
|
a different mechanism than
|
|
``from __future__ import annotations``).
|
|
* Another possible workaround involves importing
|
|
the circularly-dependent modules separately, then
|
|
externally adding ("monkey-patching") their dependencies
|
|
to each other after the modules are loaded. As long
|
|
as the modules don't examine their annotations until
|
|
after they are completely loaded, this should work fine
|
|
and be maintainable with a minimum of effort.
|
|
* A third and more radical approach would be to change the
|
|
semantics of annotations so that they don't raise a
|
|
``NameError`` when an unknown name is evaluated,
|
|
but instead create some sort of proxy "reference" object.
|
|
* Of course, even if we do deprecate :pep:`563`, it will be
|
|
several releases before the functionality is removed,
|
|
giving us several years in which to research and innovate
|
|
new solutions for this problem.
|
|
|
|
In any case, the participants of the discussion agree that
|
|
this PEP should still move forward, even as this issue remains
|
|
currently unresolved [1]_.
|
|
|
|
.. [1] https://github.com/larryhastings/co_annotations/issues/1
|
|
|
|
|
|
cls.__globals__ and fn.__locals__
|
|
---------------------------------
|
|
|
|
Is it permissible to add the ``__globals__`` reference to class
|
|
objects as proposed here? It's not clear why this hasn't already
|
|
been done; :pep:`563` could have made use of class globals, but instead
|
|
made do with looking up classes inside ``sys.modules``. Python
|
|
seems strangely allergic to adding a ``__globals__`` reference to
|
|
class objects.
|
|
|
|
If adding ``__globals__`` to class objects is indeed a bad idea
|
|
(for reasons I don't know), here are two alternatives as to
|
|
how classes could get a reference to their globals for the
|
|
implementation of this PEP:
|
|
|
|
* The generate code for a class could bind its annotations code
|
|
object to a function at the time the class is bound, rather than
|
|
waiting for ``__annotations__`` to be referenced, making them an
|
|
exception to the rule (even though "special cases aren't special
|
|
enough to break the rules"). This would result in a small
|
|
additional runtime cost when annotations were defined but not
|
|
referenced on class objects. Honestly I'm more worried about
|
|
the lack of symmetry in semantics. (But I wouldn't want to
|
|
pre-bind all annotations code objects, as that would become
|
|
much more costly for function objects, even as annotations are
|
|
rarely used at runtime.)
|
|
* Use the class's ``__module__`` attribute to look up its module
|
|
by name in ``sys.modules``. This is what :pep:`563` advises.
|
|
While this is passable for userspace or library code, it seems
|
|
like a little bit of a code smell for this to be defined semantics
|
|
baked into the language itself.
|
|
|
|
Also, the prototype gets globals for class objects by calling
|
|
``globals()`` then storing the result. I'm sure there's a much
|
|
faster way to do this, I just didn't know what it was when I was
|
|
prototyping. I'm sure we can revise this to something much faster
|
|
and much more sanitary. I'd prefer to make it completely internal
|
|
anyway, and not make it visible to the user (via this new
|
|
__globals__ attribute). There's possibly already a good place to
|
|
put it anyway--``ht_module``.
|
|
|
|
Similarly, this PEP adds one new dunder member to functions,
|
|
classes, and modules (``__co_annotations__``), and a second new
|
|
dunder member to functions (``__locals__``). This might be
|
|
considered excessive.
|
|
|
|
|
|
Bikeshedding the name
|
|
---------------------
|
|
|
|
During most of the development of this PEP, user code actually
|
|
could see the raw annotation code objects. ``__co_annotations__``
|
|
could only be set to a code object; functions and other callables
|
|
weren't permitted. In that context the name ``co_annotations``
|
|
makes a lot of sense. But with this last-minute pivot where
|
|
``__co_annotations__`` now presents itself as a callable,
|
|
perhaps the name of the attribute and the name of the
|
|
``from __future__ import`` needs a re-think.
|
|
|
|
|
|
Acknowledgements
|
|
================
|
|
|
|
Thanks to Barry Warsaw, Eric V. Smith, Mark Shannon,
|
|
and Guido van Rossum for feedback and encouragement.
|
|
Thanks in particular to Mark Shannon for two key
|
|
suggestions—build the entire annotations dict inside
|
|
a single code object, and only bind it to a function
|
|
on demand—that quickly became among the best aspects
|
|
of this proposal. Also, thanks in particular to Guido
|
|
van Rossum for suggesting that ``__co_annotations__``
|
|
functions should duplicate the name visibility rules of
|
|
annotations under "stock" semantics--this resulted in
|
|
a sizeable improvement to the second draft. Finally,
|
|
special thanks to Jelle Zijlstra, who contributed not
|
|
just feedback--but code!
|
|
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document is placed in the public domain or under the
|
|
CC0-1.0-Universal license, whichever is more permissive.
|
|
|
|
|
|
..
|
|
Local Variables:
|
|
mode: indented-text
|
|
indent-tabs-mode: nil
|
|
sentence-end-double-space: t
|
|
fill-column: 70
|
|
coding: utf-8
|
|
End:
|