diff --git a/pep-0649.rst b/pep-0649.rst index 95c028347..d6f179922 100644 --- a/pep-0649.rst +++ b/pep-0649.rst @@ -6,83 +6,123 @@ Type: Standards Track Topic: Typing Content-Type: text/x-rst Created: 11-Jan-2021 -Post-History: 11-Jan-2021, 11-Apr-2021 +Post-History: 11-Jan-2021, 11-Apr-2021, 19-Apr-2023 +******** Abstract -======== +******** -As of Python 3.9, Python supports two different behaviors -for annotations: +Annotations are a Python technology that allows expressing +type information and other metadata about Python functions, +classes, and modules. But Python's original semantics +for annotations required them to be eagerly evaluated, +at the time the annotated object was bound. This caused +chronic problems for static type analysis users using +"type hints", due to forward-reference and circular-reference +problems. -* original or "stock" Python semantics, in which annotations - are evaluated at the time they are bound, and -* :pep:`563` semantics, currently enabled per-module by - ``from __future__ import annotations``, in which annotations - are converted back into strings and must be reparsed and - executed by ``eval()`` to be used. +Python solved this by accepting :pep:`563`, incorporating +a new approach called "stringized annotations" in which +annotations were automatically converted into strings by +Python. This solved the forward-reference and circular-reference +problems, and also fostered intriguing new uses for annotation +metadata. But stringized annotations in turn caused chronic +problems for runtime users of annotations. -Original Python semantics created a circular references problem -for static typing analysis. :pep:`563` solved that problem--but -its novel semantics introduced new problems, including its -restriction that annotations can only reference names at -module-level scope. +This PEP proposes a new and comprehensive third approach +for representing and computing annotations. It adds a new +internal mechanism for lazily computing annotations on demand, +via a new object method called ``__annotate__``. +This approach, when combined with a novel technique for +coercing annotation values into alternative formats, solves +all the above problems, supports all existing use cases, +and should foster future innovations in annotations. -This PEP proposes a third way that embodies the best of both -previous approaches. It solves the same circular reference -problems solved by :pep:`563`, while otherwise preserving Python's -original annotation semantics, including allowing annotations -to refer to local and class variables. -In this new approach, the code to generate the annotations -dict is written to its own function which computes and returns -the annotations dict. Then, ``__annotations__`` is a "data -descriptor" which calls this annotation function once and -retains the result. This delays the evaluation of annotations -expressions until the annotations are examined, at which point -all circular references have likely been resolved. And if -the annotations are never examined, the function is never -called and the annotations are never computed. +******** +Overview +******** -Annotations defined using this PEP's semantics have the same -visibility into the symbol table as annotations under "stock" -semantics--any name visible to an annotation in Python 3.9 -is visible to an annotation under this PEP. In addition, -annotations under this PEP can refer to names defined *after* -the annotation is defined, as long as the name is defined in -a scope visible to the annotation. Specifically, when this PEP -is active: +This PEP adds a new dunder attribute to the objects that +support annotations--functions, classes, and modules. +The new attribute is called ``__annotate__``, and is +a reference to a function which computes and returns +that object's annotations dict. -* An annotation can refer to a local variable defined in the - current function scope. -* An annotation can refer to a local variable defined in an - enclosing function scope. -* An annotation can refer to a class variable defined in the - current class scope. -* An annotation can refer to a global variable. +At compile time, if the definition of an object includes +annotations, the Python compiler will write the expressions +computing the annotations into its own function. When run, +the function will return the annotations dict. The Python +compiler then stores a reference to this function in +``__annotate`` on the object. -And in all four of these cases, the variable referenced by -the annotation needn't be defined at the time the annotation -is defined--it can be defined afterwards. The only restriction -is that the name or variable be defined before the annotation -is *evaluated.* +Furthermore, ``__annotations__`` is redefined to be a +"data descriptor" which calls this annotation function once +and caches the result. -If accepted, these new semantics for annotations would initially -be gated behind ``from __future__ import co_annotations``. -However, these semantics would eventually be promoted to be -Python's default behavior. Thus this PEP would *supersede* -:pep:`563`, and :pep:`563`'s behavior would be deprecated and +This mechanism delays the evaluation of annotations expressions +until the annotations are examined, which solves many circular +reference problems. + +This PEP also defines new functionality for two functions +in the Python standard library: +``inspect.get_annotations`` and ``typing.get_type_hints``. +The functionality is accessed via a new keyword-only parameter, +``format``. ``format`` allows the user to request +the annotations from these functions +in a specific format. +Format identifiers are always predefined integer values. +The formats defined by this PEP are: + +* ``inspect.VALUE = 1`` + + The default value. + The function will return the conventional Python + values for the annotations. This format is identical + to the return value for these functions under Python 3.11. + +* ``inspect.FORWARDREF = 2`` + + The function will attempt to return the conventional + Python values for the annotations. However, if it + encounters an undefined name, or a free variable that + has not yet been associated with a value, it dynamically + creates a proxy object (a ``ForwardRef``) that substitutes + for that value in the expression, then continues evaluation. + The resulting dict may contain a mixture of proxies and + real values. If all real values are defined at the time + the function is called, ``inspect.FORWARDREF`` and + ``inspect.VALUE`` produce identical results. + +* ``inspect.SOURCE = 3`` + + The function will produce an annotation dictionary + where the values have been replaced by strings containing + the original source code for the annotation expressions. + These strings may only be approximate, as they may be + reverse-engineered from another format, rather than + preserving the original source code, but the differences + will be minor. + +If accepted, this PEP would *supersede* :pep:`563`, +and :pep:`563`'s behavior would be deprecated and eventually removed. -Overview -======== + +Comparison Of Annotation Semantics +================================== .. note:: The code presented in this section is simplified - for clarity. The intention is to communicate the high-level - concepts involved without getting lost in with the details. - The actual details are often quite different. See the - Implementation_ section later in this PEP for a much more - accurate description of how this PEP works. + for clarity, and is intentionally inaccurate in some + critical aspects. This example is intended merely to + communicate the high-level concepts involved without + getting lost in the details. But readers should note + that the actual implementation is quite different in + several important ways. See the Implementation_ + section later in this PEP for a far more accurate + description of what this PEP proposes from a technical + level. Consider this example code: @@ -92,7 +132,7 @@ Consider this example code: ... class MyType: ... - foo_y_type = foo.__annotations__['y'] + foo_y_annotation = foo.__annotations__['y'] As we see here, annotations are available at runtime through an ``__annotations__`` attribute on functions, classes, and modules. @@ -100,7 +140,7 @@ When annotations are specified on one of these objects, ``__annotations__`` is a dictionary mapping the names of the fields to the value specified as that field's annotation. -The default behavior in Python 3.9 is to evaluate the expressions +The default behavior in Python is to evaluate the expressions for the annotations, and build the annotations dict, at the time the function, class, or module is bound. At runtime the above code actually works something like this: @@ -113,7 +153,7 @@ code actually works something like this: foo.__annotations__ = annotations class MyType: ... - foo_y_type = foo.__annotations__['y'] + foo_y_annotation = foo.__annotations__['y'] The crucial detail here is that the values ``int``, ``MyType``, and ``float`` are looked up at the time the function object is @@ -122,8 +162,9 @@ But this code doesn't run—it throws a ``NameError`` on the first line, because ``MyType`` hasn't been defined yet. :pep:`563`'s solution is to decompile the expressions back -into strings, and store those *strings* in the annotations dict. -The equivalent runtime code would look something like this: +into strings during compliation and store those strings as the +values in the annotations dict. The equivalent runtime code +would look something like this: .. code-block:: @@ -133,14 +174,13 @@ The equivalent runtime code would look something like this: foo.__annotations__ = annotations class MyType: ... - foo_y_type = foo.__annotations__['y'] + foo_y_annotation = foo.__annotations__['y'] -This code now runs successfully. However, ``foo_y_type`` +This code now runs successfully. However, ``foo_y_annotation`` is no longer a reference to ``MyType``, it is the *string* -``'MyType'``. The code would have to be further modified to -call ``eval()`` or ``typing.get_type_hints()`` to convert -the string into a useful reference to the actual ``MyType`` -object. +``'MyType'``. To turn the string into the real value ``MyType``, +the user would need to evaluate the string using ``eval``, +``inspect.get_annotations``, or ``typing.get_type_hints``. This PEP proposes a third approach, delaying the evaluation of the annotations by computing them in their own function. If @@ -151,216 +191,311 @@ like this: class function: # __annotations__ on a function object is already a - # "data descriptor" in Python, we're just changing what it does + # "data descriptor" in Python, we're just changing + # what it does @property def __annotations__(self): - return self.__co_annotations__() + return self.__annotate__() # ... - def foo_annotations_fn(): + def annotate_foo(): return {'x': int, 'y': MyType, 'return': float} def foo(x = 3, y = "abc"): ... - foo.__co_annotations__ = foo_annotations_fn + foo.__annotate__ = annotate_foo class MyType: ... - foo_y_type = foo.__annotations__['y'] + foo_y_annotation = foo.__annotations__['y'] The important change is that the code constructing the annotations dict now lives in a function—here, called -``foo_annotations_fn()``. But this function isn't called +``annotate_foo()``. But this function isn't called until we ask for the value of ``foo.__annotations__``, and we don't do that until *after* the definition of ``MyType``. -So this code also runs successfully, and ``foo_y_type`` now +So this code also runs successfully, and ``foo_y_annotation`` now has the correct value--the class ``MyType``--even though ``MyType`` wasn't defined until *after* the annotation was defined. + +********** Motivation -========== +********** + +A History Of Annotations +======================== + +Python 3.0 shipped with a new syntax feature, "annotations", +defined in :pep:`3107`. +This allowed specifying a Python value that would be +associated with a parameter of a Python function, or +with the value that function returns. +Said another way, annotations gave Python users an interface +to provide rich metadata about a function parameter or return +value, for example type information. +All the annotations for a function were stored together in +a new attribute ``__annotations__``, in an "annotation dict" +that mapped parameter names (or, in the case of the return +annotaion, using the name ``'return'``) to their Python value. + +In an effort to foster experimentation, Python +intentionally didn't define what form this metadata should take, +or what values should be used. User code began experimenting with +this new facility almost immediately. But popular libraries that +make use of this functionality were slow to emerge. + +After years of little progress, the BDFL chose a particular +approach for expressing static type information, called +*type hints,* as defined in :pep:`484`. Python 3.5 shipped +with a new `typing` module which quickly became very popular. + +Python 3.6 added syntax to annotate local variables, +class attributes, and module attributes, using the approach +proposed in :pep:`526`. Static type analysis continued to +grow in popularity. + +However, static type analysis users were increasingly frustrated +by an inconvenient problem: forward references. In classic +Python, if a class C depends on a later-defined class D, +it's normally not a problem, because user code will usually +wait until both are defined before trying to use either. +But annotations added a new complication, because they were +computed at the time the annotated object (function, class, +or module) was bound. If methods on class C are annotated with +type D, and these annotation expressions are computed at the +time that the method is bound, D may not be defined yet. +And if methods in D are also annotated with type C, you now +have an unresolvable circular reference problem. + +Initially, static type users worked around this problem +by defining their problematic annotations as strings. +This worked because a string containing the type hint was +just as usable for the static type analysis tool. +And users of static type analysis tools rarely examine the +annotations at runtime, so this representation wasn't +itself an inconvenience. But manually stringizing type +hints was clumsy and error-prone. Also, code bases were +adding more and more annotations, which consumed more and +more CPU time to create and bind. + +To solve these problems, the BDFL accepted :pep:`563`, which +added a new feature to Python 3.7: "stringized annotations". +It was activated with a future import: + + from __future__ import annotations + +Normally, annotation expressions were evaluated at the time +the object was bound, with their values being stored in the +annotations dict. When stringized annotations were active, +these semantics changed: instead, at compile time, the compiler +converted all annotations in that module into string +representations of their source code--thus, *automatically* +turning the users's annotations into strings, obviating the +need to *manually* stringize them as before. :pep:`563` +suggested users could evaluate this string with ``eval`` +if the actual value was needed at runtime. + +(From here on out, this PEP will refer to the classic +semantics of :pep:`3107` and :pep:`526`, where the +values of annotation expressions are computed at the time +the object is bound, as *"stock" semantics,* to differentiate +them from the new :pep:`563` "stringized" annotation semantics.) + +The Current State Of Annotation Use Cases +========================================= + +Although there are many specific use cases for annotations, +annotation users in the discussion around this PEP tended +to fall into one of these four categories. + + +Static typing users +------------------- + +Static typing users use annotations to add type information +to their code. But they largely don't examine the annotations +at runtime. Instead, they use static type analysis tools +(mypy, pytype) to examine their source tree and determine +whether or not their code is using types consistently. +This is almost certainly the most popular use case for +annotations today. + +Many of the annotations use *type hints,* a la :pep:`484` +(and many subsequent PEPs). Type hints are passive objects, +mere representation of type information; they don't do any actual work. +Type hints are often parameterized with other types or other type hints. +Since they're agnostic about what these actual values are, type hints +work fine with ``ForwardRef`` proxy objects. +Users of static type hints discovered that extensive type hinting under +stock semantics often created large-scale circular reference and circular +import problems that could be difficult to solve. :pep:`563` was designed +specifically to solve this problem, and the solution worked great for +these users. The difficulty of rendering stringized annotations into +real values largely didn't inconvenience these users because of how +infrequently they examine annotations at runtime. + +Static typing users often combine :pep:`563` with the +``if typing.TYPE_CHECKING`` idiom to prevent their type hints from being +loaded at runtime. This means they often aren't able to evaluate their +stringized annotations and produce real values at runtime. On the rare +occasion that they do examine annotations at runtime, they often forgo +``eval``, instead using lexical analysis directly on the stringized +annotations. + +Under this PEP, static typing users will probably prefer ``FORWARDREF`` +or ``SOURCE`` format. + + +Runtime annotation users +------------------------ + +Runtime annotation users use annotations as a means of expressing rich +metadata about their functions and classes, which they use as input to +runtime behavior. Specific use cases include runtime type verification +(Pydantic) and glue logic to expose Python APIs in another domain +(FastAPI, Typer). The annotations may or may not be type hints. + +As runtime annotation users examine annotations at runtime, they were +traditionally better served with stock semantics. This use case is +largely incompatible with :pep:`563`, particularly with the +``if typing.TYPE_CHECKING`` idiom. + +Under this PEP, runtime annotation users will use ``VALUE`` format. + + +Wrappers +-------- + +Wrappers are functions or classes that wrap user functions or +classes and add functionality. Examples of this would be +``dataclass``, ``functools.partial``, ``attrs``, and ``wrapt``. + +Wrappers are a distinct subcategory of runtime annotation users. +Although they do use annotations at runtime, they may or may not +actually examine the annotations of the objects they wrap--it depends +on the functionality the wrapper provides. As a rule they should +propagate the annotations of the wrapped object to the wrapper +they create, although it's possible they may modify those annotations. + +Wrappers were generally designed to work well under stock semantics. +Whether or not they work well under :pep:`563` semantics depends on the +degree to which they examine the wrapped object's annotations. +Often wrappers don't care about the value per se, only needing +specific information about the annotations. Even so, :pep:`563` +and the ``if typing.TYPE_CHECKING`` idiom can make it difficult +for wrappers to reliably determine the information they need at +runtime. This is an ongoing, chronic problem. +Under this PEP, wrappers will probably prefer ``FORWARDREF`` format +for their internal logic. But the wrapped objects need to support +all formats for their users. + + +Documentation +------------- + +:pep:`563` stringized annotations were a boon for tools that +mechanically construct documentation. + +Stringized type hints make for excellent documentation; type hints +as expressed in source code are often succinct and readable. However, +at runtime these same type hints can produce value at runtime whose repr +is a sprawling, nested, unreadable mess. Thus documentation users were +well-served by :pep:`563` but poorly served with stock semantics. + +Under this PEP, documentation users are expected to use ``SOURCE`` format. + + +Motivation For This PEP +======================= Python's original semantics for annotations made its use for static type analysis painful due to forward reference problems. -This was the main justification for :pep:`563`, and we need not -revisit those arguments here. +:pep:`563` solved the forward reference problem, and many +static type analysis users became happy early adopters of it. +But its unconventional solution created new problems for two +of the above cited use cases: runtime annotation users, +and wrappers. -However, :pep:`563`'s solution was to decompile code for Python -annotations back into strings at compile time, requiring -users of annotations to ``eval()`` those strings to restore -them to their actual Python values. This has several drawbacks: +First, stringized annotations didn't permit referencing local or +free variables, which meant many useful, reasonable approaches +to creating annotations were no longer viable. This was +particularly inconvenient for decorators that wrap existing +functions and classes, as these decorators often use closures. -* It requires Python implementations to stringize their - annotations. This is surprising behavior—unprecedented - for a language-level feature. Also, adding this feature - to CPython was complicated, and this complicated code would - need to be reimplemented independently by every other Python - implementation. -* It requires that all annotations be evaluated at module-level - scope. Annotations under :pep:`563` can no longer refer to +Second, in order for ``eval`` to correctly look up globals in a +stringized annotation, you must first obtaining a reference +to the correct module. +But class objects don't retain a reference to their globals. +:pep:`563` suggests looking up a class's module by name in +``sys.modules``—a surprising requirement for a language-level +feature. - * class variables, - * local variables in the current function, or - * local variables in enclosing functions. +Additionally, complex but legitimate constructions can make it +difficult to determine the correct globals and locals dicts to +give to ``eval`` to properly evaluate a stringized annotation. +Even worse, in some situations it may simply be infeasible. -* It requires a code change every time existing code uses an - annotation, to handle converting the stringized - annotation back into a useful value. -* ``eval()`` is slow. -* ``eval()`` isn't always available; it's sometimes removed - from Python for space reasons. -* In order to evaluate the annotations on a class, - it requires obtaining a reference to that class's globals, - which :pep:`563` suggests should be done by looking up that class - by name in ``sys.modules``—another surprising requirement for - a language-level feature. -* It adds an ongoing maintenance burden to Python implementations. - Every time the language adds a new feature available in expressions, - the implementation's stringizing code must be updated in - tandem in order to support decompiling it. +For example, some libraries (e.g. TypedDict, dataclass) wrap a user +class, then merge all the annotations from all that class's base +classes together into one cumulative annotations dict. If those +annotations were stringized, calling ``eval`` on them later may +not work properly, because the globals dictionary used for the +``eval`` will be the module where the *user class* was defined, +which may not be the same module where the *annotation* was +defined. However, if the annotations were stringized because +of forward-reference problems, calling ``eval`` on them early +may not work either, due to the forward reference not being +resolvable yet. This has proved to be difficult to reconcile; +of the three bug reports linked to below, only one has been +marked as fixed. -This PEP also solves the forward reference problem outlined in -:pep:`563` while avoiding the problems listed above: + https://github.com/python/cpython/issues/89687 + https://github.com/python/cpython/issues/85421 + https://github.com/python/cpython/issues/90531 -* Python implementations would generate annotations as code - objects. This is simpler than stringizing, and is something - Python implementations are already quite good at. This means: +Even with proper globals *and* locals, ``eval`` can be unreliable +on stringized annotations. +``eval`` can only succeed if all the symbols referenced in +an annotations are defined. If a stringized annotation refers +to a mixture of defined and undefined symbols, a simple ``eval`` +of that string will fail. This is a problem for libraries with +that need to examine the annotation, because they can't reliably +convert these stringized annotations into real values. - - alternate implementations would need to write less code to - implement this feature, and - - the implementation would be simpler overall, which should - reduce its ongoing maintenance cost. +* Some libraries (e.g. ``dataclass``) solved this by foregoing real + values and performing lexical analysis of the stringized annotation, + which requires a lot of work to get right. -* Existing annotations would not need to be changed to only - use global scope. Actually, annotations would become much - easier to use, as they would now also handle forward - references. -* Code examining annotations at runtime would no longer need - to use ``eval()`` or anything else—it would automatically - see the correct values. This is easier, faster, and - removes the dependency on ``eval()``. +* Other libraries still suffer with this problem, + which can produce surprising runtime behavior. + https://github.com/python/cpython/issues/97727 +Also, ``eval()`` is slow, and it isn't always available; it's +sometimes removed for space reasons on certain platforms. +``eval()`` on MicroPython doesn't support the ``locals`` +argument, which makes converting stringized annotations +into real values at runtime even harder.. -Backwards Compatibility -======================= +Finally, :pep:`563` requires Python implementations to +stringize their annotations. This is surprising behavior—unprecedented +for a language-level feature, with a complicated implementation, +that must be updated whenever a new operator is added to the +language. -:pep:`563` changed the semantics of annotations. When its semantics -are active, annotations must assume they will be evaluated in -*module-level* scope. They may no longer refer directly -to local variables or class attributes. - -This PEP removes that restriction; annotations may refer to globals, -local variables inside functions, local variables defined in enclosing -functions, and class members in the current class. In addition, -annotations may refer to any of these that haven't been defined yet -at the time the annotation is defined, as long as the not-yet-defined -name is created normally (in such a way that it is known to the symbol -table for the relevant block, or is a global or class variable found -using normal name resolution). Thus, this PEP demonstrates *improved* -backwards compatibility over :pep:`563`. - -:pep:`563` also requires using ``eval()`` or ``typing.get_type_hints()`` -to examine annotations. Code updated to work with :pep:`563` that calls -``eval()`` directly would have to be updated simply to remove the -``eval()`` call. Code using ``typing.get_type_hints()`` would -continue to work unchanged, though future use of that function -would become optional in most cases. - -Because this PEP makes semantic changes to how annotations are -evaluated, this PEP will be initially gated with a per-module -``from __future__ import co_annotations`` before it eventually -becomes the default behavior. - -Apart from the delay in evaluating values stored in annotations -dicts, this PEP preserves nearly all existing behavior of -annotations dicts. Specifically: - -* Annotations dicts are mutable, and any changes to them are - preserved. -* The ``__annotations__`` attribute can be explicitly set, - and any value set this way will be preserved. -* The ``__annotations__`` attribute can be deleted using - the ``del`` statement. - -However, there are two uncommon interactions possible with class -and module annotations that work today—both with stock semantics, -and with :pep:`563` semantics—that would no longer work when this PEP -was active. These two interactions would have to be prohibited. -The good news is, neither is common, and neither is considered good -practice. In fact, they're rarely seen outside of Python's own -regression test suite. They are: - -* *Code that sets annotations on module or class attributes - from inside any kind of flow control statement.* It's - currently possible to set module and class attributes with - annotations inside an ``if`` or ``try`` statement, and it works - as one would expect. It's untenable to support this behavior - when this PEP is active. -* *Code in module or class scope that references or modifies the - local* ``__annotations__`` *dict directly.* Currently, when - setting annotations on module or class attributes, the generated - code simply creates a local ``__annotations__`` dict, then sets - mappings in it as needed. It's also possible for user code - to directly modify this dict, though this doesn't seem like it's - an intentional feature. Although it would be possible to support - this after a fashion when this PEP was active, the semantics - would likely be surprising and wouldn't make anyone happy. - -Note that these are both also pain points for static type checkers, -and are unsupported by those checkers. It seems reasonable to -declare that both are at the very least unsupported, and their -use results in undefined behavior. It might be worth making a -small effort to explicitly prohibit them with compile-time checks. - -In addition, there are a few operators that would no longer be -valid for use in annotations, because their side effects would -affect the *annotation function* instead of the -class/function/module the annotation was nominally defined in: - -* ``:=`` (aka the "walrus operator"), -* ``yield`` and ``yield from``, and -* ``await``. - -Use of any of these operators in an annotation will result in a -compile-time error. - -Since delaying the evaluation of annotations until they are -evaluated changes the semantics of the language, it's observable -from within the language. Therefore it's possible to write code -that behaves differently based on whether annotations are -evaluated at binding time or at access time, e.g. - -.. code-block:: - - mytype = str - def foo(a:mytype): pass - mytype = int - print(foo.__annotations__['a']) - -This will print ```` with stock semantics -and ```` when this PEP is active. Since -this is poor programming style to begin with, it seems -acceptable that this PEP changes its behavior. - -Finally, there's a standard idiom that's actually somewhat common -when accessing class annotations, and which will become more -problematic when this PEP is active: code often accesses class -annotations via ``cls.__dict__.get("__annotations__", {})`` -rather than simply ``cls.__annotations__``. It's due to a flaw -in the original design of annotations themselves. This topic -will be examined in a separate discussion; the outcome of -that discussion will likely guide the future evolution of this -PEP. +These problems motivated the research into finding a new +approach to solve the problems facing annotations users, +resulting in this PEP. Mistaken Rejection Of This Approach In November 2017 ==================================================== During the early days of discussion around :pep:`563`, -using code to delay the evaluation of annotations was -briefly discussed, in a November 2017 thread in -``comp.lang.python-dev``. At the time the +in a November 2017 thread in ``comp.lang.python-dev``, +the idea of using code to delay the evaluation of +annotations was briefly discussed. At the time the technique was termed an "implicit lambda expression". Guido van Rossum—Python's BDFL at the time—replied, @@ -381,342 +516,545 @@ scope. This idea, too, was quickly rejected. :pep:`PEP 563 summarizes the above discussion <563#keeping-the-ability-to-use-function-local-state-when-defining-annotations>` -What's puzzling is :pep:`563`'s own changes to the scoping rules -of annotations—it *also* doesn't permit annotations to reference -class-level definitions. It's not immediately clear why an -inability to reference class-level definitions was enough to -reject using "implicit lambda expressions" for annotations, -but was acceptable for stringized annotations. - -In retrospect there was probably a pivot during the development -of :pep:`563`. It seems that, early on, there was a prevailing -assumption that :pep:`563` would support references to class-level -definitions. But by the time :pep:`563` was finalized, this -assumption had apparently been abandoned. And it looks like -"implicit lambda expressions" were never reconsidered in this -new light. - -In any case, annotations are still able to refer to class-level -definitions under this PEP, rendering the objection moot. +The approach taken by this PEP doesn't suffer from these +restrictions. Annotations can access module-level definitions, +class-level definitions, and even local and free variables. .. _Implementation: +************** Implementation -============== - -There's a prototype implementation of this PEP, here: - -https://github.com/larryhastings/co_annotations/ - -As of this writing, all features described in this PEP are -implemented, and there are some rudimentary tests in the -test suite. There are still some broken tests, and the -``co_annotations`` repo is many months behind the -CPython repo. - - -from __future__ import co_annotations -------------------------------------- - -In the prototype, the semantics presented in this PEP are gated with: - -.. code-block:: - - from __future__ import co_annotations - - - -__co_annotations__ ------------------- - -Python supports runtime metadata for annotations for three different -types: function, classes, and modules. The basic approach to -implement this PEP is much the same for all three with only minor -variations. - -With this PEP, each of these types adds a new attribute, -``__co_annotations__``. ``__co_annotations__`` is a function: -it takes no arguments, and must return either ``None`` or a dict -(or subclass of dict). It adds the following semantics: - -* ``__co_annotations__`` is always set, and may contain either - ``None`` or a callable. -* ``__co_annotations__`` cannot be deleted. -* ``__annotations__`` and ``__co_annotations__`` can't both - be set to a useful value simultaneously: - - - If you set ``__annotations__`` to a dict, this also sets - ``__co_annotations__`` to None. - - If you set ``__co_annotations__`` to a callable, this also - deletes ``__annotations__`` - -Internally, ``__co_annotations__`` is a "data descriptor", -where functions are called whenever user code gets, sets, -or deletes the attribute. In all three cases, the object -has separate internal storage for the current value -of the ``__co_annotations__`` attribute. - -``__annotations__`` is also as a data descriptor, with its own -separate internal storage for its internal value. The code -implementing the "get" for ``__annotations__`` works something -like this: - -.. code-block:: - - if (the internal value is set) - return the internal annotations dict - if (__co_annotations__ is not None) - call the __co_annotations__ function - if the result is a dict: - store the result as the internal value - set __co_annotations__ to None - return the internal value - do whatever this object does when there are no annotations - - -Unbound code objects --------------------- - -When Python code defines one of these three objects with -annotations, the Python compiler generates a separate code -object which builds and returns the appropriate annotations -dict. Wherever possible, the "annotation code object" is -then stored *unbound* as the internal value of -``__co_annotations__``; it is then bound on demand when -the user asks for ``__annotations__``. - -This is a useful optimization for both speed and memory -consumption. Python processes rarely examine annotations -at runtime. Therefore, pre-binding these code objects to -function objects would usually be a waste of resources. - -When is this optimization not possible? - -* When an annotation function contains references to - free variables, in the current function or in an - outer function. -* When an annotation function is defined on a method - (a function defined inside a class) and the annotations - possibly refer directly to class variables. - -Note that user code isn't permitted to directly access these -unbound code objects. If the user "gets" the value of -``__co_annotations__``, and the internal value of -``__co_annotations__`` is an unbound code object, -it immediately binds the code object, and the resulting -function object is stored as the new value of -``__co_annotations__`` and returned. - -(However, these unbound code objects *are* stored in the -``.pyc`` file. So a determined user could examine them -should that be necessary for some reason.) - - - - -Function Annotations --------------------- - -When compiling a function, the CPython bytecode compiler -visits the annotations for the function all in one place, -starting with ``compiler_visit_annotations()`` in ``compile.c``. -If there are any annotations, they create the scope for -the annotations function on demand, and -``compiler_visit_annotations()`` assembles it. - -The code object is passed in place of the annotations dict -for the ``MAKE_FUNCTION`` bytecode instruction. -``MAKE_FUNCTION`` supports a new bit in its oparg -bitfield, ``0x10``, which tells it to expect a -``co_annotations`` code object on the stack. -The bitfields for ``annotations`` (``0x04``) and -``co_annotations`` (``0x10``) are mutually exclusive. - -When binding an unbound annotation code object, a function will -use its own ``__globals__`` as the new function's globals. - -One quirk of Python: you can't actually remove the annotations -from a function object. If you delete the ``__annotations__`` -attribute of a function, then get its ``__annotations__`` member, -it will create an empty dict and use that as its -``__annotations__``. The implementation of this PEP maintains -this quirk for backwards compatibility. - - -Class Annotations ------------------ - -When compiling a class body, the compiler maintains two scopes: -one for the normal class body code, and one for annotations. -(This is facilitated by four new functions: ``compiler.c`` -adds ``compiler_push_scope()`` and ``compiler_pop_scope()``, -and ``symtable.c`` adds ``symtable_push_scope()`` and -``symtable_pop_scope()``.) -Once the code generator reaches the end of the class body, -but before it generates the bytecode for the class body, -it assembles the bytecode for ``__co_annotations__``, then -assigns that to ``__co_annotations__`` using ``STORE_NAME``. - -It also sets a new ``__globals__`` attribute. Currently it -does this by calling ``globals()`` and storing the result. -(Surely there's a more elegant way to find the class's -globals--but this was good enough for the prototype.) When -binding an unbound annotation code object, a class will use -the value of this ``__globals__`` attribute. When the class -drops its reference to the unbound code object--either because -it has bound it to a function, or because ``__annotations__`` -has been explicitly set--it also deletes its ``__globals__`` -attribute. - -As discussed above, examination or modification of -``__annotations__`` from within the class body is no -longer supported. Also, any flow control (``if`` or ``try`` blocks) -around declarations of members with annotations is unsupported. - -If you delete the ``__annotations__`` attribute of a class, -then get its ``__annotations__`` member, it will return the -annotations dict of the first base class with annotations set. -If no base classes have annotations set, it will raise -``AttributeError``. - -Although it's an implementation-specific detail, currently -classes store the internal value of ``__co_annotations__`` -in their ``tp_dict`` under the same name. - - -Module Annotations ------------------- - -Module annotations work much the same as class annotations. -The main difference is, a module uses its own dict as the -``__globals__`` when binding the function. - -If you delete the ``__annotations__`` attribute of a class, -then get its ``__annotations__`` member, the module will -raise ``AttributeError``. - -Annotations With Closures -------------------------- - -It's possible to write annotations that refer to -free variables, and even free variables that have yet -to be defined. For example: - -.. code-block:: - - from __future__ import co_annotations - - def outer(): - def middle(): - def inner(a:mytype, b:mytype2): pass - mytype = str - return inner - mytype2 = int - return middle() - - fn = outer() - print(fn.__annotations__) - -At the time ``fn`` is set, ``inner.__co_annotations__()`` -hasn't been run. So it has to retain a reference to -the *future* definitions of ``mytype`` and ``mytype2`` if -it is to correctly evaluate its annotations. - -If an annotation function refers to a local variable -from the current function scope, or a free variable -from an enclosing function scope--if, in CPython, the -annotation function code object contains one or more -``LOAD_DEREF`` opcodes--then the annotation code object -is bound at definition time with references to these -variables. ``LOAD_DEREF`` instructions require the annotation -function to be bound with special run-time information -(in CPython, a ``freevars`` array). Rather than store -that separately and use that to later lazy-bind the -function object, the current implementation simply -early-binds the function object. - -Note that, since the annotation function ``inner.__co_annotations__()`` -is defined while parsing ``outer()``, from Python's perspective -the annotation function is a "nested function". So "local -variable inside the 'current' function" and "free variable -from an enclosing function" are, from the perspective of -the annotation function, the same thing. - - -Annotations That Refer To Class Variables ------------------------------------------ - -It's possible to write annotations that refer to -class variables, and even class variables that haven't -yet been defined. For example: - -.. code-block:: - - from __future__ import co_annotations - - class C: - def method(a:mytype): pass - mytype = str - - print(C.method.__annotations__) - -Internally, annotation functions are defined as -a new type of "block" in CPython's symbol table -called an ``AnnotationBlock``. An ``AnnotationBlock`` -is almost identical to a ``FunctionBlock``. It differs -in that it's permitted to see names from an enclosing -class scope. (Again: annotation functions are functions, -and they're defined *inside* the same scope as -the thing they're being defined on. So in the above -example, the annotation function for ``C.method()`` -is defined inside ``C``.) - -If it's possible that an annotation function refers -to class variables--if all these conditions are true: - -* The annotation function is being defined inside - a class scope. -* The generated code for the annotation function - has at least one ``LOAD_NAME`` instruction. - -Then the annotation function is bound at the time -it's set on the class/function, and this binding -includes a reference to the class dict. The class -dict is pushed on the stack, and the ``MAKE_FUNCTION`` -bytecode instruction takes a new second bitfield (0x20) -indicating that it should consume that stack argument -and store it as ``__locals__`` on the newly created -function object. - -Then, at the time the function is executed, the -``f_locals`` field of the frame object is set to -the function's ``__locals__``, if set. This permits -``LOAD_NAME`` opcodes to work normally, which means -the code generated for annotation functions is nearly -identical to that generated for conventional Python -functions. +************** + +__annotate__ and __annotations__ +================================ + +Python supports annotations on three different types: +function, classes, and modules. This PEP modifies +the semantics on all three of these types in a similar +way. + +First, this PEP adds a new "dunder" attribute, ``__annotate__``. +``__annotate__`` must be a "data descriptor", +implementing all three actions: get, set, and delete. +The ``__annotate__`` attribute is always defined, +and may only be set to either ``None`` or to a callable. +(``__annotate__`` cannot be deleted.) If an object +has no annotations, ``__annotate__`` should be +initialized to ``None``, rather than to a function +that returns an empty dict. + +The ``__annotate__`` data descriptor must have dedicated +storage inside the object to store the reference to its value. +The location of this storage at runtime is an implementation +detail. Even if it's visible to Python code, it should still +be considered an internal implementation detail, and Python +code should prefer to interact with it only via the +``__annotate__`` attribute. + +The callable stored in ``__annotate__`` must accept a +single required positional argument called ``format``, +which will always be a ``int``. It must either return +a dict (or subclass of dict) or raise +``NotImplementedError()``. + +Here's a formal definition of ``__annotate__``, as it will +appear in the "Magic methods" section of the Python +Language Reference: + + __annotate__(format: int) -> dict + + Returns a new dictionary object mapping attribute/parameter + names to their annotation values. + + Takes a ``format`` parameter specifying the format in which + annotations values should be provided. Must be one of the + following: + + ``1`` (exported as ``inspect.VALUE``) + + Values are the result of evaluating the annotation expressions. + + ``2`` (exported as ``inspect.SOURCE``) + + Values are the text string of the annotation as it + appears in the source code. May only be approximate; + whitespace may be normalized, and constant values may + be optimized. + + ``3`` (exported as ``inspect.FORWARDREF``) + + Values are real annotation values (as per ``inspect.VALUE`` format) + for defined values, and ``ForwardRef`` proxies for undefined values. + Real objects may be exposed to, or contain references to, + ``ForwardRef`` proxy objects. + + If an ``__annotate__`` function doesn't support the requested + format, it must raise ``NotImplementedError()``. + ``__annotate__`` functions must always support ``1`` (``inspect.VALUE``) + format; they must not raise ``NotImplementedError()`` when called with + ``format=1``. + + When called with ``format=1``, an ``__annotate__`` function + may raise ``NameError``; it must not raise ``NameError`` when called + requesting any other format. + + If an object doesn't have any annotations, ``__annotate__`` + should preferably be deleted or set to ``None``, rather than set to + a function that returns an empty dict. + +When the Python compiler compiles an object with +annotations, it simultaneously compiles the appropriate +annotate function. This function, called with +the single positional argument ``inspect.VALUE``, +computes and returns the annotations dict as defined +on that object. The Python compiler and runtime work +in concert to ensure that the function is bound to +the appropriate namespaces: + +* For functions and classes, the globals dictionary will + be the module where the object was defined. If the object + is itself a module, its globals dictionary will be its + own dict. +* For methods on classes, and for classes, the locals dictionary + will be the class dictionary. +* If the annotations refer to free variables, the closure will + be the appropriate tuple containing free variables. + +Second, this PEP requires that the existing +``__annotations__`` must be a "data descriptor", +implementing all three actions: get, set, and delete. +``__annotations__`` must also have its own internal +storage it uses to cache a reference to the annotations dict: + +* Class and module objects must + cache the annotations dict in their ``__dict__``, using the key + ``__annotations__``. This is required for backwards + compatibility reasons. +* For function objects, storage for the annotations dict + cache is an implementation detail. It's preferably internal + to the function object and not visible in Python. + +This PEP defines semantics on how ``__annotations__`` and +``__annotate__`` interact, for all three types that implement them. +In the following examples, ``fn`` represents a function, ``cls`` +represents a class, ``mod`` represents a module, and ``o`` represents +an object of any of these three types: + +* When ``o.__annotations__`` is evaluated, and the internal storage + for ``o.__annotations__`` is unset, and ``o.__annotate__`` is set + to a callable, the getter for ``o.__annotations__`` calls + ``o.__annotate__(1)``, then caches the result in its intenral + storage and returns the result. + + - To explicitly clarify one question that has come up multiple times: + this ``o.__annotations__`` cache is the *only* caching mechanism + defined in this PEP. There are *no other* caching mechanisms defined + in this PEP. The ``__annotate__`` functions generated by the Python + compiler explicitly don't cache any of the values they compute. + +* Setting ``o.__annotate__`` to a callable invalidates the + cached annotations dict. + +* Setting ``o.__annotate__`` to ``None`` has no effect on + the cached annotations dict. + +* Deleting ``o.__annotate__`` raises ``TypeError``. + ``__annotate__`` must always be set; this prevents unannotated + subclasses from inheriting the ``__annotate__`` method of one + of their base classes. + +* Setting ``o.__annotations__`` to a legal value + automatically sets ``o.__annotate__`` to ``None``. + + * Setting ``cls.__annotations__`` or ``mod.__annotations__`` + to ``None`` otherwise works like any other attribute; the + attribute is set to ``None``. + + * Setting ``fn.__annotations__`` to ``None`` invalidates + the cached annotations dict. If ``fn.__annotations__`` + doesn't have a cached annotations value, and ``fn.__annotate__`` + is ``None``, the ``fn.__annotations__`` data descriptor + creates, caches, and returns a new empty dict. (This is for + backwards compatibility with :pep:``3107`` semantics.) + + + +Changes to ``inspect.get_annotations`` and ``typing.get_type_hints`` +==================================================================== + +(This PEP makes frequent reference to these two functions. In the future +it will refer to them collectively as "the helper functions", as they help +user code work with annotations.) + +These two functions extract and return the annotations from an object. +``inspect.get_annotations`` returns the annotations unchanged; +for the convenience of static typing users, ``typing.get_type_hints`` +makes some modifications to the annotations before it returns them. + +This PEP adds a new keyword-only parameter to these two functions, +``format``. ``format`` specifies what format the values in the +annotations dict should be returned in. +``format`` accepts following values, defined as attributes on the +``inspect`` module:: + + VALUE = 1 + FORWARDREF = 2 + SOURCE = 3 + +The default value for the ``format`` parameter is ``1``, +which is ``VALUE`` format. + +The defined ``format`` values are guaranteed to be contiguous, +and the ``inspect`` module also publishes attributes representing +the minimum and maximum supported ``format`` values:: + + FORMAT_MIN = VALUE + FORMAT_MAX = SOURCE + + +Also, when either ``__annotations__`` or ``__annotate__`` +is updated on an object, the other of those two attributes +is now out-of-date and should also either be updated or +deleted. In general, the semantics established in the +previous section ensure that this happens automatically. +However, there's one case which for all practical +purposes can't be handled automatically: when the dict cached +by ``o.__annotations__`` is itself modified, or when mutable +values inside that dict are modified. + +Since this can't be handled in code, it must be handled in +documentation. This PEP proposes amending the documentation +for ``inspect.get_annotations`` (and similarly for +``typing.get_type_hints``) as follows: + + If you directly modify the ``__annotations__`` dict on an object, + by default these changes may not be reflected in the dictionary + returned by ``inspect.get_annotations`` when requesting either + ``SOURCE`` or ``FORWARDREF`` format on that object. Rather than + modifying the ``__annotations__`` dict directly, consider replacing + that object's ``__annotate__`` method with a function computing + the annotations dict with your desired values. Failing that, it's + best to overwrite the object's ``__annotate__`` method with ``None``, + or delete ``__annotate__`` from the object, to prevent + ``inspect.get_annotations`` from generating stale results + for ``SOURCE`` and ``FORWARDREF`` formats. + + + +The ``stringizer`` and the ``fake globals`` environment +======================================================= + +As originally proposed, this PEP supported many runtime +annotation user use cases, and many static type user use cases. +But this was insufficient--this PEP could not be accepted +until it satisfied *all* extant use cases. This became +a longtime blocker of this PEP until Carl Meyer proposed +the "stringizer" and the "fake globals" environment as +described below. These techniques allow this PEP to support +both the ``FORWARDREF`` and ``SOURCE`` formats, ably +satisfying all remaining uses cases. + +In a nutshell, this technique involves running a +Python-compiler-generated ``__annotate__`` function in +an exotic runtime environment. Its normal ``globals`` +dict is replaced with what's called a "fake globals" dict. +A "fake globals" dict is a dict with one important difference: +every time you "get" a key from it that isn't mapped, +it creates, caches, and returns a new value for that key +(as per the ``__missing__`` callback for a +``collections.defaultdict``). +That value is a an instance of a novel type referred to +as a "stringizer". + +A "stringizer" is a Python class with highly unusual behavior. +Every stringizer is initialized with its "value", initially +the name of the missing key in the "fake globals" dict. The +stringizer then implements every Python "dunder" method used to +implement operators, and the value returned by that method +is a new stringizer whose value is a text representation +of that operation. + +When these stringizers are used in expressions, the result +of the expression is a new stringizer whose name textually +represents that expression. For example, let's say +you have a variable ``f``, which is a reference to a +stringizer initialized with the value ``'f'``. Here are +some examples of operations you could perform on ``f`` and +the values they would return:: + + >>> f + Stringizer('f') + >>> f + 3 + Stringizer('f + 3') + >> f["key"] + Stringizer('f["key"]') + +Bringing it all together: if we run a Python-generated +``__annotate__`` function, but we replace its globals +with a "fake globals" dict, all undefined symbols it +references will be replaced with stringizer proxy objects +representing those symbols, and any operations performed +on those proxies will in turn result in proxies +representing that expression. This allows ``__annotate__`` +to complete, and to return an annotations dict, with +stringizer instances standing in for names and entire +expressions that could not have otherwise been evaluated. + +In practice, the "stringizer" functionality will be implemented +in the ``ForwardRef`` object currently defined in the +``typing`` module. ``ForwardRef`` will be extended to +implement all stringizer functionality; it will also be +extended to support evaluating the string it contains, +to produce the real value (assuming all symbols referenced +are defined). This means the ``ForwardRef`` object +will retain references to the appropriate "globals", +"locals", and even "closure" information needed to +evaluate the expression. + +This technique is the core of how ``inspect.get_annotations`` +supports ``FORWARDREF`` and ``SOURCE`` formats. Initially, +``inspect.get_annotations`` will call the object's +``__annotate__`` method requesting the desired format. +If that raises ``NotImplementedError``, ``inspect.get_annotations`` +will construct a "fake globals" environment, then call +the object's ``__annotate__`` method + +* ``inspect.get_annotations`` produces ``SOURCE`` format + by creating a new empty "fake globals" dict, binding it + to the object's ``__annotate__`` method, calling that + requesting ``VALUE`` format, and + then extracting the "value" from each ``FowardRef`` object + in the resulting dict. + +* ``inspect.get_annotations`` produces ``FORWARDREF`` format + by creating a new empty "fake globals" dict, pre-populating + it with the current contents of the ``__annotate__`` method's + globals dict, binding the "fake globals" dict to the object's + ``__annotate__`` method, calling that requesting ``VALUE`` + format, and returning the result. + +This entire technique works because the ``__annotate__`` functions +generated by the compiler are controlled by Python itself, and +are simple and predictable. They're +effectively a single ``return`` statement, computing and +returning the annotations dict. Since most operations needed +to compute an annotation are implemented in Python using dunder +methods, and the stringizer supports all the relevant dunder +methods, this approach is a reliable, practical solution. + +However, it's not reasonable to attempt this technique with +just any ``__annotate__`` method. This PEP assumes that +third-party libraries will implement their own ``__annotate__`` +methods, and those functions would almost certainly work +incorrectly when run in this "fake globals" environment. +For that reason, this PEP allocates a flag on code objects, +one of the unused bits in ``co_flags``, to mean "This code +object can be run in a 'fake globals' environment." This +makes the "fake globals" environment strictly opt-in, and +it's expected that only ``__annotate__`` methods generated +by the Python compiler will set it. + +The weakness in this technique is in handling operators which +don't directly map to dunder methods on an object. These are +all operators that implement some manner of flow control, +either branching or iteration: + +* Short-circuiting ``or`` +* Short-circuiting ``and`` +* Ternary operator (the ``if`` / ``then`` operator) +* Generator expressions +* List / dict / set comprehensions +* Iterable unpacking + +As a rule these techniques aren't used in annotations, +so it doesn't pose a problem in practice. However, the +recent addition of ``TypeVarTuple`` to Python does use +iterable unpacking. The dunder methods +involved (``__iter__`` and ``__next__``) don't permit +distinguishing between iteration use cases; in order to +correctly detect which use case was involved, mere +"fake globals" and a "stringizer" wouldn't be sufficient; +this would require a custom bytecode interpreter designed +specifically around producing ``SOURCE`` and ``FORWARDREF`` +formats. + +Thankfully there's a shortcut that will work fine: +the stringizer will simply assume that when its +iteration dunder methods are called, it's in service +of iterator unpacking being performed by ``TypeVarTuple``. +It will hard-code this behavior. This means no other +technique using iteration will work, but in practice +this won't inconvenience real-world use cases. + + +Finally, note that the "fake globals" environment +will also require constructing a matching "fake locals" +dictionary, which for ``FORWARDREF`` format will be +pre-populated with the relevant locals dict. The +"fake globals" environment will also have to create +a fake "closure", a tuple of ``FowardRef`` objects +pre-created with the names of the free variables +referenced by the ``__annotate__`` method. + +``ForwardRef`` proxies created from ``__annotate__`` +methods that reference free variables will map the +names and closure values of those free variables into +the locals dictionary, to ensure that ``eval`` uses +the correct values for those names. + + +Compiler-generated ``__annotate__`` functions +============================================== + +As mentioned in the previous section, the ``__annotate__`` +functions generated by the compiler are simple. They're +mainly a single ``return`` statement, computing and +returning the annotations dict. + +However, the protocol for ``inspect.get_annotations`` +to request either ``FORWARDREF`` or ``SOURCE`` format +requires first asking the ``__annotate__`` method to +produce it. ``__annotate__`` methods generated by +the Python compiler won't support either of these +formats and will raise ``NotImplementedError()``. + + +Third-party ``__annotate__`` functions +====================================== + +Third-party classes and functions will likely need +to implement their own ``__annotate__`` methods, +so that downstream users of +those objects can take full advantage of annotations. +In particular, wrappers will likely need to transform +the annotation dicts produced by the wrapped object--adding, +removing, or modifying the dictionary in some way. + +Most of the time, third-party code will implement +their ``__annotate__`` methods by calling +``inspect.get_annotations`` on some existing upstream +object. For example, wrappers will likely request the +annotations dict for their wrapped object, +in the format that was requested from them, then +modify the returned annotations dict as appropriate +and return that. This allows third-party code to +leverage the "fake globals" technique without +having to understand or participate in it. + +Third-party libraries that support both pre- and +post-PEP-649 versions of Python will have to innovate +their own best practices on how to support both. +One sensible approach would be for their wrapper to +always support ``__annotate__``, then call it requesting +``VALUE`` format and store the result as the +``__annotations__`` on their wrapper object. +This would support pre-649 Python semantics, and be +forward-compatible with post-649 semantics. + + + +Pseudocode +========== + +Here's high-level pseudocode for ``inspect.get_annotations``:: + + def get_annotations(o, format): + if format == VALUE: + return dict(o.__annotations__) + + if format == FORWARDREF: + try: + return dict(o.__annotations__) + except NameError: + pass + + if not hasattr(o.__annotate__): + return {} + + c_a = o.__annotate__ + try: + return c_a(format) + except NotImplementedError: + if not can_be_called_with_fake_globals(c_a): + return {} + c_a_with_fake_globals = make_fake_globals_version(c_a, format) + return c_a_with_fake_globals(VALUE) + +Here's what a Python compiler-generated ``__annotate__`` method +might look like if it was written in Python:: + + def __annotate__(self, format): + if format != 1: + raise NotImplementedError() + return { ... } + +Here's how a third-party wrapper class might implement +``__annotate__``. In this example, the wrapper works +like ``functools.partial``, pre-binding one parameter of +the wrapped callable, which for simplicity must be named +``arg``:: + + def __annotate__(self, format): + ann = inspect.get_annotations(self.wrapped_fn, format) + if 'arg' in ann: + del ann['arg'] + return ann + + +Other modifications to existing objects +======================================= + +This PEP adds two more attributes to existing Python objects: +a ``__locals__`` attribute to function objects, and +an optional ``__globals__`` attribute to class objects. + +In Python, the bytecode interpreter can reference both a +"globals" and a "locals" dictionary. However, the current +function object can only be bound to a globals dictionary, +via the ``__globals__`` attribute. Traditionally the +"locals" dictionary is only set when executing a class. +This PEP needs to set the "locals" dictionary to the class dict +when evaluating annotations defined inside a class namespace. +So this PEP defines a new ``__locals__`` attribute on +functions. By default it is uninitalized, or rather is set +to an internal value that indicates it hasn't been explicitly set. +It can be set to either ``None`` or a dictionary. If it's set to +a dictionary, the interpreter will use that dictionary as +the "locals" dictionary when running the function. + +In Python, function objects contain a reference to their own +``__globals__``. However, class objects aren't currently +defined as doing so in Python. The implementation of +``__annotate__`` in CPython needs a reference to the module +globals in order to bind the unbound code object. So this PEP +defines a new ``__globals__`` attribute on class objects, +which stores a reference to the globals for the module where +the class was defined. Note that this attribute is optional, +but was useful for the CPython implementation. + +(The class ``__globals__`` attribute does create a new reference +cycle, between a class and its module. However, any class that +contains a method already participates in at least one such cycle.) Interactive REPL Shell ----------------------- +====================== -Everything works the same inside Python's interactive REPL shell, -except for module annotations in the interactive module (``__main__``) -itself. Since that module is never "finished", there's no specific -point where we can compile the ``__co_annotations__`` function. +The semantics established in this PEP also hold true when executing +code in Python's interactive REPL shell, except for module annotations +in the interactive module (``__main__``) itself. Since that module is +never "finished", there's no specific point where we can compile the +``__annotate__`` function. For the sake of simplicity, in this case we forego delayed evaluation. Module-level annotations in the REPL shell will continue to work -exactly as they do today, evaluating immediately and setting the -result directly inside the ``__annotations__`` dict. - -(It might be possible to support delayed evaluation here. -But it gets complicated quickly, and for a nearly-non-existent -use case.) +exactly as they do with "stock semantics", evaluating immediately and +setting the result directly inside the ``__annotations__`` dict. Annotations On Local Variables Inside Functions ------------------------------------------------ +=============================================== Python supports syntax for local variable annotations inside functions. However, these annotations have no runtime @@ -726,14 +1064,25 @@ as stock semantics and :pep:`563`. +Prototype +========= + +The original prototype implementation of this PEP can be found here: + +https://github.com/larryhastings/co_annotations/ + +As of this writing, the implementation is severely out of date; +it's based on Python 3.10 and implements the semantics of the +first draft of this PEP, from early 2021. It will be updated +shortly. + + + Performance Comparison ----------------------- +====================== -Performance with this PEP should be favorable, when compared with either -stock behavior or :pep:`563`. In general, resources are only consumed -on demand—"you only pay for what you use". - -There are three scenarios to consider: +Performance with this PEP is generally favorable. There are three +scenarios to consider: * the runtime cost when annotations aren't defined, * the runtime cost when annotations are defined but *not* referenced, and @@ -748,32 +1097,30 @@ generated for it. This requires no runtime processor time and consumes no memory. When annotations are defined but not referenced, the runtime cost -of Python with this PEP should be roughly equal to or slightly better -than :pep:`563` semantics, and slightly better than "stock" Python -semantics. The specifics depend on the object being annotated: +of Python with this PEP is roughly the same as :pep:`563`, and +improved over stock. The specifics depend on the object +being annotated: * With stock semantics, the annotations dict is always built, and set as an attribute of the object being annotated. -* In :pep:`563` semantics, for function objects, a single constant - (a tuple) is set as an attribute of the function. For class and - module objects, the annotations dict is always built and set as - an attribute of the class or module. +* In :pep:`563` semantics, for function objects, a precompiled + constant (a specially constructed tuple) is set as an attribute + of the function. For class and module objects, the annotations + dict is always built and set as an attribute of the class or module. * With this PEP, a single object is set as an attribute of the - object being annotated. Most often, this object is a constant - (a code object). In cases where the annotation refers to local - variables or class variables, the code object will be bound to - a function object, and the function object is set as the attribute - of the object being annotated. + object being annotated. Most of the time, this object is + a constant (a code object), but when the annotations require a + class namespace or closure, this object will be a tuple constructed + at binding time. When annotations are both defined and referenced, code using -this PEP should be much faster than code using :pep:`563` semantics, -and equivalent to or slightly improved over original Python -semantics. :pep:`563` semantics requires invoking ``eval()`` for -every value inside an annotations dict, which is enormously slow. -And, as already mentioned, this PEP generates measurably more -efficient bytecode for class and module annotations than stock +this PEP should be much faster than :pep:`563`, and be as fast +or faster than stock. :pep:`563` semantics requires invoking +``eval()`` for every value inside an annotations dict which is +enormously slow. And the implementation of PEP generates measurably +more efficient bytecode for class and module annotations than stock semantics; for function annotations, this PEP and stock semantics -should be roughly equivalent. +should be about the same speed. Memory use should also be comparable in all three scenarios across all three semantic contexts. In the first and third scenarios, @@ -782,206 +1129,318 @@ In the second scenario, when annotations are defined but not referenced, using this PEP's semantics will mean the function/class/module will store one unused code object (possibly bound to an unused function object); with the other two semantics, -they'll store one unused dictionary (or constant tuple). - -Bytecode Comparison -------------------- - -The bytecode generated for annotations functions with -this PEP uses the efficient ``BUILD_CONST_KEY_MAP`` opcode -to build the dict for all annotatable objects: -functions, classes, and modules. - -Stock semantics also uses ``BUILD_CONST_KEY_MAP`` bytecode -for function annotations. :pep:`563` has an even more efficient -method for building annotations dicts on functions, leveraging -the fact that its annotations dicts only contain strings for -both keys and values. At compile-time it constructs a tuple -containing pairs of keys and values at compile-time, then -at runtime it converts that tuple into a dict on demand. -This is a faster technique than either stock semantics -or this PEP can employ, because in those two cases -annotations dicts can contain Python values of any type. -Of course, this performance win is negated if the -annotations are examined, due to the overhead of ``eval()``. - -For class and module annotations, both stock semantics -and :pep:`563` generate a longer and slightly-less-efficient -stanza of bytecode, creating the dict and setting the -annotations individually. +they'll store one unused dictionary or constant tuple. -For Future Discussion -===================== +*********************** +Backwards Compatibility +*********************** -Circular Imports ----------------- +Backwards Compatibility With Stock Semantics +============================================ -There is one unfortunately-common scenario where :pep:`563` -currently provides a better experience, and it has to do -with large code bases, with circular dependencies and -imports, that examine their annotations at run-time. +This PEP preserves nearly all existing behavior of +annotations from stock semantics: -:pep:`563` permitted defining *and examining* invalid -expressions as annotations. Its implementation requires -annotations to be legal Python expressions, which it then -converts into strings at compile-time. But legal Python -expressions may not be computable at runtime, if for -example the expression references a name that isn't defined. -This is a problem for stringized annotations if they're -evaluated, e.g. with ``typing.get_type_hints()``. But -any stringized annotation may be examined harmlessly at -any time--as long as you don't evaluate it, and only -examine it as a string. +* The format of the annotations dict stored in + the ``__annotations__`` attribute is unchanged. + Annotations dicts contain real values, not strings + as per :pep:``563``. +* Annotations dicts are mutable, and any changes to them are + preserved. +* The ``__annotations__`` attribute can be explicitly set, + and any legal value set this way will be preserved. +* The ``__annotations__`` attribute can be deleted using + the ``del`` statement. -Some large organizations have code bases that unfortunately -have circular dependency problems with their annotations--class -A has methods annotated with class B, but class B has methods -annotated with class A--that can be difficult to resolve. -Since :pep:`563` stringizes their annotations, it allows them -to leave these circular dependencies in place, and they can -sidestep the circular import problem by never importing the -module that defines the types used in the annotations. Their -annotations can no longer be evaluated, but this appears not -to be a concern in practice. They can then examine the -stringized form of the annotations at runtime and this seems -to be sufficient for their needs. +Most code that works with stock semantics should +continue to work when this PEP is active without any +modification necessary. But there are exceptions, +as follows. -This PEP allows for many of the same behaviors. -Annotations must be legal Python expressions, which -are compiled into a function at compile-time. -And if the code never examines an annotation, it won't -have any runtime effect, so here too annotations can -harmlessly refer to undefined names. (It's exactly -like defining a function that refers to undefined -names--then never calling that function. Until you -call the function, nothing bad will happen.) - -But examining an annotation when this PEP is active -means evaluating it, which means the names evaluated -in that expression must be defined. An undefined name -will throw a ``NameError`` in an annotation function, -just as it would with a stringized annotation passed -in to ``typing.get_type_hints()``, and just like any -other context in Python where an expression is evaluated. - -In discussions we have yet to find a solution to this -problem that makes all the participants in the -conversation happy. There are various avenues to explore -here: - -* One workaround is to continue to stringize one's - annotations, either by hand or done automatically - by the Python compiler (as it does today with - ``from __future__ import annotations``). This might - mean preserving Python's current stringizing annotations - going forward, although leaving it turned off by default, - only available by explicit request (though likely with - a different mechanism than - ``from __future__ import annotations``). -* Another possible workaround involves importing - the circularly-dependent modules separately, then - externally adding ("monkey-patching") their dependencies - to each other after the modules are loaded. As long - as the modules don't examine their annotations until - after they are completely loaded, this should work fine - and be maintainable with a minimum of effort. -* A third and more radical approach would be to change the - semantics of annotations so that they don't raise a - ``NameError`` when an unknown name is evaluated, - but instead create some sort of proxy "reference" object. -* Of course, even if we do deprecate :pep:`563`, it will be - several releases before the functionality is removed, - giving us several years in which to research and innovate - new solutions for this problem. - -In any case, the participants of the discussion agree that -this PEP should still move forward, even as this issue remains -currently unresolved [1]_. - -.. [1] https://github.com/larryhastings/co_annotations/issues/1 +First, there's a well-known idiom for accessing class +annotations which may not work correctly when this +PEP is active. The original implementation of class +annotations had what can only be called a bug: if a class +didn't define any annotations of its own, but one +of its base classes did define annotations, the class +would "inherit" those annotations. This behavior +was never desirable, so user code found a workaround: +instead of accessing the annotations on the class +directly via ``cls.__annotations__``, code would +access the class's annotations via its dict as in +``cls.__dict__.get("__annotations__", {})``. This +idiom worked because classes stored their annotations +in their ``__dict__``, and accessing them this way +avoided the lookups in the base classes. The technique +relied on implementation details of CPython, so it +was never supported behavior--though it was necessary. +However, when this PEP is active, a class may have +annotations defined but hasn't yet called ``__annotate__`` +and cached the result, in which case this approach +would lead to mistakenly assuming the class didn't have +annotations. +In any case, the bug was fixed as of Python 3.10, and the +idiom should no longer be used. Also as of Python 3.10, +there's an +`Annotations HOWTO `_ +that defines best practices +for working with annotations; code that follows these +guidelines will work correctly even when this PEP is +active, because it suggests using different approaches +to get annotations from class objects based on the +Python version the code runs under. -cls.__globals__ and fn.__locals__ ---------------------------------- +Since delaying the evaluation of annotations until they are +evaluated changes the semantics of the language, it's observable +from within the language. Therefore it's *possible* to write code +that behaves differently based on whether annotations are +evaluated at binding time or at access time, e.g. -Is it permissible to add the ``__globals__`` reference to class -objects as proposed here? It's not clear why this hasn't already -been done; :pep:`563` could have made use of class globals, but instead -made do with looking up classes inside ``sys.modules``. Python -seems strangely allergic to adding a ``__globals__`` reference to -class objects. +.. code-block:: -If adding ``__globals__`` to class objects is indeed a bad idea -(for reasons I don't know), here are two alternatives as to -how classes could get a reference to their globals for the -implementation of this PEP: + mytype = str + def foo(a:mytype): pass + mytype = int + print(foo.__annotations__['a']) -* The generate code for a class could bind its annotations code - object to a function at the time the class is bound, rather than - waiting for ``__annotations__`` to be referenced, making them an - exception to the rule (even though "special cases aren't special - enough to break the rules"). This would result in a small - additional runtime cost when annotations were defined but not - referenced on class objects. Honestly I'm more worried about - the lack of symmetry in semantics. (But I wouldn't want to - pre-bind all annotations code objects, as that would become - much more costly for function objects, even as annotations are - rarely used at runtime.) -* Use the class's ``__module__`` attribute to look up its module - by name in ``sys.modules``. This is what :pep:`563` advises. - While this is passable for userspace or library code, it seems - like a little bit of a code smell for this to be defined semantics - baked into the language itself. - -Also, the prototype gets globals for class objects by calling -``globals()`` then storing the result. I'm sure there's a much -faster way to do this, I just didn't know what it was when I was -prototyping. I'm sure we can revise this to something much faster -and much more sanitary. I'd prefer to make it completely internal -anyway, and not make it visible to the user (via this new -__globals__ attribute). There's possibly already a good place to -put it anyway--``ht_module``. - -Similarly, this PEP adds one new dunder member to functions, -classes, and modules (``__co_annotations__``), and a second new -dunder member to functions (``__locals__``). This might be -considered excessive. +This will print ```` with stock semantics +and ```` when this PEP is active. This is +therefore a backwards-incompatible change. However, this +example is poor programming style, so this change seems +acceptable. -Bikeshedding the name ---------------------- +There are two uncommon interactions possible with class +and module annotations that work with stock semantics +that would no longer work when this PEP was active. +These two interactions would have to be prohibited. The +good news is, neither is common, and neither is considered +good practice. In fact, they're rarely seen outside of +Python's own regression test suite. They are: -During most of the development of this PEP, user code actually -could see the raw annotation code objects. ``__co_annotations__`` -could only be set to a code object; functions and other callables -weren't permitted. In that context the name ``co_annotations`` -makes a lot of sense. But with this last-minute pivot where -``__co_annotations__`` now presents itself as a callable, -perhaps the name of the attribute and the name of the -``from __future__ import`` needs a re-think. +* *Code that sets annotations on module or class attributes + from inside any kind of flow control statement.* It's + currently possible to set module and class attributes with + annotations inside an ``if`` or ``try`` statement, and it works + as one would expect. It's untenable to support this behavior + when this PEP is active. +* *Code in module or class scope that references or modifies the + local* ``__annotations__`` *dict directly.* Currently, when + setting annotations on module or class attributes, the generated + code simply creates a local ``__annotations__`` dict, then adds + mappings to it as needed. It's possible for user code + to directly modify this dict, though this doesn't seem to be + an intentional feature. Although it would be possible to support + this after a fashion once this PEP was active, the semantics + would likely be surprising and wouldn't make anyone happy. + +Note that these are both also pain points for static type checkers, +and are unsupported by those tools. It seems reasonable to +declare that both are at the very least unsupported, and their +use results in undefined behavior. It might be worth making a +small effort to explicitly prohibit them with compile-time checks. + +Finally, if this PEP is active, annotation values shouldn't use +the ``if / else`` ternary operator. Although this will work +correctly when accessing ``o.__annotations__`` or requesting +``inspect.VALUE`` from a helper function, the boolean expression +may not compute correctly with ``inspect.FORWARDREF`` when +some names are defined, and would be far less correct with +``inspect.SOURCE``. +Backwards Compatibility With PEP 563 Semantics +============================================== + +:pep:`563` changed the semantics of annotations. When its semantics +are active, annotations must assume they will be evaluated in +*module-level* or *class-level* scope. They may no longer refer directly +to local variables in the current function or an enclosing function. +This PEP removes that restriction, and annotations may refer any +local variable. + +:pep:`563` requires using ``eval`` (or a helper function like +``typing.get_type_hints`` or ``inspect.get_annotations`` that +uses ``eval`` for you) to convert stringized annotations into +their "real" values. Existing code that activates stringized +annotations, and calls ``eval()`` directly to convert the strings +back into real values, can simply remove the ``eval()`` call. +Existing code using a helper function would continue to work +unchanged, though use of those functions may become optional. + +Static typing users often have modules that only contain +inert type hint definitions--but no live code. These modules +are only needed when running static type checking; they aren't +used at runtime. But under stock semantics, these modules +have to be imported in order for the runtime to evaluate and +compute the annotations. Meanwhile, these modules often +caused circular import problems that could be difficult or +even impossible to solve. :pep:`563` allowed users to solve +these circular import problems by doing two things. First, +they activated :pep:`563` in their modules, which meant annotations +were constant strings, and didn't require the real symbols to +be defined in order for the annotations to be computable. +Second, this permitted users to only import the problematic +modules in an ``if typing.TYPE_CHECKING`` block. This allowed +the static type checkers to import the modules and the type +definitions inside, but they wouldn't be imported at runtime. +So far, this approach will work unchanged when this PEP is +active; ``if typing.TYPE_CHECKING`` is supported behavior. + +However, some codebases actually *did* examine their +annotations at runtime, even when using the ``if typing.TYPE_CHECKING`` +technique and not importing definitions used in their annotations. +These codebases examined the annotation strings *without +evaluating them,* instead relying on identity checks or +simple lexical analysis on the strings. + +This PEP supports these technqiues too. But users will need +to port their code to it. First, user code will need to use +``inspect.get_annotations`` or ``typing.get_type_hints`` to +access the annotations; they won't be able to simply get the +``__annotations__`` attribute from their object. Second, +they will need to specify either ``inspect.FORWARDREF`` +or ``inspect.SOURCE`` for the ``format`` when calling that +function. This means the helper function can succeed in +producing the annotations dict, even when not all the symbols +are defined. Code expecting stringized annotations should +work unmodified with ``inspect.SOURCE`` formatted annotations +dicts; however, users should consider switching to +``inspect.FORWARDREF``, as it may make their analysis easier. + +Similarly, :pep:`563` permitted use of class decorators on +annotated classes in a way that hadn't previously been possible. +Some class decorators (e.g. ``dataclasses``) examine the annotations +on the class. Because class decorators using the ``@`` decorator +syntax are run before the class name is bound, they can cause +unsolvable circular-definition problems. If you annotate attributes +of a class with references to the class itself, or annotate attributes +in multiple classes with circular references to each other, you +can't decorate those classes with the ``@`` decorator syntax +using decorators that examine the annotations. :pep:`563` allowed +this to work, as long as the decorators examined the strings lexically +and didn't use ``eval`` to evaluate them (or handled the ``NameError`` +with further workarounds). When this PEP is active, decorators will +be able to compute the annotations dict in ``inspect.SOURCE`` or +``inspect.FORWARDREF`` format using the helper functions. This +will permit them to analyze annotations containing undefined +symbols, in the format they prefer. + +Early adopters of :pep:`563` discovered that "stringized" +annotations were useful for automatically-generated documentation. +Users experimented with this use case, and Python's ``pydoc`` +has expressed some interest in this technique. This PEP supports +this use case; the code generating the documentation will have to be +updated to use a helper function to access the annotations in +``inspect.SOURCE`` format. + +Finally, the warnings about using the ``if / else`` ternary +operator in annotations apply equally to users of :pep:`563`. +It currently works for them, but could produce incorrect +results when requesting some formats from the helper functions. + +If this PEP is accepted, :pep:`563` will be deprecated and +eventually removed. To facilitate this transition for early +adopters of :pep:`563`, who now depend on its semantics, +``inspect.get_annotations`` and ``typing.get_type_hints`` will +implement a special affordance. + +The Python compiler won't generate annotation code bjects +for objects defined in a module where :pep:`563` semantics are +active, even if this PEP is accepted. So, under normal +circumstances, requesting ``inspect.SOURCE`` format from a +helper function would return an empty dict. As an affordance, +to facilitate the transition, if the helper functions detect +that an object was defined in a module with :pep:`563` active, +and the user requests ``inspect.SOURCE`` format, they'll return +the current value of the ``__annotations__`` dict, which in +this case will be the stringized annotations. This will allow +:pep:`563` users who lexically analyze stringized annotations +to immediately change over to requesting ``inspect.SOURCE`` format +from the helper functions, which will hopefully smooth their +transition away from :pep:`563`. + + +************** +Rejected Ideas +************** + +"Just store the strings" +======================== + +One proposed idea for supporting ``SOURCE`` format was for +the Python compiler to emit the actual source code for the +annotation values somewhere, and to furnish that when +the user requested ``SOURCE`` format. + +This idea wasn't rejected so much as categorized as +"not yet". We already know we need to support ``FORWARDREF`` +format, and that technique can be adapted to support +``SOURCE`` format in just a few lines. There are many +unanswered questions about this approach: + +* Where would we store the strings? Would they always + be loaded when the annotated object was created, or + would they be lazy-loaded on demand? If so, how + would the lazy-loading work? +* Would the "source code" include the newlines and + comments of the original? Would it preserve all + whitespace, including indents and extra spaces used + purely for formatting? + +It's possible we'll revisit this topic in the future, +if improving the fidelity of ``SOURCE`` values to the +original source code is judged sufficiently important. + + +**************** Acknowledgements -================ +**************** -Thanks to Barry Warsaw, Eric V. Smith, Mark Shannon, -and Guido van Rossum for feedback and encouragement. -Thanks in particular to Mark Shannon for two key -suggestions—build the entire annotations dict inside -a single code object, and only bind it to a function -on demand—that quickly became among the best aspects -of this proposal. Also, thanks in particular to Guido -van Rossum for suggesting that ``__co_annotations__`` -functions should duplicate the name visibility rules of -annotations under "stock" semantics--this resulted in -a sizeable improvement to the second draft. Finally, -special thanks to Jelle Zijlstra, who contributed not -just feedback--but code! +Thanks to Carl Meyer, Barry Warsaw, Eric V. Smith, +Mark Shannon, Jelle Ziljstra, and Guido van Rossum for ongoing +feedback and encouragement. + +Particular thanks to several individuals who contributed key ideas +that became some of the best aspects of this proposal: + +* Carl Meyer suggested the "stringizer" technique that made + ``FORWARDREF`` and ``SOURCE`` formats possible, which + allowed making forward progress on this PEP possible after + a year of languishing due to seemingly-unfixable problems. + He also suggested the affordance for :pep:`563` users where + ``inspect.SOURCE`` will return the stringized annotations, + and many more suggestions besides. Carl was also the primary + correspondent in private email threads discussing this PEP, + and was a tireless resource and voice of sanity. This PEP + would almost certainly not have been accepted it were it not + for Carl's contributions. +* Mark Shannon suggested building the entire annotations dict + inside a single code object, and only binding it to a function + on demand. +* Guido van Rossum suggested that ``__annotate__`` + functions should duplicate the name visibility rules of + annotations under "stock" semantics. +* Jelle Zijlstra contributed not only feedback--but code! +********** +References +********** + +https://github.com/larryhastings/co_annotations/issues + +https://discuss.python.org/t/two-polls-on-how-to-revise-pep-649/23628 + +https://discuss.python.org/t/a-massive-pep-649-update-with-some-major-course-corrections/25672 + + + +********* Copyright -========= +********* This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.