python-peps/peps/pep-0747.rst

640 lines
23 KiB
ReStructuredText
Raw Normal View History

PEP: 747
Title: Annotating Type Forms
Author: David Foster <david at dafoster.net>, Eric Traut <erictr at microsoft.com>
Sponsor: Jelle Zijlstra <jelle.zijlstra at gmail.com>
Discussions-To: https://discuss.python.org/t/pep-747-typeexpr-type-hint-for-a-type-expression/55984
Status: Draft
Type: Standards Track
Topic: Typing
Created: 27-May-2024
Python-Version: 3.14
Post-History: `19-Apr-2024 <https://discuss.python.org/t/typeform-spelling-for-a-type-annotation-object-at-runtime/51435>`__, `04-May-2024 <https://discuss.python.org/t/typeform-spelling-for-a-type-annotation-object-at-runtime/51435/7/>`__, `17-Jun-2024 <https://discuss.python.org/t/pep-747-typeexpr-type-hint-for-a-type-expression/55984>`__
Abstract
========
:ref:`Type expressions <typing:type-expression>` provide a standardized way
to specify types in the Python type system. When a type expression is
evaluated at runtime, the resulting *type form object* encodes the information
supplied in the type expression. This enables a variety of use cases including
runtime type checking, introspection, and metaprogramming.
Such use cases have proliferated, but there is currently no way to accurately
annotate functions that accept type form objects. Developers are forced to use
an overly-wide type like ``object``, which makes some use cases impossible and
generally reduces type safety. This PEP addresses this limitation by
introducing a new special form ``typing.TypeForm``.
This PEP makes no changes to the Python grammar. ``TypeForm`` is
intended to be enforced only by type checkers, not by the Python runtime.
Motivation
==========
A function that operates on type form objects must understand how type
expression details are encoded in these objects. For example, ``int | str``,
``"int | str"``, ``list[int]``, and ``MyTypeAlias`` are all valid type
expressions, and they evaluate to instances of ``types.UnionType``,
``builtins.str``, ``types.GenericAlias``, and ``typing.TypeAliasType``,
respectively.
There is currently no way to indicate to a type checker that a function accepts
type form objects and knows how to work with them. ``TypeForm`` addresses this
limitation. For example, here is a function that checks whether a value is
assignable to a specified type and returns None if it is not::
def trycast[T](typx: TypeForm[T], value: object) -> T | None: ...
The use of ``TypeForm`` and the type variable ``T`` describes a relationship
between the type form passed to parameter ``typx`` and the function's
return type.
``TypeForm`` can also be used with :ref:`typing:typeis` to define custom type
narrowing behaviors::
def isassignable[T](value: object, typx: TypeForm[T]) -> TypeIs[T]: ...
request_json: object = ...
if isassignable(request_json, MyTypedDict):
assert_type(request_json, MyTypedDict) # Type of variable is narrowed
The ``isassignable`` function implements something like an enhanced
``isinstance`` check. This is useful for validating whether a value decoded
from JSON conforms to a particular structure of nested ``TypedDict``\ s,
lists, unions, ``Literal``\ s, or any other type form that can be described
with a type expression. This kind of check was alluded to in
:pep:`PEP 589 <589#using-typeddict-types>` but could not be implemented without
``TypeForm``.
Why not ``type[C]``?
--------------------
One might think that ``type[C]`` would suffice for these use cases. However,
only class objects (instances of the ``builtins.type`` class) are assignable
to ``type[C]``. Many type form objects do not meet this requirement::
def trycast[T](typx: type[T], value: object) -> T | None: ...
trycast(str, 'hi') # OK
trycast(Literal['hi'], 'hi') # Type violation
trycast(str | None, 'hi') # Type violation
trycast(MyProtocolClass, obj) # Type violation
TypeForm use cases
------------------
`A survey of Python libraries`_ reveals several categories of functions that
would benefit from ``TypeForm``:
.. _A survey of Python libraries: https://github.com/python/mypy/issues/9773#issuecomment-2017998886
- Assignability checkers:
- Determines whether a value is assignable to a specified type
- Pattern 1:
``def is_assignable[T](value: object, typx: TypeForm[T]) -> TypeIs[T]``
- Pattern 2:
``def is_match[T](value: object, typx: TypeForm[T]) -> TypeGuard[T]``
- Examples: beartype.\ `is_bearable`_, trycast.\ `isassignable`_,
typeguard.\ `check_type`_, xdsl.\ `isa`_
.. _is_bearable: https://github.com/beartype/beartype/issues/255
.. _isassignable: https://github.com/davidfstr/trycast?tab=readme-ov-file#isassignable-api
.. _check_type: https://typeguard.readthedocs.io/en/latest/api.html#typeguard.check_type
.. _isa: https://github.com/xdslproject/xdsl/blob/ac12c9ab0d64618475efb98d1d197bdd79f593c3/xdsl/utils/hints.py#L23
- Converters:
- If a value is assignable to (or coercible to) a specified type,
a *converter* returns the value narrowed to (or coerced to) that type.
Otherwise, an exception is raised.
- Pattern 1:
``def convert[T](value: object, typx: TypeForm[T]) -> T``
- Examples: cattrs.BaseConverter.\ `structure`_, trycast.\ `checkcast`_,
typedload.\ `load`_
- Pattern 2:
::
class Converter[T]:
def __init__(self, typx: TypeForm[T]) -> None: ...
def convert(self, value: object) -> T: ...
- Examples: pydantic.\ `TypeAdapter(T).validate_python`_,
mashumaro.\ `JSONDecoder(T).decode`_
.. _structure: https://github.com/python-attrs/cattrs/blob/5f5c11627a7f67a23d6212bc7df9f96243c62dc5/src/cattrs/converters.py#L332-L334
.. _checkcast: https://github.com/davidfstr/trycast#checkcast-api
.. _load: https://ltworf.github.io/typedload/
.. _TypeAdapter(T).validate_python: https://stackoverflow.com/a/61021183/604063
.. _JSONDecoder(T).decode: https://github.com/Fatal1ty/mashumaro?tab=readme-ov-file#usage-example
- Typed field definitions:
- Pattern:
::
class Field[T]:
value_type: TypeForm[T]
- Examples: attrs.\ `make_class`_,
dataclasses.\ `make_dataclass`_ [#DataclassInitVar]_, `openapify`_
.. _make_class: https://www.attrs.org/en/stable/api.html#attrs.make_class
.. _make_dataclass: https://github.com/python/typeshed/issues/11653
.. _openapify: https://github.com/Fatal1ty/openapify/blob/c8d968c7c9c8fd7d4888bd2ddbe18ffd1469f3ca/openapify/core/models.py#L16
The survey also identified some introspection functions that accept runtime
type forms as input. Today, these functions are annotated with ``object``:
- General introspection operations:
- Pattern: ``def get_annotation_info(typx: object) -> object``
- Examples: typing.{`get_origin`_, `get_args`_},
`typing_inspect`_.{is_*_type, get_origin, get_parameters}
These functions accept values evaluated from arbitrary annotation expressions,
not just type expressions, so they cannot be altered to use ``TypeForm``.
.. _get_origin: https://docs.python.org/3/library/typing.html#typing.get_origin
.. _get_args: https://docs.python.org/3/library/typing.html#typing.get_args
.. _typing_inspect: https://github.com/ilevkivskyi/typing_inspect?tab=readme-ov-file#readme
Specification
=============
When a type expression is evaluated at runtime, the resulting value is a
*type form* object. This value encodes the information supplied in the type
expression, and it represents the type described by that type expression.
``TypeForm`` is a special form that, when used in a type expression, describes
a set of type form objects. It accepts a single type argument, which must be a
valid type expression. ``TypeForm[T]`` describes the set of all type form
objects that represent the type ``T`` or types that are
:term:`assignable to <typing:assignable>` ``T``. For example,
``TypeForm[str | None]`` describes the set of all type form objects
that represent a type assignable to ``str | None``::
ok1: TypeForm[str | None] = str | None # OK
ok2: TypeForm[str | None] = str # OK
ok3: TypeForm[str | None] = None # OK
ok4: TypeForm[str | None] = Literal[None] # OK
ok5: TypeForm[str | None] = Optional[str] # OK
ok6: TypeForm[str | None] = "str | None" # OK
ok7: TypeForm[str | None] = Any # OK
err1: TypeForm[str | None] = str | int # Error
err2: TypeForm[str | None] = list[str | None] # Error
By this same definition, ``TypeForm[Any]`` describes a type form object
that represents the type ``Any`` or any type that is assignable to ``Any``.
Since all types in the Python type system are assignable to ``Any``,
``TypeForm[Any]`` describes the set of all type form objects
evaluated from all valid type expressions.
The type expression ``TypeForm``, with no type argument provided, is
equivalent to ``TypeForm[Any]``.
Implicit ``TypeForm`` Evaluation
--------------------------------
When a static type checker encounters an expression that follows all of the
syntactic, semantic and contextual rules for a type expression as detailed
in the typing spec, the evaluated type of this expression should be assignable
to ``TypeForm[T]`` if the type it describes is assignable to ``T``.
For example, if a static type checker encounters the expression ``str | None``,
it may normally evaluate its type as ``UnionType`` because it produces a
runtime value that is an instance of ``types.UnionType``. However, because
this expression is a valid type expression, it is also assignable to the
type ``TypeForm[str | None]``:
v1_actual: UnionType = str | None # OK
v1_type_form: TypeForm[str | None] = str | None # OK
v2_actual: type = list[int] # OK
v2_type_form: TypeForm = list[int] # OK
The ``Annotated`` special form is allowed in type expressions, so it can
also appear in an expression that is assignable to ``TypeForm``. Consistent
with the typing spec's rules for ``Annotated``, a static type checker may choose
to ignore any ``Annotated`` metadata that it does not understand::
v3: TypeForm[int | str] = Annotated[int | str, "metadata"] # OK
v4: TypeForm[Annotated[int | str, "metadata"]] = int | str # OK
A string literal expression containing a valid type expression should likewise
be assignable to ``TypeForm``::
v5: TypeForm[set[str]] = "set[str]" # OK
Expressions that violate one or more of the syntactic, semantic, or contextual
rules for type expressions should not evaluate to a ``TypeForm`` type. The rules
for type expression validity are explained in detail within the typing spec, so
they are not repeated here::
bad1: TypeForm = tuple() # Error: Call expression not allowed in type expression
bad2: TypeForm = (1, 2) # Error: Tuple expression not allowed in type expression
bad3: TypeForm = 1 # Non-class object not allowed in type expression
bad4: TypeForm = Self # Error: Self not allowed outside of a class
bad5: TypeForm = Literal[var] # Error: Variable not allowed in type expression
bad6: TypeForm = Literal[f""] # Error: f-strings not allowed in type expression
bad7: TypeForm = ClassVar[int] # Error: ClassVar not allowed in type expression
bad8: TypeForm = Required[int] # Error: Required not allowed in type expression
bad9: TypeForm = Final[int] # Error: Final not allowed in type expression
bad10: TypeForm = Unpack[Ts] # Error: Unpack not allowed in this context
bad11: TypeForm = Optional # Error: Invalid use of Optional special form
bad12: TypeForm = T # Error if T is an out-of-scope TypeVar
bad13: TypeForm = "int + str" # Error: invalid quoted type expression
Explicit ``TypeForm`` Evaluation
--------------------------------
``TypeForm`` also acts as a function that can be called with a single argument.
Type checkers should validate that this argument is a valid type expression::
x1 = TypeForm(str | None)
reveal_type(v1) # Revealed type is "TypeForm[str | None]"
x2 = TypeForm("list[int]")
revealed_type(v2) # Revealed type is "TypeForm[list[int]]"
x3 = TypeForm('type(1)') # Error: invalid type expression
At runtime the ``TypeForm(...)`` callable simply returns the value passed to it.
This explicit syntax serves two purposes. First, it documents the developer's
intent to use the value as a type form object. Second, static type checkers
validate that all rules for type expressions are followed::
x4 = type(int) # No error, evaluates to "type[int]"
x5 = TypeForm(type(int)) # Error: call not allowed in type expression
Assignability
-------------
``TypeForm`` has a single type parameter, which is covariant. That means
``TypeForm[B]`` is assignable to ``TypeForm[A]`` if ``B`` is assignable to
``A``::
def get_type_form() -> TypeForm[int]: ...
t1: TypeForm[int | str] = get_type_form() # OK
t2: TypeForm[str] = get_type_form() # Error
``type[T]`` is a subtype of ``TypeForm[T]``, which means that ``type[B]`` is
assignable to ``TypeForm[A]`` if ``B`` is assignable to ``A``::
def get_type() -> type[int]: ...
t3: TypeForm[int | str] = get_type() # OK
t4: TypeForm[str] = get_type() # Error
``TypeForm`` is a subtype of ``object`` and is assumed to have all of the
attributes and methods of ``object``.
Backward Compatibility
======================
This PEP clarifies static type checker behaviors when evaluating type
expressions in "value expression" contexts (that is, contexts where type
expressions are not mandated by the typing spec). In the absence of a
``TypeForm`` type annotation, existing type evaluation behaviors persist,
so no backward compatibility issues are anticipated. For example, if a static
type checker previously evaluated the type of expression ``str | None`` as
``UnionType``, it will continue to do so unless this expression is assigned
to a variable or parameter whose type is annotated as ``TypeForm``.
How to Teach This
=================
Type expressions are used in annotations to describe which values are accepted
by a function parameter, returned by a function, or stored in a variable:
.. code-block:: text
parameter type return type
| |
v v
def plus(n1: int, n2: int) -> int:
sum: int = n1 + n2
^
|
variable type
return sum
Type expressions evaluate to valid *type form* objects at runtime and can be
assigned to variables and manipulated like any other data in a program:
.. code-block:: text
a variable a type expression
| |
v v
int_type_form: TypeForm = int | None
^
|
the type of a type form object
``TypeForm[]`` is how you spell the type of a *type form* object, which is
a runtime representation of a type.
``TypeForm`` is similar to ``type``, but ``type`` is compatible only with
**class objects** like ``int``, ``str``, ``list``, or ``MyClass``.
``TypeForm`` accommodates any type form that can be expressed using
a valid type expression, including those with brackets (``list[int]``), union
operators (``int | None``), and special forms (``Any``, ``LiteralString``,
``Never``, etc.).
Most programmers will not define their *own* functions that accept a ``TypeForm``
parameter or return a ``TypeForm`` value. It is more common to pass a type
form object to a library function that knows how to decode and use such objects.
For example, the ``isassignable`` function in the ``trycast`` library
can be used like Python's built-in ``isinstance`` function to check whether
a value matches the shape of a particular type. ``isassignable`` accepts *any*
type form object as input.
- Yes:
::
from trycast import isassignable
if isassignable(some_object, MyTypedDict): # OK: MyTypedDict is a TypeForm[]
...
- No:
::
if isinstance(some_object, MyTypedDict): # ERROR: MyTypedDict is not a type[]
...
Advanced Examples
=================
If you want to write your own runtime type checker or a function that
manipulates type form objects as values at runtime, this section provides
examples of how such a function can use ``TypeForm``.
Introspecting type form objects
-------------------------------
Functions like ``typing.get_origin`` and ``typing.get_args`` can be used to
extract components of some type form objects.
::
import typing
def strip_annotated_metadata(typx: TypeForm[T]) -> TypeForm[T]:
if typing.get_origin(typx) is typing.Annotated:
typx = cast(TypeForm[T], typing.get_args(typx)[0])
return typx
``isinstance`` and ``is`` can also be used to distinguish between different
kinds of type form objects:
::
import types
import typing
def split_union(typx: TypeForm) -> tuple[TypeForm, ...]:
if isinstance(typ, types.UnionType): # X | Y
return cast(tuple[TypeForm, ...], typing.get_args(typ))
if typing.get_origin(typ) is typing.Union: # Union[X, Y]
return cast(tuple[TypeForm, ...], typing.get_args(typ))
if typ in (typing.Never, typing.NoReturn,):
return ()
return (typ,)
Combining with a type variable
------------------------------
``TypeForm`` can be parameterized by a type variable that is used elsewhere
within the same function definition:
::
def as_instance[T](typx: TypeForm[T]) -> T | None:
return typ() if isinstance(typ, type) else None
Combining with ``type``
-----------------------
Both ``TypeForm`` and ``type`` can be parameterized by the same type
variable within the same function definition:
::
def as_type[T](typx: TypeForm[T]) -> type[T] | None:
return typ if isinstance(typ, type) else None
Combining with ``TypeIs`` and ``TypeGuard``
-------------------------------------------
A type variable can also be used by a ``TypeIs`` or ``TypeGuard`` return type:
::
def isassignable[T](value: object, typx: TypeForm[T]) -> TypeIs[T]: ...
count: int | str = ...
if isassignable(count, int):
assert_type(count, int)
else:
assert_type(count, str)
Challenges When Accepting All TypeForms
---------------------------------------
A function that takes an *arbitrary* ``TypeForm`` as input must support a
variety of possible type form objects. Such functions are not easy to write.
- New special forms are introduced with each new Python version, and
special handling may be required for each one.
- Quoted annotations [#quoted_less_common]_ (like ``'list[str]'``)
must be *parsed* (to something like ``list[str]``).
- Resolving quoted forward references inside type expressions is typically
done with ``eval()``, which is difficult to use in a safe way.
- Recursive types like ``IntTree = list[int | 'IntTree']`` are difficult
to resolve.
- User-defined generic types (like Djangos ``QuerySet[User]``) can introduce
non-standard behaviors that require runtime support.
Reference Implementation
========================
Pyright (version 1.1.379) provides a reference implementation for ``TypeForm``.
Mypy contributors also `plan to implement <https://github.com/python/mypy/issues/9773>`__
support for ``TypeForm``.
A reference implementation of the runtime component is provided in the
``typing_extensions`` module.
Rejected Ideas
==============
Alternative names
-----------------
Alternate names were considered for ``TypeForm``. ``TypeObject``
and ``TypeType`` were deemed too generic. ``TypeExpression`` and ``TypeExpr``
were also considered, but these were considered confusing because these objects
are not themselves "expressions" but rather the result of evaluating a type
expression.
Widen ``type[C]`` to support all type expressions
-------------------------------------------------
``type`` was `designed`_ to describe class objects, subclasses of the
``type`` class. A value with the type ``type`` is assumed to be instantiable
through a constructor call. Widening the meaning of ``type`` to represent
arbitrary type form objects would present backward compatibility problems
and would eliminate a way to describe the set of values limited to subclasses
of ``type``.
.. _designed: https://mail.python.org/archives/list/typing-sig@python.org/message/D5FHORQVPHX3BHUDGF3A3TBZURBXLPHD/
Accept arbitrary annotation expressions
---------------------------------------
Certain special forms act as type qualifiers and can be used in
*some* but not *all* annotation contexts:
For example. the type qualifier ``Final`` can be used as a variable type but
not as a parameter type or a return type:
::
some_const: Final[str] = ... # OK
def foo(not_reassignable: Final[object]): ... # Error: Final not allowed here
def nonsense() -> Final[object]: ... # Error: Final not alowed here
With the exception of ``Annotated``, type qualifiers are not allowed in type
expressions. ``TypeForm`` is limited to type expressions because its
assignability rules are based on the assignability rules for types. It is
nonsensical to ask whether ``Final[int]`` is assignable to ``int`` because the
former is not a valid type expression.
Functions that wish to operate on objects that are evaluated from annotation
expressions can continue to accept such inputs as ``object`` parameters.
Pattern matching on type forms
------------------------------
It was asserted that some functions may wish to pattern match on the
interior of type expressions in their signatures.
One use case is to allow a function to explicitly enumerate all the
*specific* kinds of type expressions it supports as input.
Consider the following possible pattern matching syntax:
::
@overload
def checkcast(typx: TypeForm[AT=Annotated[T, *A]], value: str) -> T: ...
@overload
def checkcast(typx: TypeForm[UT=Union[*Ts]], value: str) -> Union[*Ts]: ...
@overload
def checkcast(typx: type[C], value: str) -> C: ...
# ... (more)
All functions observed in the wild that conceptually accept type form
objects generally try to support *all* kinds of type expressions, so it
doesnt seem valuable to enumerate a particular subset.
Additionally, the above syntax isnt precise enough to fully describe the
input constraints for a typical function in the wild. For example, many
functions do not support type expressions with quoted subexpressions
like ``list['Movie']``.
A second use case for pattern matching is to explicitly match an ``Annotated``
form to extract the interior type argument and strip away any metadata:
::
def checkcast(
typx: TypeForm[T] | TypeForm[AT=Annotated[T, *A]],
value: object
) -> T:
However, ``Annotated[T, metadata]`` is already treated equivalent to ``T``
by static type checkers. Theres no additional value in being explicit about
this behavior. The example above could more simply be written as the equivalent:
::
def checkcast(typx: TypeForm[T], value: object) -> T:
Footnotes
=========
.. [#type_t]
:ref:`Type[T] <typing:type-brackets>` spells a class object
.. [#TypeIs]
:ref:`TypeIs[T] <typing:typeis>` is similar to bool
.. [#DataclassInitVar]
``dataclass.make_dataclass`` allows the type qualifier ``InitVar[...]``,
so ``TypeForm`` cannot be used in this case.
.. [#forward_ref_normalization]
Special forms normalize string arguments to ``ForwardRef`` instances
at runtime using internal helper functions in the ``typing`` module.
Runtime type checkers may wish to implement similar functions when
working with string-based forward references.
.. [#quoted_less_common]
Quoted annotations are expected to become less common starting in Python
3.14 when :pep:`deferred annotations <649>` is implemented. However,
code written for earlier Python versions relies on quoted annotations and
will need to be supported for several years.
Copyright
=========
This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.