python-peps/pep-0646.rst

1508 lines
52 KiB
ReStructuredText
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

PEP: 646
Title: Variadic Generics
Author: Mark Mendoza <mendoza.mark.a@gmail.com>,
Matthew Rahtz <mrahtz@google.com>,
Pradeep Kumar Srinivasan <gohanpra@gmail.com>,
Vincent Siles <vsiles@fb.com>
Sponsor: Guido van Rossum <guido@python.org>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 16-Sep-2020
Python-Version: 3.10
Post-History: 07-Oct-2020, 23-Dec-2020, 29-Dec-2020
Abstract
========
PEP 484 introduced ``TypeVar``, enabling creation of generics parameterised
with a single type. In this PEP, we introduce ``TypeVarTuple``, enabling parameterisation
with an *arbitrary* number of types - that is, a *variadic* type variable,
enabling *variadic* generics. This enables a wide variety of use cases.
In particular, it allows the type of array-like structures
in numerical computing libraries such as NumPy and TensorFlow to be
parameterised with the array *shape*, enabling static type checkers
to catch shape-related bugs in code that uses these libraries.
Motivation
==========
Variadic generics have long been a requested feature, for a myriad of
use cases [#typing193]_. One particular use case - a use case with potentially
large impact, and the main case this PEP targets - concerns typing in
numerical libraries.
In the context of numerical computation with libraries such as NumPy and TensorFlow,
the *shape* of variables is often just as important as the variable *type*.
For example, consider the following function which converts a batch [#batch]_
of videos to grayscale:
::
def to_gray(videos: Array): ...
From the signature alone, it is not obvious what shape of array [#array]_
we should pass for the ``videos`` argument. Possibilities include, for
example,
batch × time × height × width × channels
and
time × batch × channels × height × width. [#timebatch]_
This is important for three reasons:
* **Documentation**. Without the required shape being clear in the signature,
the user must hunt in the docstring or the code in question to determine
what the input/output shape requirements are.
* **Catching shape bugs before runtime**. Ideally, use of incorrect shapes
should be an error we can catch ahead of time using static analysis.
(This is particularly important for machine learning code, where iteration
times can be slow.)
* **Preventing subtle shape bugs**. In the worst case, use of the wrong shape
will result in the program appearing to run fine, but with a subtle bug
that can take days to track down. (See `this exercise`_ in a popular machine learning
tutorial for a particularly pernicious example.)
Ideally, we should have some way of making shape requirements explicit in
type signatures. Multiple proposals [#numeric-stack]_ [#typing-ideas]_
[#syntax-proposal]_ have suggested the use of the standard generics syntax for
this purpose. We would write:
::
def to_gray(videos: Array[Time, Batch, Height, Width, Channels]): ...
However, note that arrays can be of arbitrary rank - ``Array`` as used above is
generic in an arbitrary number of axes. One way around this would be to use a different
``Array`` class for each rank...
::
Axis1 = TypeVar('Axis1')
Axis2 = TypeVar('Axis2')
class Array1(Generic[Axis1]): ...
class Array2(Generic[Axis1, Axis2]): ...
...but this would be cumbersome, both for users (who would have to sprinkle 1s and 2s
and so on throughout their code) and for the authors of array libraries (who would have to duplicate implementations throughout multiple classes).
Variadic generics are necessary for an ``Array`` that is generic in an arbitrary
number of axes to be cleanly defined as a single class.
Summary Examples
================
Cutting right to the chase, this PEP allows an ``Array`` class that is generic
in its shape (and datatype) to be defined using a newly-introduced
arbitrary-length type variable, ``TypeVarTuple``, as follows:
::
from typing import TypeVar, TypeVarTuple
DType = TypeVar('DType')
Shape = TypeVarTuple('Shape')
class Array(Generic[DType, *Shape]):
def __abs__(self) -> Array[DType, *Shape]: ...
def __add__(self, other: Array[DType, *Shape]) -> Array[DType, *Shape]: ...
Such an ``Array`` can be used to support a number of different kinds of
shape annotations. For example, we can add labels describing the
semantic meaning of each axis:
::
from typing import NewType
Height = NewType('Height', int)
Width = NewType('Width', int)
x: Array[float, Height, Width] = Array()
We could also add annotations describing the actual size of each axis:
::
from typing import Literal as L
x: Array[float, L[480], L[640]] = Array()
For consistency, we use semantic axis annotations as the basis of the examples
in this PEP, but this PEP is agnostic about which of these two (or possibly other)
ways of using ``Array`` is preferable; that decision is left to library authors.
(Note also that for the rest of this PEP, for conciseness of example, we use
a simpler version of ``Array`` which is generic only in the shape - *not* the
data type.)
Specification
=============
In order to support the above use cases, we introduce
``TypeVarTuple``. This serves as a placeholder not for a single type
but for a *tuple* of types.
In addition, we introduce a new use for the star operator: to 'unpack'
``TypeVarTuple`` instances and tuple types such as ``Tuple[int,
str]``. Unpacking a ``TypeVarTuple`` or tuple type is the typing
equivalent of unpacking a variable or a tuple of values.
Type Variable Tuples
--------------------
In the same way that a normal type variable is a stand-in for a single
type such as ``int``, a type variable *tuple* is a stand-in for a *tuple* type such as
``Tuple[int, str]``.
Type variable tuples are created with:
::
from typing import TypeVarTuple
Ts = TypeVarTuple('Ts')
Using Type Variable Tuples in Generic Classes
'''''''''''''''''''''''''''''''''''''''''''''
Type variable tuples behave like a number of individual type variables packed in a
``Tuple``. To understand this, consider the following example:
::
Shape = TypeVarTuple('Shape')
class Array(Generic[*Shape]): ...
Height = NewType('Height', int)
Width = NewType('Width', int)
x: Array[Height, Width] = Array()
The ``Shape`` type variable tuple here behaves like ``Tuple[T1, T2]``,
where ``T1`` and ``T2`` are type variables. To use these type variables
as type parameters of ``Array``, we must *unpack* the type variable tuple using
the star operator: ``*Shape``. The signature of ``Array`` then behaves
as if we had simply written ``class Array(Generic[T1, T2]): ...``.
In contrast to ``Generic[T1, T2]``, however, ``Generic[*Shape]`` allows
us to parameterise the class with an *arbitrary* number of type parameters.
That is, in addition to being able to define rank-2 arrays such as
``Array[Height, Width]``, we could also define rank-3 arrays, rank-4 arrays,
and so on:
::
Time = NewType('Time', int)
Batch = NewType('Batch', int)
y: Array[Batch, Height, Width] = Array()
z: Array[Time, Batch, Height, Width] = Array()
Using Type Variable Tuples in Functions
'''''''''''''''''''''''''''''''''''''''
Type variable tuples can be used anywhere a normal ``TypeVar`` can.
This includes class definitions, as shown above, as well as function
signatures and variable annotations:
::
class Array(Generic[*Shape]):
def __init__(self, shape: Tuple[*Shape]):
self._shape: Tuple[*Shape] = shape
def get_shape(self) -> Tuple[*Shape]:
return self._shape
shape = (Height(480), Width(640))
x: Array[Height, Width] = Array(shape)
y = abs(x) # Inferred type is Array[Height, Width]
z = x + x # ... is Array[Height, Width]
Type Variable Tuples Must Always be Unpacked
''''''''''''''''''''''''''''''''''''''''''''
Note that in the previous example, the ``shape`` argument to ``__init__``
was annotated as ``Tuple[*Shape]``. Why is this necessary - if ``Shape``
behaves like ``Tuple[T1, T2, ...]``, couldn't we have annotated the ``shape``
argument as ``Shape`` directly?
This is, in fact, deliberately not possible: type variable tuples must
*always* be used unpacked (that is, prefixed by the star operator). This is
for two reasons:
* To avoid potential confusion about whether to use a type variable tuple
in a packed or unpacked form ("Hmm, should I write '``-> Shape``',
or '``-> Tuple[Shape]``', or '``-> Tuple[*Shape]``'...?")
* To improve readability: the star also functions as an explicit visual
indicator that the type variable tuple is not a normal type variable.
``Unpack`` for Backwards Compatibility
''''''''''''''''''''''''''''''''''''''
Note that the use of the star operator in this context requires a grammar change,
and is therefore available only in new versions of Python. To enable use of type
variable tuples in older versions of Python, we introduce the ``Unpack`` type
operator that can be used in place of the star operator:
::
# Unpacking using the star operator in new versions of Python
class Array(Generic[*Shape]): ...
# Unpacking using ``Unpack`` in older versions of Python
class Array(Generic[Unpack[Shape]]): ...
Variance, Type Constraints and Type Bounds: Not (Yet) Supported
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
To keep this PEP minimal, ``TypeVarTuple`` does not yet support specification of:
* Variance (e.g. ``TypeVar('T', covariant=True)``)
* Type constraints (``TypeVar('T', int, float)``)
* Type bounds (``TypeVar('T', bound=ParentClass)``)
We leave the decision of how these arguments should behave to a future PEP, when variadic generics have been tested in the field. As of this PEP, type variable tuples are
invariant.
Type Variable Tuple Equality
''''''''''''''''''''''''''''
If the same ``TypeVarTuple`` instance is used in multiple places in a signature
or class, a valid type inference might be to bind the ``TypeVarTuple`` to
a ``Tuple`` of a ``Union`` of types:
::
def foo(arg1: Tuple[*Ts], arg2: Tuple[*Ts]): ...
a = (0,)
b = ('0',)
foo(a, b) # Can Ts be bound to Tuple[int | str]?
We do *not* allow this; type unions may *not* appear within the ``Tuple``.
If a type variable tuple appears in multiple places in a signature,
the types must match exactly (the list of type parameters must be the same
length, and the type parameters themselves must be identical):
::
def pointwise_multiply(
x: Array[*Shape],
y: Array[*Shape]
) -> Array[*Shape]: ...
x: Array[Height]
y: Array[Width]
z: Array[Height, Width]
pointwise_multiply(x, x) # Valid
pointwise_multiply(x, y) # Error
pointwise_multiply(x, z) # Error
Multiple Type Variable Tuples: Not Allowed
''''''''''''''''''''''''''''''''''''''''''
As of this PEP, only a single type variable tuple may appear in a type parameter list:
::
class Array(Generic[*Ts1, *Ts2]): ... # Error
The reason is that multiple type variable tuples make it ambiguous
which parameters get bound to which type variable tuple: ::
x: Array[int, str, bool] # Ts1 = ???, Ts2 = ???
Type Concatenation
------------------
Type variable tuples don't have to be alone; normal types can be
prefixed and/or suffixed:
::
Shape = TypeVarTuple('Shape')
Batch = NewType('Batch', int)
Channels = NewType('Channels', int)
def add_batch_axis(x: Array[*Shape]) -> Array[Batch, *Shape]: ...
def del_batch_axis(x: Array[Batch, *Shape]) -> Array[*Shape]: ...
def add_batch_channels(
x: Array[*Shape]
) -> Array[Batch, *Shape, Channels]: ...
a: Array[Height, Width]
b = add_batch_axis(a) # Inferred type is Array[Batch, Height, Width]
c = del_batch_axis(b) # Array[Height, Width]
d = add_batch_channels(a) # Array[Batch, Height, Width, Channels]
Normal ``TypeVar`` instances can also be prefixed and/or suffixed:
::
T = TypeVar('T')
Ts = TypeVarTuple('Ts')
def prefix_tuple(
x: T,
y: Tuple[*Ts]
) -> Tuple[T, *Ts]: ...
z = prefix_tuple(x=0, y=(True, 'a'))
# Inferred type of z is Tuple[int, bool, str]
Unpacking Tuple Types
---------------------
We mentioned that a ``TypeVarTuple`` stands for a tuple of types.
Since we can unpack a ``TypeVarTuple``, for consistency, we also
allow unpacking a tuple type. As we shall see, this also enables a
number of interesting features.
Unpacking Concrete Tuple Types
''''''''''''''''''''''''''''''
Unpacking a concrete tuple type is analogous to unpacking a tuple of
values at runtime. ``Tuple[int, *Tuple[bool, bool], str]`` is
equivalent to ``Tuple[int, bool, bool, str]``.
Unpacking Unbounded Tuple Types
'''''''''''''''''''''''''''''''
Unpacking an unbounded tuple preserves the unbounded tuple as it is.
That is, ``*Tuple[int, ...]`` remains ``*Tuple[int, ...]``; there's no
simpler form. This enables us to specify types such as ``Tuple[int,
*Tuple[str, ...], str]`` - a tuple type where the first element is
guaranteed to be of type ``int``, the last element is guaranteed to be
of type ``str``, and the elements in the middle are zero or more
elements of type ``str``. Note that ``Tuple[*Tuple[int, ...]]`` is
equivalent to ``Tuple[int, ...]``.
Unpacking unbounded tuples is also useful in function signatures where
we don't care about the exact elements and don't want to define an
unnecessary ``TypeVarTuple``:
::
def process_batch_channels(
x: Array[Batch, *Tuple[Any, ...], Channels]
) -> None:
...
x: Array[Batch, Height, Width, Channels]
process_batch_channels(x) # OK
y: Array[Batch, Channels]
process_batch_channels(y) # OK
z: Array[Batch]
process_batch_channels(z) # Error: Expected Channels.
We can also pass a ``*Tuple[int, ...]`` wherever a ``*Ts`` is
expected. This is useful when we have particularly dynamic code and
cannot state the precise number of dimensions or the precise types for
each of the dimensions. In those cases, we can smoothly fall back to
an unbounded tuple:
::
y: Array[*Tuple[Any, ...]] = read_from_file()
def expect_variadic_array(
x: Array[Batch, *Shape]
) -> None: ...
expect_variadic_array(y) # OK
def expect_precise_array(
x: Array[Batch, Height, Width, Channels]
) -> None: ...
expect_precise_array(y) # OK
``Array[*Tuple[Any, ...]]`` stands for an array with an arbitrary
number of dimensions of type ``Any``. This means that, in the call to
``expect_variadic_array``, ``Batch`` is bound to ``Any`` and ``Shape``
is bound to ``Tuple[Any, ...]``. In the call to
``expect_precise_array``, the variables ``Batch``, ``Height``,
``Width``, and ``Channels`` are all bound to ``Any``.
This allows users to handle dynamic code gracefully while still
explicitly marking the code as unsafe (by using ``y: Array[*Tuple[Any,
...]]``). Otherwise, users would face noisy errors from the type
checker every time they tried to use the variable ``y``, which would
hinder them when migrating a legacy code base to use ``TypeVarTuple``.
Multiple Unpackings in a Tuple: Not Allowed
'''''''''''''''''''''''''''''''''''''''''''
As with ``TypeVarTuples``, `only one <Multiple Type Variable Tuples:
Not Allowed_>`_ unpacking may appear in a tuple:
::
x: Tuple[int, *Ts, str, *Ts2] # Error
y: Tuple[int, *Tuple[int, ...], str, *Tuple[str, ...]] # Error
``*args`` as a Type Variable Tuple
----------------------------------
PEP 484 states that when a type annotation is provided for ``*args``, every argument
must be of the type annotated. That is, if we specify ``*args`` to be type ``int``,
then *all* arguments must be of type ``int``. This limits our ability to specify
the type signatures of functions that take heterogeneous argument types.
If ``*args`` is annotated as a type variable tuple, however, the types of the
individual arguments become the types in the type variable tuple:
::
Ts = TypeVarTuple('Ts')
def args_to_tuple(*args: *Ts) -> Tuple[*Ts]: ...
args_to_tuple(1, 'a') # Inferred type is Tuple[int, str]
In the above example, ``Ts`` is bound to ``Tuple[int, str]``. If no
arguments are passed, the type variable tuple behaves like an empty
tuple, ``Tuple[()]``.
As usual, we can unpack any tuple types. For example, by using a type
variable tuple inside a tuple of other types, we can refer to prefixes
or suffixes of the variadic argument list. For example:
::
# os.execle takes arguments 'path, arg0, arg1, ..., env'
def execle(path: str, *args: *Tuple[*Ts, Env]) -> None: ...
Note that this is different to
::
def execle(path: str, *args: *Ts, env: Env) -> None: ...
as this would make ``env`` a keyword-only argument.
Using an unpacked unbounded tuple is equivalent to the PEP 484
behavior [#pep-484-args]_ of ``*args: int``, which accepts zero or
more values of type ``int``:
::
def foo(*args: *Tuple[int, ...]) -> None: ...
# equivalent to:
def foo(*args: int) -> None: ...
Unpacking tuple types also allows more precise types for heterogeneous
``*args``. The following function expects an ``int`` at the beginning,
zero or more ``str`` values, and a ``str`` at the end:
::
def foo(*args: *Tuple[int, *Tuple[str, ...], str]) -> None: ...
For completeness, we mention that unpacking a concrete tuple allows us
to specify ``*args`` of a fixed number of heterogeneous types:
::
def foo(*args: *Tuple[int, str]) -> None: ...
foo(1, "hello") # OK
Note that, in keeping with the rule that type variable tuples must always
be used unpacked, annotating ``*args`` as being a plain type variable tuple
instance is *not* allowed:
::
def foo(*args: Ts): ... # NOT valid
``*args`` is the only case where an argument can be annotated as ``*Ts`` directly;
other arguments should use ``*Ts`` to parameterise something else, e.g. ``Tuple[*Ts]``.
If ``*args`` itself is annotated as ``Tuple[*Ts]``, the old behaviour still applies:
all arguments must be a ``Tuple`` parameterised with the same types.
::
def foo(*args: Tuple[*Ts]): ...
foo((0,), (1,)) # Valid
foo((0,), (1, 2)) # Error
foo((0,), ('1',)) # Error
Finally, note that a type variable tuple may *not* be used as the type of
``**kwargs``. (We do not yet know of a use case for this feature, so we prefer
to leave the ground fresh for a potential future PEP.)
::
# NOT valid
def foo(**kwargs: *Ts): ...
Type Variable Tuples with ``Callable``
--------------------------------------
Type variable tuples can also be used in the arguments section of a
``Callable``:
::
class Process:
def __init__(
self,
target: Callable[[*Ts], None],
args: Tuple[*Ts],
) -> None: ...
def func(arg1: int, arg2: str) -> None: ...
Process(target=func, args=(0, 'foo')) # Valid
Process(target=func, args=('foo', 0)) # Error
Other types and normal type variables can also be prefixed/suffixed
to the type variable tuple:
::
T = TypeVar('T')
def foo(f: Callable[[int, *Ts, T], Tuple[T, *Ts]]): ...
The behavior of a Callable containing an unpacked item, whether the
item is a ``TypeVarTuple`` or a tuple type, is to treat the elements
as if they were the type for ``*args``. So, ``Callable[[*Ts], None]``
is treated as the type of the function:
::
def foo(*args: *Ts) -> None: ...
``Callable[[int, *Ts, T], Tuple[T, *Ts]]`` is treated as the type of
the function:
::
def foo(*args: *Tuple[int, *Ts, T]) -> Tuple[T, *Ts]: ...
Behaviour when Type Parameters are not Specified
------------------------------------------------
When a generic class parameterised by a type variable tuple is used without
any type parameters, it behaves as if the type variable tuple was
substituted with ``Tuple[Any, ...]``:
::
def takes_any_array(arr: Array): ...
# equivalent to:
def takes_any_array(arr: Array[*Tuple[Any, ...]]): ...
x: Array[Height, Width]
takes_any_array(x) # Valid
y: Array[Time, Height, Width]
takes_any_array(y) # Also valid
This enables gradual typing: existing functions accepting, for example,
a plain TensorFlow ``Tensor`` will still be valid even if ``Tensor`` is made
generic and calling code passes a ``Tensor[Height, Width]``.
This also works in the opposite direction:
::
def takes_specific_array(arr: Array[Height, Width]): ...
z: Array
# equivalent to Array[*Tuple[Any, ...]]
takes_specific_array(z)
(For details, see the section on `Unpacking Unbounded Tuple Types`_.)
This way, even if libraries are updated to use types like ``Array[Height, Width]``,
users of those libraries won't be forced to also apply type annotations to
all of their code; users still have a choice about what parts of their code
to type and which parts to not.
Aliases
-------
Generic aliases can be created using a type variable tuple in
a similar way to regular type variables:
::
IntTuple = Tuple[int, *Ts]
NamedArray = Tuple[str, Array[*Ts]]
IntTuple[float, bool] # Equivalent to Tuple[int, float, bool]
NamedArray[Height] # Equivalent to Tuple[str, Array[Height]]
As this example shows, all type parameters passed to the alias are
bound to the type variable tuple.
Importantly for our original ``Array`` example (see `Summary Examples`_), this
allows us to define convenience aliases for arrays of a fixed shape
or datatype:
::
Shape = TypeVarTuple('Shape')
DType = TypeVar('DType')
class Array(Generic[DType, *Shape]):
# E.g. Float32Array[Height, Width, Channels]
Float32Array = Array[np.float32, *Shape]
# E.g. Array1D[np.uint8]
Array1D = Array[DType, Any]
If an explicitly empty type parameter list is given, the type variable
tuple in the alias is set empty:
::
IntTuple[()] # Equivalent to Tuple[int]
NamedArray[()] # Equivalent to Tuple[str, Array[()]]
If the type parameter list is omitted entirely, the unspecified type
variable tuples are treated as ``Tuple[Any, ...]`` (similar to
`Behaviour when Type Parameters are not Specified`_):
::
def takes_float_array_of_any_shape(x: Float32Array): ...
x: Float32Array[Height, Width] = Array()
takes_float_array_of_any_shape(x) # Valid
def takes_float_array_with_specific_shape(
y: Float32Array[Height, Width]
): ...
y: Float32Array = Array()
takes_float_array_with_specific_shape(y) # Valid
Normal ``TypeVar`` instances can also be used in such aliases:
::
T = TypeVar('T')
Foo = Tuple[T, *Ts]
# T bound to str, Ts to Tuple[int]
Foo[str, int]
# T bound to float, Ts to Tuple[()]
Foo[float]
# T bound to Any, Ts to an Tuple[Any, ...]
Foo
Overloads for Accessing Individual Types
----------------------------------------
For situations where we require access to each individual type in the type variable tuple,
overloads can be used with individual ``TypeVar`` instances in place of the type variable tuple:
::
Shape = TypeVarTuple('Shape')
Axis1 = TypeVar('Axis1')
Axis2 = TypeVar('Axis2')
Axis3 = TypeVar('Axis3')
class Array(Generic[*Shape]):
@overload
def transpose(
self: Array[Axis1, Axis2]
) -> Array[Axis2, Axis1]: ...
@overload
def transpose(
self: Array[Axis1, Axis2, Axis3]
) -> Array[Axis3, Axis2, Axis1]: ...
(For array shape operations in particular, having to specify
overloads for each possible rank is, of course, a rather cumbersome
solution. However, it's the best we can do without additional type
manipulation mechanisms. We plan to introduce these in a future PEP.)
Rationale and Rejected Ideas
============================
Shape Arithmetic
----------------
Considering the use case of array shapes in particular, note that as of
this PEP, it is not yet possible to describe arithmetic transformations
of array dimensions - for example,
``def repeat_each_element(x: Array[N]) -> Array[2*N]``. We consider
this out-of-scope for the current PEP, but plan to propose additional
mechanisms that *will* enable this in a future PEP.
Supporting Variadicity Through Aliases
--------------------------------------
As noted in the introduction, it *is* possible to avoid variadic generics
by simply defining aliases for each possible number of type parameters:
::
class Array1(Generic[Axis1]): ...
class Array2(Generic[Axis1, Axis2]): ...
However, this seems somewhat clumsy - it requires users to unnecessarily
pepper their code with 1s, 2s, and so on for each rank necessary.
Construction of ``TypeVarTuple``
--------------------------------
``TypeVarTuple`` began as ``ListVariadic``, based on its naming in
an early implementation in Pyre.
We then changed this to ``TypeVar(list=True)``, on the basis that a)
it better emphasises the similarity to ``TypeVar``, and b) the meaning
of 'list' is more easily understood than the jargon of 'variadic'.
Once we'd decided that a variadic type variable should behave like a ``Tuple``,
we also considered ``TypeVar(bound=Tuple)``, which is similarly intuitive
and accomplishes most what we wanted without requiring any new arguments to
``TypeVar``. However, we realised this may constrain us in the future, if
for example we want type bounds or variance to function slightly differently
for variadic type variables than what the semantics of ``TypeVar`` might
otherwise imply. Also, we may later wish to support arguments that should not be supported by regular type variables (such as ``arbitrary_len`` [#arbitrary_len]_).
We therefore settled on ``TypeVarTuple``.
Unspecified Type Parameters: Tuple vs TypeVarTuple
--------------------------------------------------
In order to support gradual typing, this PEP states that *both*
of the following examples should type-check correctly:
::
def takes_any_array(x: Array): ...
x: Array[Height, Width]
takes_any_array(x)
def takes_specific_array(y: Array[Height, Width]): ...
y: Array
takes_specific_array(y)
Note that this is in contrast to the behaviour of the only currently-existing
variadic type in Python, ``Tuple``:
::
def takes_any_tuple(x: Tuple): ...
x: Tuple[int, str]
takes_any_tuple(x) # Valid
def takes_specific_tuple(y: Tuple[int, str]): ...
y: Tuple
takes_specific_tuple(y) # Error
The rules for ``Tuple`` were deliberately chosen such that the latter case
is an error: it was thought to be more likely that the programmer has made a
mistake than that the function expects a specific kind of ``Tuple`` but the
specific kind of ``Tuple`` passed is unknown to the type checker. Additionally,
``Tuple`` is something of a special case, in that it is used to represent
immutable sequences. That is, if an object's type is inferred to be an
unparameterised ``Tuple``, it is not necessarily because of incomplete typing.
In contrast, if an object's type is inferred to be an unparameterised ``Array``,
it is much more likely that the user has simply not yet fully annotated their
code, or that the signature of a shape-manipulating library function cannot yet
be expressed using the typing system and therefore returning a plain ``Array``
is the only option. We rarely deal with arrays of truly arbitrary shape;
in certain cases, *some* parts of the shape will be arbitrary - for example,
when dealing with sequences, the first two parts of the shape are often
'batch' and 'time' - but we plan to support these cases explicitly in a
future PEP with a syntax such as ``Array[Batch, Time, ...]``.
We therefore made the decision to have variadic generics *other* than
``Tuple`` behave differently, in order to give the user more flexibility
in how much of their code they wish to annotate, and to enable compatibility
between old unannotated code and new versions of libraries which do use
these type annotations.
Alternatives
============
It should be noted that the approach outlined in this PEP to solve the
issue of shape checking in numerical libraries is *not* the only approach
possible. Examples of lighter-weight alternatives based on *runtime* checking include
ShapeGuard [#shapeguard]_, tsanley [#tsanley]_, and PyContracts [#pycontracts]_.
While these existing approaches improve significantly on the default
situation of shape checking only being possible through lengthy and verbose
assert statements, none of them enable *static* analysis of shape correctness.
As mentioned in `Motivation`_, this is particularly desirable for
machine learning applications where, due to library and infrastructure complexity,
even relatively simple programs must suffer long startup times; iterating
by running the program until it crashes, as is necessary with these
existing runtime-based approaches, can be a tedious and frustrating
experience.
Our hope with this PEP is to begin to codify generic type annotations as
an official, language-supported way of dealing with shape correctness.
With something of a standard in place, in the long run, this will
hopefully enable a thriving ecosystem of tools for analysing and verifying
shape properties of numerical computing programs.
Grammar Changes
===============
This PEP requires two grammar changes.
Change 1: Star Expressions in Indexes
-------------------------------------
The first grammar change enables use of star expressions in index operations (that is,
within square brackets), necessary to support star-unpacking of TypeVarTuples:
::
DType = TypeVar('DType')
Shape = TypeVarTuple('Shape')
class Array(Generic[DType, *Shape]):
...
Before:
::
slices:
| slice !','
| ','.slice+ [',']
After:
::
slices:
| slice !','
| ','.(slice | starred_expression)+ [',']
As with star-unpacking in other contexts, the star operator calls ``__iter__``
on the callee, and adds the contents of the resulting iterator to the argument
passed to ``__getitem__``. For example, if we do ``foo[a, *b, c]``, and
``b.__iter__`` produces an iterator yielding ``d`` and ``e``,
``foo.__getitem__`` would receive ``(a, d, e, c)``.
To put it another way, note that ``x[..., *a, ...]`` produces the same result
as ``x[(..., *a, ...)]``` (with any slices ``i:j`` in ``...`` replaced with
``slice(i, j)``, with the one edge case that ``x[*a]`` becomes ``x[(*a,)]``).
TypeVarTuple Implementation
'''''''''''''''''''''''''''
With this grammar change, ``TypeVarTuple`` is implemented as follows.
Note that this implementation is useful only for the benefit of a) correct
``repr()`` and b) runtime analysers; static analysers would not use the
implementation.
::
class TypeVarTuple:
def __init__(self, name):
self._name = name
self._unpacked = UnpackedTypeVarTuple(name)
def __iter__(self):
yield self._unpacked
def __repr__(self):
return self._name
class UnpackedTypeVarTuple:
def __init__(self, name):
self._name = name
def __repr__(self):
return '*' + self._name
Implications
''''''''''''
This grammar change implies a number of additional changes in behaviour not
required by this PEP. We choose to allow these additional changes rather than
disallowing them at a syntax level in order to keep the syntax change as small
as possible.
First, the grammar change enables star-unpacking of other structures, such
as lists, within indexing operations:
::
idxs_to_select = (1, 2)
array[0, *idxs_to_select, -1] # Equivalent to [0, 1, 2, -1]
Second, more than one instance of a star-unpack can occur within an index:
::
array[*idxs_to_select, *idxs_to_select] # Equivalent to array[1, 2, 1, 2]
Note that this PEP disallows multiple unpacked TypeVarTuples within a single
type parameter list. This requirement would therefore need to be implemented
in type checking tools themselves rather than at the syntax level.
Third, slices may co-occur with starred expressions:
::
array[3:5, *idxs_to_select] # Equivalent to array[3:5, 1, 2]
However, note that slices involving starred expressions are still invalid:
::
# Syntax error
array[*idxs_start:*idxs_end]
Change 2: ``*args`` as a TypeVarTuple
-------------------------------------
The second change enables use of ``*args: *Ts`` in function definitions.
Before:
::
star_etc:
| '*' param_no_default param_maybe_default* [kwds]
| '*' ',' param_maybe_default+ [kwds]
| kwds
After:
::
star_etc:
| '*' param_no_default param_maybe_default* [kwds]
| '*' param_no_default_star_annotation param_maybe_default* [kwds] # New
| '*' ',' param_maybe_default+ [kwds]
| kwds
Where:
::
param_no_default_star_annotation:
| param_star_annotation ',' TYPE_COMMENT?
| param_star_annotation TYPE_COMMENT? &')'
param_star_annotation: NAME star_annotation
star_annotation: ':' star_expression
We also need to deal with the ``star_expression`` that results from this
construction. Normally, a ``star_expression`` occurs within the context
of e.g. a list, so a ``star_expression`` is handled by essentially
calling ``iter()`` on the starred object, and inserting the results
of the resulting iterator into the list at the appropriate place. For
``*args: *Ts``, however, we must process the ``star_expression`` in a
different way.
We do this by instead making a special case for the ``star_expression``
resulting from ``*args: *Ts``, emitting code equivalent to
``[annotation_value] = [*Ts]``. That is, we create an iterator from
``Ts`` by calling ``Ts.__iter__``, fetch a single value from the iterator,
verify that the iterator is exhausted, and set that value as the annotation
value. This results in the unpacked ``TypeVarTuple`` being set directly
as the runtime annotation for ``*args``:
::
>>> Ts = TypeVarTuple('Ts')
>>> def foo(*args: *Ts): pass
>>> foo.__annotations__
{'args': *Ts}
# *Ts is the repr() of Ts._unpacked, an instance of UnpackedTypeVarTuple
This allows the runtime annotation to be consistent with an AST representation
that uses a ``Starred`` node for the annotations of ``args`` - in turn important
for tools that rely on the AST such as mypy to correctly recognise the construction:
::
>>> print(ast.dump(ast.parse('def foo(*args: *Ts): pass'), indent=2))
Module(
body=[
FunctionDef(
name='foo',
args=arguments(
posonlyargs=[],
args=[],
vararg=arg(
arg='args',
annotation=Starred(
value=Name(id='Ts', ctx=Load()),
ctx=Load())),
kwonlyargs=[],
kw_defaults=[],
defaults=[]),
body=[
Pass()],
decorator_list=[])],
type_ignores=[])
Note that the only scenario in which this grammar change allows ``*Ts`` to be
used as a direct annotation (rather than being wrapped in e.g. ``Tuple[*Ts]``)
is ``*args``. Other uses are still invalid:
::
x: *Ts # Syntax error
def foo(x: *Ts): pass # Syntax error
Implications
''''''''''''
As with the first grammar change, this change also has a number of side effects.
In particular, the annotation of ``*args`` could be set to a starred object
other than a ``TypeVarTuple`` - for example, the following nonsensical
annotations are possible:
::
>>> foo = [1]
>>> def bar(*args: *foo): pass
>>> bar.__annotations__
{'args': 1}
>>> foo = [1, 2]
>>> def bar(*args: *foo): pass
ValueError: too many values to unpack (expected 1)
Again, prevention of such annotations will need to be done by, say, static
checkers, rather than at the level of syntax.
Alternatives (Why Not Just Use ``Unpack``?)
-------------------------------------------
If these grammar changes are considered too burdensome, there are two
alternatives.
The first would be to **support change 1 but not change 2**. Variadic generics
are more important to us than the ability to annotate ``*args``.
The second alternative would be to **use ``Unpack`` instead**, requiring no
grammar changes. However, we regard this as a suboptimal solution for two
reasons:
* **Readability**. ``class Array(Generic[DType, Unpack[Shape]])`` is a bit
of a mouthful; the flow of reading is interrupted by length of ``Unpack`` and
the extra set of square brackets. ``class Array(Generic[DType, *Shape])``
is much easier to skim, while still marking ``Shape`` as special.
* **Intuitiveness**. We think a user is more likely to intuitively understand
the meaning of ``*Ts`` - especially when they see that ``Ts`` is a
TypeVar**Tuple** - than the meaning of ``Unpack[Ts]``. (This assumes
the user is familiar with star-unpacking in other contexts; if the
user is reading or writing code that uses variadic generics, this seems
reasonable.)
If even change 1 is thought too significant a change, therefore, it might be
better for us to reconsider our options before going ahead with this second
alternative.
Backwards Compatibility
=======================
The ``Unpack`` version of the PEP should be back-portable to previous
versions of Python.
Gradual typing is enabled by the fact that unparameterised variadic classes
are compatible with an arbitrary number of type parameters. This means
that if existing classes are made generic, a) all existing (unparameterised)
uses of the class will still work, and b) parameterised and unparameterised
versions of the class can be used together (relevant if, for example, library
code is updated to use parameters while user code is not, or vice-versa).
Reference Implementation
========================
Two reference implementations of type-checking functionality exist:
one in Pyre, as of v0.9.0, and one in Pyright, as of v1.1.108.
A preliminary implementation of the ``Unpack`` version of the PEP in CPython
is available in `cpython/23527`_. A preliminary version of the version
using the star operator, based on an early implementation of PEP 637,
is also available at `mrahtz/cpython/pep637+646`_.
Appendix A: Shape Typing Use Cases
==================================
To give this PEP additional context for those particularly interested in the
array typing use case, in this appendix we expand on the different ways
this PEP can be used for specifying shape-based subtypes.
Use Case 1: Specifying Shape Values
-----------------------------------
The simplest way to parameterise array types is using ``Literal``
type parameters - e.g. ``Array[Literal[64], Literal[64]]``.
We can attach names to each parameter using normal type variables:
::
K = TypeVar('K')
N = TypeVar('N')
def matrix_vector_multiply(x: Array[K, N], Array[N]) -> Array[K]: ...
a: Array[Literal[64], Literal[32]]
b: Array[Literal[32]]
matrix_vector_multiply(a, b)
# Result is Array[Literal[64]]
Note that such names have a purely local scope. That is, the name
``K`` is bound to ``Literal[64]`` only within ``matrix_vector_multiply``. To put it another
way, there's no relationship between the value of ``K`` in different
signatures. This is important: it would be inconvenient if every axis named ``K``
were constrained to have the same value throughout the entire program.
The disadvantage of this approach is that we have no ability to enforce shape semantics across
different calls. For example, we can't address the problem mentioned in `Motivation`_: if
one function returns an array with leading dimensions 'Time × Batch', and another function
takes the same array assuming leading dimensions 'Batch × Time', we have no way of detecting this.
The main advantage is that in some cases, axis sizes really are what we care about. This is true
for both simple linear algebra operations such as the matrix manipulations above, but also in more
complicated transformations such as convolutional layers in neural networks, where it would be of
great utility to the programmer to be able to inspect the array size after each layer using
static analysis. To aid this, in the future we would like to explore possibilities for additional
type operators that enable arithmetic on array shapes - for example:
::
def repeat_each_element(x: Array[N]) -> Array[Mul[2, N]]: ...
Such arithmetic type operators would only make sense if names such as ``N`` refer to axis size.
Use Case 2: Specifying Shape Semantics
--------------------------------------
A second approach (the one that most of the examples in this PEP are based around)
is to forgo annotation with actual axis size, and instead annotate axis *type*.
This would enable us to solve the problem of enforcing shape properties across calls.
For example:
::
# lib.py
class Batch: pass
class Time: pass
def make_array() -> Array[Batch, Time]: ...
# user.py
from lib import Batch, Time
# `Batch` and `Time` have the same identity as in `lib`,
# so must take array as produced by `lib.make_array`
def use_array(x: Array[Batch, Time]): ...
Note that in this case, names are *global* (to the extent that we use the
same ``Batch`` type in different place). However, because names refer only
to axis *types*, this doesn't constrain the *value* of certain axes to be
the same through (that is, this doesn't constrain all axes named ``Height``
to have a value of, say, 480 throughout).
The argument *for* this approach is that in many cases, axis *type* is the more
important thing to verify; we care more about which axis is which than what the
specific size of each axis is.
It also does not preclude cases where we wish to describe shape transformations
without knowing the type ahead of time. For example, we can still write:
::
K = TypeVar('K')
N = TypeVar('N')
def matrix_vector_multiply(x: Array[K, N], Array[N]) -> Array[K]: ...
We can then use this with:
class Batch: pass
class Values: pass
batch_of_values: Array[Batch, Values]
value_weights: Array[Values]
matrix_vector_multiply(batch_of_values, value_weights)
# Result is Array[Batch]
The disadvantages are the inverse of the advantages from use case 1.
In particular, this approach does not lend itself well to arithmetic
on axis types: ``Mul[2, Batch]`` would be as meaningless as ``2 * int``.
Discussion
----------
Note that use cases 1 and 2 are mutually exclusive in user code. Users
can verify size or semantic type but not both.
As of this PEP, we are agnostic about which approach will provide most benefit.
Since the features introduced in this PEP are compatible with both approaches, however,
we leave the door open.
Why Not Both?
-------------
Consider the following 'normal' code:
::
def f(x: int): ...
Note that we have symbols for both the value of the thing (``x``) and the type of
the thing (``int``). Why can't we do the same with axes? For example, with an imaginary
syntax, we could write:
::
def f(array: Array[TimeValue: TimeType]): ...
This would allow us to access the axis size (say, 32) through the symbol ``TimeValue``
*and* the type through the symbol ``TypeType``.
This might even be possible using existing syntax, through a second level of parameterisation:
::
def f(array: array[TimeValue[TimeType]]): ..
However, we leave exploration of this approach to the future.
Appendix B: Shaped Types vs Named Axes
======================================
An issue related to those addressed by this PEP concerns
axis *selection*. For example, if we have an image stored in an array of shape 64×64x3,
we might wish to convert to black-and-white by computing the mean over the third
axis, ``mean(image, axis=2)``. Unfortunately, the simple typo ``axis=1`` is
difficult to spot and will produce a result that means something completely different
(all while likely allowing the program to keep on running, resulting in a bug
that is serious but silent).
In response, some libraries have implemented so-called 'named tensors' (in this context,
'tensor' is synonymous with 'array'), in which axes are selected not by index but by
label - e.g. ``mean(image, axis='channels')``.
A question we are often asked about this PEP is: why not just use named tensors?
The answer is that we consider the named tensors approach insufficient, for two main reasons:
* **Static checking** of shape correctness is not possible. As mentioned in `Motivation`_,
this is a highly desirable feature in machine learning code where iteration times
are slow by default.
* **Interface documentation** is still not possible with this approach. If a function should
*only* be willing to take array arguments that have image-like shapes, this cannot be stipulated
with named tensors.
Additionally, there's the issue of **poor uptake**. At the time of writing, named tensors
have only been implemented in a small number of numerical computing libraries. Possible explanations for this
include difficulty of implementation (the whole API must be modified to allow selection by axis name
instead of index), and lack of usefulness due to the fact that axis ordering conventions are often
strong enough that axis names provide little benefit (e.g. when working with images, 3D tensors are
basically *always* height × width × channels). However, ultimately we are still uncertain
why this is the case.
Can the named tensors approach be combined with the approach we advocate for in
this PEP? We're not sure. One area of overlap is that in some contexts, we could do, say:
::
Image: Array[Height, Width, Channels]
im: Image
mean(im, axis=Image.axes.index(Channels)
Ideally, we might write something like ``im: Array[Height=64, Width=64, Channels=3]`` -
but this won't be possible in the short term, due to the rejection of PEP 637.
In any case, our attitude towards this is mostly "Wait and see what happens before
taking any further steps".
Footnotes
==========
.. [#batch] 'Batch' is machine learning parlance for 'a number of'.
.. [#array] We use the term 'array' to refer to a matrix with an arbitrary
number of dimensions. In NumPy, the corresponding class is the ``ndarray``;
in TensorFlow, the ``Tensor``; and so on.
.. [#timebatch] If the shape begins with 'batch × time', then
``videos_batch[0][1]`` would select the second frame of the first video. If the
shape begins with 'time × batch', then ``videos_batch[1][0]`` would select the
same frame.
Endorsements
============
Variadic generics have a wide range of uses. For the fraction of that range
involving numerical computing, how likely is it that relevant libraries will
actually make use of the features proposed in this PEP?
We reached out to a number of people with this question, and received the
following endorsements.
From **Stephan Hoyer**, member of the NumPy Steering Council:
[#stephan-endorsement]_
I just wanted to thank Matthew & Pradeep for writing this PEP and for
clarifications to the broader context of PEP 646 for array typing in
https://github.com/python/peps/pull/1904.
As someone who is heavily involved in the Python numerical computing
community (e.g., NumPy, JAX, Xarray), but who is not so familiar with the
details of Python's type system, it is reassuring to see that a broad range
of use-cases related to type checking of named axes & shapes have been
considered, and could build upon the infrastructure in this PEP.
Type checking for shapes is something the NumPy community is very
interested in -- there are more thumbs up on the relevant issue on NumPy's
GitHub than any others (https://github.com/numpy/numpy/issues/7370) and we
recently added a "typing" module that is under active development.
It will certainly require experimentation to figure out the best ways to
use type checking for ndarrays, but this PEP looks like an excellent
foundation for such work.
From **Bas van Beek**, who has worked on preliminary support for
shape-generics in NumPy:
I very much share Stephan's opinion here and look forward to integrating the
new PEP 646 variadics into numpy.
In the context of numpy (and tensor typing general): the typing of array
shapes is a fairly complicated subject and the introduction of variadics
will likely play in big role in laying its foundation, as it allows for the
expression of both dimensioability as well as basic shape manipulation.
All in all, I'm very interested in where both PEP 646 and future PEPs will
take us and look forward to further developments.
From **Dan Moldovan**, a Senior Software Engineer on the TensorFlow Dev Team
and author of the TensorFlow RFC, `TensorFlow Canonical Type System`_: [#dan-endorsement]_
I'd be interested in using this the mechanisms defined in this PEP to define
rank-generic Tensor types in TensorFlow, which are important in specifying
``tf.function`` signatures in a Pythonic way, using type annotations (rather
than the custom ``input_signature`` mechanism we have today - see this
issue: https://github.com/tensorflow/tensorflow/issues/31579). Variadic
generics are among the last few missing pieces to create an elegant set of
type definitions for tensors and shapes.
(For the sake of transparency - we also reached out to folks from a third popular
numerical computing library, PyTorch, but did *not* receive a statement of
endorsement from them. Our understanding is that although they are interested
in some of the same issues - e.g. static shape inference - they are currently
focusing on enabling this through a DSL rather than the Python type system.)
Acknowledgements
================
Thank you to **Alfonso Castaño**, **Antoine Pitrou**, **Bas v.B.**, **David Foster**, **Dimitris Vardoulakis**, **Eric Traut**, **Guido van Rossum**, **Jia Chen**,
**Lucio Fernandez-Arjona**, **Nikita Sobolev**, **Peilonrayz**, **Rebecca Chen**,
**Sergei Lebedev**, and **Vladimir Mikulik** for helpful feedback and suggestions on
drafts of this PEP.
Thank you especially to **Lucio** for suggesting the star syntax (which has made multiple aspects of this proposal much more concise and intuitive), and to **Stephan Hoyer** and **Dan Moldovan** for their endorsements.
Resources
=========
Discussions on variadic generics in Python started in 2016 with Issue 193
on the python/typing GitHub repository [#typing193]_.
Inspired by this discussion, **Ivan Levkivskyi** made a concrete proposal
at PyCon 2019, summarised in notes on 'Type system improvements' [#type-improvements]_
and 'Static typing of Python numeric stack' [#numeric-stack]_.
Expanding on these ideas, **Mark Mendoza** and **Vincent Siles** gave a presentation on
'Variadic Type Variables for Decorators and Tensors' [#variadic-type-variables]_ at the 2019 Python
Typing Summit.
References
==========
.. [#typing193] Python typing issue #193:
https://github.com/python/typing/issues/193
.. [#type-improvements] Ivan Levkivskyi, 'Type system improvements', PyCon 2019:
https://paper.dropbox.com/doc/Type-system-improvements-HHOkniMG9WcCgS0LzXZAe
.. [#numeric-stack] Ivan Levkivskyi, 'Static typing of Python numeric stack', PyCon 2019:
https://paper.dropbox.com/doc/Static-typing-of-Python-numeric-stack-summary-6ZQzTkgN6e0oXko8fEWwN
.. [#typing-ideas] Stephan Hoyer, 'Ideas for array shape typing in Python':
https://docs.google.com/document/d/1vpMse4c6DrWH5rq2tQSx3qwP_m_0lyn-Ij4WHqQqRHY/edit
.. [#variadic-type-variables] Mark Mendoza, 'Variadic Type Variables for Decorators and Tensors', Python Typing Summit 2019:
https://github.com/facebook/pyre-check/blob/ae85c0c6e99e3bbfc92ec55104bfdc5b9b3097b2/docs/Variadic_Type_Variables_for_Decorators_and_Tensors.pdf
.. [#syntax-proposal] Matthew Rahtz et al., 'Shape annotation syntax proposal':
https://docs.google.com/document/d/1But-hjet8-djv519HEKvBN6Ik2lW3yu0ojZo6pG9osY/edit
.. [#arbitrary_len] Discussion on Python typing-sig mailing list:
https://mail.python.org/archives/list/typing-sig@python.org/thread/SQVTQYWIOI4TIO7NNBTFFWFMSMS2TA4J/
.. [#tsanley] tsanley: https://github.com/ofnote/tsanley
.. [#pycontracts] PyContracts: https://github.com/AndreaCensi/contracts
.. [#shapeguard] ShapeGuard: https://github.com/Qwlouse/shapeguard
.. _cpython/23527: https://github.com/python/cpython/pull/24527
.. _mrahtz/cpython/pep637+646: https://github.com/mrahtz/cpython/tree/pep637%2B646
.. _this exercise: https://spinningup.openai.com/en/latest/spinningup/exercise2_2_soln.html
.. _TensorFlow Canonical Type System: https://github.com/tensorflow/community/pull/208
.. [#stephan-endorsement] https://mail.python.org/archives/list/python-dev@python.org/message/UDM7Y6HLHQBKXQEBIBD5ZLB5XNPDZDXV/
.. [#dan-endorsement] https://mail.python.org/archives/list/python-dev@python.org/message/HTCARTYYCHETAMHB6OVRNR5EW5T2CP4J/
.. [#pep-484-args] https://www.python.org/dev/peps/pep-0484/#arbitrary-argument-lists-and-default-argument-values
Copyright
=========
This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: