[PEP 646] Allow cleanly substituting any tuple type for a TypeVarTuple (#2162)

A number of cleanups also happened, improving readability of various sections and examples, but the only *semantic* change here is that TypeVarTuples can now be substituted with unbounded tuples (previously that was explicitly forbidden, but we convinced ourselves that there is no reason for that).
This commit is contained in:
Pradeep Kumar 2021-12-08 14:07:27 -08:00 committed by GitHub
parent 23307c1239
commit f1c02245cc
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 255 additions and 118 deletions

View File

@ -146,18 +146,21 @@ data type.)
Specification
=============
In order to support the above use cases, we introduce ``TypeVarTuple``. This serves as a placeholder not for a single type but for an *arbitrary* number of types, and behaving like a number of ``TypeVar`` instances packed in a ``Tuple``.
In order to support the above use cases, we introduce
``TypeVarTuple``. This serves as a placeholder not for a single type
but for a *tuple* of types.
In addition, we introduce a new use for the star operator: to 'unpack'
``TypeVarTuple`` instances, in order to access the type variables
contained in the tuple.
``TypeVarTuple`` instances and tuple types such as ``Tuple[int,
str]``. Unpacking a ``TypeVarTuple`` or tuple type is the typing
equivalent of unpacking a variable or a tuple of values.
Type Variable Tuples
--------------------
In the same way that a normal type variable is a stand-in for a single type,
a type variable *tuple* is a stand-in for an arbitrary number of types (zero or
more) in a flat ordered list.
In the same way that a normal type variable is a stand-in for a single
type such as ``int``, a type variable *tuple* is a stand-in for a *tuple* type such as
``Tuple[int, str]``.
Type variable tuples are created with:
@ -167,6 +170,9 @@ Type variable tuples are created with:
Ts = TypeVarTuple('Ts')
Using Type Variable Tuples in Generic Classes
'''''''''''''''''''''''''''''''''''''''''''''
Type variable tuples behave like a number of individual type variables packed in a
``Tuple``. To understand this, consider the following example:
@ -199,6 +205,9 @@ and so on:
y: Array[Batch, Height, Width] = Array()
z: Array[Time, Batch, Height, Width] = Array()
Using Type Variable Tuples in Functions
'''''''''''''''''''''''''''''''''''''''
Type variable tuples can be used anywhere a normal ``TypeVar`` can.
This includes class definitions, as shown above, as well as function
signatures and variable annotations:
@ -264,74 +273,6 @@ To keep this PEP minimal, ``TypeVarTuple`` does not yet support specification of
We leave the decision of how these arguments should behave to a future PEP, when variadic generics have been tested in the field. As of this PEP, type variable tuples are
invariant.
Behaviour when Type Parameters are not Specified
''''''''''''''''''''''''''''''''''''''''''''''''
When a generic class parameterised by a type variable tuple is used without
any type parameters, it behaves as if its type parameters are '``Any, ...``'
(an arbitrary number of ``Any``):
::
def takes_any_array(arr: Array): ...
x: Array[Height, Width]
takes_any_array(x) # Valid
y: Array[Time, Height, Width]
takes_any_array(y) # Also valid
This enables gradual typing: existing functions accepting, for example,
a plain TensorFlow ``Tensor`` will still be valid even if ``Tensor`` is made
generic and calling code passes a ``Tensor[Height, Width]``.
This also works in the opposite direction:
::
def takes_specific_array(arr: Array[Height, Width]): ...
z: Array
takes_specific_array(z)
This way, even if libraries are updated to use types like ``Array[Height, Width]``,
users of those libraries won't be forced to also apply type annotations to
all of their code; users still have a choice about what parts of their code
to type and which parts to not.
Type Variable Tuples Must Have Known Length
'''''''''''''''''''''''''''''''''''''''''''
Type variables tuples may not be bound to a type with unknown length.
That is:
::
def foo(x: Tuple[*Ts]): ...
x: Tuple[float, ...]
foo(x) # NOT valid; Ts would be bound to ``Tuple[float, ...]``
If this is confusing - didn't we say that type variable tuples are a stand-in
for an *arbitrary* number of types? - note the difference between the
length of the type variable tuple *itself*, and the length of the type it is
*bound* to. Type variable tuples themselves can be of arbitrary length -
that is, they can be bound to ``Tuple[int]``, ``Tuple[int, int]``, and
so on - but the types they are bound to must be of known length -
that is, ``Tuple[int, int]``, but not ``Tuple[int, ...]``.
Note that, as a result of this rule, omitting the type parameter list is the
*only* way of instantiating a generic type with an arbitrary number of
type parameters. (We plan to introduce a more deliberate syntax for this
case in a future PEP.) For example, an unparameterised ``Array`` may
*behave* like ``Array[Any, ...]``, but it cannot be instantiated using
``Array[Any, ...]``, because this would bind its type variable tuple to ``Tuple[Any, ...]``:
::
x: Array # Valid
y: Array[int, ...] # Error
z: Array[Any, ...] # Error
Type Variable Tuple Equality
''''''''''''''''''''''''''''
@ -375,6 +316,11 @@ As of this PEP, only a single type variable tuple may appear in a type parameter
class Array(Generic[*Ts1, *Ts2]): ... # Error
The reason is that multiple type variable tuples make it ambiguous
which parameters get bound to which type variable tuple: ::
x: Array[int, str, bool] # Ts1 = ???, Ts2 = ???
Type Concatenation
------------------
@ -397,7 +343,7 @@ prefixed and/or suffixed:
b = add_batch_axis(a) # Inferred type is Array[Batch, Height, Width]
c = del_batch_axis(b) # Array[Height, Width]
d = add_batch_channels(a) # Array[Batch, Height, Width, Channels]
Normal ``TypeVar`` instances can also be prefixed and/or suffixed:
@ -414,6 +360,102 @@ Normal ``TypeVar`` instances can also be prefixed and/or suffixed:
z = prefix_tuple(x=0, y=(True, 'a'))
# Inferred type of z is Tuple[int, bool, str]
Unpacking Tuple Types
---------------------
We mentioned that a ``TypeVarTuple`` stands for a tuple of types.
Since we can unpack a ``TypeVarTuple``, for consistency, we also
allow unpacking a tuple type. As we shall see, this also enables a
number of interesting features.
Unpacking Concrete Tuple Types
''''''''''''''''''''''''''''''
Unpacking a concrete tuple type is analogous to unpacking a tuple of
values at runtime. ``Tuple[int, *Tuple[bool, bool], str]`` is
equivalent to ``Tuple[int, bool, bool, str]``.
Unpacking Unbounded Tuple Types
'''''''''''''''''''''''''''''''
Unpacking an unbounded tuple preserves the unbounded tuple as it is.
That is, ``*Tuple[int, ...]`` remains ``*Tuple[int, ...]``; there's no
simpler form. This enables us to specify types such as ``Tuple[int,
*Tuple[str, ...], str]`` - a tuple type where the first element is
guaranteed to be of type ``int``, the last element is guaranteed to be
of type ``str``, and the elements in the middle are zero or more
elements of type ``str``. Note that ``Tuple[*Tuple[int, ...]]`` is
equivalent to ``Tuple[int, ...]``.
Unpacking unbounded tuples is also useful in function signatures where
we don't care about the exact elements and don't want to define an
unnecessary ``TypeVarTuple``:
::
def process_batch_channels(
x: Array[Batch, *Tuple[Any, ...], Channels]
) -> None:
...
x: Array[Batch, Height, Width, Channels]
process_batch_channels(x) # OK
y: Array[Batch, Channels]
process_batch_channels(y) # OK
z: Array[Batch]
process_batch_channels(z) # Error: Expected Channels.
We can also pass a ``*Tuple[int, ...]`` wherever a ``*Ts`` is
expected. This is useful when we have particularly dynamic code and
cannot state the precise number of dimensions or the precise types for
each of the dimensions. In those cases, we can smoothly fall back to
an unbounded tuple:
::
y: Array[*Tuple[Any, ...]] = read_from_file()
def expect_variadic_array(
x: Array[Batch, *Shape]
) -> None: ...
expect_variadic_array(y) # OK
def expect_precise_array(
x: Array[Batch, Height, Width, Channels]
) -> None: ...
expect_precise_array(y) # OK
``Array[*Tuple[Any, ...]]`` stands for an array with an arbitrary
number of dimensions of type ``Any``. This means that, in the call to
``expect_variadic_array``, ``Batch`` is bound to ``Any`` and ``Shape``
is bound to ``Tuple[Any, ...]``. In the call to
``expect_precise_array``, the variables ``Batch``, ``Height``,
``Width``, and ``Channels`` are all bound to ``Any``.
This allows users to handle dynamic code gracefully while still
explicitly marking the code as unsafe (by using ``y: Array[*Tuple[Any,
...]]``). Otherwise, users would face noisy errors from the type
checker every time they tried to use the variable ``y``, which would
hinder them when migrating a legacy code base to use ``TypeVarTuple``.
Multiple Unpackings in a Tuple: Not Allowed
'''''''''''''''''''''''''''''''''''''''''''
As with ``TypeVarTuples``, `only one <Multiple Type Variable Tuples:
Not Allowed_>`_ unpacking may appear in a tuple:
::
x: Tuple[int, *Ts, str, *Ts2] # Error
y: Tuple[int, *Tuple[int, ...], str, *Tuple[str, ...]] # Error
``*args`` as a Type Variable Tuple
----------------------------------
@ -428,13 +470,59 @@ individual arguments become the types in the type variable tuple:
::
Ts = TypeVarTuple('Ts')
def args_to_tuple(*args: *Ts) -> Tuple[*Ts]: ...
args_to_tuple(1, 'a') # Inferred type is Tuple[int, str]
If no arguments are passed, the type variable tuple behaves like an
empty tuple, ``Tuple[()]``.
In the above example, ``Ts`` is bound to ``Tuple[int, str]``. If no
arguments are passed, the type variable tuple behaves like an empty
tuple, ``Tuple[()]``.
As usual, we can unpack any tuple types. For example, by using a type
variable tuple inside a tuple of other types, we can refer to prefixes
or suffixes of the variadic argument list. For example:
::
# os.execle takes arguments 'path, arg0, arg1, ..., env'
def execle(path: str, *args: *Tuple[*Ts, Env]) -> None: ...
Note that this is different to
::
def execle(path: str, *args: *Ts, env: Env) -> None: ...
as this would make ``env`` a keyword-only argument.
Using an unpacked unbounded tuple is equivalent to the PEP 484
behavior [#pep-484-args]_ of ``*args: int``, which accepts zero or
more values of type ``int``:
::
def foo(*args: *Tuple[int, ...]) -> None: ...
# equivalent to:
def foo(*args: int) -> None: ...
Unpacking tuple types also allows more precise types for heterogeneous
``*args``. The following function expects an ``int`` at the beginning,
zero or more ``str`` values, and a ``str`` at the end:
::
def foo(*args: *Tuple[int, *Tuple[str, ...], str]) -> None: ...
For completeness, we mention that unpacking a concrete tuple allows us
to specify ``*args`` of a fixed number of heterogeneous types:
::
def foo(*args: *Tuple[int, str]) -> None: ...
foo(1, "hello") # OK
Note that, in keeping with the rule that type variable tuples must always
be used unpacked, annotating ``*args`` as being a plain type variable tuple
@ -457,17 +545,6 @@ all arguments must be a ``Tuple`` parameterised with the same types.
foo((0,), (1, 2)) # Error
foo((0,), ('1',)) # Error
Following `Type Variable Tuples Must Have Known Length`_, note
that the following should *not* type-check as valid (even though it is, of
course, valid at runtime):
::
def foo(*args: *Ts): ...
def bar(x: Tuple[int, ...]):
foo(*x) # NOT valid
Finally, note that a type variable tuple may *not* be used as the type of
``**kwargs``. (We do not yet know of a use case for this feature, so we prefer
to leave the ground fresh for a potential future PEP.)
@ -488,12 +565,12 @@ Type variable tuples can also be used in the arguments section of a
class Process:
def __init__(
self,
target: Callable[[*Ts], Any],
args: Tuple[*Ts]
): ...
target: Callable[[*Ts], None],
args: Tuple[*Ts],
) -> None: ...
def func(arg1: int, arg2: str) -> None: ...
def func(arg1: int, arg2: str): ...
Process(target=func, args=(0, 'foo')) # Valid
Process(target=func, args=('foo', 0)) # Error
@ -506,6 +583,63 @@ to the type variable tuple:
def foo(f: Callable[[int, *Ts, T], Tuple[T, *Ts]]): ...
The behavior of a Callable containing an unpacked item, whether the
item is a ``TypeVarTuple`` or a tuple type, is to treat the elements
as if they were the type for ``*args``. So, ``Callable[[*Ts], None]``
is treated as the type of the function:
::
def foo(*args: *Ts) -> None: ...
``Callable[[int, *Ts, T], Tuple[T, *Ts]]`` is treated as the type of
the function:
::
def foo(*args: *Tuple[int, *Ts, T]) -> Tuple[T, *Ts]: ...
Behaviour when Type Parameters are not Specified
------------------------------------------------
When a generic class parameterised by a type variable tuple is used without
any type parameters, it behaves as if the type variable tuple was
substituted with ``Tuple[Any, ...]``:
::
def takes_any_array(arr: Array): ...
# equivalent to:
def takes_any_array(arr: Array[*Tuple[Any, ...]]): ...
x: Array[Height, Width]
takes_any_array(x) # Valid
y: Array[Time, Height, Width]
takes_any_array(y) # Also valid
This enables gradual typing: existing functions accepting, for example,
a plain TensorFlow ``Tensor`` will still be valid even if ``Tensor`` is made
generic and calling code passes a ``Tensor[Height, Width]``.
This also works in the opposite direction:
::
def takes_specific_array(arr: Array[Height, Width]): ...
z: Array
# equivalent to Array[*Tuple[Any, ...]]
takes_specific_array(z)
(For details, see the section on `Unpacking Unbounded Tuple Types`_.)
This way, even if libraries are updated to use types like ``Array[Height, Width]``,
users of those libraries won't be forced to also apply type annotations to
all of their code; users still have a choice about what parts of their code
to type and which parts to not.
Aliases
-------
@ -547,8 +681,9 @@ tuple in the alias is set empty:
IntTuple[()] # Equivalent to Tuple[int]
NamedArray[()] # Equivalent to Tuple[str, Array[()]]
If the type parameter list is omitted entirely, the alias is
compatible with arbitrary type parameters:
If the type parameter list is omitted entirely, the unspecified type
variable tuples are treated as ``Tuple[Any, ...]`` (similar to
`Behaviour when Type Parameters are not Specified`_):
::
@ -573,9 +708,9 @@ Normal ``TypeVar`` instances can also be used in such aliases:
Foo[str, int]
# T bound to float, Ts to Tuple[()]
Foo[float]
# T bound to Any, Ts to an arbitrary number of Any
# T bound to Any, Ts to an Tuple[Any, ...]
Foo
Overloads for Accessing Individual Types
----------------------------------------
@ -654,8 +789,8 @@ otherwise imply. Also, we may later wish to support arguments that should not be
We therefore settled on ``TypeVarTuple``.
Behaviour when Type Parameters are not Specified
------------------------------------------------
Unspecified Type Parameters: Tuple vs TypeVarTuple
--------------------------------------------------
In order to support gradual typing, this PEP states that *both*
of the following examples should type-check correctly:
@ -756,7 +891,7 @@ within square brackets), necessary to support star-unpacking of TypeVarTuples:
Before:
::
slices:
| slice !','
| ','.slice+ [',']
@ -764,7 +899,7 @@ Before:
After:
::
slices:
| slice !','
| ','.(slice | starred_expression)+ [',']
@ -792,7 +927,7 @@ implementation.
class TypeVarTuple:
def __init__(self, name):
self._name = name
self._unpacked = UnpackedTypeVarTuple(name)
self._unpacked = UnpackedTypeVarTuple(name)
def __iter__(self):
yield self._unpacked
def __repr__(self):
@ -945,7 +1080,7 @@ reasons:
the user is familiar with star-unpacking in other contexts; if the
user is reading or writing code that uses variadic generics, this seems
reasonable.)
If even change 1 is thought too significant a change, therefore, it might be
better for us to reconsider our options before going ahead with this second
alternative.
@ -1000,7 +1135,7 @@ We can attach names to each parameter using normal type variables:
b: Array[Literal[32]]
matrix_vector_multiply(a, b)
# Result is Array[Literal[64]]
Note that such names have a purely local scope. That is, the name
``K`` is bound to ``Literal[64]`` only within ``matrix_vector_multiply``. To put it another
way, there's no relationship between the value of ``K`` in different
@ -1022,7 +1157,7 @@ type operators that enable arithmetic on array shapes - for example:
::
def repeat_each_element(x: Array[N]) -> Array[Mul[2, N]]: ...
Such arithmetic type operators would only make sense if names such as ``N`` refer to axis size.
Use Case 2: Specifying Shape Semantics
@ -1037,16 +1172,16 @@ For example:
::
# lib.py
class Batch: pass
class Time: pass
def make_array() -> Array[Batch, Time]: ...
# user.py
from lib import Batch, Time
# `Batch` and `Time` have the same identity as in `lib`,
# so must take array as produced by `lib.make_array`
def use_array(x: Array[Batch, Time]): ...
@ -1070,17 +1205,17 @@ without knowing the type ahead of time. For example, we can still write:
N = TypeVar('N')
def matrix_vector_multiply(x: Array[K, N], Array[N]) -> Array[K]: ...
We can then use this with:
class Batch: pass
class Values: pass
batch_of_values: Array[Batch, Values]
value_weights: Array[Values]
matrix_vector_multiply(batch_of_values, value_weights)
# Result is Array[Batch]
The disadvantages are the inverse of the advantages from use case 1.
In particular, this approach does not lend itself well to arithmetic
on axis types: ``Mul[2, Batch]`` would be as meaningless as ``2 * int``.
@ -1103,7 +1238,7 @@ Consider the following 'normal' code:
::
def f(x: int): ...
Note that we have symbols for both the value of the thing (``x``) and the type of
the thing (``int``). Why can't we do the same with axes? For example, with an imaginary
syntax, we could write:
@ -1111,7 +1246,7 @@ syntax, we could write:
::
def f(array: Array[TimeValue: TimeType]): ...
This would allow us to access the axis size (say, 32) through the symbol ``TimeValue``
*and* the type through the symbol ``TypeType``.
@ -1120,7 +1255,7 @@ This might even be possible using existing syntax, through a second level of par
::
def f(array: array[TimeValue[TimeType]]): ..
However, we leave exploration of this approach to the future.
Appendix B: Shaped Types vs Named Axes
@ -1314,6 +1449,8 @@ References
.. [#dan-endorsement] https://mail.python.org/archives/list/python-dev@python.org/message/HTCARTYYCHETAMHB6OVRNR5EW5T2CP4J/
.. [#pep-484-args] https://www.python.org/dev/peps/pep-0484/#arbitrary-argument-lists-and-default-argument-values
Copyright
=========