diff --git a/pep-0646.rst b/pep-0646.rst index c1cc85141..41023483e 100644 --- a/pep-0646.rst +++ b/pep-0646.rst @@ -2,6 +2,7 @@ PEP: 0646 Title: Variadic Generics Author: Mark Mendoza , Matthew Rahtz , + Pradeep Kumar Srinivasan , Vincent Siles Sponsor: Guido van Rossum Status: Draft @@ -9,13 +10,13 @@ Type: Standards Track Content-Type: text/x-rst Created: 16-Sep-2020 Python-Version: 3.10 -Post-History: 07-Oct-2020 +Post-History: 07-Oct-2020, 23-Dec-2020, 29-Dec-2020 Abstract ======== PEP 484 introduced ``TypeVar``, enabling creation of generics parameterised -with a single type. In this PEP, we introduce ``TypeTuple``, enabling parameterisation +with a single type. In this PEP, we introduce ``TypeVarTuple``, enabling parameterisation with an *arbitrary* number of types - that is, a *variadic* type variable, enabling *variadic* generics. This allows the type of array-like structures in numerical computing libraries such as NumPy and TensorFlow to be @@ -25,7 +26,7 @@ to catch shape-related bugs in code that uses these libraries. Motivation ========== -There are two main use-cases for variadic generics. +There are two main use-cases for variadic generics. [#hkt]_ The primary motivation is to enable typing of array shapes in numerical computing libraries such as NumPy and TensorFlow. This is the use-case @@ -34,7 +35,7 @@ much of the PEP will focus on. Additionally, variadic generics allow us to concisely specify the type signature of ``map`` and ``zip``. -We discuss each of these three motivations below. +We discuss each of these motivations below. Array Shapes ------------- @@ -46,7 +47,7 @@ batch [#batch]_ of videos to grayscale: :: - def to_gray(videos: Tensor): ... + def to_gray(videos: Array): ... From the signature alone, it is not obvious what shape of array [#array]_ we should pass for the ``videos`` argument. Possibilities include, for @@ -65,25 +66,25 @@ this purpose. We would write: :: - def to_gray(videos: Tensor[Time, Batch, Height, Width, Channels]): ... + def to_gray(videos: Array[Time, Batch, Height, Width, Channels]): ... -However, note that arrays can be of arbitrary rank - ``Tensor`` as used above is +However, note that arrays can be of arbitrary rank - ``Array`` as used above is generic in an arbitrary number of axes. One way around this would be to use a different -``Tensor`` class for each rank... +``Array`` class for each rank... :: Axis1 = TypeVar('Axis1') Axis2 = TypeVar('Axis2') - class Tensor1(Generic[Axis1]): ... + class Array1(Generic[Axis1]): ... - class Tensor2(Generic[Axis1, Axis2]): ... + class Array2(Generic[Axis1, Axis2]): ... ...but this would be cumbersome, both for users (who would have to sprinkle 1s and 2s -and so on throughout their code) and for the authors of tensor libraries (who would have to duplicate implementations throughout multiple classes). +and so on throughout their code) and for the authors of array libraries (who would have to duplicate implementations throughout multiple classes). -Variadic generics are necessary for a ``Tensor`` that is generic in an arbitrary +Variadic generics are necessary for a ``Array`` that is generic in an arbitrary number of axes to be cleanly defined as a single class. ``map`` and ``zip`` @@ -130,126 +131,109 @@ Specification In order to support the above use-cases, we introduce: -* ``TypeTupleVar``, a ``TypeVar`` that acts as a placeholder not for a single - type but for an *arbitrary* number of types. -* A new syntax for parameterizing generic functions and classes using a - type tuple variable. -* Two new type operators, ``Apply`` and ``Map``. +* ``TypeVarTuple``, serving as a placeholder not for a single type but + for an *arbitrary* number of types, and behaving like a number of + ``TypeVar`` instances packed in a ``Tuple``. +* A new use for the star operator: unpacking of each individual type + from a ``TypeVarTuple``. +* Two new type operators, ``Unpack`` and ``Map``. These are described in detail below. -Type Tuple Variables +Type Variable Tuples -------------------- In the same way that a normal type variable is a stand-in for a single type, -a type *tuple* variable is a stand-in for an arbitrary number of types in a flat -ordered list. +a type variable *tuple* is a stand-in for an arbitrary number of types (zero or +more) in a flat ordered list. -Type tuple variables are created with: +Type variable tuples are created with: :: - from typing import TypeTupleVar + from typing import TypeVarTuple - Ts = TypeTupleVar('Ts') + Ts = TypeVarTuple('Ts') -A type tuple variable behaves in a similar way to a parameterized ``Tuple``. +A type variable tuple behaves in a similar way to a parameterized ``Tuple``. For example, in a generic object instantiated with type parameters -``int`` and ``str``, ``Ts`` behaves similarly to ``Tuple[int, str]``. +``int`` and ``str``, ``Ts`` is equivalent to ``Tuple[int, str]``. -Parameterizing Types: Star Operator -''''''''''''''''''''''''''''''''''' - -One use of type tuple variables are to parameterize variadic types -such as ``Tuple``. - -To differentiate type tuple variables from normal type variables, we introduce -a new use for the star operator: - -:: - - Tuple[*Ts] - -The star operator here serves to 'expand' the type tuple into -its component types. For example, in a generic object instantiated -with ``Ts`` being ``int`` and ``str``, then ``Tuple[*Ts]`` would -be equivalent to ``Tuple[int, str]``. - -For consistency, the star operator can also be applied directly to a -parameterised ``Tuple``: - -:: - - Types = Tuple[int, str, bool, float, double] - Tuple[*Types] # Also valid - - -Parameterizing Types: ``Expand`` -'''''''''''''''''''''''''''''''' - -Because the new use of the star operator requires a syntax change and is -therefore incompatible with previous versions of Python, we also introduce the -``Expand`` type operator for use in existing versions of Python. ``Expand`` -behaves identically to the star operator, but without requiring a syntax change. -In any place you would normally write ``*Ts``, you can also write ``Expand[Ts]``. - -Parameterizing Function Signatures and Classes -'''''''''''''''''''''''''''''''''''''''''''''' - -Type tuple variables can be used anywhere a normal ``TupleVar`` can. For example, -in class definitions, function signatures, and variable annotations: +Type variable tuples can be used anywhere a normal ``TupleVar`` can. +For example, in class definitions, function signatures, and variable annotations: :: Shape = TypeTupleVar('Shape') - class Tensor(Generic[*Shape]): + class Array(Generic[Shape]): - def __init__(self, shape: Tuple[int, ...]): + def __init__(self, shape: Shape): self.shape: Shape = shape - def __abs__(self) -> Tensor[*Shape]: ... + def __abs__(self) -> Array[Shape]: ... - def __add__(self, other: Tensor[*Shape]) -> Tensor[*Shape]: ... + def __add__(self, other: Array[Shape]) -> Array[Shape]: ... - class Height: pass - class Width: pass - x: Tensor[Height, Width] = Tensor(shape=(640, 480)) + Height = NewType('Height', int) + Width = NewType('Width', int) + shape = (Height(480), Width(640)) + x: Array[Tuple[Height, Width]] = Array(shape) x.shape # Inferred type is Tuple[Height, Width] - y = abs(x) # Tensor[Height, Width] - z = x + y # Tensor[Height, Width] + y = abs(x) # Array[Tuple[Height, Width]] + z = x + y # Array[Tuple[Height, Width]] -Unexpanded Type Tuple Variables -''''''''''''''''''''''''''''''' +Variance and ``bound``: Not (Yet) Supported +''''''''''''''''''''''''''''''''''''''''''' -Until now, we have always expanded type tuple variables. -However, type tuple variables can also be used without being expanded. -When used in this way, the type tuple variable behaves like a -``Tuple`` parameterised by the types that the type tuple variable -is bound to. That is: +To keep this PEP minimal, ``TypeTupleVar`` does not yet support +the ``bound`` argument or specification of variance, as ``TypeVar`` +does. We leave the decision of how these arguments should be implemented +to a future PEP, when use-cases for variadic generics have been +explored more in practice. + +Unpacking: Star Operator +'''''''''''''''''''''''' + +Note that the fully-parameterised type of ``Array`` above is +rather verbose. Wouldn't it be easier if we could just write +``Array[Height, Width]``? + +To enable this, we introduce a new function for the star operator: +to 'unpack' type variable tuples. When unpacked, a type variable tuple +behaves as if its component types had been written +directly into the signature, rather than being wrapped in a ``Tuple``. + +Rewriting the ``Array`` class using an unpacked type variable +tuple, we can instead write: :: - def foo(x: Tuple[*Ts]) -> Tuple[*Ts]: ... - # could also be written as - def foo(x: Ts) -> Ts: ... - -Type tuple variables can also be used unexpanded in in the context -of generic classes. However, note that when used in this way, -type parameters to the generic class must be explicitly -enclosed in a ``Tuple``. + Shape = TypeTupleVar('Shape') -:: + class Array(Generic[*Shape]): - class Foo(Generic[Ts]): ... + def __init__(self, shape: Shape): + self.shape: Shape = shape - foo: Foo[Tuple[int, str]] + def __add__(self, other: Array[*Shape]) -> Array[*Shape]: ... -See `Concatenating Multiple type tuple Variables`_ below for why this -is important. + shape = (Height(480), Width(640)) + x: Array[Height, Width] = Array(shape) + x.shape # Inferred type is Tuple[Height, Width] + z = x + x # Array[Height, Width] +Unpacking: ``Unpack`` Operator +'''''''''''''''''''''''''''''' -``*args`` as a Type Tuple Variable +Because the new use of the star operator requires a syntax change and is +therefore incompatible with previous versions of Python, we also introduce the +``typing.Unpack`` type operator for use in existing versions of Python. ``Unpack`` +takes a single type variable tuple argument, and behaves identically to the star +operator, but without requiring a syntax change. In any place you would normally +write ``*Ts``, you can also write ``Unpack[Ts]``. + +``*args`` as a Type Variable Tuple '''''''''''''''''''''''''''''''''' PEP 484 states that when a type annotation is provided for ``*args``, each argument @@ -257,42 +241,74 @@ must be of the type annotated. That is, if we specify ``*args`` to be type ``int then *all* arguments must be of type ``int``. This limits our ability to specify the type signatures of functions that take heterogeneous argument types. -If ``*args`` is annotated as being an expanded type tuple variable, however, the -types of the individual arguments become the types in the type tuple: +If ``*args`` is annotated as an unpacked type variable tuple, however, the +types of the individual arguments become the types in the type variable tuple: :: - def args_to_tuple(*args: *Ts) -> Tuple[*Ts]: ... + def args_to_tuple(*args: *Ts) -> Ts: ... args_to_tuple(1, 'a') # Inferred type is Tuple[int, str] -Inside the body of ``args_to_tuple``, the type of ``args`` is ``Tuple[*Ts]`` -(with ``*Ts`` substituted for the actual types at runtime). - -Note that, for consistency, the following is also valid syntactically: +Note that the type variable tuple must be unpacked in order for this new +behaviour to apply. If the type variable tuple is not unpacked, the old +behaviour still applies: :: - def foo(*args: *Tuple[int, str]): ... + # *All* arguments must be of type Tuple[T1, T2], + # where T1 and T2 are the same types for all arguments + def foo(*args: Ts) -> Ts: ... -However, since it is a strange thing to do (why not just specify the arguments -directly as ``arg1: int, arg2: str``?), we recommend type checkers -emit a warning when coming across such annotations. + x: Tuple[int, str] + y: Tuple[int, str] + foo(x, y) # Valid -Also note that when a type tuple variable is used in this way, it *must* -be in conjunction with the star operator: + z: Tuple[bool] + foo(x, z) # Not valid -:: - - def foo(*args: Ts): ... # NOT valid - -Finally, note that a type tuple variable may *not* be used as the type of +Finally, note that a type variable tuple may *not* be used as the type of ``**kwargs``. (We do not yet know of a use-case for this feature, so prefer to leave the ground fresh for a potential future PEP.) :: - def foo(**kwargs: *Ts): ... # NOT valid + # NOT valid + def foo(**kwargs: Ts): ... + def foo(**kwargs: *Ts): ... + +Type Variable Tuples with ``Callable`` +'''''''''''''''''''''''''''''''''''''' + +Type variable tuples can also be used in the arguments section of a +``Callable``: + +:: + + class Process: + def __init__(target: Callable[[*Ts], Any], args: Tuple[*Ts]): ... + + def func(arg1: int, arg2: str): ... + + Process(target=func, args=(0, 'foo')) # Passes type-check + Process(target=func, args=('foo', 0)) # Fails type-check + +Type Variable Tuples with ``Union`` +''''''''''''''''''''''''''''''''''' + +Finally, type variable tuples can be used with ``Union``: + +:: + + def f(*args: *Ts) -> Union[*Ts]: + return random.choice(args) + + f(1, 'foo') # Inferred type is Union[int, str] + +If the type variable tuple is empty (e.g. if we had ``*args: *Ts`` +and didn't pass any arguments), the type checker should +raise an error on the ``Union`` (matching the behaviour of ``Union`` +at runtime, which requires at least one type argument). ``Map`` ------- @@ -305,30 +321,44 @@ To enable typing of functions such as ``map`` and ``zip``, we introduce the from typing import Map - def args_to_tuples(*args: *Ts) -> Map[Tuple, Ts]: ... + def args_to_lists(*args: *Ts) -> Map[List, Ts]: ... - args_to_tuples(1, 'a') # Inferred type is Tuple[Tuple[int], Tuple[str]] + args_to_lists(1, 'a') # Inferred type is Tuple[List[int], List[str]] ``Map`` takes two operands. The first operand is a parameterizable -type (or type alias [#type_aliases]) such as ``Tuple`` or ``List``. The second operand -is a type tuple variable or a parameterized ``Tuple`` such as ``Tuple[int, str]``. -The result of ``Map`` is a ``Tuple``, where the Nth type in the ``Tuple`` is the -first operand parameterized by the Nth type in the second operand. +type (or type alias [#type_aliases]_) such as ``Tuple``, ``List``, or a +user-defined generic class. The second operand is a type variable tuple. +The result of ``Map`` is a ``Tuple``, where the Nth type in the ``Tuple`` is +the first operand parameterized by the Nth type in the type variable tuple. Because ``Map`` returns a parameterized ``Tuple``, it can be used anywhere -that a type tuple variable would be. For example: +that a type variable tuple would be. For example, as the type of ``*args``: :: # Equivalent to 'arg1: List[T1], arg2: List[T2], ...' def foo(*args: *Map[List, Ts]): ... + # Ts is bound to Tuple[int, str] + foo([1], ['a']) + +As a return type: + +:: # Equivalent to '-> Tuple[List[T1], List[T2], ...]' def bar(*args: *Ts) -> Map[List, Ts]: ... + # Ts is bound to Tuple[float, bool] + # Inferred type is Tuple[List[float], List[bool]] + bar(1.0, True) - bar() # Inferred type is Tuple[()] (an empty tuple) - bar(1) # Inferred type is Tuple[List[int]] - bar(1, 'a') # Inferred type is Tuple[List[int], List[str]] +And as an argument type: + +:: + + # Equivalent to 'arg: Tuple[List[T1], List[T2], ...]' + def baz(arg: Map[List, Ts]): ... + # Ts is bound to Tuple[bool, bool] + baz(([True], [False])) ``map`` and ``zip`` ''''''''''''''''''' @@ -337,15 +367,15 @@ that a type tuple variable would be. For example: :: - ArgTs = TypeTupleVar('ArgTs') - ReturnT = TypeVar('ReturnT') + Ts = TypeVarTuple('Ts') + R = TypeVar(R) - def map(func: Callable[[*ArgTs], ReturnT], - *iterables: *Map[Iterable, ArgTs]) -> Iterable[ReturnT]: ... + def map(func: Callable[[*Ts], R], + *iterables: *Map[Iterable, Ts]) -> Iterable[R]: ... def func(int, str) -> float: ... - # ArgTs is bound to Tuple[int, str] - # Map[Iterable, ArgTs] is Iterable[int], Iterable[str] + # Ts is bound to Tuple[int, str] + # Map[Iterable, Ts] is Iterable[int], Iterable[str] # Therefore, iter1 must be type Iterable[int], # and iter2 must be type Iterable[str] map(func, iter1, iter2) @@ -354,98 +384,314 @@ Similarly, we can specify the signature of ``zip`` as: :: - def zip(*iterables: *Map[Iterable, ArgTs]) -> Iterable[*ArgTs]): ... + def zip(*iterables: *Map[Iterable, Ts]) -> Iterator[Ts]): ... l1: List[int] l2: List[str] - zip(l1, l2) # Iterable[int, str] + zip(l1, l2) # Iterator[Tuple[int, str]] -Nesting -''''''' - -Because the type of the result of ``Map`` is the same as the type of its second -operand, the result of one ``Map`` *can* be used as the input to another ``Map``: - -:: - - Map[Tuple, *Map[Tuple, Ts]] # Valid! - -Accessing Individual Types --------------------------- +Overloads for Accessing Individual Types +---------------------------------------- ``Map`` allows us to operate on types in a bulk fashion. For situations where we require access to each individual type, overloads can be used with individual -``TypeVar`` instances in place of the type tuple variable: +``TypeVar`` instances in place of the type variable tuple: :: - Shape = TypeTupleVar('Shape') + Shape = TypeVarTuple('Shape') Axis1 = TypeVar('Axis1') Axis2 = TypeVar('Axis2') Axis3 = TypeVar('Axis3') - class Tensor(Generic[*Shape]): ... + class Array(Generic[*Shape]): ... - @overload - class Tensor(Generic[Axis1, Axis2]): + @overload + def transpose( + self: Array[Axis1, Axis2] + ) -> Array[Axis2, Axis1]: ... - def transpose(self) -> Tensor[Axis2, Axis1]: ... + @overload + def transpose( + self: Array[Axis1, Axis2, Axis3) + ) -> Array[Axis3, Axis2, Axis1]: ... - @overload - class Tensor(Generic[Axis1, Axis2, Axis3]): +(For array shape operations in particular, having to specify +overloads for each possible rank is, of course, a rather cumbersome +solution. However, it's the best we can do without additional type +manipulation mechanisms, which are beyond the scope of this PEP.) - def transpose(self) -> Tensor[Axis3, Axis2, Axis1]: ... - -Concatenating Other Types to a Type Tuple Variable +Concatenating Other Types to a Type Variable Tuple -------------------------------------------------- -If a type tuple variable appears with other types in the same type parameter -list, the effect is to concatenate those types with the types -in the type tuple variable: +If an unpacked type variable tuple appears with other types in the same type parameter +list, the effect is to concatenate those types with the types in the type variable +tuple. For example, concatenation in a function return type: :: - Shape = TypeTupleVar('Shape') - class Batch: pass - class Height: pass - class Width: pass + Batch = NewType('int') + Height = NewType('int') + Width = NewType('int') - class Tensor(Generic[*Shape]): ... + class Array(Generic[*Shape]): ... - def add_batch(x: Tensor[*Shape]) -> Tensor[Batch, *Shape]: ... + def add_batch(x: Array[*Shape]) -> Array[Batch, *Shape]: ... - x: Tensor[Height, Width] - add_batch(x) # Inferred type is Tensor[Batch, Height, Width] + x: Array[Height, Width] + y = add_batch(x) # Inferred type is Array[Batch, Height, Width] -Type tuple variables can also be combined with regular ``TypeVar`` instances: +In function argument types: :: - T1 = TypeVar('T1') - T2 = TypeVar('T2') + def batch_sum(x: Array[Batch, *Shape]) -> Array[*Shape]: ... - class Foo(Generic[T1, T2, *Ts]): ... + x: Array[Batch, Height, Width] + y = batch_sum(x) # Inferred type is Array[Height, Width] - foo: Foo[int, str, bool, float] # T1=int, T2=str, Ts=Tuple[bool, float] +And in class type parameters: -Concatenating Multiple Type Tuple Variables +:: + + class BatchArray(Generic[Batch, *Shape]): + def sum(self) -> Array[*Shape]: ... + + x: BatchArray[Batch, Height, Width] + y = x.sum() # Inferred type is Array[Height, Width] + +Concatenation can involve both prefixing and suffixing, and +can include an arbitrary number of types: + +:: + + def foo(x: Tuple[*Ts]) -> Tuple[int, str, *Ts, bool]: ... + +It is also possible to concatenate type variable tuples with regular +type variables: + +:: + + T = TypeVar('T') + + def first_axis_sum(x: Array[T, *Shape]) -> Array[*Shape]: ... + + x: Array[Time, Height, Width] + y = first_axis_sum(x) # Inferred type is Array[Height, Width] + +Finally, concatenation can also occur in the argument list to ``Callable``: + +:: + + def f(func: Callable[[int, *Ts], Any]) -> Tuple[*Ts]: ... + + def foo(int, str, float): ... + def bar(str, int, float): ... + + f(foo) # Valid; inferred type is Tuple[str, float] + f(bar) # Not valid + +And in ``Union``: + +:: + + def f(*args: *Ts) -> Union[*Ts, float]: ... + + f(0, 'spam') # Inferred type is Union[int, str, float] + +Concatenating Multiple Type Variable Tuples ------------------------------------------- -If multiple type tuple variables appear in a parameter list, in order -to prevent ambiguity about which types would be bound to which type -tuple variables, the type tuple variables must not be expanded: +We can also concatenate *multiple* type variable tuples, but only in cases +where the types bound to each type variable tuple can be inferred +unambiguously. Note that this is not always the case: :: - # NOT allowed - class Bar(Generic[*Ts1, *Ts2]): ... - # How would we decide which types are bound to Ts1 - # and which are bound to Ts2? - bar: Bar[int, str, bool] + # Type checker should raise an error on definition of func; + # how would we know which types are bound to Ts1, and which + # are bound to Ts2? + def func(ham: Tuple[*Ts1, *Ts2]): ... - # The right way - class Bar(Generic[Ts1, Ts2]): ... - bar: Bar[Tuple[int], Tuple[str, bool]] + # Ts1 = Tuple[int, str], Ts2 = Tuple[bool]? + # Or Ts1 = Tuple[int], Ts2 = Tuple[str, bool]? + ham: Tuple[int, str, bool] + func(ham) + +In general, some kind of extra constraint is necessary in order +for the ambiguity to be resolved. This is usually provided by +an un-concatenated usage of the type variable tuple elsewhere in +the same signature. + +For example, resolving ambiguity in an argument: + +:: + + def func(ham: Tuple[*Ts1, *Ts2], spam: Ts2): ... + + # Ts1 is bound to Tuple[int], Ts2 to Tuple[str, bool] + ham: Tuple[int, str, bool] + spam: Tuple[str, bool] + func(ham, spam) + +In a return type: + +:: + + def func(ham: Ts1, spam: Ts2) -> Tuple[*Ts1, *Ts2]): ... + + ham: Tuple[int] + spam: Tuple[str, bool] + # Return type is Tuple[int, str, bool] + func(ham, spam) + +Note, however, that the same cannot be done with generic classes: + +:: + + # No way to add extra constraints about Ts1 and Ts2, + # so this is not valid + class C(Generic[*Ts1, *Ts2]): ... + +Generics in Multiple Type Variable Tuples +----------------------------------------- + +If we *do* wish to use multiple type variable tuples in a type signature +that would otherwise not resolve the ambiguity, it is also possible +to make the type bindings explicit by using a type variable tuple directly, +without unpacking it. When then instantiating, for example, the class in +question, the types corresponding to each type variable tuple must +be wrapped in a ``Tuple``: + +:: + + class C(Generic[Ts1, Ts2]): ... + + # Ts1 = Tuple[int, str] + # Ts2 = Tuple[bool] + c: C[Tuple[int, str], Tuple[bool]] = C() + +Similarly for functions: + +:: + + def foo(x: Tuple[Ts1, Ts2]): ... + + # Ts1 = Tuple[int, float] + # Ts2 = Tuple[bool] + x: Tuple[Tuple[int, float], Tuple[bool]] + foo(x) + +Aliases +------- + +Generic aliases can be created using a type variable tuple in +a similar way to regular type variables: + +:: + + IntTuple = Tuple[int, *Ts] + IntTuple[float, bool] # Equivalent to Tuple[int, float, bool] + +As this example shows, all type arguments passed to the alias are +bound to the type variable tuple. If no type arguments are given, +the type variable tuple holds no types: + +:: + + IntTuple # Equivalent to Tuple[int] + +Type variable tuples can also be used without unpacking: + +:: + + IntTuple = Tuple[int, Ts] + IntTuple[float, bool] # Equivalent to Tuple[int, Tuple[float, bool]] + IntTuple # Tuple[int, Tuple[]] + +At most a single distinct type variable tuple can occur in an alias: + +:: + + # Invalid + Foo = Tuple[Ts1, int, Ts2] + # Why? Because there would be no way to decide which types should + # be bound to which type variable tuple: + Foo[float, bool, str] + # Equivalent to Tuple[float, bool, int, str]? + # Or Tuple[float, int, bool, str]? + +The same type variable tuple may be used multiple times, however: + +:: + + Bar = Tuple[*Ts, *Ts] + Bar[int, float] # Equivalent to Tuple[int, float, int, float] + +Finally, type variable tuples can be used in combination with +normal type variables. In this case, the number of type arguments must +be equal to or greater than the number of distinct normal type variables: + +:: + + Baz = Tuple[T1, *Ts, T2, T1] + + # T1 bound to int, T2 bound to bool, Ts empty + # Equivalent to Tuple[int, bool, int] + Baz[int, bool] + + # T1 bound to int + # Ts bound to Tuple[float, bool] + # T2 bound to str + # So equivalent to Tuple[int, float, bool, str, int] + Baz[int, float, bool, str] + + +An Ideal Array Type: One Possible Example +========================================= + +Type variable tuples allow us to make significant progress on the +typing of arrays. However, the array class we have sketched +out in this PEP is still missing some desirable features. [#typing-ideas]_ + +The most crucial feature missing is the ability to specify +the data type (e.g. ``np.float32`` or ``np.uint8``). This is important +because some numerical computing libraries will silently cast +types, which can easily lead to hard-to-diagnose bugs. + +Additionally, it might be useful to be able to specify the rank +instead of the full shape. This could be useful for cases where +axes don't have obvious semantic meaning like 'height' or 'width', +or where the array is very high-dimensional and writing out all +the axes would be too verbose. + +Here is one possible example of how these features might be implemented +in a complete array type. + +:: + + # E.g. Ndim[Literal[3]] + Integer = TypeVar('Integer') + class Ndim(Generic[Integer]): ... + + # E.g. Shape[Height, Width] + # (Where Height and Width are custom types) + Axes = TypeVarTuple('Axes') + class Shape(Generic[*Axes]): ... + + DataType = TypeVar('DataType') + ShapeType = TypeVar('ShapeType', NDim, Shape) + + # The most verbose type + # E.g. Array[np.float32, Ndim[Literal[3]] + # Array[np.uint8, Shape[Height, Width, Channels]] + class Array(Generic[DataType, ShapeType]): ... + + # Type aliases for less verbosity + # E.g. Float32Array[Height, Width, Channels] + Float32Array = Array[np.float32, Shape[*Axes]] + # E.g. Array1D[np.uint8] + Array1D = Array[DataType, Ndim[Literal[1]]] Rationale and Rejected Ideas ============================ @@ -458,8 +704,8 @@ by simply defining aliases for each possible number of type parameters: :: - class Tensor1(Generic[Axis1]): ... - class Tensor2(Generic[Axis1, Axis2]): ... + class Array1(Generic[Axis1]): ... + class Array2(Generic[Axis1, Axis2]): ... However, this seems somewhat clumsy - it requires users to unnecessarily pepper their code with 1s, 2s, and so on for each rank necessary. @@ -478,148 +724,48 @@ considered a number of different options for the name of this operator. In the end, we decided that ``Map`` was good enough. -Naming of ``TypeTupleVar`` +Nesting ``Map`` +--------------- + +Since the result of ``Map`` is a parameterised ``Tuple``, it should be +possible to use the output of a ``Map`` as the input to another ``Map``: + +:: + + Map[Tuple, Map[List, Ts]] + +If ``Ts`` here were bound to ``Tuple[int, str]``, the result of the +inner ``Map`` would be ``Tuple[List[int], List[str]]``, so the result +of the outer map would be ``Tuple[Tuple[List[int]], Tuple[List[str]]]``. + +We chose not to highlight this fact because of a) how confusing it is, +and b) lack of a specific use-case. Whether to support nested ``Map`` +is left to the implementation. + +Naming of ``TypeVarTuple`` -------------------------- -``TypeTupleVar`` began as ``ListVariadic``, based on its naming in +``TypeVarTuple`` began as ``ListVariadic``, based on its naming in an early implementation in Pyre. We then changed this to ``TypeVar(list=True)``, on the basis that a) it better emphasises the similarity to ``TypeVar``, and b) the meaning of 'list' is more easily understood than the jargon of 'variadic'. -We finally settled on ``TypeTupleVar`` based on the justification -that c) this emphasises the tuple-like behaviour, and d) type tuple -variables are a sufficiently different kind of thing to regular +We finally settled on ``TypeVarTuple`` based on the justification +that c) this emphasises the tuple-like behaviour, and d) type variable +tuples are a sufficiently different kind of thing to regular type variables that we may later wish to support keyword arguments to its constructor that should not be supported by regular type variables (such as ``arbitrary_len`` [#arbitrary_len]_). -Accessing Individual Types Without Overloads --------------------------------------------- - -We chose to support access to individual types in the type tuple variable -using overloads (see the `Accessing Individual Types`_ section). One -alternative would have been to allow explicit access to arbitrary parts -of the type tuple variable - for example, through indexing: - -:: - - def foo(t: Tuple[*Ts]): - x: Ts[1] = t[1] - -We decided to omit this mechanism from this PEP because a) it adds complexity, -b) we were not aware of any use-cases that need it, and c) if it turns out to be -needed in the future, it can easily be added in a future PEP. - -Integer Generics ----------------- - -Consider a function such as `np.tile`: - -:: - - x = np.zeros((3,)) # A tensor of length 3 - y = np.tile(x, reps=2) # y is now length 6 - -Intuitively, we would specify the signature of such a function as: - -:: - - @overload - def tile(A: Tensor[N], reps: Literal[2]) -> Tensor[2*N]: ... - # ...and other overloads for different values of `reps` - -``N`` is *sort* of like a type variable. However, type variables -stand in for *types*, whereas here we want ``N`` to stand in for a -particular *value*. ``N`` should be some sort of 'integer type variable'. - -(Note that ``N`` could *not* be created as simply ``TypeTupleVar('N', bound=int)``. -This would state that ``N`` could stand for an ``int`` or any *subtype* of ``int``. -For our signature above, we would need ``N`` to stand for any *instance* of -type ``int``.) - -We decided to omit integer type variables for this PEP, postponing it for a future -PEP when necessary. - -Integer Parameterization ------------------------- - -The examples of this PEP have parameterised tensor types -using the semantic meaning of each axes, e.g. ``Tensor[Batch, Time]``. -However, we may also wish to parameterize using the actual -integer value of each part of the shape, such as ``Tensor[Literal[64], Literal[64]]``. - -There are two aspects related to such integer parameterization that we decided -to ignore in this PEP: - -**Examples of integer parameterization**. Thought it clearly *is* valid to -parameterize with literal types, we wish to encourage the use of semantic -labelling of tensor axes wherever possible: having each axis labelled serves -as extra protection against mistakes when manipulating axes. - -**Syntactic sugar for integer parameterization**. Typing ``Literal`` is -cumbersome; ideally, we could write ``Tensor[64, 64]`` as syntactic sugar -for ``Tensor[Literal[64], Literal[64]]``. However, this would require an -inconsistency: because of forward referencing, ``Tensor['Batch']`` and -``Tensor[Literal['Batch']]`` mean different things. For this to work, we -would have to stipulate this sugar only applies for integers. We leave -this discussion for a future PEP. (If you do wish to employ such types -in your code currently, we recommend ``import Typing.Literal as L`` -enabling the much shorter ``L[64]``.) - -Checking the Number of Types in a Variadic Generic --------------------------------------------------- - -Consider reduction operations, which behave as: - -:: - - x = np.zeros((2, 3, 5)) - reduce_sum(x, axis=0) # Shape (3, 5) - reduce_sum(x, axis=1) # Shape (2, 5) - -One way to compactly specify the signature of these operations would be -write something like: - -:: - - Shape = TypeTupleVar('Shape') - - # Tensor of rank N goes in, tensor of rank N-1 comes out - def reduce_sum(x: Tensor[Shape[N]], axis: int) -> Tensor[Shape[N-1]]: ... - -``Shape[N]`` here states that number of types in ``Shapes`` is bound to ``N``, -where ``N`` is some object that we can perform arithmetic on. - -Lacking an urgent use-case for this feature, we omit it from this PEP, -leaving it to a future PEP if necessary. - -(Note that reduction operations are only used as an example here. -Reduction functions can in fact be typed without this feature, -using overloads: - -:: - - @overload - def reduce_sum(x: Tensor[A, B], axis: Literal[0]) -> Tensor[B]: ... - - @overload - def reduce_sum(x: Tensor[A, B], axis: Literal[1]) -> Tensor[A]: ... - - ... - -Although more verbose, typing reduction operations this way is superior -to the approach above, since it preserves information about *which* -axis has been removed.) - Backwards Compatibility ======================= TODO * ``Tuple`` needs to be upgraded to support parameterization with a - a type tuple variable. + a type variable tuple. Reference Implementation @@ -645,14 +791,10 @@ Footnotes shape begins with 'time × batch', then ``videos_batch[1][0]`` would select the same frame. -.. [#kwargs] In the case of ``**kwargs``, we mean the Nth argument as - it appears in the function *definition*, *not* the Nth keyword argument - specified in the function *call*. - -.. [#type_aliases] For example, in ``asyncio`` [#asyncio]_, it is convenient to define - a type alias +.. [#type_aliases] For example, in ``asyncio`` [#asyncio]_, it is convenient + to define a type alias ``_FutureT = Union[Future[_T], Generator[Any, None, _T], Awaitable[_T]]``. - We should also be able to apply ``Map`` to alias - e.g. ``Map[_FutureT, Ts]``. + We should be able to apply ``Map`` to such aliases - e.g. ``Map[_FutureT, Ts]``. References ========== @@ -660,9 +802,6 @@ References .. [#pep-612] PEP 612, "Parameter Specification Variables": https://www.python.org/dev/peps/pep-0612 -.. [#pep-484] PEP 484, "Type Hints": - https://www.python.org/dev/peps/pep-0484 - .. [#numeric-stack] Static typing of Python numeric stack: https://paper.dropbox.com/doc/Static-typing-of-Python-numeric-stack-summary-6ZQzTkgN6e0oXko8fEWwN @@ -681,13 +820,13 @@ References Acknowledgements ================ -Thank you to **Alfonso Castaño**, **Antoine Pitrou**, **Bas v.B.**, **David Foster**, **Dimitris Vardoulakis**, **Guido van Rossum**, **Jia Chen**, **Lucio Fernandez-Arjona**, -**Nikita Sobolev**, **Peilonrayz**, **Pradeep Kumar Srinivasan**, **Rebecca Chen**, **Sergei Lebedev** and **Vladimir Mikulik** for helpful feedback and suggestions on drafts of this PEP. +Thank you to **Alfonso Castaño**, **Antoine Pitrou**, **Bas v.B.**, **David Foster**, **Dimitris Vardoulakis**, **Eric Traut**, **Guido van Rossum**, **Jia Chen**, +**Lucio Fernandez-Arjona**, **Nikita Sobolev**, **Peilonrayz**, **Rebecca Chen**, +**Sergei Lebedev** and **Vladimir Mikulik** for helpful feedback and suggestions on +drafts of this PEP. -Thank you especially to **Pradeep** for numerous key contributions, including pointing -out that unexpanded type tuples allow for clean concatenation of multiple type tuples, -and to **Lucio**, for suggesting the star syntax, which has made multiple aspects of -this proposal much more concise and intuitive. +Thank you especially to **Lucio**, for suggesting the star syntax, which has made +multiple aspects of this proposal much more concise and intuitive. Resources =========