PEP: 586 Title: Literal Types Author: Michael Lee , Ivan Levkivskyi , Jukka Lehtosalo BDFL-Delegate: Guido van Rossum Discussions-To: Typing-Sig Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 14-Mar-2018 Python-Version: 3.8 Post-History: 14-Mar-2018 Abstract ======== This PEP proposes adding *Literal types* to the PEP 484 ecosystem. Literal types indicate that some expression has literally a specific value. For example, the following function will accept only expressions that have literally the value "4":: from typing import Literal def accepts_only_four(x: Literal[4]) -> None: pass accepts_only_four(4) # OK accepts_only_four(19) # Rejected Motivation and Rationale ======================== Python has many APIs that return different types depending on the value of some argument provided. For example: - ``open(filename, mode)`` returns either ``IO[bytes]`` or ``IO[Text]`` depending on whether the second argument is something like ``r`` or ``rb``. - ``subprocess.check_output(...)`` returns either bytes or text depending on whether the ``universal_newlines`` keyword argument is set to ``True`` or not. This pattern is also fairly common in many popular 3rd party libraries. For example, here are just two examples from pandas and numpy respectively: - ``pandas.concat(...)`` will return either ``Series`` or ``DataFrame`` depending on whether the ``axis`` argument is set to 0 or 1. - ``numpy.unique`` will return either a single array or a tuple containing anywhere from two to four arrays depending on three boolean flag values. The typing issue tracker contains some `additional examples and discussion `_. There is currently no way of expressing the type signatures of these functions: PEP 484 does not include any mechanism for writing signatures where the return type varies depending on the value passed in. Note that this problem persists even if we redesign these APIs to instead accept enums: ``MyEnum.FOO`` and ``MyEnum.BAR`` are both considered to be of type ``MyEnum``. Currently, type checkers work around this limitation by adding ad hoc extensions for important builtins and standard library functions. For example mypy comes bundled with a plugin that attempts to infer more precise types for ``open(...)``. While this approach works for standard library functions, it’s unsustainable in general: it’s not reasonable to expect 3rd party library authors to maintain plugins for N different type checkers. We propose adding *Literal types* to address these gaps. Core Semantics ============== This section outlines the baseline behavior of literal types. Core behavior ------------- Literal types indicate that a variable has a specific and concrete value. For example, if we define some variable ``foo`` to have type ``Literal[3]``, we are declaring that ``foo`` must be exactly equal to ``3`` and no other value. Given some value ``v`` that is a member of type ``T``, the type ``Literal[v]`` shall be treated as a subtype of ``T``. For example, ``Literal[3]`` is a subtype of ``int``. All methods from the parent type will be directly inherited by the literal type. So, if we have some variable ``foo`` of type ``Literal[3]`` it’s safe to do things like ``foo + 5`` since ``foo`` inherits int’s ``__add__`` method. The resulting type of ``foo + 5`` is ``int``. This "inheriting" behavior is identical to how we `handle NewTypes. `_. Equivalence of two Literals --------------------------- Two types ``Literal[v1]`` and ``Literal[v2]`` are equivalent when both of the following conditions are true: 1. ``type(v1) == type(v2)`` 2. ``v1 == v2`` For example, ``Literal[20]`` and ``Literal[0x14]`` are equivalent. However, ``Literal[0]`` and ``Literal[False]`` is *not* equivalent despite that ``0 == False`` evaluates to 'true' at runtime: ``0`` has type ``int`` and ``False`` has type ``bool``. Shortening unions of literals ----------------------------- Literals are parameterized with one or more values. When a Literal is parameterized with more than one value, it's treated as exactly equivalent to the union of those types. That is, ``Literal[v1, v2, v3]`` is equivalent to ``Union[Literal[v1], Literal[v2], Literal[v3]]``. This shortcut helps make writing signatures for functions that accept many different literals more ergonomic — for example, functions like ``open(...)``:: # Note: this is a simplification of the true type signature. _PathType = Union[str, bytes, int] @overload def open(path: _PathType, mode: Literal["r", "w", "a", "x", "r+", "w+", "a+", "x+"], ) -> IO[Text]: ... @overload def open(path: _PathType, mode: Literal["rb", "wb", "ab", "xb", "r+b", "w+b", "a+b", "x+b"], ) -> IO[bytes]: ... # Fallback overload for when the user isn't using literal types @overload def open(path: _PathType, mode: str) -> IO[Any]: ... The provided values do not all have to be members of the same type. For example, ``Literal[42, "foo", True]`` is a legal type. However, Literal **must** be parameterized with at least one type. Types like ``Literal[]`` or ``Literal`` are illegal. Legal and illegal parameterizations =================================== This section describes what exactly constitutes a legal ``Literal[...]`` type: what values may and may not be used as parameters. In short, a ``Literal[...]`` type may be parameterized by one or more literal expressions, and nothing else. Legal parameters for ``Literal`` at type check time --------------------------------------------------- ``Literal`` may be parameterized with literal ints, byte and unicode strings, bools, Enum values and ``None``. So for example, all of the following would be legal:: Literal[26] Literal[0x1A] # Exactly equivalent to Literal[26] Literal[-4] Literal["hello world"] Literal[b"hello world"] Literal[u"hello world"] Literal[True] Literal[Color.RED] # Assuming Color is some enum Literal[None] **Note:** Since the type ``None`` is inhabited by just a single value, the types ``None`` and ``Literal[None]`` are exactly equivalent. Type checkers may simplify ``Literal[None]`` into just ``None``. ``Literal`` may also be parameterized by other literal types, or type aliases to other literal types. For example, the following is legal:: ReadOnlyMode = Literal["r", "r+"] WriteAndTruncateMode = Literal["w", "w+", "wt", "w+t"] WriteNoTruncateMode = Literal["r+", "r+t"] AppendMode = Literal["a", "a+", "at", "a+t"] AllModes = Literal[ReadOnlyMode, WriteAndTruncateMode, WriteNoTruncateMode, AppendMode] This feature is again intended to help make using and reusing literal types more ergonomic. **Note:** As a consequence of the above rules, type checkers are also expected to support types that look like the following:: Literal[Literal[Literal[1, 2, 3], "foo"], 5, None] This should be exactly equivalent to the following type:: Literal[1, 2, 3, "foo", 5, None] ...and also to the following type:: Optional[Literal[1, 2, 3, "foo", 5]] **Note:** String literal types like ``Literal["foo"]`` should subtype either bytes or unicode in the same way regular string literals do at runtime. For example, in Python 3, the type ``Literal["foo"]`` is equivalent to ``Literal[u"foo"]``, since ``"foo"`` is equivalent to ``u"foo"`` in Python 3. Similarly, in Python 2, the type ``Literal["foo"]`` is equivalent to ``Literal[b"foo"]`` -- unless the file includes a ``from __future__ import unicode_literals`` import, in which case it would be equivalent to ``Literal[u"foo"]``. Illegal parameters for ``Literal`` at type check time ----------------------------------------------------- The following parameters are intentionally disallowed by design: - Arbitrary expressions like ``Literal[3 + 4]`` or ``Literal["foo".replace("o", "b")]``. - Rationale: Literal types are meant to be a minimal extension to the PEP 484 typing ecosystem and requiring type checkers to interpret potentially expressions inside types adds too much complexity. Also see `Rejected or out-of-scope ideas`_. - As a consequence, complex numbers like ``Literal[4 + 3j]`` and ``Literal[-4 + 2j]`` are also prohibited. For consistency, literals like ``Literal[4j]`` that contain just a single complex number are also prohibited. - The only exception to this rule is the unary ``-`` (minus) for ints: types like ``Literal[-5]`` are *accepted*. - Tuples containing valid literal types like ``Literal[(1, "foo", "bar")]``. The user could always express this type as ``Tuple[Literal[1], Literal["foo"], Literal["bar"]]`` instead. Also, tuples are likely to be confused with the ``Literal[1, 2, 3]`` shortcut. - Mutable literal data structures like dict literals, list literals, or set literals: literals are always implicitly final and immutable. So, ``Literal[{"a": "b", "c": "d"}]`` is illegal. - Any other types: for example, ``Literal[Path]``, or ``Literal[some_object_instance]`` are illegal. This includes typevars: if ``T`` is a typevar, ``Literal[T]`` is not allowed. Typevars can vary over only types, never over values. The following are provisionally disallowed for simplicity. We can consider allowing them in future extensions of this PEP. - Floats: e.g. ``Literal[3.14]``. Representing Literals of infinity or NaN in a clean way is tricky; real-world APIs are unlikely to vary their behavior based on a float parameter. - Any: e.g. ``Literal[Any]``. ``Any`` is a type, and ``Literal[...]`` is meant to contain values only. It is also unclear what ``Literal[Any]`` would actually semantically mean. Parameters at runtime --------------------- Although the set of parameters ``Literal[...]`` may contain at type check time is very small, the actual implementation of ``typing.Literal`` will not perform any checks at runtime. For example:: def my_function(x: Literal[1 + 2]) -> int: return x * 3 x: Literal = 3 y: Literal[my_function] = my_function The type checker should reject this program: all three uses of ``Literal`` are *invalid* according to this spec. However, Python itself should execute this program with no errors. This is partly to help us preserve flexibility in case we want to expand the scope of what ``Literal`` can be used for in the future, and partly because it is not possible to detect all illegal parameters at runtime to begin with. For example, it is impossible to distinguish between ``Literal[1 + 2]`` and ``Literal[3]`` at runtime. Literals, enums, and forward references --------------------------------------- One potential ambiguity is between literal strings and forward references to literal enum members. For example, suppose we have the type ``Literal["Color.RED"]``. Does this literal type contain a string literal or a forward reference to some ``Color.RED`` enum member? In cases like these, we always assume the user meant to construct a literal string. If the user wants a forward reference, they must wrap the entire literal type in a string -- e.g. ``"Literal[Color.RED]"``. Type inference ============== This section describes a few rules regarding type inference and literals, along with some examples. Backwards compatibility ----------------------- When type checkers add support for Literal, it's important they do so in a way that maximizes backwards-compatibility. Type checkers should ensure that code that used to type check continues to do so after support for Literal is added on a best-effort basis. This is particularly important when performing type inference. For example, given the statement ``x = "blue"``, should the inferred type of ``x`` be ``str`` or ``Literal["blue"]``? One naive strategy would be to always assume expressions are intended to be Literal types. So, ``x`` would always have an inferred type of ``Literal["blue"]`` in the example above. This naive strategy is almost certainly too disruptive -- it would cause programs like the following to start failing when they previously did not:: # If a type checker infers 'var' has type Literal[3] # and my_list has type List[Literal[3]]... var = 3 my_list = [var] # ...this call would be a type-error. my_list.append(4) Another example of when this strategy would fail is when setting fields in objects:: class MyObject: def __init__(self) -> None: # If a type checker infers MyObject.field has type Literal[3]... self.field = 3 m = MyObject() # ...this assignment would no longer type check m.field = 4 An alternative strategy that *does* maintain compatibility in every case would be to always assume expressions are *not* Literal types unless they are explicitly annotated otherwise. A type checker using this strategy would always infer that ``x`` is of type ``str`` in the first example above. This is not the only viable strategy: type checkers should feel free to experiment with more sophisticated inference techniques. This PEP does not mandate any particular strategy; it only emphasizes the importance of backwards compatibility. Using non-Literals in Literal contexts -------------------------------------- Literal types follow the existing rules regarding subtyping with no additional special-casing. For example, programs like the following are type safe:: def expects_str(x: str) -> None: ... var: Literal["foo"] = "foo" # Legal: Literal["foo"] is a subtype of str expects_str(var) This also means non-Literal expressions in general should not automatically be cast to Literal. For example:: def expects_literal(x: Literal["foo"]) -> None: ... def runner(my_str: str) -> None: # ILLEGAL: str is not a subclass of Literal["foo"] expects_literal(my_str) **Note:** If the user wants their API to support accepting both literals *and* the original type -- perhaps for legacy purposes -- they should implement a fallback overload. See `Interactions with overloads`_. Interactions with other types and features ========================================== This section discusses how Literal types interact with other existing types. Intelligent indexing of structured data --------------------------------------- Literals can be used to "intelligently index" into structured types like tuples, NamedTuple, and classes. (Note: this is not an exhaustive list). For example, type checkers should infer the correct value type when indexing into a tuple using an int key that corresponds a valid index:: a: Literal[0] = 0 b: Literal[5] = 5 some_tuple: Tuple[int, str, List[bool]] = (3, "abc", [True, False]) reveal_type(some_tuple[a]) # Revealed type is 'int' some_tuple[b] # Error: 5 is not a valid index into the tuple We expect similar behavior when using functions like getattr:: class Test: def __init__(self, param: int) -> None: self.myfield = param def mymethod(self, val: int) -> str: ... a: Literal["myfield"] = "myfield" b: Literal["mymethod"] = "mymethod" c: Literal["blah"] = "blah" t = Test() reveal_type(getattr(t, a)) # Revealed type is 'int' reveal_type(getattr(t, b)) # Revealed type is 'Callable[[int], str]' getattr(t, c) # Error: No attribute named 'blah' in Test **Note:** See `Interactions with Final`_ for a proposal on how we can express the variable declarations above in a more compact manner. Interactions with overloads --------------------------- Literal types and overloads do not need to interact in a special way: the existing rules work fine. However, one important use case type checkers must take care to support is the ability to use a *fallback* when the user is not using literal types. For example, consider ``open``:: _PathType = Union[str, bytes, int] @overload def open(path: _PathType, mode: Literal["r", "w", "a", "x", "r+", "w+", "a+", "x+"], ) -> IO[Text]: ... @overload def open(path: _PathType, mode: Literal["rb", "wb", "ab", "xb", "r+b", "w+b", "a+b", "x+b"], ) -> IO[bytes]: ... # Fallback overload for when the user isn't using literal types @overload def open(path: _PathType, mode: str) -> IO[Any]: ... If we were to change the signature of ``open`` to use just the first two overloads, we would break any code that does not pass in a literal string expression. For example, code like this would be broken:: mode: str = pick_file_mode(...) with open(path, mode) as f: # f should continue to be of type IO[Any] here A little more broadly: we propose adding a policy to typeshed that mandates that whenever we add literal types to some existing API, we also always include a fallback overload to maintain backwards-compatibility. Interactions with generics -------------------------- Types like ``Literal[3]`` are meant to be just plain old subclasses of ``int``. This means you can use types like ``Literal[3]`` anywhere you could use normal types, such as with generics. This means that it is legal to parameterize generic functions or classes using Literal types:: A = TypeVar('A', bound=int) B = TypeVar('B', bound=int) C = TypeVar('C', bound=int) # A simplified definition for Matrix[row, column] class Matrix(Generic[A, B]): def __add__(self, other: Matrix[A, B]) -> Matrix[A, B]: ... def __matmul__(self, other: Matrix[B, C]) -> Matrix[A, C]: ... def transpose(self) -> Matrix[B, A]: ... foo: Matrix[Literal[2], Literal[3]] = Matrix(...) bar: Matrix[Literal[3], Literal[7]] = Matrix(...) baz = foo @ bar reveal_type(baz) # Revealed type is 'Matrix[Literal[2], Literal[7]]' Similarly, it is legal to construct TypeVars with value restrictions or bounds involving Literal types:: T = TypeVar('T', Literal["a"], Literal["b"], Literal["c"]) S = TypeVar('S', bound=Literal["foo"]) ...although it is unclear when it would ever be useful to construct a TypeVar with a Literal upper bound. For example, the ``S`` TypeVar in the above example is essentially pointless: we can get equivalent behavior by using ``S = Literal["foo"]`` instead. **Note:** Literal types and generics deliberately interact in only very basic and limited ways. In particular, libraries that want to type check code containing an heavy amount of numeric or numpy-style manipulation will almost certainly likely find Literal types as proposed in this PEP to be insufficient for their needs. We considered several different proposals for fixing this, but ultimately decided to defer the problem of integer generics to a later date. See `Rejected or out-of-scope ideas`_ for more details. Interactions with enums and exhaustiveness checks ------------------------------------------------- Type checkers should be capable of performing exhaustiveness checks when working Literal types that have a closed number of variants, such as enums. For example, the type checker should be capable of inferring that the final ``else`` statement must be of type ``str``, since all three values of the ``Status`` enum have already been exhausted:: class Status(Enum): SUCCESS = 0 INVALID_DATA = 1 FATAL_ERROR = 2 def parse_status(s: Union[str, Status]) -> None: if s is Status.SUCCESS: print("Success!") elif s is Status.INVALID_DATA: print("The given data is invalid because...") elif s is Status.FATAL_ERROR: print("Unexpected fatal error...") else: # 's' must be of type 'str' since all other options are exhausted print("Got custom status: " + s) The interaction described above is not new: it's already `already codified within PEP 484 `_. However, many type checkers (such as mypy) do not yet implement this due to the expected complexity of the implementation work. Some of this complexity will be alleviated once Literal types are introduced: rather than entirely special-casing enums, we can instead treat them as being approximately equivalent to the union of their values and take advantage of any existing logic regarding unions, exhaustibility, type narrowing, reachability, and so forth the type checker might have already implemented. So here, the ``Status`` enum could be treated as being approximately equivalent to ``Literal[Status.SUCCESS, Status.INVALID_DATA, Status.FATAL_ERROR]`` and the type of ``s`` narrowed accordingly. Interactions with narrowing --------------------------- Type checkers may optionally perform additional analysis for both enum and non-enum Literal types beyond what is described in the section above. For example, it may be useful to perform narrowing based on things like containment or equality checks:: def parse_status(status: str) -> None: if status in ("MALFORMED", "ABORTED"): # Type checker could narrow 'status' to type # Literal["MALFORMED", "ABORTED"] here. return expects_bad_status(status) # Similarly, type checker could narrow 'status' to Literal["PENDING"] if status == "PENDING": expects_pending_status(status) It may also be useful to perform narrowing taking into account expressions involving Literal bools. For example, we can combine ``Literal[True]``, ``Literal[False]``, and overloads to construct "custom type guards":: @overload def is_int_like(x: Union[int, List[int]]) -> Literal[True]: ... @overload def is_int_like(x: object) -> bool: ... def is_int_like(x): ... vector: List[int] = [1, 2, 3] if is_int_like(vector): vector.append(3) else: vector.append("bad") # This branch is inferred to be unreachable scalar: Union[int, str] if is_int_like(scalar): scalar += 3 # Type checks: type of 'scalar' is narrowed to 'int' else: scalar += "foo" # Type checks: type of 'scalar' is narrowed to 'str' Interactions with Final ----------------------- `PEP 591 `_ proposes adding a "Final" qualifier to the typing ecosystem. This qualifier can be used to declare that some variable or attribute cannot be reassigned:: foo: Final = 3 foo = 4 # Error: 'foo' is declared to be Final Note that in the example above, we know that ``foo`` will always be equal to exactly ``3``. A type checker can use this information to deduce that ``foo`` is valid to use in any context that expects a ``Literal[3]``:: def expects_three(x: Literal[3]) -> None: ... expects_three(foo) # Type checks, since 'foo' is Final and equal to 3 The ``Final`` qualifier serves as a shorthand for declaring that a variable is *effectively Literal*. If both this PEP and PEP 591 are accepted, type checkers are expected to support this shortcut. Specifically, given a variable or attribute assignment of the form ``var: Final = value`` where ``value`` is a valid parameter for ``Literal[...]``, type checkers should understand that ``var`` may be used in any context that expects a ``Literal[value]``. Type checkers are not obligated to understand any other uses of Final. For example, whether or not the following program type checks is left unspecified:: # Note: The assignment does not exactly match the form 'var: Final = value'. bar1: Final[int] = 3 expects_three(bar1) # May or may not be accepted by type checkers # Note: "Literal[1 + 2]" is not a legal type. bar2: Final = 1 + 2 expects_three(bar2) # May or may not be accepted by type checkers Rejected or out-of-scope ideas ============================== This section outlines some potential features that are explicitly out-of-scope. True dependent types/integer generics ------------------------------------- This proposal is essentially describing adding a very simplified dependent type system to the PEP 484 ecosystem. One obvious extension would be to implement a full-fledged dependent type system that lets users predicate types based on their values in arbitrary ways. That would let us write signatures like the below:: # A vector has length 'n', containing elements of type 'T' class Vector(Generic[N, T]): ... # The type checker will statically verify our function genuinely does # construct a vector that is equal in length to "len(vec1) + len(vec2)" # and will throw an error if it does not. def concat(vec1: Vector[A, T], vec2: Vector[B, T]) -> Vector[A + B, T]: # ...snip... At the very least, it would be useful to add some form of integer generics. Although such a type system would certainly be useful, it’s out of scope for this PEP: it would require a far more substantial amount of implementation work, discussion, and research to complete compared to the current proposal. It's entirely possible we'll circle back and revisit this topic in the future: we very likely will need some form of dependent typing along with other extensions like variadic generics to support popular libraries like numpy. This PEP should be seen as a stepping stone towards this goal, rather then an attempt at providing a comprehensive solution. Adding more concise syntax -------------------------- One objection to this PEP is that having to explicitly write ``Literal[...]`` feels verbose. For example, instead of writing:: def foobar(arg1: Literal[1], arg2: Literal[True]) -> None: pass ...it would be nice to instead write:: def foobar(arg1: 1, arg2: True) -> None: pass Unfortunately, these abbreviations simply will not work with the existing implementation of ``typing`` at runtime. For example, the following snippet crashes when run using Python 3.7:: from typing import Tuple # Supposed to accept tuple containing the literals 1 and 2 def foo(x: Tuple[1, 2]) -> None: pass Running this yields the following exception:: TypeError: Tuple[t0, t1, ...]: each t must be a type. Got 1. We don’t want users to have to memorize exactly when it’s ok to elide ``Literal``, so we require ``Literal`` to always be present. A little more broadly, we feel overhauling the syntax of types in Python is not within the scope of this PEP: it would be best to have that discussion in a separate PEP, instead of attaching it to this one. So, this PEP deliberately does not try and innovate Python's type syntax. Backporting the ``Literal`` type ================================ Once this PEP is accepted, the ``Literal`` type will need to be backported for Python versions that come bundled with older versions of the ``typing`` module. We plan to do this by adding ``Literal`` to the ``typing_extensions`` 3rd party module, which contains a variety of other backported types. Implementation ============== The mypy type checker currently has implemented a large subset of the behavior described in this spec, with the exception of enum Literals and some of the more complex narrowing interactions described above. Related work ============ This proposal was written based on the discussion that took place in the following threads: - `Check that literals belong to/are excluded from a set of values `_ - `Simple dependent types `_ - `Typing for multi-dimensional arrays `_ The overall design of this proposal also ended up converging into something similar to how `literal types are handled in TypeScript `_. .. _typing-discussion: https://github.com/python/typing/issues/478 .. _mypy-discussion: https://github.com/python/mypy/issues/3062 .. _arrays-discussion: https://github.com/python/typing/issues/513 .. _typescript-literal-types: https://www.typescriptlang.org/docs/handbook/advanced-types.html#string-literal_types .. _typescript-index-types: https://www.typescriptlang.org/docs/handbook/advanced-types.html#index-types .. _newtypes: https://www.python.org/dev/peps/pep-0484/#newtype-helper-function .. _pep-484-enums: https://www.python.org/dev/peps/pep-0484/#support-for-singleton-types-in-unions .. _pep-591: https://www.python.org/dev/peps/pep-0591/ Acknowledgements ================ Thanks to Mark Mendoza, Ran Benita, Rebecca Chen, and the other members of typing-sig for their comments on this PEP. Additional thanks to the various participants in the mypy and typing issue trackers, who helped provide a lot of the motivation and reasoning behind this PEP. Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: