PEP: 647 Title: User-Defined Type Guards Version: $Revision$ Last-Modified: $Date$ Author: Eric Traut Sponsor: Guido van Rossum Discussions-To: Typing-Sig Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 07-Oct-2020 Python-Version: 3.10 Post-History: 28-Dec-2020 Abstract ======== This PEP specifies a way for programs to influence conditional type narrowing employed by a type checker based on runtime checks. Motivation ========== Static type checkers commonly employ a technique called "type narrowing" to determine a more precise type of an expression within a program's code flow. When type narrowing is applied within a block of code based on a conditional code flow statement (such as ``if`` and ``while`` statements), the conditional expression is sometimes referred to as a "type guard". Python type checkers typically support various forms of type guards expressions. :: def func(val: Optional[str]): # "is None" type guard if val is not None: # Type of val is narrowed to str ... else: # Type of val is narrowed to None ... def func(val: Optional[str]): # Truthy type guard if val: # Type of val is narrowed to str ... else: # Type of val remains Optional[str] ... def func(val: Union[str, float]): # "isinstance" type guard if isinstance(val, str): # Type of val is narrowed to str ... else: # Type of val is narrowed to float ... def func(val: Literal[1, 2]): # Comparison type guard if val == 1: # Type of val is narrowed to Literal[1] ... else: # Type of val is narrowed to Literal[2] ... There are cases where type narrowing cannot be applied based on static information only. Consider the following example: :: def is_str_list(val: List[object]) -> bool: """Determines whether all objects in the list are strings""" return all(isinstance(x, str) for x in val) def func1(val: List[object]): if is_str_list(val): print(" ".join(val)) # Error: invalid type This code is correct, but a type checker will report a type error because the value ``val`` passed to the ``join`` method is understood to be of type ``List[object]``. The type checker does not have enough information to statically verify that the type of ``val`` is ``List[str]`` at this point. This PEP introduces a way for a function like ``is_str_list`` to be defined as a "user-defined type guard". This allows code to extend the type guards that are supported by type checkers. Using this new mechanism, the ``is_str_list`` function in the above example would be modified slightly. Its return type would be changed from ``bool`` to ``TypeGuard[List[str]]``. :: from typing import TypeGuard def is_str_list(val: List[object]) -> TypeGuard[List[str]]: """Determines whether all objects in the list are strings""" return all(isinstance(x, str) for x in val) User-defined type guards can also be used to determine whether a dictionary conforms to the type requirements of a TypedDict. :: class Person(TypedDict): name: str age: int def is_person(val: dict) -> "TypeGuard[Person]": try: return isinstance(val["name"], str) and isinstance(val["age"], int) except KeyError: return False def print_age(val: dict): if is_person(val): print(f"Age: {val['age']}") else: print("Not a person!") Specification ============= TypeGuard Type -------------- This PEP introduces the symbol ``TypeGuard`` exported from the ``typing`` module. ``TypeGuard`` is a special form that accepts a single type argument. It is used to annotate the return type of a user-defined type guard function. Return statements within a type guard function should return bool values, and type checkers should verify that all return paths return a bool. In all other respects, TypeGuard is a distinct type from bool. It is not a subtype of bool. Therefore, ``Callable[..., TypeGuard[int]]`` is not assignable to ``Callable[..., bool]``. When ``TypeGuard`` is used to annotate the return type of a function or method that accepts at least one parameter, that function or method is treated by type checkers as a user-defined type guard. The type argument provided for ``TypeGuard`` indicates the type that has been validated by the function. User-defined type guards can be generic functions, as shown in this example: :: _T = TypeVar("_T") def is_two_element_tuple(val: Tuple[_T, ...]) -> TypeGuard[Tuple[_T, _T]]: return len(val) == 2 def func(names: Tuple[str, ...]): if is_two_element_tuple(names): reveal_type(names) # Tuple[str, str] else: reveal_type(names) # Tuple[str, ...] Type checkers should assume that type narrowing should be applied to the expression that is passed as the first positional argument to a user-defined type guard. If the type guard function accepts more than one argument, no type narrowing is applied to those additional argument expressions. If a type guard function is implemented as an instance method or class method, the first positional argument maps to the second parameter (after "self" or "cls"). Here are some examples of user-defined type guard functions that accept more than one argument: :: def is_str_list(val: List[object], allow_empty: bool) -> TypeGuard[List[str]]: if len(val) == 0: return allow_empty return all(isinstance(x, str) for x in val) _T = TypeVar("_T") def is_set_of(val: Set[Any], type: Type[_T]) -> TypeGuard[Set[_T]]: return all(isinstance(x, type) for x in val) The return type of a user-defined type guard function will normally refer to a type that is strictly "narrower" than the type of the first argument (that is, it's a more specific type that can be assigned to the more general type). However, it is not required that the return type be strictly narrower. This allows for cases like the example above where ``List[str]`` is not assignable to ``List[object]``. When a conditional statement includes a call to a user-defined type guard function, the expression passed as the first positional argument to the type guard function should be assumed by a static type checker to take on the type specified in the TypeGuard return type, unless and until it is further narrowed within the conditional code block. Some built-in type guards provide narrowing for both positive and negative tests (in both the ``if`` and ``else`` clauses). For example, consider the type guard for an expression of the form `x is None`. If `x` has a type that is a union of None and some other type, it will be narrowed to `None` in the positive case and the other type in the negative case. User-defined type guards apply narrowing only in the positive case (the ``if`` clause). The type is not narrowed in the negative case. :: OneOrTwoStrs = Union[Tuple[str], Tuple[str, str]] def func(val: OneOrTwoStrs): if is_two_element_tuple(val): reveal_type(val) # Tuple[str, str] ... else: reveal_type(val) # OneOrTwoStrs ... if not is_two_element_tuple(val): reveal_type(val) # OneOrTwoStrs ... else: reveal_type(val) # Tuple[str, str] ... Backwards Compatibility ======================= Existing code that does not use this new functionality will be unaffected. Reference Implementation ======================== The Pyright type checker supports the behavior described in this PEP. Rejected Ideas ============== Decorator Syntax ---------------- The use of a decorator was considered for defining type guards. :: @type_guard(List[str]) def is_str_list(val: List[object]) -> bool: ... The decorator approach is inferior because it requires runtime evaluation of the type, precluding forward references. The proposed approach was also deemed to be easier to understand and simpler to implement. Enforcing Strict Narrowing -------------------------- Strict type narrowing enforcement (requiring that the type specified in the TypeGuard type argument is a narrower form of the type specified for the first parameter) was considered, but this eliminates valuable use cases for this functionality. For instance, the ``is_str_list`` example above would be considered invalid because ``List[str]`` is not a subtype of ``List[object]`` because of invariance rules. One variation that was considered was to require a strict narrowing requirement by default but allow the type guard function to specify some flag to indicate that it is not following this requirement. This was rejected because it was deemed cumbersome and unnecessary. Another consideration was to define some less-strict check that ensures that there is some overlap between the value type and the narrowed type specified in the TypeGuard. The problem with this proposal is that the rules for type compatibility are already very complex when considering unions, protocols, type variables, generics, etc. Defining a variant of these rules that relaxes some of these constraints just for the purpose of this feature would require that we articulate all of the subtle ways in which the rules differ and under what specific circumstances the constrains are relaxed. For this reason, it was decided to omit all checks. It was noted that without enforcing strict narrowing, it would be possible to break type safety. A poorly-written type guard function could produce unsafe or even nonsensical results. For example: :: def f(value: int) -> TypeGuard[str]: return True However, there are many ways a determined or uninformed developer can subvert type safety -- most commonly by using ``cast`` or ``Any``. If a Python developer takes the time to learn about and implement user-defined type guards within their code, it is safe to assume that they are interested in type safety and will not write their type guard functions in a way that will undermine type safety or produce nonsensical results. Conditionally Applying TypeGuard Type ------------------------------------- It was suggested that the expression passed as the first argument to a type guard function should retain its existing type if the type of the expression was a proper subtype of the type specified in the TypeGuard return type. For example, if the type guard function is ```def f(value: object) -> TypeGuard[float]``` and the expression passed to this function is of type ```int```, it would retain the ```int``` type rather than take on the ```float``` type indicated by the TypeGuard return type. This proposal was rejected because it added complexity, inconsistency, and opened up additional questions about the proper behavior if the type of the expression was of composite types like unions or type variables with multiple constraints. It was decided that the added complexity and inconsistency was not justified given that it would provide little or no added value. Narrowing of Arbitrary Parameters --------------------------------- TypeScript's formulation of user-defined type guards allows for any input parameter to be used as the value tested for narrowing. The TypeScript language authors could not recall any real-world examples in TypeScript where the parameter being tested was not the first parameter. For this reason, it was decided unnecessary to burden the Python implementation of user-defined type guards with additional complexity to support a contrived use case. If such use cases are identified in the future, there are ways the TypeGuard mechanism could be extended. This could involve the use of keyword indexing, as proposed in PEP 637. Narrowing of Implicit "self" and "cls" Parameters ------------------------------------------------- The proposal states that the first positional argument is assumed to be the value that is tested for narrowing. If the type guard function is implemented as an instance or class method, an implicit ``self`` or ``cls`` argument will also be passed to the function. A concern was raised that there may be cases where it is desired to apply the narrowing logic on ``self`` and ``cls``. This is an unusual use case, and accommodating it would significantly complicate the implementation of user-defined type guards. It was therefore decided that no special provision would be made for it. If narrowing of ``self`` or ``cls`` is required, the value can be passed as an explicit argument to a type guard function. Copyright ========= This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.