python-peps/pep-0647.rst

356 lines
13 KiB
ReStructuredText
Raw Normal View History

PEP: 647
Title: User-Defined Type Guards
Version: $Revision$
Last-Modified: $Date$
Author: Eric Traut <erictr at microsoft.com>
Sponsor: Guido van Rossum <guido@python.org>
Discussions-To: Typing-Sig <typing-sig@python.org>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 07-Oct-2020
Python-Version: 3.10
Post-History: 28-Dec-2020
Abstract
========
This PEP specifies a way for programs to influence conditional type narrowing
employed by a type checker based on runtime checks.
Motivation
==========
Static type checkers commonly employ a technique called "type narrowing" to
determine a more precise type of an expression within a program's code flow.
When type narrowing is applied within a block of code based on a conditional
code flow statement (such as ``if`` and ``while`` statements), the conditional
expression is sometimes referred to as a "type guard". Python type checkers
typically support various forms of type guards expressions.
::
def func(val: Optional[str]):
# "is None" type guard
if val is not None:
# Type of val is narrowed to str
...
else:
# Type of val is narrowed to None
...
def func(val: Optional[str]):
# Truthy type guard
if val:
# Type of val is narrowed to str
...
else:
# Type of val remains Optional[str]
...
def func(val: Union[str, float]):
# "isinstance" type guard
if isinstance(val, str):
# Type of val is narrowed to str
...
else:
# Type of val is narrowed to float
...
def func(val: Literal[1, 2]):
# Comparison type guard
if val == 1:
# Type of val is narrowed to Literal[1]
...
else:
# Type of val is narrowed to Literal[2]
...
There are cases where type narrowing cannot be applied based on static
information only. Consider the following example:
::
def is_str_list(val: List[object]) -> bool:
"""Determines whether all objects in the list are strings"""
return all(isinstance(x, str) for x in val)
def func1(val: List[object]):
if is_str_list(val):
print(" ".join(val)) # Error: invalid type
This code is correct, but a type checker will report a type error because
the value ``val`` passed to the ``join`` method is understood to be of type
``List[object]``. The type checker does not have enough information to
statically verify that the type of ``val`` is ``List[str]`` at this point.
This PEP introduces a way for a function like ``is_str_list`` to be defined as
a "user-defined type guard". This allows code to extend the type guards that
are supported by type checkers.
Using this new mechanism, the ``is_str_list`` function in the above example
would be modified slightly. Its return type would be changed from ``bool``
to ``TypeGuard[List[str]]``.
::
from typing import TypeGuard
def is_str_list(val: List[object]) -> TypeGuard[List[str]]:
"""Determines whether all objects in the list are strings"""
return all(isinstance(x, str) for x in val)
User-defined type guards can also be used to determine whether a dictionary
conforms to the type requirements of a TypedDict.
::
class Person(TypedDict):
name: str
age: int
def is_person(val: dict) -> "TypeGuard[Person]":
try:
return isinstance(val["name"], str) and isinstance(val["age"], int)
except KeyError:
return False
def print_age(val: dict):
if is_person(val):
print(f"Age: {val['age']}")
else:
print("Not a person!")
Specification
=============
TypeGuard Type
--------------
This PEP introduces the symbol ``TypeGuard`` exported from the ``typing``
module. ``TypeGuard`` is a special form that accepts a single type argument.
It is used to annotate the return type of a user-defined type guard function.
Return statements within a type guard function should return bool values,
and type checkers should verify that all return paths return a bool.
In all other respects, TypeGuard is a distinct type from bool. It is not a
subtype of bool. Therefore, ``Callable[..., TypeGuard[int]]`` is not assignable
to ``Callable[..., bool]``.
When ``TypeGuard`` is used to annotate the return type of a function or
method that accepts at least one parameter, that function or method is
treated by type checkers as a user-defined type guard. The type argument
provided for ``TypeGuard`` indicates the type that has been validated by
the function.
User-defined type guards can be generic functions, as shown in this example:
::
_T = TypeVar("_T")
def is_two_element_tuple(val: Tuple[_T, ...]) -> TypeGuard[Tuple[_T, _T]]:
return len(val) == 2
def func(names: Tuple[str, ...]):
if is_two_element_tuple(names):
reveal_type(names) # Tuple[str, str]
else:
reveal_type(names) # Tuple[str, ...]
Type checkers should assume that type narrowing should be applied to the
expression that is passed as the first positional argument to a user-defined
type guard. If the type guard function accepts more than one argument, no
type narrowing is applied to those additional argument expressions.
If a type guard function is implemented as an instance method or class method,
the first positional argument maps to the second parameter (after "self" or
"cls").
Here are some examples of user-defined type guard functions that accept more
than one argument:
::
def is_str_list(val: List[object], allow_empty: bool) -> TypeGuard[List[str]]:
if len(val) == 0:
return allow_empty
return all(isinstance(x, str) for x in val)
_T = TypeVar("_T")
def is_set_of(val: Set[Any], type: Type[_T]) -> TypeGuard[Set[_T]]:
return all(isinstance(x, type) for x in val)
The return type of a user-defined type guard function will normally refer to
a type that is strictly "narrower" than the type of the first argument (that
is, it's a more specific type that can be assigned to the more general type).
However, it is not required that the return type be strictly narrower. This
allows for cases like the example above where ``List[str]`` is not assignable
to ``List[object]``.
When a conditional statement includes a call to a user-defined type guard
function, the expression passed as the first positional argument to the type
guard function should be assumed by a static type checker to take on the type
specified in the TypeGuard return type, unless and until it is further
narrowed within the conditional code block.
Some built-in type guards provide narrowing for both positive and negative
tests (in both the ``if`` and ``else`` clauses). For example, consider the
type guard for an expression of the form `x is None`. If `x` has a type that
is a union of None and some other type, it will be narrowed to `None` in the
positive case and the other type in the negative case. User-defined type
guards apply narrowing only in the positive case (the ``if`` clause). The type
is not narrowed in the negative case.
::
OneOrTwoStrs = Union[Tuple[str], Tuple[str, str]]
def func(val: OneOrTwoStrs):
if is_two_element_tuple(val):
reveal_type(val) # Tuple[str, str]
...
else:
reveal_type(val) # OneOrTwoStrs
...
if not is_two_element_tuple(val):
reveal_type(val) # OneOrTwoStrs
...
else:
reveal_type(val) # Tuple[str, str]
...
Backwards Compatibility
=======================
Existing code that does not use this new functionality will be unaffected.
Reference Implementation
========================
The Pyright type checker supports the behavior described in this PEP.
Rejected Ideas
==============
Decorator Syntax
----------------
The use of a decorator was considered for defining type guards.
::
@type_guard(List[str])
def is_str_list(val: List[object]) -> bool: ...
The decorator approach is inferior because it requires runtime evaluation of
the type, precluding forward references. The proposed approach was also deemed
to be easier to understand and simpler to implement.
Enforcing Strict Narrowing
--------------------------
Strict type narrowing enforcement (requiring that the type specified
in the TypeGuard type argument is a narrower form of the type specified
for the first parameter) was considered, but this eliminates valuable
use cases for this functionality. For instance, the ``is_str_list`` example
above would be considered invalid because ``List[str]`` is not a subtype of
``List[object]`` because of invariance rules.
One variation that was considered was to require a strict narrowing requirement
by default but allow the type guard function to specify some flag to
indicate that it is not following this requirement. This was rejected because
it was deemed cumbersome and unnecessary.
Another consideration was to define some less-strict check that ensures that
there is some overlap between the value type and the narrowed type specified
in the TypeGuard. The problem with this proposal is that the rules for type
compatibility are already very complex when considering unions, protocols,
type variables, generics, etc. Defining a variant of these rules that relaxes
some of these constraints just for the purpose of this feature would require
that we articulate all of the subtle ways in which the rules differ and under
what specific circumstances the constrains are relaxed. For this reason,
it was decided to omit all checks.
It was noted that without enforcing strict narrowing, it would be possible to
break type safety. A poorly-written type guard function could produce unsafe or
even nonsensical results. For example:
::
def f(value: int) -> TypeGuard[str]:
return True
However, there are many ways a determined or uninformed developer can subvert
type safety -- most commonly by using ``cast`` or ``Any``. If a Python
developer takes the time to learn about and implement user-defined
type guards within their code, it is safe to assume that they are interested
in type safety and will not write their type guard functions in a way that will
undermine type safety or produce nonsensical results.
Conditionally Applying TypeGuard Type
-------------------------------------
It was suggested that the expression passed as the first argument to a type
guard function should retain its existing type if the type of the expression was
a proper subtype of the type specified in the TypeGuard return type.
For example, if the type guard function is ```def f(value: object) ->
TypeGuard[float]``` and the expression passed to this function is of type
```int```, it would retain the ```int``` type rather than take on the
```float``` type indicated by the TypeGuard return type. This proposal was
rejected because it added complexity, inconsistency, and opened up additional
questions about the proper behavior if the type of the expression was of
composite types like unions or type variables with multiple constraints. It was
decided that the added complexity and inconsistency was not justified given
that it would provide little or no added value.
Narrowing of Arbitrary Parameters
---------------------------------
TypeScript's formulation of user-defined type guards allows for any input
parameter to be used as the value tested for narrowing. The TypeScript language
authors could not recall any real-world examples in TypeScript where the
parameter being tested was not the first parameter. For this reason, it was
decided unnecessary to burden the Python implementation of user-defined type
guards with additional complexity to support a contrived use case. If such
use cases are identified in the future, there are ways the TypeGuard mechanism
could be extended. This could involve the use of keyword indexing, as proposed
in PEP 637.
Narrowing of Implicit "self" and "cls" Parameters
-------------------------------------------------
The proposal states that the first positional argument is assumed to be the
value that is tested for narrowing. If the type guard function is implemented
as an instance or class method, an implicit ``self`` or ``cls`` argument will
also be passed to the function. A concern was raised that there may be
cases where it is desired to apply the narrowing logic on ``self`` and ``cls``.
This is an unusual use case, and accommodating it would significantly
complicate the implementation of user-defined type guards. It was therefore
decided that no special provision would be made for it. If narrowing
of ``self`` or ``cls`` is required, the value can be passed as an explicit
argument to a type guard function.
Copyright
=========
This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.