A new type hints draft. I've decided to publish more drafts.

Significant changes in this draft:
- Define stubs.
- Define `@overload`.
- Describe `cast()`.
- Fix description of `Any`.
- Describe `Callable[..., t]`.
- Explain why `List[t]` instead of `List<t>`.
- Add section on rejected alternatives.
- Various other edits for clarity.
This commit is contained in:
Guido van Rossum 2015-03-20 09:47:17 -07:00
parent 5817306884
commit 908c2eb563
1 changed files with 480 additions and 50 deletions

View File

@ -8,7 +8,7 @@ Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 29-Sep-2014
Post-History: 16-Jan-2015
Post-History: 16-Jan-2015,20-Mar-2015
Resolution:
@ -16,11 +16,33 @@ Abstract
========
This PEP introduces a standard syntax for type hints using annotations
on function definitions.
(PEP 3107) on function definitions. For example, here is a simple
function whose argument and return type are declared in the
annotations::
The proposal is strongly inspired by mypy [mypy]_.
def greeting(name: str) -> str:
return 'Hello ' + name
The theory behind type hints and gradual typing is explained in PEP 483.
While these annotations are available at runtime through the usual
``__annotations__`` attribute, *no type checking happens at runtime*.
Instead, the proposal assumes the existence of a separate off-line
type checker which users can run over their source code voluntarily.
Essentially, such a type checker acts as a very powerful linter.
The proposal is strongly inspired by mypy [mypy]_. For example, the
type "sequence of integers" can be written as ``Sequence[int]``. The
square brackets mean that no new syntax needs to be added to the
language. The example here uses a custom class ``Sequence``, imported
from a pure-Python module ``typing.py``. The ``Sequence[int]``
notation works by implementing ``__getitem__()`` in the metaclass.
The type system supports unions, generic types, and a special type
named ``Any`` which is consistent with (i.e. assignable to and from) all
types. This latter feature is taken from the idea of gradual typing.
Gradual typing and the full type system are explained in PEP 483.
Other approaches from which we have borrowed or to which ours can be
compared and contrasted are described in PEP 482.
Rationale and Goals
@ -36,6 +58,25 @@ up Python code to easier static analysis and refactoring, potential
runtime type checking, and performance optimizations utilizing type
information.
Of these goals, static analysis is the most important. This includes
support for off-line type checkers such as mypy, as well as providing
a standard notation that can be used by IDEs for code completion and
refactoring.
Non-goals
---------
While the proposed typing module will contain some building blocks for
runtime type checking -- in particular a useful ``isinstance()``
implementation -- third party packages would have to be developed to
implement specific runtime type checking functionality, for example
using decorators or metaclasses. Using type hints for performance
optimizations is left as an exercise for the reader.
It should also be emphasized that Python will remain a dynamically
typed language, and the authors have no desire to ever make type hints
mandatory, even by convention.
Type Definition Syntax
======================
@ -70,7 +111,7 @@ Type aliases are also valid type hints::
integer = int
def retry(url: str, retry_count: integer): ...
def retry(url: str, retry_count: integer) -> None: ...
New names that are added to support features described in following
sections are available in the ``typing`` package.
@ -83,17 +124,31 @@ Frameworks expecting callback functions of specific signatures might be
type hinted using ``Callable[[Arg1Type, Arg2Type], ReturnType]``.
Examples::
from typing import Any, AnyArgs, Callable
from typing import Callable
def feeder(get_next_item: Callable[[], Item]): ...
def feeder(get_next_item: Callable[[], str]) -> None:
# Body
def async_query(on_success: Callable[[int], None], on_error: Callable[[int, Exception], None]): ...
def async_query(on_success: Callable[[int], None],
on_error: Callable[[int, Exception], None]) -> None:
# Body
def partial(func: Callable[AnyArgs, Any], *args): ...
It is possible to declare the return type of a callable without
specifying the call signature by substituting a literal ellipsis
(three dots) for the list of arguments::
Since using callbacks with keyword arguments is not perceived as
a common use case, there is currently no support for specifying keyword
arguments with ``Callable``.
def partial(func: Callable[..., str], *args) -> Callable[..., str]:
# Body
Note that there are no square brackets around the ellipsis. The
arguments of the callback are completely unconstrained in this case
(and keyword arguments are acceptable).
Since using callbacks with keyword arguments is not perceived as a
common use case, there is currently no support for specifying keyword
arguments with ``Callable``. Similarly, there is no support for
specifying callback signatures with a variable number of argument of a
specific type.
Generics
@ -106,7 +161,7 @@ elements. Example::
from typing import Mapping, Set
def notify_by_email(employees: Set[Employee], overrides: Mapping[str, str]): ...
def notify_by_email(employees: Set[Employee], overrides: Mapping[str, str]) -> None: ...
Generics can be parametrized by using a new factory available in
``typing`` called ``TypeVar``. Example::
@ -149,13 +204,13 @@ When a type hint contains names that have not been defined yet, that
definition may be expressed as a string, to be resolved later. For
example, instead of writing::
def notify_by_email(employees: Set[Employee]): ...
def notify_by_email(employees: Set[Employee]) -> None: ...
one might write::
def notify_by_email(employees: 'Set[Employee]'): ...
def notify_by_email(employees: 'Set[Employee]') -> None: ...
.. FIXME: Rigorously define this. Defend it, or find an alternative.
.. FIXME: Rigorously define this, and give a motivational example.
Union types
@ -167,7 +222,7 @@ Example::
from typing import Union
def handle_employees(e: Union[Employee, Sequence[Employee]]):
def handle_employees(e: Union[Employee, Sequence[Employee]]) -> None:
if isinstance(e, Employee):
e = [e]
...
@ -180,14 +235,14 @@ One common case of union types are *optional* types. By default,
``None`` is an invalid value for any type, unless a default value of
``None`` has been provided in the function definition. Examples::
def handle_employee(e: Union[Employee, None]): ...
def handle_employee(e: Union[Employee, None]) -> None: ...
As a shorthand for ``Union[T1, None]`` you can write ``Optional[T1]``;
for example, the above is equivalent to::
from typing import Optional
def handle_employee(e: Optional[Employee]): ...
def handle_employee(e: Optional[Employee]) -> None: ...
An optional type is also automatically assumed when the default value is
``None``, for example::
@ -196,14 +251,21 @@ An optional type is also automatically assumed when the default value is
This is equivalent to::
def handle_employee(e: Optional[Employee] = None): ...
def handle_employee(e: Optional[Employee] = None) -> None: ...
.. FIXME: Is this really a good idea?
The ``Any`` type
----------------
A special kind of union type is ``Any``, a class that responds
``True`` to ``issubclass`` of any class. This lets the user
explicitly state that there are no constraints on the type of a
specific argument or return value.
A special kind of type is ``Any``. Every class is a subclass of
``Any``. This is also true for the builtin class ``object``.
However, to the static type checker these are completely different.
When the type of a value is ``object``, the type checker will reject
almost all operations on it, and assigning it to a variable (or using
it as a return value) of a more specialized type is a type error. On
the other hand, when a value has type ``Any``, the type checker will
allow all operations on it, and a value of type `Any`` can be assigned
to a variable (or used as a return value) of a more constrained type.
Platform-specific type checking
@ -227,6 +289,8 @@ differences, simple conditionals can be used::
else:
loop = UnixSelectorEventLoop
.. FIXME: Also define PY3 and POSIX?
Arbitrary literals defined in the form of ``NAME = True`` will also be
accepted by the type checker to differentiate type resolution::
@ -258,12 +322,16 @@ makes it issue warnings when a static analyzer is used.
To mark portions of the program that should not be covered by type
hinting, use the following:
* a ``@no_type_checks`` decorator on classes and functions
* a ``@no_type_check`` decorator on classes and functions
* a ``# type: ignore`` comment on arbitrary lines
.. FIXME: should we have a module-wide comment as well?
.. FIXME: suggest that other uses of annotations be replaced with decorators
.. FIXME: add reference to "rejected alternatives"
Type Hints on Local and Global Variables
========================================
@ -275,7 +343,7 @@ complex cases, a comment of the following format may be used::
x = [] # type: List[Employee]
In the case where type information for a local variable is needed before
if was declared, an ``Undefined`` placeholder might be used::
it is declared, an ``Undefined`` placeholder might be used::
from typing import Undefined
@ -285,17 +353,142 @@ if was declared, an ``Undefined`` placeholder might be used::
If type hinting proves useful in general, a syntax for typing variables
may be provided in a future Python version.
Casts
=====
Explicit raised exceptions
==========================
Occasionally the type checker may need a different kind of hint: the
programmer may know that an expression is of a more constrained type
than the type checker infers. For example::
No support for listing explicitly raised exceptions is being defined by
this PEP. Currently the only known use case for this feature is
documentational, in which case the recommendation is to put this
information in a docstring.
from typing import List
def find_first_str(a: List[object]) -> str:
index = next(i for i, x in enumerate(a) if isinstance(x, str))
# We only get here if there's at least one string in a
return cast(str, a[index])
The type checker infers the type ``object`` for ``a[index]``, but we
know that (if the code gets to that point) it must be a string. The
``cast(t, x)`` call tells the type checker that we are confident that
the type of ``x`` is ``t``. At runtime a cast always returns the
expression unchanged -- it does not check the type, and it does not
convert or coerce the value.
Casts differ from type comments (see the previous section). When
using a type comment, the type checker should still verify that the
inferred type is consistent with the stated type. When using a cast,
the type checker trusts the programmer. Also, casts can be used in
expressions, while type comments only apply to assignments.
The ``typing`` package
Stub Files
==========
Stub files are files containing type hints that are only for use by
the type checker, not at runtime. There are several use cases for
stub files:
* Extension modules
* 3rd party modules whose authors have not yet added type hints
* Standard library modules for which type hints have not yet been written
* Modules that must be compatible with Python 2 and 3
* Modules that use annotations for other purposes
Stub files have the same syntax as regular Python modules. There is
one feature of the ``typing`` module that may only be used in stub
files: the ``@overload`` decorator described below.
The type checker should only check function signatures in stub files;
function bodies in stub files should just be a single ``pass`` statement.
The type checker should have a configurable search path for stub
files. If a stub file is found the type checker should not read the
corresponding "real" module.
Stub files may use the ``.py`` extension or alternatively may use the
``.pyi`` extension. The latter makes it possible to maintain stub
files in the same directory as the corresponding real module.
Function overloading
--------------------
The ``@overload`` decorator allows describing functions that support
multiple different combinations of argument types. This pattern is
used frequently in builtin modules and types. For example, the
``__getitem__()`` method of the ``bytes`` type can be described as
follows::
from typing import overload
class bytes:
...
@overload
def __getitem__(self, i: int) -> int: pass
@overload
def __getitem__(self, s: slice) -> bytes: pass
This description is more precise than would be possible using unions
(which cannot express the relationship between the argument and return
types)::
from typing import Union
class bytes:
...
def __getitem__(self, a: Union[int, slice]) -> Union[int, bytes]: pass
Another example where ``@overload`` comes in handy is the type of the
builtin ``map()`` function, which takes a different number of
arguments depending on the type of the callable::
from typing import Callable, Iterable, Iterator, Tuple, TypeVar, overload
T1 = TypeVar('T1')
T2 = TypeVar('T2)
S = TypeVar('S')
@overload
def map(func: Callable[[T1], S], iter1: Iterable[T1]) -> Iterator[S]: pass
@overload
def map(func: Callable[[T1, T2], S],
iter1: Iterable[T1], iter2: Iterable[T2]) -> Iterator[S]: pass
# ... and we could add more items to support more than two iterables
Note that we could also easily add items to support ``map(None, ...)``::
@overload
def map(func: None, iter1: Iterable[T1]) -> Iterable[T1]: pass
@overload
def map(func: None,
iter1: Iterable[T1],
iter2: Iterable[T2]) -> Iterable[Tuple[T1, T2]]: pass
The ``@overload`` decorator may only be used in stub files. While it
would be possible to provide a multiple dispatch implementation using
this syntax, its implementation would require using
``sys._getframe()``, which is frowned upon. Also, designing and
implementing an efficient multiple dispatch mechanism is hard, which
is why previous attempts were abandoned in favor of
``functools.singledispatch()``. (See PEP 443, especially its section
"Alternative approaches".) In the future we may come up with a
satisfactory multiple dispatch design, but we don't want such a design
to be constrained by the overloading syntax defined for type hints in
stub files.
Exceptions
==========
No syntax for listing explicitly raised exceptions is proposed.
Currently the only known use case for this feature is documentational,
in which case the recommendation is to put this information in a
docstring.
The ``typing`` Package
======================
To open the usage of static type checking to Python 3.5 as well as older
@ -312,10 +505,17 @@ holds a set of classes representing builtin types with generics, namely:
* FrozenSet, used as ``FrozenSet[element_type]``
* Tuple, used as ``Tuple[index0_type, index1_type, ...]``.
Arbitrary-length tuples might be expressed using ellipsis, in which
case the following arguments are considered the same type as the last
defined type on the tuple.
* Tuple, used by listing the element types, for example
``Tuple[int, int, str]``.
Arbitrary-length homogeneous tuples can be expressed
using one type and ellipsis, for example ``Tuple[int, ...]``.
(The ``...`` here are part of the syntax.)
The generic versions of concrete collection types (``Dict``, ``List``,
``Set``, ``FrozenSet``, and homogeneous arbitrary-length ``Tuple``)
are mainly useful for annotating return values. For arguments, prefer
the abstract collection types defined below, e.g. ``Mapping``,
``Sequence`` or ``AbstractSet``.
It also introduces factories and helper members needed to express
generics and union types:
@ -333,8 +533,6 @@ generics and union types:
* Callable, used as ``Callable[[Arg1Type, Arg2Type], ReturnType]``
* AnyArgs, used as ``Callable[AnyArgs, ReturnType]``
* AnyStr, equivalent to ``TypeVar('AnyStr', str, bytes)``
All abstract base classes available in ``collections.abc`` are
@ -385,7 +583,7 @@ The library includes literals for platform-specific type hinting:
* WINDOWS
* UNIXOID, equivalent to ``not WINDOWS``
* POSIX, equivalent to ``not WINDOWS``
The following types are available in the ``typing.io`` module:
@ -408,7 +606,7 @@ modules have two-letter names.").
The place of the ``typing`` module in the standard library
----------------------------------------------------------
.. FIXME: complete this section
.. FIXME: complete this section (or discard?)
Usage Patterns
@ -417,8 +615,8 @@ Usage Patterns
The main use case of type hinting is static analysis using an external
tool without executing the analyzed program. Existing tools used for
that purpose like ``pyflakes`` [pyflakes]_ or ``pylint`` [pylint]_
might be extended to support type checking. New tools, like mypy's
``mypy -S`` mode, can be adopted specifically for this purpose.
might be extended to support type checking. New tools, like mypy [mypy]_,
can be adopted specifically for this purpose.
Type checking based on type hints is understood as a best-effort
mechanism. In other words, whenever types are not annotated and cannot
@ -431,20 +629,246 @@ error, the program will continue running.
The implementation of a type checker, whether linting source files or
enforcing type information during runtime, is out of scope for this PEP.
.. FIXME: Describe stub modules.
.. FIXME: This is somewhat redundant with the updated initial sections.
.. FIXME: Describe run-time behavior of generic types.
Existing Approaches
===================
Rejected Alternatives
=====================
PEP 482 lists existing approaches in Python and other languages.
During discussion of earlier drafts of this PEP, various objections
were raised and alternatives were proposed. We discuss some of these
here and explain why we reject them.
Several main objections were raised.
Which brackets for generic type parameters?
-------------------------------------------
Most people are familiar with the use of angular brackets
(e.g. ``List<int>``) in languages like C++, Java, C# and Swift to
express the parametrization of generic types. The problem with these
is that they are really hard to parse, especially for a simple-minded
parser like Python. In most languages the ambiguities are usually
dealy with by only allowing angular brackets in specific syntactic
positions, where general expressions aren't allowed. (And also by
using very powerful parsing techniques that can backtrack over an
arbitrary section of code.)
But in Python, we'd like type expressions to be (syntactically) the
same as other expressions, so that we can use e.g. variable assignment
to create type aliases. Consider this simple type expression::
List<int>
From the Python parser's perspective, the expression begins with the
same four tokens (NAME, LESS, NAME, GREATER) as a chained comparison::
a < b > c # I.e., (a < b) and (b > c)
We can even make up an example that could be parsed both ways::
a < b > [ c ]
Assuming we had angular brackets in the language, this could be
interpreted as either of the following two::
(a<b>)[c] # I.e., (a<b>).__getitem__(c)
a < b > ([c]) # I.e., (a < b) and (b > [c])
It would surely be possible to come up with a rule to disambiguate
such cases, but to most users the rules would feel arbitrary and
complex. It would also require us to dramatically change the CPython
parser (and every other parser for Python). It should be noted that
Python's current parser is intentionally "dumb" -- a simple grammar is
easier for users to reason about.
For all these reasons, square brackets (e.g. ``List[int]``) are (and
have long been) the preferred syntax for generic type parameters.
They can be implemented by defining the ``__getitem__()`` method on
the metaclass, and no new syntax is required at all. This option
works in all recent versions of Python (starting with Python 2.2).
Python is not alone in this syntactic choice -- generic classes in
Scala also use square brackets.
What about existing uses of annotations?
----------------------------------------
One line of argument points out that PEP 3107 explicitly supports
the use of arbitrary expressions in function annotations. The new
proposal is then considered incompatible with the specification of PEP
3107.
Our response to this is that, first of all, the current proposal does
not introduce any direct incompatibilities, so programs using
annotations in Python 3.4 will still work correctly and without
prejudice in Python 3.5.
We do hope that type hints will eventually become the sole use for
annotations, but this will require additional discussion and a
deprecation period after the initial roll-out of the typing module
with Python 3.5. The current PEP will have provisional status (see
PEP 411) until Python 3.6 is released. The fastest conceivable scheme
would introduce silent deprecation of non-type-hint annotations in
3.6, full deprecation in 3.7, and declare type hints as the only
allowed use of annotations in Python 3.8. This should give authors of
packages that use annotations plenty of time to devise another
approach, even if type hints become an overnight success.
Another possible outcome would be that type hints will eventually
become the default meaning for annotations, but that there will always
remain an option to disable them. For this purpose the current
proposal defines a decorator ``@no_type_check`` which disables the
default interpretation of annotations as type hints in a given class
or function. It also defines a meta-decorator
``@no_type_check_decorator`` which can be used to decorate a decorator
(!), causing annotations in any function or class decorated with the
latter to be ignored by the type checker.
There are also ``# type: ignore`` comments, and static checkers should
support configuration options to disable type checking in selected
packages.
Despite all these options, proposals have been circulated to allow
type hints and other forms of annotations to coexist for individual
arguments. One proposal suggests that if an annotation for a given
argument is a dictionary literal, each key represents a different form
of annotation, and the key ``'type'`` would be use for type hints.
The problem with this idea and its variants is that the notation
becomes very "noisy" and hard to read. Also, in most cases where
existing libraries use annotations, there would be little need to
combine them with type hints. So the simpler approach of selectively
disabling type hints appears sufficient.
The problem of forward declarations
-----------------------------------
The current proposal is admittedly sub-optimal when type hints must
contain forward references. Python requires all names to be defined
by the time they are used. Apart from circular imports this is rarely
a problem: "use" here means "look up at runtime", and with most
"forward" references there is no problem in ensuring that a name is
defined before the function using it is called.
The problem with type hints is that annotations (per PEP 3107, and
similar to default values) are evaluated at the time a function is
defined, and thus any names used in an annotation must be already
defined when the function is being defined. A common scenario is a
class definition whose methods need to reference the class itself in
their annotations. (More general, it can also occur with mutually
recursive classes.) This is natural for container types, for
example::
class Node:
"""Binary tree node."""
def __init__(self, left: Node, right: None):
self.left = left
self.right = right
As written this will not work, because of the peculiarity in Python
that class names become defined once the entire body of the class has
been executed. Our solution, which isn't particularly elegant, but
gets the job done, is to allow using string literals in annotations.
Most of the time you won't have to use this though -- most _uses_ of
type hints are expected to reference builtin types or types defined in
other modules.
A counterproposal would change the semantics of type hints so they
aren't evaluated at runtime at all (after all, type checking happens
off-line, so why would type hints need to be evaluated at runtime at
all). This of course would run afoul of backwards compatibility,
since the Python interpreter doesn't actually know whether a
particular annotation is meant to be a type hint or something else.
The double colon
----------------
A few creative souls have tried to invent solutions for this problem.
For example, it was proposed to use a double colon (``::``) for type
hints, solving two problems at once: disambiguating between type hints
and other annotations, and changing the semantics to preclude runtime
evaluation. There are several things wrong with this idea, however.
* It's ugly. The single colon in Python has many uses, and all of
them look familiar because they resemble the use of the colon in
English text. This is a general rule of thumb by which Python
abides for most forms of punctuation; the exceptions are typically
well known from other programming languages. But this use of ``::``
is unheard of in English, and in other languages (e.g. C++) it is
used as a scoping operator, which is a very different beast. In
contrast, the single colon for type hints reads natural -- and no
wonder, since it was carefully designed for this purpose (the idea
long predates PEP 3107 [gvr-artima]_). It is also used in the same
fashion in other languages from Pascal to Swift.
* What would you do for return type annotations?
* It's actually a feature that type hints are evaluated at runtime.
* Making type hints available at runtime allows runtime type
checkers to be built on top of type hints.
* It catches mistakes even when the type checker is not run. Since
it is a separate program, users may choose not to run it (or even
install it), but might still want to use type hints as a concise
form of documentation. Broken type hints are no use even for
documentation.
* Because it's new syntax, using the double colon for type hints would
limit them to code that works with Python 3.5 only. By using
existing syntax, the current proposal can easily work for older
versions of Python 3. (And in fact mypy supports Python 3.2 and
newer.)
* If type hints become successful we may well decide to add new syntax
in the future to declare the type for variables, for example
``var age: int = 42``. If we were to use a double colon for
argument type hints, for consistency we'd have to use the same
convention for future syntax, perpetuating the ugliness.
Other forms of new syntax
-------------------------
A few other forms of alternative syntax have been proposed, e.g. the
introduction of a ``where`` keyword [roberge]_, and Cobra-inspired
``requires`` clauses. But these all share a problem with the double
colon: they won't work for earlier versions of Python 3. The same
would apply to a new ``__future__`` import.
Other backwards compatible conventions
--------------------------------------
The ideas put forward include:
* A decorator, e.g. ``@typehints(name=str, returns=str)``. This could
work, but it's pretty verbose (an extra line, and the argument names
must be repeated), and a far cry in elegance from the PEP 3107
notation.
* Stub files. We do want stub files, but they are primarily useful
for adding type hints to existing code that doesn't lend itself to
adding type hints, e.g. 3rd party packages, code that needs to
support both Python 2 and Python 3, and especially extension
modules. For most situations, having the annotations in line with
the function definitions makes them much more useful.
* Docstrings. There is an existing convention for docstrings, based
on the Sphinx notation (``:type arg1: description``). This is
pretty verbose (an extra line per parameter), and not very elegant.
We could also make up something new, but the annotation syntax is
hard to beat (because it was designed for this very purpose).
It's also been proposed to simply wait another release. But what
problem would that solve? It would just be procrastination.
Is type hinting Pythonic?
Is Type Hinting Pythonic?
=========================
.. FIXME: Do we really need this section?
Type annotations provide important documentation for how a unit of code
should be used. Programmers should therefore provide type hints on
public APIs, namely argument and return types on functions and methods
@ -486,6 +910,12 @@ References
.. [pylint]
http://www.pylint.org
.. [gvr-artima]
http://www.artima.com/weblogs/viewpost.jsp?thread=85551
.. [roberge]
http://aroberge.blogspot.com/2015/01/type-hinting-in-python-focus-on.html
Copyright
=========