[pep-585] Focus the PEP on a single issue

This commit is contained in:
Łukasz Langa 2019-09-17 23:54:57 +02:00
parent f5b71ceb4b
commit ec6f1538c5
No known key found for this signature in database
GPG Key ID: B26995E310250568
1 changed files with 236 additions and 87 deletions

View File

@ -1,135 +1,184 @@
PEP: 585
Title: Type Hinting Usability Conventions
Title: Type Hinting Generics In Standard Collections
Version: $Revision$
Last-Modified: $Date$
Author: Łukasz Langa <lukasz@python.org>
Discussions-To: Python-Dev <python-dev@python.org>
Discussions-To: Typing-Sig <typing-sig@python.org>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 03-Mar-2019
Python-Version: 3.8
Status of this PEP
==================
The draft of this PEP is not yet complete. I am shamelessly squatting
on the PEP number which provides a cute relation to the original PEP 484.
The draft will be completed in the upcoming days.
Python-Version: 3.9
Abstract
========
Static typing as defined by PEPs 484, 526, 544, 560, and 563 was built
incrementally on top of the existing Python runtime and constrained by
existing syntax and runtime behavior. For this reason, its usability is
lacking and some parts of typing necessarily feel like an afterthought.
existing syntax and runtime behavior. This led to the existence of
a duplicated collection hierarchy in the ``typing`` module due to
generics (for example ``typing.List`` and the built-in ``list``).
This PEP addresses some of the major complaints of typing users, namely:
This PEP proposes to enable support for the generics syntax in all
standard collections available by the ``typing`` module.
* the necessity for programmers to perform import book-keeping of names
only used in static typing contexts;
* the surprising placement of runtime collections in the typing module
(ABCs and ``NamedTuple``);
* the surprising dichotomy between ``List`` and ``list``, and so on;
* parts of static typing still performed at runtime (aliasing, cast,
``NewType``, ``TypeVar``).
Rationale and Goals
===================
The overarching goal of this PEP is to make static typing fully free of
runtime side effects. In other words, no operations related to the
process of annotating arguments, return values, and variables with types
should generate runtime behavior which is otherwise useless at runtime.
This change removes the necessity for a parallel type hierarchy in the
``typing`` module, making it easier for users to annotate their programs
and easier for teachers to teach Python.
Terminology
===========
Generic (n.) - a type that can be parametrized, typically a container.
Also known as a *parametric type* or a *generic type*. For example:
``dict``.
Parametrized generic - a specific instance of a generic with the
expected types for container elements provided. For example:
``dict[str, int]``.
Backwards compatibility
=======================
This PEP is fully backwards compatible. Code written in previous ways
might trigger some deprecations but will ultimately work as intended.
The newly described functionality requires Python 3.7 (for uses of
the "annotations" future-import) or Python 3.8 (for refactorings of the
``typing`` module).
The newly described functionality requires Python 3.9. For use cases
restricted to type annotations, Python files with the "annotations"
future-import (available since Python 3.7) can use generics in
combination with standard library collections.
Tooling, including type checkers and linters, will have to be adapted to
enable the new functionality.
recognize such generics usage as valid.
Implementation
==============
Syntactic support for generics on builtin types within annotations
------------------------------------------------------------------
Starting with Python 3.7, when ``from __future__ import annotations`` is
used, function and variable annotations can specify generics directly on
builtin types. Example::
builtin types. Example::
from __future__ import annotations
def find(haystack: dict[str, list[int]]) -> int:
...
This new way is preferred, the names ``List``, ``Dict``, ``FrozenSet``,
``Set`` are deprecated. They won't be removed from the ``typing`` module
for backwards compatibility but type checkers may warn about them in
future versions when used in conjunction with the "annotations" future
import.
Certain features of typing like type aliases or casting require putting
types in runtime context, outside of annotations. While these are
relatively less common than type annotations, it's important to allow
using the same type syntax in all contexts. This is why starting with
Python 3.9, the following collections gain `__class_getitem__()` support
for generics:
Note: no runtime component is added to builtin collections to facilitate
generics in any sense. This syntax is only supported in an annotation.
* ``tuple`` # typing.Tuple
* ``list`` # typing.List
* ``dict`` # typing.Dict
* ``set`` # typing.Set
* ``frozenset`` # typing.FrozenSet
* ``type`` # typing.Type
* ``collections.deque``
* ``collections.defaultdict``
* ``collections.OrderedDict``
* ``collections.Counter``
* ``collections.ChainMap``
* ``collections.abc.Awaitable``
* ``collections.abc.Coroutine``
* ``collections.abc.AsyncIterable``
* ``collections.abc.AsyncIterator``
* ``collections.abc.AsyncGenerator``
* ``collections.abc.Iterable``
* ``collections.abc.Iterator``
* ``collections.abc.Generator``
* ``collections.abc.Reversible``
* ``collections.abc.Container``
* ``collections.abc.Collection``
* ``collections.abc.Callable``
* ``collections.abc.Set`` # typing.AbstractSet
* ``collections.abc.MutableSet``
* ``collections.abc.Mapping``
* ``collections.abc.MutableMapping``
* ``collections.abc.Sequence``
* ``collections.abc.MutableSequence``
* ``collections.abc.ByteString``
* ``collections.abc.MappingView``
* ``collections.abc.KeysView``
* ``collections.abc.ItemsView``
* ``collections.abc.ValuesView``
* ``contextlib.AbstractContextManager`` # typing.ContextManager
* ``contextlib.AbstractAsyncContextManager`` # typing.AsyncContextManager
Importing of typing
-------------------
Importing those from ``typing`` is deprecated. Type checkers may warn
about such deprecated usage when the target version of the checked
program is signalled to be Python 3.9 or newer.
Starting with Python 3.7, when ``from __future__ import annotations`` is
used, function and variable annotations can use special names from the
``typing`` module without the relevant explicit imports being present
in the module.
Example::
Parameters to generics are available at runtime
-----------------------------------------------
from __future__ import annotations
Preserving the generic type at runtime enables introspection of the type
which can be used for API generation or runtime type checking. Such
usage is already present in the wild.
def loads(
input: Union[str, bytes], *, encoding: Optional[str] = None
) -> dict[str, Any]:
...
Just like with the ``typing`` module today, the parametrized generic
types listed in the previous section all preserve their type parameters
at runtime::
Runtime collections in typing
-----------------------------
>>> list[str]
list[str]
>>> tuple[int, ...]
tuple[int, ...]
>>> ChainMap[str, list[str]]
collections.ChainMap[str, list[str]]
All abstract base classes redefined in the typing module are being moved
back to ``collections.abc`` including all additional functionality they
gained in the typing module (in particular, generics support). The
``Generic`` type is also moved to ``collections.abc``.
This is implemented using a thin proxy type that forwards all method
calls and attribute accesses to the bare origin type with the following
exceptions:
``typing.NamedTuple`` is also moved to ``collections``.
* the ``__repr__`` shows the parametrized type;
* the ``__origin__`` attribute points at the non-parametrized
generic class;
* the ``__parameters__`` attribute is a tuple (possibly of length
1) of generic types passed to the original ``__class_getitem__``;
* the ``__class_getitem__`` raises an exception to disallow mistakes
like ``dict[str][str]``.
Aliases for all moved names will remain in the `typing` module for
backwards compatibility. Using them directly becomes deprecated.
This design means that it is possible to create instances of
parametrized collections, like::
Moving the remaining runtime syntax for typing-related functionality to annotations
-----------------------------------------------------------------------------------
>>> l = list[str]()
[]
>>> isinstance([1, 2, 3], list[str])
True
>>> list is list[str]
False
>>> list == list[str]
True
Aliasing, cast, ``NewType``, and ``TypeVar`` require definitions which
have a runtime effect. This means they require importing names from
typing, cannot support forward references, and have negative (even if
minimal) effect on runtime performance.
Objects created with bare types and parametrized types are exactly the
same. The generic parameters are not preserved in instances created
with parametrized types, in other words generic types erase type
parameters during object creation.
New syntax for those looks like this::
One important consequence of this is that the interpreter does **not**
attempt to type check operations on the collection created with
a parametrized type. This provides symmetry between::
FBID: NewType[int]
some_fbid: Cast[FBID] = some_int_from_db
Inbox: Alias[dict[FBID, list[Message]]]
T: TypeVar
XXX: How to bind in TypeVar?
l: list[str] = []
and::
l = list[str]()
Forward compatibility
---------------------
Future standard collections must implement the same behavior.
All of the above use the variable annotation syntax, removing the
runtime component from the functionality. In the case of NewType and
TypeVar, they additionally remove the necessity to repeat yourself with
the name of the type variable.
Rejected alternatives
=====================
@ -137,11 +186,111 @@ Rejected alternatives
Do nothing
----------
The usability issues described in the abstract are increasingly visible
when a codebase adopts type hinting holistically. The need to jump
between the type the programmer is just describing and imports needed to
describe the type breaks the flow of thought. The need to import
lookalike built-in collections for generics within annotations is a
kludge which makes it harder to teach Python and looks inelegant. The
remaining runtime component, even with use of the "annotations"
future-import, impacts startup performance of annotated applications.
Keeping the status quo forces Python programmers to perform book-keeping
of imports from the ``typing`` module for standard collections, making
all but the simplest annotations cumbersome to maintain. The existence
of parallel types is confusing to newcomers (why is there both ``list``
and ``List``?).
The above problems also don't exist in user-built generic classes which
share runtime functionality and the ability to use them as generic type
annotations. Making standard collections harder to use in type hinting
from user classes hindered typing adoption and usability.
Generics erasure
----------------
It would be easier to implement ``__class_getitem__`` on the listed
standard collections in a way that doesn't preserve the generic type,
in other words::
>>> list[str]
<class 'list'>
>>> tuple[int, ...]
<class 'tuple'>
>>> collections.ChainMap[str, list[str]]
<class 'collections.ChainMap'>
This is problematic as it breaks backwards compatibility: current
equivalents of those types in the ``typing`` module **do** preserve
the generic type::
>>> from typing import List, Tuple, ChainMap
>>> List[str]
typing.List[str]
>>> Tuple[int, ...]
typing.Tuple[int, ...]
>>> ChainMap[str, List[str]]
typing.ChainMap[str, typing.List[str]]
As mentioned in the "Implementation" section, preserving the generic
type at runtime enables runtime introspection of the type which can be
used for API generation or runtime type checking. Such usage is already
present in the wild.
Additionally, implementing subscripts as identity functions would make
Python less friendly to beginners. Let's demonstrate this with an
example. If a user is passing a list type instead of a list object to
a function, and that function is using indexing, the code would no
longer raise an error.
Today::
>>> l = list
>>> l[-1]
TypeError: 'type' object is not subscriptable
With ``__class_getitem__`` as an identity function::
>>> l = list
>>> l[-1]
list
The indexing being successful here would likely end up raising an
exception at a distance and with a confusing error message to the user.
Disallowing instantiation of parametrized types
-----------------------------------------------
Given that the proxy type which preserves ``__origin__`` and
``__parameters__`` is mostly useful for static analysis or runtime
introspection purposes, we might have disallowed instantiation of
parametrized types.
In fact, this is what the ``typing`` module does today for the parallels
of builtin collections only. Instantiation of other parametrized types
is allowed.
The original reason for this decision was to discourage spurious
parametrization which made object creation up to two orders of magnitude
slower compared to the special syntax available for builtin types.
This rationale is not strong enough to allow the exceptional treatment
of builtins. All other parametrized types can still be instantiated,
including parallels of collections in the standard library. Moreover,
Python allows for instantiation of lists using ``list()`` and some
builtin collections don't provide special syntax for instantiation.
Making ``isinstance(obj, list[str])`` perform a runtime type check
------------------------------------------------------------------
This functionality requires iterating over the collection which is
a destructive operation in some of them. This functionality would have
been useful, however implementing the type checker within Python that
would deal with complex types, nested type checking, type variables,
string forward references, and so on is out of scope for this PEP. This
can be revised in the future.
Note on the initial draft
=========================
An early version of this PEP discussed matters beyond generics in
standard collections. Those unrelated topics were removed for clarity.
Copyright
=========
This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.