PEP 677: Readability and organizational edits (#2193)

* Address Guido's comments from initial PR

* Update functions-as-types based on Pradeep's feedback

* Fix some markup, address comments on endpoint example

* Rearrange Motivation and Rationale

* Move the language comparison subsection to Rational

* Fix more markup errors

* Rework "Motivation" section again

Going with Pradeep's suggested order:
- untyped example
- correct types
- partially-typed code
- our proposal

* Fix more lint errors

* Add a section highlighting a concern about -> readability

* Update the add/wrap/flatmap example

* Wording change suggested by AlexWaygood

* Add missing return annotation

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>

* Break up long sentence

Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>

* Break up long sentence (thanks Jelle)

* Reword awkward sentence (thanks Pradeep)

* Rearrange paragraphs in Motivation

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
This commit is contained in:
Steven Troxler 2021-12-15 21:23:34 -08:00 committed by GitHub
parent 581d21d947
commit c8a66e38e6
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 340 additions and 228 deletions

View File

@ -27,116 +27,130 @@ work as a drop-in replacement.
Motivation
==========
Describing Callable Signatures with ``typing.Callable``
-------------------------------------------------------
One way to make code safer and easier to analyze is by making sure
that functions and classes are well-typed. In Python we have type
annotations defined by PEP 484 to provide type hints that can find
bugs as well as helping with editor tooling like tab completion,
static analysis tooling, and code review.
that functions and classes are well-typed. In Python we have type
annotations, the framework for which is defined in PEP 484, to provide
type hints that can find bugs as well as helping with editor tooling
like tab completion, static analysis tooling, and code review.
One of the types `defined by PEP 484
<https://www.python.org/dev/peps/pep-0484/#callable>`_ is
``typing.Callable``, which describes a callable value such as a
function or a method. It takes two parameters as inputs but with the
first parameter being either a placeholder like ``...`` or a list of
types. For example:
Consider the following untyped code::
- ``Callable[..., int]`` indicates a function with arbitrary
parameters returning an integer.
- ``Callable[[str, int], bool]`` indicates a function taking two
positional parameters of types ``str`` and ``int`` and returning a
``bool``.
Of the types defined by PEP 484, ``typing.Callable`` is the most
complex because it is the only one that requires two levels of
brackets in the same type. PEP 612 added ``typing.ParamSpec`` and
``typing.Concatenate`` to help describe functions that pass ``*args``
and ``**kwargs`` to callbacks, which is very common with
decorators. This made ``typing.Callable`` more powerful but also more
complicated.
Problems with ``typing.Callable``
---------------------------------
Empirically `we have found
<https://github.com/pradeep90/annotation_collector#typed-projects---callable-type>`_
that it is common for library authors to make use of untyped or
partially-typed callables (e.g. ``Callable[..., Any]`` or a bare
``Callable``) which we believe is partially a result of the existing
types being hard to use. But using this kind of partially-typed
callable can negate the benefits of static typing. For example,
consider the following code::
from typing import Any, Callable
def with_retries(
f: Callable[..., Any]
) -> Callable[..., Any]:
def wrapper(retry_once, *args, **kwargs):
if retry_once:
try: return f(*args, **kwargs)
except Exception: pass
return f(*args, **kwargs)
return wrapper
@with_retries
def f(x: int) -> int:
return x
def flat_map(l, func):
out = []
for element in l:
out.extend(f(element))
return out
f(True, z=15)
def wrap(x: int) -> list[int]:
return [x]
The decorator above is clearly not intended to modify the type of the
function it wraps, but because it uses ``Callable[..., Any]`` it
accidentally eliminates the annotations on ``f``, and type checkers
will accept the code above even though it is sure to crash at
runtime. A correct version of this code would look like this::
def add(x: int, y: int) -> int:
return x + y
from typing import Any, Callable, Concatenate, ParamSpec, TypeVar
flat_map(wrap, [1, 2, 3]) # no runtime error, output is [1, 2, 3]
flat_map(add, [1, 2, 3]) # runtime error: `add` expects 2 arguments, got 1
R = TypeVar("R")
P = ParamSpec("P")
def with_retries(
f: Callable[P, R]
) -> Callable[Concatenate[bool, P] R]:
def wrapper(retry_once: bool, *args: P.args, **kwargs: P.kwargs):
...
return wrapper
We can add types to this example to detect the runtime error::
from typing import Callable
def flat_map(
l: list[int],
func: Callable[[int], list[int]]
) -> list[int]:
....
...
With these changes, the incorrect attempt to pass ``z`` to ``f``
produces a typecheck error as we would like.
Four usability problems with the way ``typing.Callable`` is
represented may explain why library authors often do not use its full
power:
flat_map(wrap, [1, 2, 3]) # type checks okay, output is [1, 2, 3]
flat_map(add, [1, 2, 3]) # type check error
There are a few usability challenges with ``Callable`` we can see here:
- It is verbose, particularly for more complex function signatures.
- It does not visually represent the way function headers are written,
which can make it harder to learn and use.
- It requires an explicit import, something we no longer require for
most of the other very common types after PEP 604 (``|`` for
``Union`` types) and PEP 585 (generic collections)
- It relies on two levels of nested square brackets. This can be quite
hard to read, especially when the function arguments themselves have
square brackets.
- It relies on two levels of nested brackets, unlike any other generic
type. This can be expecially hard to read when some of the type
parameters are themselves generic types.
- The bracket structure is not visually similar to how function signatures
are written.
- It requires an explicit import, unlike many of the other most common
types like ``list``.
With our proposed syntax, the properly-typed decorator example becomes
concise and the type representations are visually descriptive::
Possibly as a result, `programmers often fail to write complete
Callable types
<https://github.com/pradeep90/annotation_collector#typed-projects---callable-type>`_.
Such untyped or partially-typed callable types do not check the
parameter types or return types of the given callable and thus negate
the benefits of static typing. For example, they might write this::
from typing import Any, ParamSpec, TypeVar
R = TypeVar("R")
P = ParamSpec("P")
from typing import Callable
def with_retries(
f: (**P) -> R
) -> (bool, **P) -> R:
...
def flat_map(
l: list[int],
func: Callable[..., Any]
) -> list[int]:
....
...
flat_map(add, [1, 2, 3]) # oops, no type check error!
There's some partial type information here - we at least know that ``func``
needs to be callable. But we've dropped too much type information to catch
the mistake.
With our proposal, the example looks like this::
def flat_map(l: list[int], func: (int) -> list[int]) -> list[int]:
out = []
for element in l:
out.extend(f(element))
return out
...
The type ``(int) -> list[int]`` is more concise, uses an arrow similar
to the one indicating a return type in a function header, avoids
nested brackets, and does not require an import.
Rationale
=========
The ``Callable`` type is widely used. For example, `as of October 2021
it was
<https://github.com/pradeep90/annotation_collector#overall-stats-in-typeshed>`_
the fifth most common complex type in typeshed, after ``Optional``,
``Tuple``, ``Union``, and ``List``.
The others have had their syntax improved and the need for imports
eliminated by either PEP 604 or PEP 585:
- ``typing.Optional[int]`` is written ``int | None``
- ``typing.Union[int, str]`` is written ``int | str``
- ``typing.List[int]`` is written ``list[int]``
- ``typing.Tuple[int, str]`` is written ``tuple[int, str]``
The ``typing.Callable`` type is used almost as often as these other
types, is more complicated to read and write, and still requires an
import and bracket-based syntax.
In this proposal, we chose to support all the existing semantics of
``typing.Callable``, without adding support for new features. We took
this decision after examining how frequently each feature might be
used in existing typed and untyped open-source code. We determined
that the vast majority of use cases are covered.
We considered adding support for named, optional, and variadic
arguments. However, we decided against including these features, as
our analysis showed they are infrequently used. When they are really
needed, it is possible to type these using `callback protocols
<https://mypy.readthedocs.io/en/stable/protocols.html#callback-protocols>`_.
An Arrow Syntax for Callable Types
----------------------------------
@ -164,7 +178,7 @@ using the existing ``typing.Callable``::
from app_logic import Response, UserSetting
async def customize_response_for_settings(
def customize_response(
response: Response,
customizer: Callable[[Response, list[UserSetting]], Awaitable[Response]]
) -> Response:
@ -174,7 +188,7 @@ With our proposal, this code can be abbreviated to::
from app_logic import Response, UserSetting
def make_endpoint(
def customize_response(
response: Response,
customizer: async (Response, list[UserSetting]) -> Response,
) -> Response:
@ -184,33 +198,171 @@ This is shorter and requires fewer imports. It also has far less
nesting of square brackets - only one level, as opposed to three in
the original code.
Rationale
=========
Compact Syntax for ``ParamSpec``
--------------------------------
The ``Callable`` type is widely used. For example, `as of October 2021
it was
<https://github.com/pradeep90/annotation_collector#overall-stats-in-typeshed>`_
the fifth most common complex type in typeshed, after ``Optional``,
``Tuple``, ``Union``, and ``List``.
A particularly common case where library authors leave off type information
for callables is when defining decorators. Consider the following::
Most of the other commonly used types have had their syntax improved
via either PEP 604 or PEP 585. ``Callable`` is used heavily enough to
similarly justify a more usable syntax.
In this proposal, we chose to support all the existing semantics of
``typing.Callable``, without adding support for new features. We took
this decision after examining how frequently each feature might be
used in existing typed and untyped open-source code. We determined
that the vast majority of use cases are covered.
from typing import Any, Callable
We considered adding support for named, optional, and variadic
arguments. However, we decided against including these features, as
our analysis showed they are infrequently used. When they are really
needed, it is possible to type these using `Callback Protocols
<https://mypy.readthedocs.io/en/stable/protocols.html#callback-protocols>`_.
def with_retries(
f: Callable[..., Any]
) -> Callable[..., Any]:
def wrapper(retry_once, *args, **kwargs):
if retry_once:
try: return f(*args, **kwargs)
except Exception: pass
return f(*args, **kwargs)
return wrapper
See the Rejected Alternatives section for more detailed discussion
about omitted features.
@with_retries
def f(x: int) -> int:
return x
f(y=10) # oops - no type error!
In the code above, it is clear that the decorator should produce a
function whose signature is like that of the argument ``f`` other
than an additional ``retry_once`` argument. But the use of ``...``
prevents a type checker from seeing this and alerting a user that
``f(y=10)`` is invalid.
With PEP 612 it is possible to type decorators like this correctly
as follows::
from typing import Any, Callable, Concatenate, ParamSpec, TypeVar
R = TypeVar("R")
P = ParamSpec("P")
def with_retries(
f: Callable[P, R]
) -> Callable[Concatenate[bool, P] R]:
def wrapper(retry_once: bool, *args: P.args, **kwargs: P.kwargs) -> R:
...
return wrapper
...
With our proposed syntax, the properly-typed decorator example becomes
concise and the type representations are visually descriptive::
from typing import Any, ParamSpec, TypeVar
R = TypeVar("R")
P = ParamSpec("P")
def with_retries(
f: (**P) -> R
) -> (bool, **P) -> R:
...
Comparing to Other Languages
----------------------------
Many popular programming languages use an arrow syntax similar
to the one we are proposing here.
TypeScript
~~~~~~~~~~
In `TypeScript
<https://basarat.gitbook.io/typescript/type-system/callable#arrow-syntax>`_,
function types are expressed in a syntax almost the same as the one we
are proposing, but the arrow token is ``=>`` and arguments have names::
(x: int, y: str) => bool
The names of the arguments are not actually relevant to the type. So,
for example, this is the same callable type::
(a: int, b: str) => bool
Kotlin
~~~~~~
Function types in `Kotlin <https://kotlinlang.org/docs/lambdas.html>`_ permit
an identical syntax to the one we are proposing, for example::
(Int, String) -> Bool
It also optionally allows adding names to the arguments, for example::
(x: Int, y: String) -> Bool
As in TypeScript, the argument names if provided are just there for documentation
and are not part of the type itself.
Scala
~~~~~
`Scala <https://docs.scala-lang.org/tour/higher-order-functions.html>`_
uses the ``=>`` arrow for function types. Other than that, their syntax is
the same as the one we are proposing, for example::
(Int, String) => Bool
Scala, like Python, has the ability to provide function arguments by name.
Function types can optionally include names, for example::
(x: Int, y: String) => Bool
Unlike in TypeScript and Kotlin, these names are part of the type if
provided - any function implementing the type must use the same names.
This is similar to the extended syntax proposal we described in our
`Rejected Alternatives`_ section.
Function Definition vs Callable Type Annotations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In all of the languages listed above, type annotations for function
definitions use a ``:`` rather than a ``->``. For example, in TypeScript
a simple add function looks like this::
function higher_order(fn: (a: string) => string): string {
return fn("Hello, World");
}
Scala and Kotlin use essentially the same ``:`` syntax for return
annotations. The ``:`` makes sense in these languages because they
all use ``:`` for type annotations of
parameters and variables, and the use for function return types is
similar.
In Python we use ``:`` to denote the start of a function body and
``->`` for return annotations. As a result, even though our proposal
is superficially the same as these other languages the context is
different. There is potential for more confusion in Python when
reading function definitions that include callable types.
This is a key concern for which we are seeking feedback with our draft
PEP; one idea we have floated is to use ``=>`` instead to make it easier
to differentiate.
The ML Language Family
~~~~~~~~~~~~~~~~~~~~~~
Languages in the ML family, including `F#
<https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/fsharp-types#syntax-for-types>`_,
`OCaml
<https://www2.ocaml.org/learn/tutorials/basics.html#Defining-a-function>`_,
and `Haskell <https://wiki.haskell.org/Type_signature>`_, all use
``->`` to represent function types. All of them use a parentheses-free
syntax with multiple arrows, for example in Haskell::
Integer -> String -> Bool
The use of multiple arrows, which differs from our proposal, makes
sense for languages in this family because they use automatic
`currying <https://en.wikipedia.org/wiki/Currying>`_ of function arguments,
which means that a multi-argument function behaves like a single-argument
function returning a function.
Specification
=============
@ -257,7 +409,8 @@ same::
Grammar and AST
---------------
The proposed new syntax can be described by these AST changes ::
The proposed new syntax can be described by these AST changes to `Parser/Python.asdl
<https://github.com/python/cpython/blob/main/Parser/Python.asdl>`_::
expr = <prexisting_expr_kinds>
| AsyncCallableType(callable_type_arguments args, expr returns)
@ -301,8 +454,8 @@ types by modifying the grammar for
``callable_type_positional_argument`` as follows::
callable_type_positional_argument:
| expression ','
| expression &')'
| !'...' expression ','
| !'...' expression &')'
| '*' expression ','
| '*' expression &')'
@ -340,8 +493,8 @@ Because operators bind more tightly than ``->``, parentheses are
required whenever an arrow type is intended to be inside an argument
to an operator like ``|``::
(int) -> bool | () -> bool # syntax error!
(int) -> bool | (() -> bool) # okay
(int) -> () -> int | () -> bool # syntax error!
(int) -> (() -> int) | (() -> bool) # okay
We discussed each of these behaviors and believe they are desirable:
@ -400,21 +553,30 @@ want a type annotation and ``...`` is a valid expression. This is
never semantically valid and all type checkers would reject it, but
the grammar would allow it if we did not explicitly prevent this.
We decided that there were compelling reasons to prevent it: - The
semantics of ``(...) -> bool`` are different from ``(T) -> bool`` for
any valid type T: ``(...)`` is a special form indicating
``AnyArguments`` whereas ``T`` is a type parameter in the arguments
list. - ``...`` is used as a placeholder default value to indicate an
optional argument in stubs and Callback Protocols. Allowing it in the
position of a type could easily lead to confusion and possibly bugs
due to typos.
Since ``...`` is meaningless as a type and there are usability
concerns, our grammar rules it out and the following is a syntax
error::
(int, ...) -> bool
We decided that there were compelling reasons to do this:
- The semantics of ``(...) -> bool`` are different from ``(T) -> bool``
for any valid type T: ``(...)`` is a special form indicating
``AnyArguments`` whereas ``T`` is a type parameter in the arguments
list.
- ``...`` is used as a placeholder default value to indicate an
optional argument in stubs and callback protocols. Allowing it in
the position of a type could easily lead to confusion and possibly
bugs due to typos.
- In the ``tuple`` generic type, we special-case ``...`` to mean
"more of the same", e.g. a ``tuple[int, ...]`` means a tuple with
one or more integers. We do not use ``...`` in a a similar way
in callable types, so to prevent misunderstandings it makes sense
to prevent this.
Incompatibility with other possible uses of ``*`` and ``**``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@ -508,7 +670,7 @@ described by the existing ``typing.Callable`` semantics:
Features that other, more complicated proposals would support account
for fewer than 2% of the use cases we found. These are already
expressible using Callback Protocols, and since they are uncommon we
expressible using callback protocols, and since they are uncommon we
decided that it made more sense to move forward with a simpler syntax.
Extended Syntax Supporting Named and Optional Arguments
@ -551,16 +713,34 @@ We decided against proposing it for the following reasons:
community decides after more experience and discussion that we want
the additional features, they should be straightforward to propose
in the future.
- We realized that because of overloads, it is not possible to replace
all need for Callback Protocols even with an extended syntax. This
makes us prefer proposing a simple solution that handles most use
cases well.
- Even a full extended syntax cannot replace the use of callback
protocols for overloads. For example, no closed form of callable type
could express a function that maps bools to bools and ints to floats,
like this callback protocol.::
from typing import overload, Protocol
class OverloadedCallback(Protocol)
@overload
def __call__(self, x: int) -> float: ...
@overload
def __call__(self, x: bool) -> bool: ...
def __call__(self, x: int | bool) -> float | bool: ...
f: OverloadedCallback = ...
f(True) # bool
f(3) # float
We confirmed that the current proposal is forward-compatible with
extended syntax by
`implementing <https://github.com/stroxler/cpython/tree/callable-type-syntax--extended>`_
a quick-and-dirty grammar and AST on top of the grammar and AST for
the current proposal.
a quick-and-dirty grammar and AST on top of this grammar and AST for.
Syntax Closer to Function Signatures
@ -605,26 +785,35 @@ Other Proposals Considered
Functions-as-Types
~~~~~~~~~~~~~~~~~~
An idea we looked at very early on was to `allow using functions as
types
An idea we looked at very early on was to `allow using functions as types
<https://docs.google.com/document/d/1rv6CCDnmLIeDrYlXe-QcyT0xNPSYAuO1EBYjU3imU5s/edit?usp=sharing>`_.
The idea is allowing a function to stand in for its own call
signature, with roughly the same semantics as the ``__call__`` method
of Callback Protocols. Think this may be a great idea and worth its
own PEP, but that it is not a good alternative to improving the
usability of callable types:
of callback protocols::
- Using functions as types would not give us a new way of describing
function types as first class values. Instead, they would require a
function definition statement that effectively defines a type alias
(much as a Callable Protocol class statement does).
- Functions-as-types would support almost exactly the same features
that Callable Protocols do today: named, optional, and variadic args
as well as the ability to define overloads.
def CallableType(
positional_only: int,
/,
named: str,
*args: float,
keyword_only: int = ...,
**kwargs: str)`
) -> bool: ...
Another reason we don't view functions-as-types as a good alternative
is that it would be difficult to handle ``ParamSpec``, which we
consider a critical feature to support.
f: CallableType = ...
f(5, 6.6, 6.7, named=6, x="hello", y="world") # typechecks as bool
This may be a good idea, but we do not consider it a viable
replacement for callable types:
- It would be difficult to handle ``ParamSpec``, which we consider a
critical feature to support.
- When using functions as types, the callable types are not first-class
values. Instead, they require a separate, out-of-line function
definition to define a type alias
- It would not support more features than callback protocols, and seems
more like a shorter way to write them than a replacement for
``Callable``.
Parenthesis-Free Syntax
~~~~~~~~~~~~~~~~~~~~~~~
@ -748,83 +937,6 @@ led us to the current proposal.
<https://mail.python.org/archives/list/python-dev@python.org/thread/VBHJOS3LOXGVU6I4FABM6DKHH65GGCUB>`_
for feedback.
Other Languages
---------------
Many popular programming languages use an arrow syntax similar
to the one we are proposing here
the same ``->`` arrow token we are proposing here.
almost identical to the ones we are proposing here
TypeScript
~~~~~~~~~~
In `TypeScript
<https://basarat.gitbook.io/typescript/type-system/callable#arrow-syntax>`_,
function types are expressed in a syntax almost the same as the one we
are proposing, but the arrow token is ``=>`` and arguments have names::
(x: int, y: str) => bool
The names of the arguments are not actually relevant to the type. So,
for example, this is the same callable type::
(a: int, b: str) => bool
Kotlin
~~~~~~
Function types in `Kotlin <https://kotlinlang.org/docs/lambdas.html>`_ permit
an identical syntax to the one we are proposing, for example::
(Int, String) -> Bool
It also optionally allows adding names to the arguments, for example::
(x: Int, y: String) -> Bool
As in TypeScript, the argument names if provided are just there for documentation
and are not part of the type itself.
Scala
~~~~~
`Scala <https://docs.scala-lang.org/tour/higher-order-functions.html>`_
uses the ``=>`` arrow for function types. Other than that, their syntax is
the same as the one we are proposing, for example::
(Int, String) => Bool
Scala, like Python, has the ability to provide function arguments by name.
Funciton types can optionally include names, for example::
(x: Int, y: String) => Bool
Unlike in TypeScript and Kotlin, these names are part of the type if
provided - any function implementing the type must use the same names.
This is similar to the extended syntax proposal we described in our
`Rejected Alternatives`_ section.
The ML Language Family
~~~~~~~~~~~~~~~~~~~~~~
Languages in the ML family, including `F#
<https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/fsharp-types#syntax-for-types>`_,
`OCaml
<https://www2.ocaml.org/learn/tutorials/basics.html#Defining-a-function>`_,
and `Haskell <https://wiki.haskell.org/Type_signature>`_, all use
``->`` to represent function types. All of them use a parentheses-free
syntax with multiple arrows, for example in Haskell::
Integer -> String -> Bool
The use of multiple arrows, which differs from our proposal, makes
sense for languages in this family because they use automatic
`currying <https://en.wikipedia.org/wiki/Currying>`_ of function arguments,
which means that a multi-argument function behaves like a single-argument
function returning a function.
Acknowledgments
---------------