1206 lines
42 KiB
Plaintext
1206 lines
42 KiB
Plaintext
PEP: 544
|
||
Title: Protocols
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Ivan Levkivskyi <levkivskyi@gmail.com>, Jukka Lehtosalo <jukka.lehtosalo@iki.fi>, Łukasz Langa <lukasz@langa.pl>
|
||
Discussions-To: Python-Dev <python-dev@python.org>
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 05-Mar-2017
|
||
Python-Version: 3.7
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
Type hints introduced in PEP 484 can be used to specify type metadata
|
||
for static type checkers and other third party tools. However, PEP 484
|
||
only specifies the semantics of *nominal* subtyping. In this PEP we specify
|
||
static and runtime semantics of protocol classes that will provide a support
|
||
for *structural* subtyping (static duck typing).
|
||
|
||
|
||
.. _rationale:
|
||
|
||
Rationale and Goals
|
||
===================
|
||
|
||
Currently, PEP 484 and the ``typing`` module [typing]_ define abstract
|
||
base classes for several common Python protocols such as ``Iterable`` and
|
||
``Sized``. The problem with them is that a class has to be explicitly marked
|
||
to support them, which is unpythonic and unlike what one would
|
||
normally do in idiomatic dynamically typed Python code. For example,
|
||
this conforms to PEP 484::
|
||
|
||
from typing import Sized, Iterable, Iterator
|
||
|
||
class Bucket(Sized, Iterable[int]):
|
||
...
|
||
def __len__(self) -> int: ...
|
||
def __iter__(self) -> Iterator[int]: ...
|
||
|
||
The same problem appears with user-defined ABCs: they must be explicitly
|
||
subclassed or registered. This is particularly difficult to do with library
|
||
types as the type objects may be hidden deep in the implementation
|
||
of the library. Also, extensive use of ABCs might impose additional
|
||
runtime costs.
|
||
|
||
The intention of this PEP is to solve all these problems
|
||
by allowing users to write the above code without explicit base classes in
|
||
the class definition, allowing ``Bucket`` to be implicitly considered
|
||
a subtype of both ``Sized`` and ``Iterable[int]`` by static type checkers
|
||
using structural [wiki-structural]_ subtyping::
|
||
|
||
from typing import Iterator, Iterable
|
||
|
||
class Bucket:
|
||
...
|
||
def __len__(self) -> int: ...
|
||
def __iter__(self) -> Iterator[int]: ...
|
||
|
||
def collect(items: Iterable[int]) -> int: ...
|
||
result: int = collect(Bucket()) # Passes type check
|
||
|
||
Note that ABCs in ``typing`` module already provide structural behavior
|
||
at runtime, ``isinstance(Bucket(), Iterable)`` returns ``True``.
|
||
The main goal of this proposal is to support such behavior statically.
|
||
The same functionality will be provided for user-defined protocols, as
|
||
specified below. The above code with a protocol class matches common Python
|
||
conventions much better. It is also automatically extensible and works
|
||
with additional, unrelated classes that happen to implement
|
||
the required protocol.
|
||
|
||
|
||
Nominal vs structural subtyping
|
||
-------------------------------
|
||
|
||
Structural subtyping is natural for Python programmers since it matches
|
||
the runtime semantics of duck typing: an object that has certain properties
|
||
is treated independently of its actual runtime class.
|
||
However, as discussed in PEP 483, both nominal and structural
|
||
subtyping have their strengths and weaknesses. Therefore, in this PEP we
|
||
*do not propose* to replace the nominal subtyping described by PEP 484 with
|
||
structural subtyping completely. Instead, protocol classes as specified in
|
||
this PEP complement normal classes, and users are free to choose
|
||
where to apply a particular solution. See section on `rejected`_ ideas at the
|
||
end of this PEP for additional motivation.
|
||
|
||
|
||
Non-goals
|
||
---------
|
||
|
||
At runtime, protocol classes will be simple ABCs. There is no intent to
|
||
provide sophisticated runtime instance and class checks against protocol
|
||
classes. This would be difficult and error-prone and will contradict the logic
|
||
of PEP 484. As well, following PEP 484 and PEP 526 we state that protocols are
|
||
**completely optional**:
|
||
|
||
* No runtime semantics will be imposed for variables or parameters annotated
|
||
with a protocol class.
|
||
* Any checks will be performed only by third-party type checkers and
|
||
other tools.
|
||
* Programmers are free to not use them even if they use type annotations.
|
||
* There is no intent to make protocols non-optional in the future.
|
||
|
||
|
||
Existing Approaches to Structural Subtyping
|
||
===========================================
|
||
|
||
Before describing the actual specification, we review and comment on existing
|
||
approaches related to structural subtyping in Python and other languages:
|
||
|
||
* ``zope.interface`` [zope-interfaces]_ was one of the first widely used
|
||
approaches to structural subtyping in Python. It is implemented by providing
|
||
special classes to distinguish interface classes from normal classes,
|
||
to mark interface attributes, and to explicitly declare implementation.
|
||
For example::
|
||
|
||
from zope.interface import Interface, Attribute, implementer
|
||
|
||
class IEmployee(Interface):
|
||
|
||
name = Attribute("Name of employee")
|
||
|
||
def do(work):
|
||
"""Do some work"""
|
||
|
||
@implementer(IEmployee)
|
||
class Employee:
|
||
|
||
name = 'Anonymous'
|
||
|
||
def do(self, work):
|
||
return work.start()
|
||
|
||
Zope interfaces support various contracts and constraints for interface
|
||
classes. For example::
|
||
|
||
from zope.interface import invariant
|
||
|
||
def required_contact(obj):
|
||
if not (obj.email or obj.phone):
|
||
raise Exception("At least one contact info is required")
|
||
|
||
class IPerson(Interface):
|
||
|
||
name = Attribute("Name")
|
||
email = Attribute("Email Address")
|
||
phone = Attribute("Phone Number")
|
||
|
||
invariant(required_contact)
|
||
|
||
Even more detailed invariants are supported. However, Zope interfaces rely
|
||
entirely on runtime validation. Such focus on runtime properties goes
|
||
beyond the scope of the current proposal, and static support for invariants
|
||
might be difficult to implement. However, the idea of marking an interface
|
||
class with a special base class is reasonable and easy to implement both
|
||
statically and at runtime.
|
||
|
||
* Python abstract base classes [abstract-classes]_ are the standard
|
||
library tool to provide some functionality similar to structural subtyping.
|
||
The drawback of this approach is the necessity to either subclass
|
||
the abstract class or register an implementation explicitly::
|
||
|
||
from abc import ABC
|
||
|
||
class MyTuple(ABC):
|
||
pass
|
||
|
||
MyTuple.register(tuple)
|
||
|
||
assert issubclass(tuple, MyTuple)
|
||
assert isinstance((), MyTuple)
|
||
|
||
As mentioned in the `rationale`_, we want to avoid such necessity, especially
|
||
in static context. However, in a runtime context, ABCs are good candidates for
|
||
protocol classes and they are already used extensively in
|
||
the ``typing`` module.
|
||
|
||
* Abstract classes defined in ``collections.abc`` module [collections-abc]_
|
||
are slightly more advanced since they implement a custom
|
||
``__subclasshook__()`` method that allows runtime structural checks without
|
||
explicit registration::
|
||
|
||
from collections.abc import Iterable
|
||
|
||
class MyIterable:
|
||
def __iter__(self):
|
||
return []
|
||
|
||
assert isinstance(MyIterable(), Iterable)
|
||
|
||
Such behavior seems to be a perfect fit for both runtime and static behavior
|
||
of protocols. As discussed in `rationale`_, we propose to add static support
|
||
for such behavior. In addition, to allow users to achieve such runtime
|
||
behavior for *user-defined* protocols a special ``@runtime`` decorator will
|
||
be provided, see detailed `discussion`_ below.
|
||
|
||
* TypeScript [typescript]_ provides support for user-defined classes and
|
||
interfaces. Explicit implementation declaration is not required and
|
||
structural subtyping is verified statically. For example::
|
||
|
||
interface LabeledItem {
|
||
label: string;
|
||
size?: int;
|
||
}
|
||
|
||
function printLabel(obj: LabeledValue) {
|
||
console.log(obj.label);
|
||
}
|
||
|
||
let myObj = {size: 10, label: "Size 10 Object"};
|
||
printLabel(myObj);
|
||
|
||
Note that optional interface members are supported. Also, TypeScript
|
||
prohibits redundant members in implementations. While the idea of
|
||
optional members looks interesting, it would complicate this proposal and
|
||
it is not clear how useful it will be. Therefore it is proposed to postpone
|
||
this; see `rejected`_ ideas. In general, the idea of static protocol
|
||
checking without runtime implications looks reasonable, and basically
|
||
this proposal follows the same line.
|
||
|
||
* Go [golang]_ uses a more radical approach and makes interfaces the primary
|
||
way to provide type information. Also, assignments are used to explicitly
|
||
ensure implementation::
|
||
|
||
type SomeInterface interface {
|
||
SomeMethod() ([]byte, error)
|
||
}
|
||
|
||
if _, ok := someval.(SomeInterface); ok {
|
||
fmt.Printf("value implements some interface")
|
||
}
|
||
|
||
Both these ideas are questionable in the context of this proposal. See
|
||
the section on `rejected`_ ideas.
|
||
|
||
|
||
.. _specification:
|
||
|
||
Specification
|
||
=============
|
||
|
||
Terminology
|
||
-----------
|
||
|
||
We propose to use the term *protocols* for types supporting structural
|
||
subtyping. The reason is that the term *iterator protocol*,
|
||
for example, is widely understood in the community, and coming up with
|
||
a new term for this concept in a statically typed context would just create
|
||
confusion.
|
||
|
||
This has the drawback that the term *protocol* becomes overloaded with
|
||
two subtly different meanings: the first is the traditional, well-known but
|
||
slightly fuzzy concept of protocols such as iterator; the second is the more
|
||
explicitly defined concept of protocols in statically typed code.
|
||
The distinction is not important most of the time, and in other
|
||
cases we propose to just add a qualifier such as *protocol classes*
|
||
when referring to the static type concept.
|
||
|
||
If a class includes a protocol in its MRO, the class is called
|
||
an *explicit* subclass of the protocol. If a class is a structural subtype
|
||
of a protocol, it is said to implement the protocol and to be compatible
|
||
with a protocol. If a class is compatible with a protocol but the protocol
|
||
is not included in the MRO, the class is an *implicit* subtype
|
||
of the protocol. (Note that one can explicitly subclass a protocol and
|
||
still not implement it if a protocol attribute is set to ``None``
|
||
in the subclass, see Python [data-model]_ for details.)
|
||
|
||
The attributes (variables and methods) of a protocol that are mandatory
|
||
for other class in order to be considered a structural subtype are called
|
||
protocol members.
|
||
|
||
|
||
.. _definition:
|
||
|
||
Defining a protocol
|
||
-------------------
|
||
|
||
Protocols are defined by including a special new class ``typing.Protocol``
|
||
(an instance of ``abc.ABCMeta``) in the base classes list, typically
|
||
at the end of the list. Here is a simple example::
|
||
|
||
from typing import Protocol
|
||
|
||
class SupportsClose(Protocol):
|
||
def close(self) -> None:
|
||
...
|
||
|
||
Now if one defines a class ``Resource`` with a ``close()`` method that has
|
||
a compatible signature, it would implicitly be a subtype of
|
||
``SupportsClose``, since the structural subtyping is used for
|
||
protocol types::
|
||
|
||
class Resource:
|
||
...
|
||
def close(self) -> None:
|
||
self.file.close()
|
||
self.lock.release()
|
||
|
||
Apart from few restrictions explicitly mentioned below, protocol types can
|
||
be used in every context where a normal types can::
|
||
|
||
def close_all(things: Iterable[SupportsClose]) -> None:
|
||
for t in things:
|
||
t.close()
|
||
|
||
f = open('foo.txt')
|
||
r = Resource()
|
||
close_all([f, r]) # OK!
|
||
close_all([1]) # Error: 'int' has no 'close' method
|
||
|
||
Note that both the user-defined class ``Resource`` and the built-in
|
||
``IO`` type (the return type of ``open()``) are considered subtypes of
|
||
``SupportsClose``, because they provide a ``close()`` method with
|
||
a compatible type signature.
|
||
|
||
|
||
Protocol members
|
||
----------------
|
||
|
||
All methods defined in the protocol class body are protocol members, both
|
||
normal and decorated with ``@abstractmethod``. If any parameters of a
|
||
protocol method are not annotated, then their types are assumed to be ``Any``
|
||
(see PEP 484). Bodies of protocol methods are type checked.
|
||
An abstract method that should not be called via ``super()`` ought to raise
|
||
``NotImplementedError``. Example::
|
||
|
||
from typing import Protocol
|
||
from abc import abstractmethod
|
||
|
||
class Example(Protocol):
|
||
def first(self) -> int: # This is a protocol member
|
||
return 42
|
||
|
||
@abstractmethod
|
||
def second(self) -> int: # Method without a default implementation
|
||
raise NotImplementedError
|
||
|
||
Static methods, class methods, and properties are equally allowed
|
||
in protocols.
|
||
|
||
To define a protocol variable, one can use PEP 526 variable
|
||
annotations in the class body. Additional attributes *only* defined in
|
||
the body of a method by assignment via ``self`` are not allowed. The rationale
|
||
for this is that the protocol class implementation is often not shared by
|
||
subtypes, so the interface should not depend on the default implementation.
|
||
Examples::
|
||
|
||
from typing import Protocol, List
|
||
|
||
class Template(Protocol):
|
||
name: str # This is a protocol member
|
||
value: int = 0 # This one too (with default)
|
||
|
||
def method(self) -> None:
|
||
self.temp: List[int] = [] # Error in type checker
|
||
|
||
class Concrete:
|
||
def __init__(self, name: str, value: int) -> None:
|
||
self.name = name
|
||
self.value = value
|
||
|
||
var: Template = Concrete('value', 42) # OK
|
||
|
||
To distinguish between protocol class variables and protocol instance
|
||
variables, the special ``ClassVar`` annotation should be used as specified
|
||
by PEP 526. By default, protocol variables as defined above are considered
|
||
readable and writable. To define a read-only protocol variable, one can use
|
||
an (abstract) property.
|
||
|
||
|
||
Explicitly declaring implementation
|
||
-----------------------------------
|
||
|
||
To explicitly declare that a certain class implements a given protocol,
|
||
it can be used as a regular base class. In this case a class could use
|
||
default implementations of protocol members. ``typing.Sequence`` is a good
|
||
example of a protocol with useful default methods. Static analysis tools are
|
||
expected to automatically detect that a class implements a given protocol.
|
||
So while it's possible to subclass a protocol explicitly, it's *not necessary*
|
||
to do so for the sake of type-checking.
|
||
|
||
The default implementations cannot be used if
|
||
the subtype relationship is implicit and only via structural
|
||
subtyping -- the semantics of inheritance is not changed. Examples::
|
||
|
||
class PColor(Protocol):
|
||
@abstractmethod
|
||
def draw(self) -> str:
|
||
...
|
||
def complex_method(self) -> int:
|
||
# some complex code here
|
||
|
||
class NiceColor(PColor):
|
||
def draw(self) -> str:
|
||
return "deep blue"
|
||
|
||
class BadColor(PColor):
|
||
def draw(self) -> str:
|
||
return super().draw() # Error, no default implementation
|
||
|
||
class ImplicitColor: # Note no 'PColor' base here
|
||
def draw(self) -> str:
|
||
return "probably gray"
|
||
def comlex_method(self) -> int:
|
||
# class needs to implement this
|
||
|
||
nice: NiceColor
|
||
another: ImplicitColor
|
||
|
||
def represent(c: PColor) -> None:
|
||
print(c.draw(), c.complex_method())
|
||
|
||
represent(nice) # OK
|
||
represent(another) # Also OK
|
||
|
||
Note that there is little difference between explicit and implicit
|
||
subtypes, the main benefit of explicit subclassing is to get some protocol
|
||
methods "for free". In addition, type checkers can statically verify that
|
||
the class actually implements the protocol correctly::
|
||
|
||
class RGB(Protocol):
|
||
rgb: Tuple[int, int, int]
|
||
|
||
@abstractmethod
|
||
def intensity(self) -> int:
|
||
return 0
|
||
|
||
class Point(RGB):
|
||
def __init__(self, red: int, green: int, blue: str) -> None:
|
||
self.rgb = red, green, blue # Error, 'blue' must be 'int'
|
||
|
||
# Type checker might warn that 'intensity' is not defined
|
||
|
||
A class can explicitly inherit from multiple protocols and also form normal
|
||
classes. In this case methods are resolved using normal MRO and a type checker
|
||
verifies that all subtyping are correct. The semantics of ``@abstractmethod``
|
||
is not changed, all of them must be implemented by an explicit subclass
|
||
before it can be instantiated.
|
||
|
||
|
||
Merging and extending protocols
|
||
-------------------------------
|
||
|
||
The general philosophy is that protocols are mostly like regular ABCs,
|
||
but a static type checker will handle them specially. Subclassing a protocol
|
||
class would not turn the subclass into a protocol unless it also has
|
||
``typing.Protocol`` as an explicit base class. Without this base, the class
|
||
is "downgraded" to a regular ABC that cannot be used with structural
|
||
subtyping. The rationale for this rule is that we don't want to accidentally
|
||
have some class act as a protocol just because one of its base classes
|
||
happens to be one. We still slightly prefer nominal subtyping over structural
|
||
subtyping in the static typing world.
|
||
|
||
A subprotocol can be defined by having *both* one or more protocols as
|
||
immediate base classes and also having ``typing.Protocol`` as an immediate
|
||
base class::
|
||
|
||
from typing import Sized, Protocol
|
||
|
||
class SizedAndClosable(Sized, Protocol):
|
||
def close(self) -> None:
|
||
...
|
||
|
||
Now the protocol ``SizedAndClosable`` is a protocol with two methods,
|
||
``__len__`` and ``close``. If one omits ``Protocol`` in the base class list,
|
||
this would be a regular (non-protocol) class that must implement ``Sized``.
|
||
Alternatively, one can implement ``SizedAndClosable`` protocol by merging
|
||
the ``SupportsClose`` protocol from the example in the `definition`_ section
|
||
with ``typing.Sized``::
|
||
|
||
from typing import Sized
|
||
|
||
class SupportsClose(Protocol):
|
||
def close(self) -> None:
|
||
...
|
||
|
||
class SizedAndClosable(Sized, SupportsClose, Protocol):
|
||
pass
|
||
|
||
The two definitions of ``SizedAndClosable`` are equivalent.
|
||
Subclass relationships between protocols are not meaningful when
|
||
considering subtyping, since structural compatibility is
|
||
the criterion, not the MRO.
|
||
|
||
If ``Protocol`` is included in the base class list, all the other base classes
|
||
must be protocols. A protocol can't extend a regular class, see `rejected`_
|
||
ideas for rationale. Note that rules around explicit subclassing are different
|
||
from regular ABCs, where abstractness is simply defined by having at least one
|
||
abstract method being unimplemented. Protocol classes must be marked
|
||
*explicitly*.
|
||
|
||
|
||
Generic protocols
|
||
-----------------
|
||
|
||
Generic protocols are important. For example, ``SupportsAbs``, ``Iterable``
|
||
and ``Iterator`` are generic protocols. They are defined similar to normal
|
||
non-protocol generic types::
|
||
|
||
class Iterable(Protocol[T]):
|
||
@abstractmethod
|
||
def __iter__(self) -> Iterator[T]:
|
||
...
|
||
|
||
``Protocol[T, S, ...]`` is allowed as a shorthand for
|
||
``Protocol, Generic[T, S, ...]``.
|
||
|
||
Declaring variance is not necessary for protocol classes, since it can be
|
||
inferred from a protocol definition. Examples::
|
||
|
||
class Box(Protocol[T]):
|
||
def content(self) -> T:
|
||
...
|
||
|
||
box: Box[float]
|
||
second_box: Box[int]
|
||
box = second_box # This is OK due to the inferred covariance of 'Box'.
|
||
|
||
class Sender(Protocol[T]):
|
||
def send(self, data: T) -> int:
|
||
...
|
||
|
||
sender: Sender[float]
|
||
new_sender: Sender[int]
|
||
new_sender = sender # OK, type checker finds that 'Sender' is contravariant.
|
||
|
||
class Proto(Protocol[T]):
|
||
attr: T
|
||
|
||
var: Proto[float]
|
||
another_var: Proto[int]
|
||
var = another_var # Error! 'Proto[float]' is incompatible with 'Proto[int]'.
|
||
|
||
|
||
Recursive protocols
|
||
-------------------
|
||
|
||
Recursive protocols are also supported. Forward references to the protocol
|
||
class names can be given as strings as specified by PEP 484. Recursive
|
||
protocols are useful for representing self-referential data structures
|
||
like trees in an abstract fashion::
|
||
|
||
class Traversable(Protocol):
|
||
def leaves(self) -> Iterable['Traversable']:
|
||
...
|
||
|
||
Note that for recursive protocols, a class is considered a subtype of
|
||
the protocol in situations where the decision depends on itself.
|
||
Continuing the previous example::
|
||
|
||
class SimpleTree:
|
||
def leaves(self) -> List['SimpleTree']:
|
||
...
|
||
|
||
root: Traversable = SimpleTree() # OK
|
||
|
||
class Tree(Generic[T]):
|
||
def leaves(self) -> List['Tree[T]']:
|
||
...
|
||
|
||
def walk(graph: Traversable) -> None:
|
||
...
|
||
tree: Tree[float] = Tree(0, [])
|
||
walk(tree) # OK, 'Tree[float]' is a subtype of 'Traversable'
|
||
|
||
|
||
Using Protocols
|
||
===============
|
||
|
||
Subtyping relationships with other types
|
||
----------------------------------------
|
||
|
||
Protocols cannot be instantiated, so there are no values whose
|
||
runtime type is a protocol. For variables and parameters with protocol types,
|
||
subtyping relationships are subject to the following rules:
|
||
|
||
* A protocol is never a subtype of a concrete type.
|
||
* A concrete type ``X`` is a subtype of protocol ``P``
|
||
if and only if ``X`` implements all protocol members of ``P`` with
|
||
compatible types. In other words, subtyping with respect to a protocol is
|
||
always structural.
|
||
* A protocol ``P1`` is a subtype of another protocol ``P2`` if ``P1`` defines
|
||
all protocol members of ``P2`` with compatible types.
|
||
|
||
Generic protocol types follow the same rules of variance as non-protocol
|
||
types. Protocol types can be used in all contexts where any other types
|
||
can be used, such as in ``Union``, ``ClassVar``, type variables bounds, etc.
|
||
Generic protocols follow the rules for generic abstract classes, except for
|
||
using structural compatibility instead of compatibility defined by
|
||
inheritance relationships.
|
||
|
||
|
||
Unions and intersections of protocols
|
||
-------------------------------------
|
||
|
||
``Union`` of protocol classes behaves the same way as for non-protocol
|
||
classes. For example::
|
||
|
||
from typing import Union, Optional, Protocol
|
||
|
||
class Exitable(Protocol):
|
||
def exit(self) -> int:
|
||
...
|
||
class Quittable(Protocol):
|
||
def quit(self) -> Optional[int]:
|
||
...
|
||
|
||
def finish(task: Union[Exitable, Quittable]) -> int:
|
||
...
|
||
class DefaultJob:
|
||
...
|
||
def quit(self) -> int:
|
||
return 0
|
||
finish(DefaultJob()) # OK
|
||
|
||
One can use multiple inheritance to define an intersection of protocols.
|
||
Example::
|
||
|
||
from typing import Sequence, Hashable
|
||
|
||
class HashableFloats(Sequence[float], Hashable, Protocol):
|
||
pass
|
||
|
||
def cached_func(args: HashableFloats) -> float:
|
||
...
|
||
cached_func((1, 2, 3)) # OK, tuple is both hashable and sequence
|
||
|
||
If this will prove to be a widely used scenario, then a special
|
||
intersection type construct could be added in future as specified by PEP 483,
|
||
see `rejected`_ ideas for more details.
|
||
|
||
|
||
``Type[]`` with protocols
|
||
-------------------------
|
||
|
||
Variables and parameters annotated with ``Type[Proto]`` accept only concrete
|
||
(non-protocol) subtypes of ``Proto``. The main reason for this is to allow
|
||
instantiation of parameters with such type. For example::
|
||
|
||
class Proto(Protocol):
|
||
@abstractmethod
|
||
def meth(self) -> int:
|
||
...
|
||
class Concrete:
|
||
def meth(self) -> int:
|
||
return 42
|
||
|
||
def fun(cls: Type[Proto]) -> int:
|
||
return cls().meth() # OK
|
||
fun(Proto) # Error
|
||
fun(Concrete) # OK
|
||
|
||
The same rule applies to variables::
|
||
|
||
var: Type[Proto]
|
||
var = Proto # Error
|
||
var = Concrete # OK
|
||
var().meth() # OK
|
||
|
||
Assigning an ABC or a protocol class to a variable is allowed if it is
|
||
not explicitly typed, and such assignment creates a type alias.
|
||
For normal (non-abstract) classes, the behavior of ``Type[]`` is
|
||
not changed.
|
||
|
||
|
||
``NewType()`` and type aliases
|
||
------------------------------
|
||
|
||
Protocols are essentially anonymous. To emphasize this point, static type
|
||
checkers might refuse protocol classes inside ``NewType()`` to avoid an
|
||
illusion that a distinct type is provided::
|
||
|
||
form typing import NewType , Protocol, Iterator
|
||
|
||
class Id(Protocol):
|
||
code: int
|
||
secrets: Iterator[bytes]
|
||
|
||
UserId = NewType('UserId', Id) # Error, can't provide distinct type
|
||
|
||
In contrast, type aliases are fully supported, including generic type
|
||
aliases::
|
||
|
||
from typing import TypeVar, Reversible, Iterable, Sized
|
||
|
||
T = TypeVar('T')
|
||
class SizedIterable(Iterable[T], Sized, Protocol):
|
||
pass
|
||
CompatReversible = Union[Reversible[T], SizedIterable[T]]
|
||
|
||
|
||
.. _discussion:
|
||
|
||
``@runtime`` decorator and narrowing types by ``isinstance()``
|
||
--------------------------------------------------------------
|
||
|
||
The default semantics is that ``isinstance()`` and ``issubclass()`` fail
|
||
for protocol types. This is in the spirit of duck typing -- protocols
|
||
basically would be used to model duck typing statically, not explicitly
|
||
at runtime.
|
||
|
||
However, it should be possible for protocol types to implement custom
|
||
instance and class checks when this makes sense, similar to how ``Iterable``
|
||
and other ABCs in ``collections.abc`` and ``typing`` already do it,
|
||
but this is limited to non-generic and unsubscripted generic protocols
|
||
(``Iterable`` is statically equivalent to ``Iterable[Any]`).
|
||
The ``typing`` module will define a special ``@runtime`` class decorator
|
||
that provides the same semantics for class and instance checks as for
|
||
``collections.abc`` classes, essentially making them "runtime protocols"::
|
||
|
||
from typing import runtime, Protocol
|
||
|
||
@runtime
|
||
class Closable(Protocol):
|
||
def close(self):
|
||
...
|
||
|
||
assert isinstance(open('some/file'), Closable)
|
||
|
||
Static type checkers will understand ``isinstance(x, Proto)`` and
|
||
``issubclass(C, Proto)`` for protocols defined with this decorator (as they
|
||
already do for ``Iterable`` etc.). Static type checkers will narrow types
|
||
after such checks by the type erased ``Proto`` (i.e. with all variables
|
||
having type ``Any`` and all methods having type ``Callable[..., Any]``).
|
||
Note that ``isinstance(x, Proto[int])`` etc. will always fail in agreement
|
||
with PEP 484. Examples::
|
||
|
||
from typing import Iterable, Iterator, Sequence
|
||
|
||
def process(items: Iterable[int]) -> None:
|
||
if isinstance(items, Iterator):
|
||
# 'items' have type 'Iterator[int]' here
|
||
elif isinstance(items, Sequence[int]):
|
||
# Error! Can't use 'isinstance()' with subscripted protocols
|
||
|
||
Note that instance checks are not 100% reliable statically, this is why
|
||
this behavior is opt-in, see section on `rejected`_ ideas for examples.
|
||
|
||
|
||
Using Protocols in Python 2.7 - 3.5
|
||
===================================
|
||
|
||
Variable annotation syntax was added in Python 3.6, so that the syntax
|
||
for defining protocol variables proposed in `specification`_ section can't
|
||
be used if support for earlier versions is needed. To define these
|
||
in a manner compatible with older versions of Python one can use properties.
|
||
Properties can be settable and/or abstract if needed::
|
||
|
||
class Foo(Protocol):
|
||
@property
|
||
def c(self) -> int:
|
||
return 42 # Default value can be provided for property...
|
||
|
||
@abstractproperty
|
||
def d(self) -> int: # ... or it can be abstract
|
||
return 0
|
||
|
||
Also function type comments can be used as per PEP 484 (for example
|
||
to provide compatibility with Python 2). The ``typing`` module changes
|
||
proposed in this PEP will also be backported to earlier versions via the
|
||
backport currently available on PyPI.
|
||
|
||
|
||
Runtime Implementation of Protocol Classes
|
||
==========================================
|
||
|
||
Implementation details
|
||
----------------------
|
||
|
||
The runtime implementation could be done in pure Python without any
|
||
effects on the core interpreter and standard library except in the
|
||
``typing`` module:
|
||
|
||
* Define class ``typing.Protocol`` similar to ``typing.Generic``.
|
||
* Implement metaclass functionality to detect whether a class is
|
||
a protocol or not. Add a class attribute ``__protocol__ = True``
|
||
if that is the case. Verify that a protocol class only has protocol
|
||
base classes in the MRO (except for object).
|
||
* Implement ``@runtime`` that adds all attributes to ``__subclasshook__()``.
|
||
* All structural subtyping checks will be performed by static type checkers,
|
||
such as ``mypy`` [mypy]_. No additional support for protocol validation will
|
||
be provided at runtime.
|
||
|
||
|
||
Changes in the typing module
|
||
----------------------------
|
||
|
||
The following classes in ``typing`` module will be protocols:
|
||
|
||
* ``Hashable``
|
||
* ``SupportsAbs`` (and other ``Supports*`` classes)
|
||
* ``Iterable``, ``Iterator``
|
||
* ``Sized``
|
||
* ``Container``
|
||
* ``Collection``
|
||
* ``Reversible``
|
||
* ``Sequence``, ``MutableSequence``
|
||
* ``AbstractSet``, ``MutableSet``
|
||
* ``Mapping``, ``MutableMapping``
|
||
* ``AsyncIterable``, ``AsyncIterator``
|
||
* ``Awaitable``
|
||
* ``Callable``
|
||
* ``ContextManager``, ``AsyncContextManager``
|
||
|
||
Most of these classes are small and conceptually simple. It is easy to see
|
||
what are the methods these protocols implement, and immediately recognize
|
||
the corresponding runtime protocol counterpart.
|
||
Practically, few changes will be needed in ``typing`` since some of these
|
||
classes already behave the necessary way at runtime. Most of these will need
|
||
to be updated only in the corresponding ``typeshed`` stubs [typeshed]_.
|
||
|
||
All other concrete generic classes such as ``List``, ``Set``, ``IO``,
|
||
``Deque``, etc are sufficiently complex that it makes sense to keep
|
||
them non-protocols (i.e. require code to be explicit about them). Also, it is
|
||
too easy to leave some methods unimplemented by accident, and explicitly
|
||
marking the subclass relationship allows type checkers to pinpoint the missing
|
||
implementations.
|
||
|
||
|
||
Introspection
|
||
-------------
|
||
|
||
The existing class introspection machinery (``dir``, ``__annotations__`` etc)
|
||
can be used with protocols. In addition, all introspection tools implemented
|
||
in the ``typing`` module will support protocols. Since all attributes need
|
||
to be defined in the class body based on this proposal, protocol classes will
|
||
have even better perspective for introspection than regular classes where
|
||
attributes can be defined implicitly -- protocol attributes can't be
|
||
initialized in ways that are not visible to introspection
|
||
(using ``setattr()``, assignment via ``self``, etc.). Still, some things like
|
||
types of attributes will not be visible at runtime in Python 3.5 and earlier,
|
||
but this looks like a reasonable limitation.
|
||
|
||
There will be only limited support of ``isinstance()`` and ``issubclass()``
|
||
as discussed above (these will *always* fail with ``TypeError`` for
|
||
subscripted generic protocols, since a reliable answer could not be given
|
||
at runtime in this case). But together with other introspection tools this
|
||
give a reasonable perspective for runtime type checking tools.
|
||
|
||
|
||
.. _rejected:
|
||
|
||
Rejected/Postponed Ideas
|
||
========================
|
||
|
||
The ideas in this section were previously discussed in [several]_
|
||
[discussions]_ [elsewhere]_.
|
||
|
||
Make every class a protocol by default
|
||
--------------------------------------
|
||
|
||
Some languages such as Go make structural subtyping the only or the primary
|
||
form of subtyping. We could achieve a similar result by making all classes
|
||
protocols by default (or even always). However we believe that it is better
|
||
to require classes to be explicitly marked as protocols, for the following
|
||
reasons:
|
||
|
||
* Protocols don't have some properties of regular classes. In particular,
|
||
``isinstance()``, as defined for normal classes, is based on the nominal
|
||
hierarchy. In order to make everything a protocol by default, and have
|
||
``isinstance()`` work would require changing its semantics,
|
||
which won't happen.
|
||
* Protocol classes should generally not have many method implementations,
|
||
as they describe an interface, not an implementation.
|
||
Most classes have many implementations, making them bad protocol classes.
|
||
* Experience suggests that many classes are not practical as protocols anyway,
|
||
mainly because their interfaces are too large, complex or
|
||
implementation-oriented (for example, they may include de facto
|
||
private attributes and methods without a ``__`` prefix).
|
||
* Most actually useful protocols in existing Python code seem to be implicit.
|
||
The ABCs in ``typing`` and ``collections.abc`` are rather an exception, but
|
||
even they are recent additions to Python and most programmers
|
||
do not use them yet.
|
||
* Many built-in functions only accept concrete instances of ``int``
|
||
(and subclass instances), and similarly for other built-in classes. Making
|
||
``int`` a structural type wouldn't be safe without major changes to the
|
||
Python runtime, which won't happen.
|
||
|
||
|
||
Allow protocols subclassing normal classes
|
||
------------------------------------------
|
||
|
||
The main rationale to prohibit this is to preserve transitivity of subtyping,
|
||
consider this example::
|
||
|
||
from typing import Protocol
|
||
|
||
class Base:
|
||
attr: str
|
||
|
||
class Proto(Base, Protocol):
|
||
def meth(self) -> int:
|
||
...
|
||
|
||
class C:
|
||
attr: str
|
||
def meth(self) -> int:
|
||
return 0
|
||
|
||
Now, ``C`` is a subtype of ``Proto``, and ``Proto`` is a subtype of ``Base``.
|
||
But ``C`` cannot be a subtype of ``Base`` (since the latter is not
|
||
a protocol). This situation would be really weird. In addition, there is
|
||
an ambiguity about whether attributes of ``Base`` should become protocol
|
||
members of ``Proto``.
|
||
|
||
|
||
Support optional protocol members
|
||
---------------------------------
|
||
|
||
We can come up with examples where it would be handy to be able to say
|
||
that a method or data attribute does not need to be present in a class
|
||
implementing a protocol, but if it is present, it must conform to a specific
|
||
signature or type. One could use a ``hasattr()`` check to determine whether
|
||
they can use the attribute on a particular instance.
|
||
|
||
Languages such as TypeScript have similar features and
|
||
apparently they are pretty commonly used. The current realistic potential
|
||
use cases for protocols in Python don't require these. In the interest
|
||
of simplicity, we propose to not support optional methods or attributes.
|
||
We can always revisit this later if there is an actual need.
|
||
|
||
|
||
Allow only protocol methods and force use of getters and setters
|
||
----------------------------------------------------------------
|
||
|
||
One could argue that protocols typically only define methods, but not
|
||
variables. However, using getters and setters in cases where only a
|
||
simple variable is needed would be quite unpythonic. Moreover, the widespread
|
||
use of properties (that often act as type validators) in large code bases
|
||
is partially due to previous absence of static type checkers for Python,
|
||
the problem that PEP 484 and this PEP are aiming to solve. For example::
|
||
|
||
# without static types
|
||
|
||
class MyClass:
|
||
@property
|
||
def my_attr(self):
|
||
return self._my_attr
|
||
@my_attr.setter
|
||
def my_attr(self, value):
|
||
if not isinstance(value, int):
|
||
raise ValidationError("An integer expected for my_attr")
|
||
self._my_attr = value
|
||
|
||
# with static types
|
||
|
||
class MyClass:
|
||
my_attr: int
|
||
|
||
|
||
Support non-protocol members
|
||
----------------------------
|
||
|
||
There was an idea to make some methods "non-protocol" (i.e. not necessary
|
||
to implement, and inherited in explicit subclassing), but it was rejected,
|
||
since this complicates things. For example, consider this situation::
|
||
|
||
class Proto(Protocol):
|
||
@abstractmethod
|
||
def first(self) -> int:
|
||
raise NotImplementedError
|
||
def second(self) -> int:
|
||
return self.first() + 1
|
||
|
||
def fun(arg: Proto) -> None:
|
||
arg.second()
|
||
|
||
The question is should this be an error? We think most people would expect
|
||
this to be valid. Therefore, to be on the safe side, we need to require both
|
||
methods to be implemented in implicit subclasses. In addition, if one looks
|
||
at definitions in ``collections.abc``, there are very few methods that could
|
||
be considered "non-protocol". Therefore, it was decided to not introduce
|
||
"non-protocol" methods.
|
||
|
||
There is only one downside to this: it will require some boilerplate for
|
||
implicit subtypes of ``Mapping`` and few other "large" protocols. But, this
|
||
applies to few "built-in" protocols (like ``Mapping`` and ``Sequence``) and
|
||
people are already subclassing them. Also, such style is discouraged for
|
||
user-defined protocols. It is recommended to create compact protocols and
|
||
combine them.
|
||
|
||
|
||
Make protocols interoperable with other approaches
|
||
--------------------------------------------------
|
||
|
||
The protocols as described here are basically a minimal extension to
|
||
the existing concept of ABCs. We argue that this is the way they should
|
||
be understood, instead of as something that *replaces* Zope interfaces,
|
||
for example. Attempting such interoperabilities will significantly
|
||
complicate both the concept and the implementation.
|
||
|
||
On the other hand, Zope interfaces are conceptually a superset of protocols
|
||
defined here, but using an incompatible syntax to define them,
|
||
because before PEP 526 there was no straightforward way to annotate attributes.
|
||
In the 3.6+ world, ``zope.interface`` might potentially adopt the ``Protocol``
|
||
syntax. In this case, type checkers could be taught to recognize interfaces
|
||
as protocols and make simple structural checks with respect to them.
|
||
|
||
|
||
Use assignments to check explicitly that a class implements a protocol
|
||
----------------------------------------------------------------------
|
||
|
||
In the Go language the explicit checks for implementation are performed
|
||
via dummy assignments [golang]_. Such a way is also possible with the
|
||
current proposal. Example::
|
||
|
||
class A:
|
||
def __len__(self) -> float:
|
||
return ...
|
||
|
||
_: Sized = A() # Error: A.__len__ doesn't conform to 'Sized'
|
||
# (Incompatible return type 'float')
|
||
|
||
This approach moves the check away from
|
||
the class definition and it almost requires a comment as otherwise
|
||
the code probably would not make any sense to an average reader
|
||
-- it looks like dead code. Besides, in the simplest form it requires one
|
||
to construct an instance of ``A``, which could be problematic if this requires
|
||
accessing or allocating some resources such as files or sockets.
|
||
We could work around the latter by using a cast, for example, but then
|
||
the code would be ugly. Therefore we discourage the use of this pattern.
|
||
|
||
|
||
Support ``isinstance()`` checks by default
|
||
------------------------------------------
|
||
|
||
The problem with this is instance checks could be unreliable, except for
|
||
situations where there is a common signature convention such as ``Iterable``.
|
||
For example::
|
||
|
||
class P(Protocol):
|
||
def common_method_name(self, x: int) -> int: ...
|
||
|
||
class X:
|
||
<a bunch of methods>
|
||
def common_method_name(self) -> None: ... # Note different signature
|
||
|
||
def do_stuff(o: Union[P, X]) -> int:
|
||
if isinstance(o, P):
|
||
return o.common_method_name(1) # oops, what if it's an X instance?
|
||
|
||
Another potentially problematic case is assignment of attributes
|
||
*after* instantiation::
|
||
|
||
class P(Protocol):
|
||
x: int
|
||
|
||
class C:
|
||
def initialize(self) -> None:
|
||
self.x = 0
|
||
|
||
c = C()
|
||
isinstance(c1, P) # False
|
||
c.initialize()
|
||
isinstance(c, P) # True
|
||
|
||
def f(x: Union[P, int]) -> None:
|
||
if isinstance(x, P):
|
||
# static type of x is P here
|
||
...
|
||
else:
|
||
# type of x is "int" here?
|
||
print(x + 1)
|
||
|
||
f(C()) # oops
|
||
|
||
We argue that requiring an explicit class decorator would be better, since
|
||
one can then attach warnings about problems like this in the documentation.
|
||
The user would be able to evaluate whether the benefits outweigh
|
||
the potential for confusion for each protocol and explicitly opt in -- but
|
||
the default behavior would be safer. Finally, it will be easy to make this
|
||
behavior default if necessary, while it might be problematic to make it opt-in
|
||
after being default.
|
||
|
||
|
||
Provide a special intersection type construct
|
||
---------------------------------------------
|
||
|
||
There was an idea to allow ``Proto = All[Proto1, Proto2, ...]`` as a shorthand
|
||
for::
|
||
|
||
class Proto(Proto1, Proto2, ..., Protocol):
|
||
pass
|
||
|
||
However, it is not yet clear how popular/useful it will be and implementing
|
||
this in type checkers for non-protocol classes could be difficult. Finally, it
|
||
will be very easy to add this later if needed.
|
||
|
||
|
||
Prohibit explicit subclassing of protocols by non-protocols
|
||
-----------------------------------------------------------
|
||
|
||
This was rejected for the following reasons:
|
||
|
||
* Backward compatibility: People are already using ABCs, including generic
|
||
ABCs from ``typing`` module. If we prohibit explicit subclassing of these
|
||
ABCs, then quite a lot of code will break.
|
||
|
||
* Convenience: There are existing protocol-like ABCs (that will be turned
|
||
into protocols) that have many useful "mix-in" (non-abstract) methods.
|
||
For example in the case of ``Sequence`` one only needs to implement
|
||
``__getitem__`` and ``__len__`` in an explicit subclass, and one gets
|
||
``__iter__``, ``__contains__``, ``__reversed__``, ``index``, and ``count``
|
||
for free.
|
||
|
||
* Explicit subclassing makes it explicit that a class implements a particular
|
||
protocol, making subtyping relationships easier to see.
|
||
|
||
* Type checkers can warn about missing protocol members or members with
|
||
incompatible types more easily, without having to use hacks like dummy
|
||
assignments discussed above in this section.
|
||
|
||
* Explicit subclassing makes it possible to force a class to be considered
|
||
a subtype of a protocol (by using ``# type: ignore`` together with an
|
||
explicit base class) when it is not strictly compatible, such as when
|
||
it has an unsafe override.
|
||
|
||
|
||
Backwards Compatibility
|
||
=======================
|
||
|
||
This PEP is almost fully backwards compatible. Few collection classes such as
|
||
``Sequence`` and ``Mapping`` will be turned into runtime protocols, therefore
|
||
results of ``isinstance()`` checks are going to change in some edge cases.
|
||
For example, a class that implements the ``Sequence`` protocol but does not
|
||
explicitly inherit from ``Sequence`` currently returns ``False`` in
|
||
corresponding instance and class checks. With this PEP implemented, such
|
||
checks will return ``True``.
|
||
|
||
|
||
Implementation
|
||
==============
|
||
|
||
A working implementation of this PEP for ``mypy`` type checker is found on
|
||
GitHub repo at https://github.com/ilevkivskyi/mypy/tree/protocols,
|
||
corresponding ``typeshed`` stubs for more flavor are found at
|
||
https://github.com/ilevkivskyi/typeshed/tree/protocols. Installation steps::
|
||
|
||
git clone --recurse-submodules https://github.com/ilevkivskyi/mypy/
|
||
cd mypy && git checkout protocols && cd typeshed
|
||
git remote add proto https://github.com/ilevkivskyi/typeshed
|
||
git fetch proto && git checkout proto/protocols
|
||
cd .. && git add typeshed && sudo python3 -m pip install -U .
|
||
|
||
|
||
References
|
||
==========
|
||
|
||
.. [typing]
|
||
https://docs.python.org/3/library/typing.html
|
||
|
||
.. [wiki-structural]
|
||
https://en.wikipedia.org/wiki/Structural_type_system
|
||
|
||
.. [zope-interfaces]
|
||
https://zopeinterface.readthedocs.io/en/latest/
|
||
|
||
.. [abstract-classes]
|
||
https://docs.python.org/3/library/abc.html
|
||
|
||
.. [collections-abc]
|
||
https://docs.python.org/3/library/collections.abc.html
|
||
|
||
.. [typescript]
|
||
https://www.typescriptlang.org/docs/handbook/interfaces.html
|
||
|
||
.. [golang]
|
||
https://golang.org/doc/effective_go.html#interfaces_and_types
|
||
|
||
.. [data-model]
|
||
https://docs.python.org/3/reference/datamodel.html#special-method-names
|
||
|
||
.. [typeshed]
|
||
https://github.com/python/typeshed/
|
||
|
||
.. [mypy]
|
||
http://github.com/python/mypy/
|
||
|
||
.. [several]
|
||
https://mail.python.org/pipermail/python-ideas/2015-September/thread.html#35859
|
||
|
||
.. [discussions]
|
||
https://github.com/python/typing/issues/11
|
||
|
||
.. [elsewhere]
|
||
https://github.com/python/peps/pull/224
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|