From 6024eea3209da00a2da6795081d81784c3efbff7 Mon Sep 17 00:00:00 2001 From: Ivan Levkivskyi Date: Sat, 22 Apr 2017 16:43:32 +0200 Subject: [PATCH] Updates for PEP 544: Protocols (#243) Note: there's an open question about whether interface variance should be inferred or be user-declared (defaulting to invariant unless declared otherwise). The current draft uses inferred invariance. See e.g. discussion at https://github.com/python/peps/pull/243#issuecomment-295826456. --- pep-0544.txt | 359 +++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 279 insertions(+), 80 deletions(-) diff --git a/pep-0544.txt b/pep-0544.txt index 87b767f74..2eb536cee 100644 --- a/pep-0544.txt +++ b/pep-0544.txt @@ -116,7 +116,7 @@ approaches related to structural subtyping in Python and other languages: to mark interface attributes, and to explicitly declare implementation. For example:: - from zope.interface import Interface, Attribute, implements + from zope.interface import Interface, Attribute, implementer class IEmployee(Interface): @@ -125,8 +125,8 @@ approaches related to structural subtyping in Python and other languages: def do(work): """Do some work""" - class Employee(object): - implements(IEmployee) + @implementer(IEmployee) + class Employee: name = 'Anonymous' @@ -193,10 +193,10 @@ approaches related to structural subtyping in Python and other languages: Such behavior seems to be a perfect fit for both runtime and static behavior of protocols. As discussed in `rationale`_, we propose to add static support for such behavior. In addition, to allow users to achieve such runtime - behavior for *user defined* protocols a special ``@runtime`` decorator will + behavior for *user-defined* protocols a special ``@runtime`` decorator will be provided, see detailed `discussion`_ below. -* TypeScript [typescript]_ provides support for user defined classes and +* TypeScript [typescript]_ provides support for user-defined classes and interfaces. Explicit implementation declaration is not required and structural subtyping is verified statically. For example:: @@ -263,7 +263,9 @@ an *explicit* subclass of the protocol. If a class is a structural subtype of a protocol, it is said to implement the protocol and to be compatible with a protocol. If a class is compatible with a protocol but the protocol is not included in the MRO, the class is an *implicit* subtype -of the protocol. +of the protocol. (Note that one can explicitly subclass a protocol and +still not implement it if a protocol attribute is set to ``None`` +in the subclass, see Python [data-model]_ for details.) The attributes (variables and methods) of a protocol that are mandatory for other class in order to be considered a structural subtype are called @@ -276,7 +278,7 @@ Defining a protocol ------------------- Protocols are defined by including a special new class ``typing.Protocol`` -(an instance of ``abc.ABCMeta``) in the base classes list, preferably +(an instance of ``abc.ABCMeta``) in the base classes list, typically at the end of the list. Here is a simple example:: from typing import Protocol @@ -318,11 +320,11 @@ Protocol members ---------------- All methods defined in the protocol class body are protocol members, both -normal and decorated with ``@abstractmethod``. If some or all parameters of +normal and decorated with ``@abstractmethod``. If any parameters of a protocol method are not annotated, then their types are assumed to be ``Any`` -(see PEP 484). Bodies of protocol methods are type checked, except for methods -decorated with ``@abstractmethod`` with trivial bodies. A trivial body can -contain a docstring. Example:: +(see PEP 484). Bodies of protocol methods are type checked. +An abstract method that should not be called via ``super()`` ought to raise +``NotImplementedError``. Example:: from typing import Protocol from abc import abstractmethod @@ -333,15 +335,12 @@ contain a docstring. Example:: @abstractmethod def second(self) -> int: # Method without a default implementation - """Some method.""" + raise NotImplementedError -Note that although formally the implicit return type of a method with -a trivial body is ``None``, type checker will not warn about above example, -such convention is similar to how methods are defined in stub files. Static methods, class methods, and properties are equally allowed in protocols. -To define a protocol variable, one must use PEP 526 variable +To define a protocol variable, one can use PEP 526 variable annotations in the class body. Additional attributes *only* defined in the body of a method by assignment via ``self`` are not allowed. The rationale for this is that the protocol class implementation is often not shared by @@ -357,22 +356,32 @@ Examples:: def method(self) -> None: self.temp: List[int] = [] # Error in type checker + class Concrete: + def __init__(self, name: str, value: int) -> None: + self.name = name + self.value = value + + var: Template = Concrete('value', 42) # OK + To distinguish between protocol class variables and protocol instance variables, the special ``ClassVar`` annotation should be used as specified -by PEP 526. +by PEP 526. By default, protocol variables as defined above are considered +readable and writable. To define a read-only protocol variable, one can use +an (abstract) property. Explicitly declaring implementation ----------------------------------- -To explicitly declare that a certain class implements the given protocols, -they can be used as regular base classes. In this case a class could use +To explicitly declare that a certain class implements a given protocol, +it can be used as a regular base class. In this case a class could use default implementations of protocol members. ``typing.Sequence`` is a good -example of a protocol with useful default methods. +example of a protocol with useful default methods. Static analysis tools are +expected to automatically detect that a class implements a given protocol. +So while it's possible to subclass a protocol explicitly, it's *not necessary* +to do so for the sake of type-checking. -Abstract methods with trivial bodies are recognized by type checkers as -having no default implementation and can't be used via ``super()`` in -explicit subclasses. The default implementations can not be used if +The default implementations cannot be used if the subtype relationship is implicit and only via structural subtyping -- the semantics of inheritance is not changed. Examples:: @@ -406,7 +415,7 @@ subtyping -- the semantics of inheritance is not changed. Examples:: represent(nice) # OK represent(another) # Also OK -Note that there is no conceptual difference between explicit and implicit +Note that there is little difference between explicit and implicit subtypes, the main benefit of explicit subclassing is to get some protocol methods "for free". In addition, type checkers can statically verify that the class actually implements the protocol correctly:: @@ -428,7 +437,7 @@ A class can explicitly inherit from multiple protocols and also form normal classes. In this case methods are resolved using normal MRO and a type checker verifies that all subtyping are correct. The semantics of ``@abstractmethod`` is not changed, all of them must be implemented by an explicit subclass -before it could be instantiated. +before it can be instantiated. Merging and extending protocols @@ -439,7 +448,10 @@ but a static type checker will handle them specially. Subclassing a protocol class would not turn the subclass into a protocol unless it also has ``typing.Protocol`` as an explicit base class. Without this base, the class is "downgraded" to a regular ABC that cannot be used with structural -subtyping. +subtyping. The rationale for this rule is that we don't want to accidentally +have some class act as a protocol just because one of its base classes +happens to be one. We still slightly prefer nominal subtyping over structural +subtyping in the static typing world. A subprotocol can be defined by having *both* one or more protocols as immediate base classes and also having ``typing.Protocol`` as an immediate @@ -447,24 +459,24 @@ base class:: from typing import Sized, Protocol - class SizedAndCloseable(Sized, Protocol): + class SizedAndClosable(Sized, Protocol): def close(self) -> None: ... -Now the protocol ``SizedAndCloseable`` is a protocol with two methods, +Now the protocol ``SizedAndClosable`` is a protocol with two methods, ``__len__`` and ``close``. If one omits ``Protocol`` in the base class list, this would be a regular (non-protocol) class that must implement ``Sized``. -If ``Protocol`` is included in the base class list, all the other base classes -must be protocols. A protocol can't extend a regular class. - -Alternatively, one can implement ``SizedAndCloseable`` like this, assuming -the existence of ``SupportsClose`` from the example in `definition`_ section:: +Alternatively, one can implement ``SizedAndClosable`` protocol by merging +the ``SupportsClose`` protocol from the example in the `definition`_ section +with ``typing.Sized``:: from typing import Sized - class SupportsClose(...): ... # Like above + class SupportsClose(Protocol): + def close(self) -> None: + ... - class SizedAndCloseable(Sized, SupportsClose, Protocol): + class SizedAndClosable(Sized, SupportsClose, Protocol): pass The two definitions of ``SizedAndClosable`` are equivalent. @@ -472,35 +484,86 @@ Subclass relationships between protocols are not meaningful when considering subtyping, since structural compatibility is the criterion, not the MRO. -Note that rules around explicit subclassing are different from regular ABCs, -where abstractness is simply defined by having at least one abstract method -being unimplemented. Protocol classes must be marked *explicitly*. +If ``Protocol`` is included in the base class list, all the other base classes +must be protocols. A protocol can't extend a regular class, see `rejected`_ +ideas for rationale. Note that rules around explicit subclassing are different +from regular ABCs, where abstractness is simply defined by having at least one +abstract method being unimplemented. Protocol classes must be marked +*explicitly*. -Generic and recursive protocols -------------------------------- +Generic protocols +----------------- Generic protocols are important. For example, ``SupportsAbs``, ``Iterable`` and ``Iterator`` are generic protocols. They are defined similar to normal non-protocol generic types:: - T = TypeVar('T', covariant=True) - class Iterable(Protocol[T]): @abstractmethod def __iter__(self) -> Iterator[T]: ... -Note that ``Protocol[T, S, ...]`` is allowed as a shorthand for +``Protocol[T, S, ...]`` is allowed as a shorthand for ``Protocol, Generic[T, S, ...]``. +Declaring variance is not necessary for protocol classes, since it can be +inferred from a protocol definition. Examples:: + + class Box(Protocol[T]): + def content(self) -> T: + ... + + box: Box[float] + second_box: Box[int] + box = second_box # This is OK due to the inferred covariance of 'Box'. + + class Sender(Protocol[T]): + def send(self, data: T) -> int: + ... + + sender: Sender[float] + new_sender: Sender[int] + new_sender = sender # OK, type checker finds that 'Sender' is contravariant. + + class Proto(Protocol[T]): + attr: T + + var: Proto[float] + another_var: Proto[int] + var = another_var # Error! 'Proto[float]' is incompatible with 'Proto[int]'. + + +Recursive protocols +------------------- + Recursive protocols are also supported. Forward references to the protocol class names can be given as strings as specified by PEP 484. Recursive -protocols will be useful for representing self-referential data structures +protocols are useful for representing self-referential data structures like trees in an abstract fashion:: class Traversable(Protocol): - leaves: Iterable['Traversable'] + def leaves(self) -> Iterable['Traversable']: + ... + +Note that for recursive protocols, a class is considered a subtype of +the protocol in situations where the decision depends on itself. +Continuing the previous example:: + + class SimpleTree: + def leaves(self) -> List['SimpleTree']: + ... + + root: Traversable = SimpleTree() # OK + + class Tree(Generic[T]): + def leaves(self) -> List['Tree[T]']: + ... + + def walk(graph: Traversable) -> None: + ... + tree: Tree[float] = Tree(0, []) + walk(tree) # OK, 'Tree[float]' is a subtype of 'Traversable' Using Protocols @@ -509,28 +572,17 @@ Using Protocols Subtyping relationships with other types ---------------------------------------- -Protocols cannot be instantiated, so there are no values with -protocol types. For variables and parameters with protocol types, subtyping -relationships are subject to the following rules: +Protocols cannot be instantiated, so there are no values whose +runtime type is a protocol. For variables and parameters with protocol types, +subtyping relationships are subject to the following rules: * A protocol is never a subtype of a concrete type. -* A concrete type or a protocol ``X`` is a subtype of another protocol ``P`` - if and only if ``X`` implements all protocol members of ``P``. In other - words, subtyping with respect to a protocol is always structural. -* Edge case: for recursive protocols, a class is considered a subtype of - the protocol in situations where such decision depends on itself. - Continuing the previous example:: - - class Tree(Generic[T]): - def __init__(self, value: T, - leaves: 'List[Tree[T]]') -> None: - self.value = value - self.leafs = leafs - - def walk(graph: Traversable) -> None: - ... - tree: Tree[float] = Tree(0, []) - walk(tree) # OK, 'Tree[float]' is a subtype of 'Traversable' +* A concrete type ``X`` is a subtype of protocol ``P`` + if and only if ``X`` implements all protocol members of ``P`` with + compatible types. In other words, subtyping with respect to a protocol is + always structural. +* A protocol ``P1`` is a subtype of another protocol ``P2`` if ``P1`` defines + all protocol members of ``P2`` with compatible types. Generic protocol types follow the same rules of variance as non-protocol types. Protocol types can be used in all contexts where any other types @@ -551,17 +603,17 @@ classes. For example:: class Exitable(Protocol): def exit(self) -> int: ... - class Quitable(Protocol): + class Quittable(Protocol): def quit(self) -> Optional[int]: ... - def finish(task: Union[Exitable, Quitable]) -> int: + def finish(task: Union[Exitable, Quittable]) -> int: ... - class GoodJob: + class DefaultJob: ... def quit(self) -> int: return 0 - finish(GoodJob()) # OK + finish(DefaultJob()) # OK One can use multiple inheritance to define an intersection of protocols. Example:: @@ -576,7 +628,7 @@ Example:: cached_func((1, 2, 3)) # OK, tuple is both hashable and sequence If this will prove to be a widely used scenario, then a special -intersection type construct may be added in future as specified by PEP 483, +intersection type construct could be added in future as specified by PEP 483, see `rejected`_ ideas for more details. @@ -628,7 +680,7 @@ illusion that a distinct type is provided:: UserId = NewType('UserId', Id) # Error, can't provide distinct type -On the contrary, type aliases are fully supported, including generic type +In contrast, type aliases are fully supported, including generic type aliases:: from typing import TypeVar, Reversible, Iterable, Sized @@ -661,11 +713,11 @@ that provides the same semantics for class and instance checks as for from typing import runtime, Protocol @runtime - class Closeable(Protocol): + class Closable(Protocol): def close(self): ... - assert isinstance(open('some/file'), Closeable) + assert isinstance(open('some/file'), Closable) Static type checkers will understand ``isinstance(x, Proto)`` and ``issubclass(C, Proto)`` for protocols defined with this decorator (as they @@ -692,8 +744,9 @@ Using Protocols in Python 2.7 - 3.5 Variable annotation syntax was added in Python 3.6, so that the syntax for defining protocol variables proposed in `specification`_ section can't -be used in earlier versions. To define these in earlier versions of Python -one can use properties:: +be used if support for earlier versions is needed. To define these +in a manner compatible with older versions of Python one can use properties. +Properties can be settable and/or abstract if needed:: class Foo(Protocol): @property @@ -704,9 +757,10 @@ one can use properties:: def d(self) -> int: # ... or it can be abstract return 0 -In Python 2.7 the function type comments should be used as per PEP 484. -The ``typing`` module changes proposed in this PEP will be also -backported to earlier versions via the backport currently available on PyPI. +Also function type comments can be used as per PEP 484 (for example +to provide compatibility with Python 2). The ``typing`` module changes +proposed in this PEP will also be backported to earlier versions via the +backport currently available on PyPI. Runtime Implementation of Protocol Classes @@ -745,7 +799,6 @@ The following classes in ``typing`` module will be protocols: * ``Sequence``, ``MutableSequence`` * ``AbstractSet``, ``MutableSet`` * ``Mapping``, ``MutableMapping`` -* ``ItemsView`` (and other ``*View`` classes) * ``AsyncIterable``, ``AsyncIterator`` * ``Awaitable`` * ``Callable`` @@ -826,6 +879,33 @@ reasons: Python runtime, which won't happen. +Allow protocols subclassing normal classes +------------------------------------------ + +The main rationale to prohibit this is to preserve transitivity of subtyping, +consider this example:: + + from typing import Protocol + + class Base: + attr: str + + class Proto(Base, Protocol): + def meth(self) -> int: + ... + + class C: + attr: str + def meth(self) -> int: + return 0 + +Now, ``C`` is a subtype of ``Proto``, and ``Proto`` is a subtype of ``Base``. +But ``C`` cannot be a subtype of ``Base`` (since the latter is not +a protocol). This situation would be really weird. In addition, there is +an ambiguity about whether attributes of ``Base`` should become protocol +members of ``Proto``. + + Support optional protocol members --------------------------------- @@ -842,6 +922,66 @@ of simplicity, we propose to not support optional methods or attributes. We can always revisit this later if there is an actual need. +Allow only protocol methods and force use of getters and setters +---------------------------------------------------------------- + +One could argue that protocols typically only define methods, but not +variables. However, using getters and setters in cases where only a +simple variable is needed would be quite unpythonic. Moreover, the widespread +use of properties (that often act as type validators) in large code bases +is partially due to previous absence of static type checkers for Python, +the problem that PEP 484 and this PEP are aiming to solve. For example:: + + # without static types + + class MyClass: + @property + def my_attr(self): + return self._my_attr + @my_attr.setter + def my_attr(self, value): + if not isinstance(value, int): + raise ValidationError("An integer expected for my_attr") + self._my_attr = value + + # with static types + + class MyClass: + my_attr: int + + +Support non-protocol members +---------------------------- + +There was an idea to make some methods "non-protocol" (i.e. not necessary +to implement, and inherited in explicit subclassing), but it was rejected, +since this complicates things. For example, consider this situation:: + + class Proto(Protocol): + @abstractmethod + def first(self) -> int: + raise NotImplementedError + def second(self) -> int: + return self.first() + 1 + + def fun(arg: Proto) -> None: + arg.second() + +The question is should this be an error? We think most people would expect +this to be valid. Therefore, to be on the safe side, we need to require both +methods to be implemented in implicit subclasses. In addition, if one looks +at definitions in ``collections.abc``, there are very few methods that could +be considered "non-protocol". Therefore, it was decided to not introduce +"non-protocol" methods. + +There is only one downside to this: it will require some boilerplate for +implicit subtypes of ``Mapping`` and few other "large" protocols. But, this +applies to few "built-in" protocols (like ``Mapping`` and ``Sequence``) and +people are already subclassing them. Also, such style is discouraged for +user-defined protocols. It is recommended to create compact protocols and +combine them. + + Make protocols interoperable with other approaches -------------------------------------------------- @@ -862,7 +1002,7 @@ as protocols and make simple structural checks with respect to them. Use assignments to check explicitly that a class implements a protocol ---------------------------------------------------------------------- -In Go language the explicit checks for implementation are performed +In the Go language the explicit checks for implementation are performed via dummy assignments [golang]_. Such a way is also possible with the current proposal. Example:: @@ -949,6 +1089,62 @@ this in type checkers for non-protocol classes could be difficult. Finally, it will be very easy to add this later if needed. +Prohibit explicit subclassing of protocols by non-protocols +----------------------------------------------------------- + +This was rejected for the following reasons: + +* Backward compatibility: People are already using ABCs, including generic + ABCs from ``typing`` module. If we prohibit explicit subclassing of these + ABCs, then quite a lot of code will break. + +* Convenience: There are existing protocol-like ABCs (that will be turned + into protocols) that have many useful "mix-in" (non-abstract) methods. + For example in the case of ``Sequence`` one only needs to implement + ``__getitem__`` and ``__len__`` in an explicit subclass, and one gets + ``__iter__``, ``__contains__``, ``__reversed__``, ``index``, and ``count`` + for free. + +* Explicit subclassing makes it explicit that a class implements a particular + protocol, making subtyping relationships easier to see. + +* Type checkers can warn about missing protocol members or members with + incompatible types more easily, without having to use hacks like dummy + assignments discussed above in this section. + +* Explicit subclassing makes it possible to force a class to be considered + a subtype of a protocol (by using ``# type: ignore`` together with an + explicit base class) when it is not strictly compatible, such as when + it has an unsafe override. + + +Backwards Compatibility +======================= + +This PEP is almost fully backwards compatible. Few collection classes such as +``Sequence`` and ``Mapping`` will be turned into runtime protocols, therefore +results of ``isinstance()`` checks are going to change in some edge cases. +For example, a class that implements the ``Sequence`` protocol but does not +explicitly inherit from ``Sequence`` currently returns ``False`` in +corresponding instance and class checks. With this PEP implemented, such +checks will return ``True``. + + +Implementation +============== + +A working implementation of this PEP for ``mypy`` type checker is found on +GitHub repo at https://github.com/ilevkivskyi/mypy/tree/protocols, +corresponding ``typeshed`` stubs for more flavor are found at +https://github.com/ilevkivskyi/typeshed/tree/protocols. Installation steps:: + + git clone --recurse-submodules https://github.com/ilevkivskyi/mypy/ + cd mypy && git checkout protocols && cd typeshed + git remote add proto https://github.com/ilevkivskyi/typeshed + git fetch proto && git checkout proto/protocols + cd .. && git add typeshed && sudo python3 -m pip install -U . + + References ========== @@ -973,6 +1169,9 @@ References .. [golang] https://golang.org/doc/effective_go.html#interfaces_and_types +.. [data-model] + https://docs.python.org/3/reference/datamodel.html#special-method-names + .. [typeshed] https://github.com/python/typeshed/