416 lines
15 KiB
Plaintext
416 lines
15 KiB
Plaintext
PEP: 443
|
||
Title: Single-dispatch generic functions
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Łukasz Langa <lukasz@langa.pl>
|
||
Discussions-To: Python-Dev <python-dev@python.org>
|
||
Status: Final
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 22-May-2013
|
||
Post-History: 22-May-2013, 25-May-2013, 31-May-2013
|
||
Replaces: 245, 246, 3124
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
This PEP proposes a new mechanism in the ``functools`` standard library
|
||
module that provides a simple form of generic programming known as
|
||
single-dispatch generic functions.
|
||
|
||
A **generic function** is composed of multiple functions implementing
|
||
the same operation for different types. Which implementation should be
|
||
used during a call is determined by the dispatch algorithm. When the
|
||
implementation is chosen based on the type of a single argument, this is
|
||
known as **single dispatch**.
|
||
|
||
|
||
Rationale and Goals
|
||
===================
|
||
|
||
Python has always provided a variety of built-in and standard-library
|
||
generic functions, such as ``len()``, ``iter()``, ``pprint.pprint()``,
|
||
``copy.copy()``, and most of the functions in the ``operator`` module.
|
||
However, it currently:
|
||
|
||
1. does not have a simple or straightforward way for developers to
|
||
create new generic functions,
|
||
|
||
2. does not have a standard way for methods to be added to existing
|
||
generic functions (i.e., some are added using registration
|
||
functions, others require defining ``__special__`` methods, possibly
|
||
by monkeypatching).
|
||
|
||
In addition, it is currently a common anti-pattern for Python code to
|
||
inspect the types of received arguments, in order to decide what to do
|
||
with the objects.
|
||
|
||
For example, code may wish to accept either an object
|
||
of some type, or a sequence of objects of that type.
|
||
Currently, the "obvious way" to do this is by type inspection, but this
|
||
is brittle and closed to extension.
|
||
|
||
Abstract Base Classes make it easier
|
||
to discover present behaviour, but don't help adding new behaviour.
|
||
A developer using an already-written library may be unable to change how
|
||
their objects are treated by such code, especially if the objects they
|
||
are using were created by a third party.
|
||
|
||
Therefore, this PEP proposes a uniform API to address dynamic
|
||
overloading using decorators.
|
||
|
||
|
||
User API
|
||
========
|
||
|
||
To define a generic function, decorate it with the ``@singledispatch``
|
||
decorator. Note that the dispatch happens on the type of the first
|
||
argument. Create your function accordingly::
|
||
|
||
>>> from functools import singledispatch
|
||
>>> @singledispatch
|
||
... def fun(arg, verbose=False):
|
||
... if verbose:
|
||
... print("Let me just say,", end=" ")
|
||
... print(arg)
|
||
|
||
To add overloaded implementations to the function, use the
|
||
``register()`` attribute of the generic function. This is a decorator,
|
||
taking a type parameter and decorating a function implementing the
|
||
operation for that type::
|
||
|
||
>>> @fun.register(int)
|
||
... def _(arg, verbose=False):
|
||
... if verbose:
|
||
... print("Strength in numbers, eh?", end=" ")
|
||
... print(arg)
|
||
...
|
||
>>> @fun.register(list)
|
||
... def _(arg, verbose=False):
|
||
... if verbose:
|
||
... print("Enumerate this:")
|
||
... for i, elem in enumerate(arg):
|
||
... print(i, elem)
|
||
|
||
To enable registering lambdas and pre-existing functions, the
|
||
``register()`` attribute can be used in a functional form::
|
||
|
||
>>> def nothing(arg, verbose=False):
|
||
... print("Nothing.")
|
||
...
|
||
>>> fun.register(type(None), nothing)
|
||
|
||
The ``register()`` attribute returns the undecorated function. This
|
||
enables decorator stacking, pickling, as well as creating unit tests for
|
||
each variant independently::
|
||
|
||
>>> @fun.register(float)
|
||
... @fun.register(Decimal)
|
||
... def fun_num(arg, verbose=False):
|
||
... if verbose:
|
||
... print("Half of your number:", end=" ")
|
||
... print(arg / 2)
|
||
...
|
||
>>> fun_num is fun
|
||
False
|
||
|
||
When called, the generic function dispatches on the type of the first
|
||
argument::
|
||
|
||
>>> fun("Hello, world.")
|
||
Hello, world.
|
||
>>> fun("test.", verbose=True)
|
||
Let me just say, test.
|
||
>>> fun(42, verbose=True)
|
||
Strength in numbers, eh? 42
|
||
>>> fun(['spam', 'spam', 'eggs', 'spam'], verbose=True)
|
||
Enumerate this:
|
||
0 spam
|
||
1 spam
|
||
2 eggs
|
||
3 spam
|
||
>>> fun(None)
|
||
Nothing.
|
||
>>> fun(1.23)
|
||
0.615
|
||
|
||
Where there is no registered implementation for a specific type, its
|
||
method resolution order is used to find a more generic implementation.
|
||
The original function decorated with ``@singledispatch`` is registered
|
||
for the base ``object`` type, which means it is used if no better
|
||
implementation is found.
|
||
|
||
To check which implementation will the generic function choose for
|
||
a given type, use the ``dispatch()`` attribute::
|
||
|
||
>>> fun.dispatch(float)
|
||
<function fun_num at 0x104319058>
|
||
>>> fun.dispatch(dict) # note: default implementation
|
||
<function fun at 0x103fe0000>
|
||
|
||
To access all registered implementations, use the read-only ``registry``
|
||
attribute::
|
||
|
||
>>> fun.registry.keys()
|
||
dict_keys([<class 'NoneType'>, <class 'int'>, <class 'object'>,
|
||
<class 'decimal.Decimal'>, <class 'list'>,
|
||
<class 'float'>])
|
||
>>> fun.registry[float]
|
||
<function fun_num at 0x1035a2840>
|
||
>>> fun.registry[object]
|
||
<function fun at 0x103fe0000>
|
||
|
||
The proposed API is intentionally limited and opinionated, as to ensure
|
||
it is easy to explain and use, as well as to maintain consistency with
|
||
existing members in the ``functools`` module.
|
||
|
||
|
||
Implementation Notes
|
||
====================
|
||
|
||
The functionality described in this PEP is already implemented in the
|
||
``pkgutil`` standard library module as ``simplegeneric``. Because this
|
||
implementation is mature, the goal is to move it largely as-is. The
|
||
reference implementation is available on hg.python.org [#ref-impl]_.
|
||
|
||
The dispatch type is specified as a decorator argument. An alternative
|
||
form using function annotations was considered but its inclusion
|
||
has been rejected. As of May 2013, this usage pattern is out of scope
|
||
for the standard library [#pep-0008]_, and the best practices for
|
||
annotation usage are still debated.
|
||
|
||
Based on the current ``pkgutil.simplegeneric`` implementation, and
|
||
following the convention on registering virtual subclasses on Abstract
|
||
Base Classes, the dispatch registry will not be thread-safe.
|
||
|
||
Abstract Base Classes
|
||
---------------------
|
||
|
||
The ``pkgutil.simplegeneric`` implementation relied on several forms of
|
||
method resultion order (MRO). ``@singledispatch`` removes special
|
||
handling of old-style classes and Zope's ExtensionClasses. More
|
||
importantly, it introduces support for Abstract Base Classes (ABC).
|
||
|
||
When a generic function implementation is registered for an ABC, the
|
||
dispatch algorithm switches to a mode of MRO calculation for the
|
||
provided argument which includes the relevant ABCs. The algorithm is as
|
||
follows::
|
||
|
||
def _compose_mro(cls, haystack):
|
||
"""Calculates the MRO for a given class `cls`, including relevant
|
||
abstract base classes from `haystack`."""
|
||
bases = set(cls.__mro__)
|
||
mro = list(cls.__mro__)
|
||
for regcls in haystack:
|
||
if regcls in bases or not issubclass(cls, regcls):
|
||
continue # either present in the __mro__ or unrelated
|
||
for index, base in enumerate(mro):
|
||
if not issubclass(base, regcls):
|
||
break
|
||
if base in bases and not issubclass(regcls, base):
|
||
# Conflict resolution: put classes present in __mro__
|
||
# and their subclasses first.
|
||
index += 1
|
||
mro.insert(index, regcls)
|
||
return mro
|
||
|
||
In its most basic form, it returns the MRO for the given type::
|
||
|
||
>>> _compose_mro(dict, [])
|
||
[<class 'dict'>, <class 'object'>]
|
||
|
||
When the haystack consists of ABCs that the specified type is a subclass
|
||
of, they are inserted in a predictable order::
|
||
|
||
>>> _compose_mro(dict, [Sized, MutableMapping, str,
|
||
... Sequence, Iterable])
|
||
[<class 'dict'>, <class 'collections.abc.MutableMapping'>,
|
||
<class 'collections.abc.Iterable'>, <class 'collections.abc.Sized'>,
|
||
<class 'object'>]
|
||
|
||
While this mode of operation is significantly slower, all dispatch
|
||
decisions are cached. The cache is invalidated on registering new
|
||
implementations on the generic function or when user code calls
|
||
``register()`` on an ABC to register a new virtual subclass. In the
|
||
latter case, it is possible to create a situation with ambiguous
|
||
dispatch, for instance::
|
||
|
||
>>> from collections import Iterable, Container
|
||
>>> class P:
|
||
... pass
|
||
>>> Iterable.register(P)
|
||
<class '__main__.P'>
|
||
>>> Container.register(P)
|
||
<class '__main__.P'>
|
||
|
||
Faced with ambiguity, ``@singledispatch`` refuses the temptation to
|
||
guess::
|
||
|
||
>>> @singledispatch
|
||
... def g(arg):
|
||
... return "base"
|
||
...
|
||
>>> g.register(Iterable, lambda arg: "iterable")
|
||
<function <lambda> at 0x108b49110>
|
||
>>> g.register(Container, lambda arg: "container")
|
||
<function <lambda> at 0x108b491c8>
|
||
>>> g(P())
|
||
Traceback (most recent call last):
|
||
...
|
||
RuntimeError: Ambiguous dispatch: <class 'collections.abc.Container'>
|
||
or <class 'collections.abc.Iterable'>
|
||
|
||
Note that this exception would not be raised if ``Iterable`` and
|
||
``Container`` had been provided as base classes during class definition.
|
||
In this case dispatch happens in the MRO order::
|
||
|
||
>>> class Ten(Iterable, Container):
|
||
... def __iter__(self):
|
||
... for i in range(10):
|
||
... yield i
|
||
... def __contains__(self, value):
|
||
... return value in range(10)
|
||
...
|
||
>>> g(Ten())
|
||
'iterable'
|
||
|
||
|
||
Usage Patterns
|
||
==============
|
||
|
||
This PEP proposes extending behaviour only of functions specifically
|
||
marked as generic. Just as a base class method may be overridden by
|
||
a subclass, so too a function may be overloaded to provide custom
|
||
functionality for a given type.
|
||
|
||
Universal overloading does not equal *arbitrary* overloading, in the
|
||
sense that we need not expect people to randomly redefine the behavior
|
||
of existing functions in unpredictable ways. To the contrary, generic
|
||
function usage in actual programs tends to follow very predictable
|
||
patterns and registered implementations are highly-discoverable in the
|
||
common case.
|
||
|
||
If a module is defining a new generic operation, it will usually also
|
||
define any required implementations for existing types in the same
|
||
place. Likewise, if a module is defining a new type, then it will
|
||
usually define implementations there for any generic functions that it
|
||
knows or cares about. As a result, the vast majority of registered
|
||
implementations can be found adjacent to either the function being
|
||
overloaded, or to a newly-defined type for which the implementation is
|
||
adding support.
|
||
|
||
It is only in rather infrequent cases that one will have implementations
|
||
registered in a module that contains neither the function nor the
|
||
type(s) for which the implementation is added. In the absence of
|
||
incompetence or deliberate intention to be obscure, the few
|
||
implementations that are not registered adjacent to the relevant type(s)
|
||
or function(s), will generally not need to be understood or known about
|
||
outside the scope where those implementations are defined. (Except in
|
||
the "support modules" case, where best practice suggests naming them
|
||
accordingly.)
|
||
|
||
As mentioned earlier, single-dispatch generics are already prolific
|
||
throughout the standard library. A clean, standard way of doing them
|
||
provides a way forward to refactor those custom implementations to use
|
||
a common one, opening them up for user extensibility at the same time.
|
||
|
||
|
||
Alternative approaches
|
||
======================
|
||
|
||
In PEP 3124 [#pep-3124]_ Phillip J. Eby proposes a full-grown solution
|
||
with overloading based on arbitrary rule sets (with the default
|
||
implementation dispatching on argument types), as well as interfaces,
|
||
adaptation and method combining. PEAK-Rules [#peak-rules]_ is
|
||
a reference implementation of the concepts described in PJE's PEP.
|
||
|
||
Such a broad approach is inherently complex, which makes reaching
|
||
a consensus hard. In contrast, this PEP focuses on a single piece of
|
||
functionality that is simple to reason about. It's important to note
|
||
this does not preclude the use of other approaches now or in the future.
|
||
|
||
In a 2005 article on Artima [#artima2005]_ Guido van Rossum presents
|
||
a generic function implementation that dispatches on types of all
|
||
arguments on a function. The same approach was chosen in Andrey Popp's
|
||
``generic`` package available on PyPI [#pypi-generic]_, as well as David
|
||
Mertz's ``gnosis.magic.multimethods`` [#gnosis-multimethods]_.
|
||
|
||
While this seems desirable at first, I agree with Fredrik Lundh's
|
||
comment that "if you design APIs with pages of logic just to sort out
|
||
what code a function should execute, you should probably hand over the
|
||
API design to someone else". In other words, the single argument
|
||
approach proposed in this PEP is not only easier to implement but also
|
||
clearly communicates that dispatching on a more complex state is an
|
||
anti-pattern. It also has the virtue of corresponding directly with the
|
||
familiar method dispatch mechanism in object oriented programming. The
|
||
only difference is whether the custom implementation is associated more
|
||
closely with the data (object-oriented methods) or the algorithm
|
||
(single-dispatch overloading).
|
||
|
||
PyPy's RPython offers ``extendabletype`` [#pairtype]_, a metaclass which
|
||
enables classes to be externally extended. In combination with
|
||
``pairtype()`` and ``pair()`` factories, this offers a form of
|
||
single-dispatch generics.
|
||
|
||
|
||
Acknowledgements
|
||
================
|
||
|
||
Apart from Phillip J. Eby's work on PEP 3124 [#pep-3124]_ and
|
||
PEAK-Rules, influences include Paul Moore's original issue
|
||
[#issue-5135]_ that proposed exposing ``pkgutil.simplegeneric`` as part
|
||
of the ``functools`` API, Guido van Rossum's article on multimethods
|
||
[#artima2005]_, and discussions with Raymond Hettinger on a general
|
||
pprint rewrite. Huge thanks to Nick Coghlan for encouraging me to create
|
||
this PEP and providing initial feedback.
|
||
|
||
|
||
References
|
||
==========
|
||
|
||
.. [#ref-impl]
|
||
http://hg.python.org/features/pep-443/file/tip/Lib/functools.py#l359
|
||
|
||
.. [#pep-0008] PEP 8 states in the "Programming Recommendations"
|
||
section that "the Python standard library will not use function
|
||
annotations as that would result in a premature commitment to
|
||
a particular annotation style".
|
||
(http://www.python.org/dev/peps/pep-0008)
|
||
|
||
.. [#pep-3124] http://www.python.org/dev/peps/pep-3124/
|
||
|
||
.. [#peak-rules] http://peak.telecommunity.com/DevCenter/PEAK_2dRules
|
||
|
||
.. [#artima2005]
|
||
http://www.artima.com/weblogs/viewpost.jsp?thread=101605
|
||
|
||
.. [#pypi-generic] http://pypi.python.org/pypi/generic
|
||
|
||
.. [#gnosis-multimethods]
|
||
http://gnosis.cx/publish/programming/charming_python_b12.html
|
||
|
||
.. [#pairtype]
|
||
https://bitbucket.org/pypy/pypy/raw/default/rpython/tool/pairtype.py
|
||
|
||
.. [#issue-5135] http://bugs.python.org/issue5135
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|
||
|
||
|