PEP 622: Drop __match__ from the spec (#1484)

This commit is contained in:
Brandt Bucher 2020-07-01 08:37:47 -07:00 committed by GitHub
parent 31d5ffe9ca
commit 74c81defd0
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 43 additions and 115 deletions

View File

@ -100,12 +100,11 @@ the proposed code will work without any modifications to the definition of
See the `syntax`_ sections below for a more detailed specification. See the `syntax`_ sections below for a more detailed specification.
Similarly to how constructing objects can be customized by a user-defined We propose that destructuring
``__init__()`` method, we propose that destructuring objects can be customized objects can be customized by a new special ``__match_args__``
by a new special ``__match__()`` method. As part of this PEP we specify the attribute. As part of this PEP we specify the general API and its
general ``__match__()`` API, its implementation for ``object.__match__()``, implementation for some standard library classes (including named
and for some standard library classes (including PEP 557 dataclasses). See tuples and dataclasses). See the `runtime`_ section below.
`runtime`_ section below.
Finally, we aim to provide a comprehensive support for static type checkers Finally, we aim to provide a comprehensive support for static type checkers
and similar tools. For this purpose we propose to introduce a and similar tools. For this purpose we propose to introduce a
@ -415,13 +414,11 @@ example::
case Rectangle(x0, y0, x1, y1, painted=True): case Rectangle(x0, y0, x1, y1, painted=True):
... ...
Whether a match succeeds or not is determined by calling a special Whether a match succeeds or not is determined by the equivalent of an
``__match__()`` method on the class named in the pattern ``isinstance`` call. If the target (``shape``, in the example) is not
(``Point`` and ``Rectangle`` in the example), an instance of the named class (``Point`` or ``Rectangle``), the match
with the value being matched (``shape``) as the only argument. fails. Otherwise, it continues (see details in the `runtime`_
If the method returns ``None``, the match fails, otherwise the section).
match continues w.r.t. attributes of the returned proxy object, see details
in `runtime`_ section.
The named class must inherit from ``type``. It may be a single name The named class must inherit from ``type``. It may be a single name
or a dotted name (e.g. ``some_mod.SomeClass`` or ``mod.pkg.Class``). or a dotted name (e.g. ``some_mod.SomeClass`` or ``mod.pkg.Class``).
@ -429,12 +426,12 @@ The leading name must not be ``_``, so e.g. ``_(...)`` and
``_.C(...)`` are invalid. Use ``object(foo=_)`` to check whether the ``_.C(...)`` are invalid. Use ``object(foo=_)`` to check whether the
matched object has an attribute ``foo``. matched object has an attribute ``foo``.
This PEP only fully specifies the behavior of ``__match__()`` for ``object`` By default, sub-patterns may only be matched by keyword for
and some builtin and standard library classes, custom classes are only user-defined classes. In order to suport positional sub-patterns, a
required to follow the protocol specified in `runtime`_ section. After all, custom ``__match_args__`` attribute is required.
the authors of a class know best how to "revert" the logic of the The runtime allows matching against
``__init__()`` they wrote. The runtime will then chain these calls to allow arbitrarily nested patterns by chaining all of the instance checks and
matching against arbitrarily nested patterns. attribute lookups appropriately.
Combining multiple patterns Combining multiple patterns
@ -544,24 +541,20 @@ equally.
Runtime specification Runtime specification
===================== =====================
The ``__match__()`` protocol The Match Protocol
---------------------------- ------------------
TODO: Show equivalent pseudo code. The equivalent of an ``isinstance`` call is used to decide whether an
object matches a given class pattern and to extract the corresponding
The ``__match__()`` method is used to decide whether an object matches attributes. Classes requiring different matching semantics (such as
a given class pattern and to extract the corresponding attributes. It duck-typing) can do so by defining ``__instancecheck__`` (a
must be a class method or a static method returning an object pre-existing metaclass hook) or by using ``typing.Protocol``.
(typically the same as the argument), or ``None`` to indicate that no
match is possible. (More about the return value in the next section.)
The procedure is as following: The procedure is as following:
* The class object for ``Class`` in ``Class(<sub-patterns>)`` is looked up and * The class object for ``Class`` in ``Class(<sub-patterns>)`` is
``Class.__match__(obj)`` is called where ``obj`` is the value being matched. looked up and ``isinstance(obj, Class)`` is called, where ``obj`` is
the value being matched. If false, the match fails.
* If the result of the call (which we are referring to as "match proxy") is
``None``, the match fails.
* Otherwise, if any sub-patterns are given in the form of positional * Otherwise, if any sub-patterns are given in the form of positional
or keyword arguments, these are matched from left to right, as or keyword arguments, these are matched from left to right, as
@ -585,7 +578,7 @@ The procedure is as following:
refuse the temptation to guess." refuse the temptation to guess."
* If there are any match-by-keyword items the keywords are looked up * If there are any match-by-keyword items the keywords are looked up
as attributes on the proxy. If the lookup succeeds the value is as attributes on the target. If the lookup succeeds the value is
matched against the corresponding sub-pattern. If the lookup fails, matched against the corresponding sub-pattern. If the lookup fails,
the match fails. the match fails.
@ -597,7 +590,7 @@ For the most commonly-matched built-in types (``bool``,
``frozenset``, ``int``, ``list``, ``set``, ``str``, and ``tuple``), a ``frozenset``, ``int``, ``list``, ``set``, ``str``, and ``tuple``), a
single positional sub-pattern is allowed to be passed to single positional sub-pattern is allowed to be passed to
the call. Rather than being matched against any particular attribute the call. Rather than being matched against any particular attribute
on the proxy, it is instead matched against the proxy itself. This on the target, it is instead matched against the target itself. This
creates behavior that is useful and intuitive for these objects: creates behavior that is useful and intuitive for these objects:
* ``bool(False)`` matches ``False`` (but not ``0``). * ``bool(False)`` matches ``False`` (but not ``0``).
@ -605,34 +598,6 @@ creates behavior that is useful and intuitive for these objects:
* ``int(i)`` matches any ``int`` and binds it to the name ``i``. * ``int(i)`` matches any ``int`` and binds it to the name ``i``.
Result value of ``__match__()``
-------------------------------
If a match is successful, the ``__match__()`` method should return an object
whose attribute values will then be bound to the corresponding keyword argument
names in the pattern after the match is complete. For each possible name that is
legal in the match pattern, the returned object should have a corresponding attribute
with that name, that can be used to access that value.
(Positional sub-patterns are matched to keyword sub-patterns using
``__match_args__`` as shown in the previous section.)
For most ordinary objects, this returned object can simply be the original object,
unchanged.
However, there may be cases where the internal implementation of a class is
very different than its public representation, for example a ``Point`` class with
`x`, `y` and `z` attributes may be represented internally as a vector; in such cases
a 'proxy object' may be returned whose attributes correspond to the matchable names.
There is no requirement that the attributes on the proxy object be the same type or
value as the attributes of the original object; one envisioned use case is for
expensive-to-compute properties to be computed lazily on the proxy object via
property getters.
In deciding what names should be available for matching, the recommended practice
is that class patterns should be the mirror of construction; that is, the set of
available names and their types should resemble the arguments to ``__init__()``.
Ambiguous matches Ambiguous matches
----------------- -----------------
@ -650,40 +615,19 @@ described in the previous subsection:
Special attribute ``__match_args__`` Special attribute ``__match_args__``
------------------------------------ ------------------------------------
The ``__match_args__`` attribute complements the ``__match__`` method and is The ``__match_args__`` attribute is always looked up on the type
always looked up on the same class as the ``__match__`` method. object named in the pattern. If present, it must be a list or tuple
``__match_args__``, if it is present, must be a list or of strings naming the allowed positional arguments.
tuple of strings naming the allowed positional arguments.
In deciding what names should be available for matching, the
recommended practice is that class patterns should be the mirror of
construction; that is, the set of available names and their types
should resemble the arguments to ``__init__()``.
Default ``object.__match__()`` Only match-by-name will work by default, and classes should define
------------------------------ ``__match_args__`` as a class attribute if they would like to support
match-by-position. Additionally, dataclasses and named tuples will
The default implementation aims at providing a basic, useful (but still safe) support match-by-position out of the box. See below for more details.
experience with pattern matching out of the box. For this purpose the default
``__match__()`` method follows this logic (pseudo-code)::
class object:
@classmethod
def __match__(cls, instance):
if isinstance(instance, cls):
return instance
This means that pattern matching is allowed by default for every class. If
a class wants to disallow pattern matching against itself, it should define
``__match__ = None``. This will cause an exception when trying to match
against such a class.
The above implementation means that by default only match-by-name will
work,
and classes should define ``__match_args__`` (e.g. as a class
attribute) if they would like to support match-by-position. Additionally,
dataclasses and named tuples will support match-by-position out of the box. See below for more
details.
Finally, all attributes are exposed for matching, if a class wants to hide
some attributes from matching against them, a custom ``__match__()`` method is
required.
The standard library The standard library
@ -699,9 +643,9 @@ the standard library:
``__init__()`` method. This includes the situations where attributes are ``__init__()`` method. This includes the situations where attributes are
inherited from a superclass. inherited from a superclass.
In addition, a systematic effort will be put into going through existing In addition, a systematic effort will be put into going through
standard library classes and adding custom ``__match__()`` and/or existing standard library classes and adding ``__match_args__`` where
``__match_args__`` where it looks beneficial. it looks beneficial.
.. _static checkers: .. _static checkers:
@ -931,18 +875,6 @@ productivity at the expense of additional CPU cycles, it would be
unfortunate if the benefits of ``match`` were counter-balanced by a significant unfortunate if the benefits of ``match`` were counter-balanced by a significant
overall decrease in runtime performance. overall decrease in runtime performance.
That being said, because of the flexibility of ``match``, and the fact that
it can be customized via the ``__match__`` callback, there is some overhead
involved with calling these methods. Exactly how much cost this will entail
will be implementation-dependent.
In this design, an attempt has been made to avoid putting too much of a
computational burden on the ``__match__`` method. In particular, earlier
versions of the design required a custom matcher to completely re-implement
most of the pattern-matching logic that would have been performed by the VM.
The current design eschews this flexibility in favor of a simpler, faster
custom match protocol.
Although this PEP does not specify any particular implementation strategy, Although this PEP does not specify any particular implementation strategy,
a few words about the prototype implementation and how it attempts to a few words about the prototype implementation and how it attempts to
maximize performance are in order. maximize performance are in order.
@ -954,16 +886,12 @@ logic for testing instance types, sequence lengths, mapping keys and
so on are inlined in place of the ``match``. so on are inlined in place of the ``match``.
This is not the only possible strategy, nor is it necessarily the best. This is not the only possible strategy, nor is it necessarily the best.
For example, the call to ``__match__`` could be memoized, especially For example, the instance checks could be memoized, especially
if there are multiple instances of the same class type but with different if there are multiple instances of the same class type but with different
arguments in a single match statement. It is also theoretically arguments in a single match statement. It is also theoretically
possible for a future implementation to process the case clauses in possible for a future implementation to process the case clauses in
parallel using a decision tree rather than testing them one by one. parallel using a decision tree rather than testing them one by one.
For this reason, implementers of ``__match__`` should not make any
assumptions about the number of times or the order in which ``__match__``
is called.
Backwards Compatibility Backwards Compatibility
======================= =======================