PEP 622: Drop __match__ from the spec (#1484)

This commit is contained in:
Brandt Bucher 2020-07-01 08:37:47 -07:00 committed by GitHub
parent 31d5ffe9ca
commit 74c81defd0
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 43 additions and 115 deletions

View File

@ -100,12 +100,11 @@ the proposed code will work without any modifications to the definition of
See the `syntax`_ sections below for a more detailed specification.
Similarly to how constructing objects can be customized by a user-defined
``__init__()`` method, we propose that destructuring objects can be customized
by a new special ``__match__()`` method. As part of this PEP we specify the
general ``__match__()`` API, its implementation for ``object.__match__()``,
and for some standard library classes (including PEP 557 dataclasses). See
`runtime`_ section below.
We propose that destructuring
objects can be customized by a new special ``__match_args__``
attribute. As part of this PEP we specify the general API and its
implementation for some standard library classes (including named
tuples and dataclasses). See the `runtime`_ section below.
Finally, we aim to provide a comprehensive support for static type checkers
and similar tools. For this purpose we propose to introduce a
@ -415,13 +414,11 @@ example::
case Rectangle(x0, y0, x1, y1, painted=True):
...
Whether a match succeeds or not is determined by calling a special
``__match__()`` method on the class named in the pattern
(``Point`` and ``Rectangle`` in the example),
with the value being matched (``shape``) as the only argument.
If the method returns ``None``, the match fails, otherwise the
match continues w.r.t. attributes of the returned proxy object, see details
in `runtime`_ section.
Whether a match succeeds or not is determined by the equivalent of an
``isinstance`` call. If the target (``shape``, in the example) is not
an instance of the named class (``Point`` or ``Rectangle``), the match
fails. Otherwise, it continues (see details in the `runtime`_
section).
The named class must inherit from ``type``. It may be a single name
or a dotted name (e.g. ``some_mod.SomeClass`` or ``mod.pkg.Class``).
@ -429,12 +426,12 @@ The leading name must not be ``_``, so e.g. ``_(...)`` and
``_.C(...)`` are invalid. Use ``object(foo=_)`` to check whether the
matched object has an attribute ``foo``.
This PEP only fully specifies the behavior of ``__match__()`` for ``object``
and some builtin and standard library classes, custom classes are only
required to follow the protocol specified in `runtime`_ section. After all,
the authors of a class know best how to "revert" the logic of the
``__init__()`` they wrote. The runtime will then chain these calls to allow
matching against arbitrarily nested patterns.
By default, sub-patterns may only be matched by keyword for
user-defined classes. In order to suport positional sub-patterns, a
custom ``__match_args__`` attribute is required.
The runtime allows matching against
arbitrarily nested patterns by chaining all of the instance checks and
attribute lookups appropriately.
Combining multiple patterns
@ -544,24 +541,20 @@ equally.
Runtime specification
=====================
The ``__match__()`` protocol
----------------------------
The Match Protocol
------------------
TODO: Show equivalent pseudo code.
The ``__match__()`` method is used to decide whether an object matches
a given class pattern and to extract the corresponding attributes. It
must be a class method or a static method returning an object
(typically the same as the argument), or ``None`` to indicate that no
match is possible. (More about the return value in the next section.)
The equivalent of an ``isinstance`` call is used to decide whether an
object matches a given class pattern and to extract the corresponding
attributes. Classes requiring different matching semantics (such as
duck-typing) can do so by defining ``__instancecheck__`` (a
pre-existing metaclass hook) or by using ``typing.Protocol``.
The procedure is as following:
* The class object for ``Class`` in ``Class(<sub-patterns>)`` is looked up and
``Class.__match__(obj)`` is called where ``obj`` is the value being matched.
* If the result of the call (which we are referring to as "match proxy") is
``None``, the match fails.
* The class object for ``Class`` in ``Class(<sub-patterns>)`` is
looked up and ``isinstance(obj, Class)`` is called, where ``obj`` is
the value being matched. If false, the match fails.
* Otherwise, if any sub-patterns are given in the form of positional
or keyword arguments, these are matched from left to right, as
@ -585,7 +578,7 @@ The procedure is as following:
refuse the temptation to guess."
* If there are any match-by-keyword items the keywords are looked up
as attributes on the proxy. If the lookup succeeds the value is
as attributes on the target. If the lookup succeeds the value is
matched against the corresponding sub-pattern. If the lookup fails,
the match fails.
@ -597,7 +590,7 @@ For the most commonly-matched built-in types (``bool``,
``frozenset``, ``int``, ``list``, ``set``, ``str``, and ``tuple``), a
single positional sub-pattern is allowed to be passed to
the call. Rather than being matched against any particular attribute
on the proxy, it is instead matched against the proxy itself. This
on the target, it is instead matched against the target itself. This
creates behavior that is useful and intuitive for these objects:
* ``bool(False)`` matches ``False`` (but not ``0``).
@ -605,34 +598,6 @@ creates behavior that is useful and intuitive for these objects:
* ``int(i)`` matches any ``int`` and binds it to the name ``i``.
Result value of ``__match__()``
-------------------------------
If a match is successful, the ``__match__()`` method should return an object
whose attribute values will then be bound to the corresponding keyword argument
names in the pattern after the match is complete. For each possible name that is
legal in the match pattern, the returned object should have a corresponding attribute
with that name, that can be used to access that value.
(Positional sub-patterns are matched to keyword sub-patterns using
``__match_args__`` as shown in the previous section.)
For most ordinary objects, this returned object can simply be the original object,
unchanged.
However, there may be cases where the internal implementation of a class is
very different than its public representation, for example a ``Point`` class with
`x`, `y` and `z` attributes may be represented internally as a vector; in such cases
a 'proxy object' may be returned whose attributes correspond to the matchable names.
There is no requirement that the attributes on the proxy object be the same type or
value as the attributes of the original object; one envisioned use case is for
expensive-to-compute properties to be computed lazily on the proxy object via
property getters.
In deciding what names should be available for matching, the recommended practice
is that class patterns should be the mirror of construction; that is, the set of
available names and their types should resemble the arguments to ``__init__()``.
Ambiguous matches
-----------------
@ -650,40 +615,19 @@ described in the previous subsection:
Special attribute ``__match_args__``
------------------------------------
The ``__match_args__`` attribute complements the ``__match__`` method and is
always looked up on the same class as the ``__match__`` method.
``__match_args__``, if it is present, must be a list or
tuple of strings naming the allowed positional arguments.
The ``__match_args__`` attribute is always looked up on the type
object named in the pattern. If present, it must be a list or tuple
of strings naming the allowed positional arguments.
In deciding what names should be available for matching, the
recommended practice is that class patterns should be the mirror of
construction; that is, the set of available names and their types
should resemble the arguments to ``__init__()``.
Default ``object.__match__()``
------------------------------
The default implementation aims at providing a basic, useful (but still safe)
experience with pattern matching out of the box. For this purpose the default
``__match__()`` method follows this logic (pseudo-code)::
class object:
@classmethod
def __match__(cls, instance):
if isinstance(instance, cls):
return instance
This means that pattern matching is allowed by default for every class. If
a class wants to disallow pattern matching against itself, it should define
``__match__ = None``. This will cause an exception when trying to match
against such a class.
The above implementation means that by default only match-by-name will
work,
and classes should define ``__match_args__`` (e.g. as a class
attribute) if they would like to support match-by-position. Additionally,
dataclasses and named tuples will support match-by-position out of the box. See below for more
details.
Finally, all attributes are exposed for matching, if a class wants to hide
some attributes from matching against them, a custom ``__match__()`` method is
required.
Only match-by-name will work by default, and classes should define
``__match_args__`` as a class attribute if they would like to support
match-by-position. Additionally, dataclasses and named tuples will
support match-by-position out of the box. See below for more details.
The standard library
@ -699,9 +643,9 @@ the standard library:
``__init__()`` method. This includes the situations where attributes are
inherited from a superclass.
In addition, a systematic effort will be put into going through existing
standard library classes and adding custom ``__match__()`` and/or
``__match_args__`` where it looks beneficial.
In addition, a systematic effort will be put into going through
existing standard library classes and adding ``__match_args__`` where
it looks beneficial.
.. _static checkers:
@ -931,18 +875,6 @@ productivity at the expense of additional CPU cycles, it would be
unfortunate if the benefits of ``match`` were counter-balanced by a significant
overall decrease in runtime performance.
That being said, because of the flexibility of ``match``, and the fact that
it can be customized via the ``__match__`` callback, there is some overhead
involved with calling these methods. Exactly how much cost this will entail
will be implementation-dependent.
In this design, an attempt has been made to avoid putting too much of a
computational burden on the ``__match__`` method. In particular, earlier
versions of the design required a custom matcher to completely re-implement
most of the pattern-matching logic that would have been performed by the VM.
The current design eschews this flexibility in favor of a simpler, faster
custom match protocol.
Although this PEP does not specify any particular implementation strategy,
a few words about the prototype implementation and how it attempts to
maximize performance are in order.
@ -954,16 +886,12 @@ logic for testing instance types, sequence lengths, mapping keys and
so on are inlined in place of the ``match``.
This is not the only possible strategy, nor is it necessarily the best.
For example, the call to ``__match__`` could be memoized, especially
For example, the instance checks could be memoized, especially
if there are multiple instances of the same class type but with different
arguments in a single match statement. It is also theoretically
possible for a future implementation to process the case clauses in
parallel using a decision tree rather than testing them one by one.
For this reason, implementers of ``__match__`` should not make any
assumptions about the number of times or the order in which ``__match__``
is called.
Backwards Compatibility
=======================