874 lines
25 KiB
ReStructuredText
874 lines
25 KiB
ReStructuredText
PEP: 653
|
|
Title: Precise Semantics for Pattern Matching
|
|
Author: Mark Shannon <mark@hotpy.org>
|
|
Status: Draft
|
|
Type: Standards Track
|
|
Content-Type: text/x-rst
|
|
Created: 09-Feb-2021
|
|
Post-History: 18-Feb-2021
|
|
|
|
|
|
Abstract
|
|
========
|
|
|
|
This PEP proposes a semantics for pattern matching that respects the general concept of :pep:`634`,
|
|
but is more precise, easier to reason about, and should be faster.
|
|
|
|
The object model will be extended with two special (dunder) attributes, ``__match_container__`` and
|
|
``__match_class__``, in addition to the ``__match_args__`` attribute from :pep:`634`, to support pattern matching.
|
|
Both of these new attributes must be integers and ``__match_args__`` is required to be a tuple of unique strings.
|
|
|
|
With this PEP:
|
|
|
|
* The semantics of pattern matching will be clearer, so that patterns are easier to reason about.
|
|
* It will be possible to implement pattern matching in a more efficient fashion.
|
|
* Pattern matching will be more usable for complex classes, by allowing classes some more control over which patterns they match.
|
|
|
|
Motivation
|
|
==========
|
|
|
|
Pattern matching in Python, as described in :pep:`634`, is to be added to Python 3.10.
|
|
Unfortunately, :pep:`634` is not as precise about the semantics as it could be,
|
|
nor does it allow classes sufficient control over how they match patterns.
|
|
|
|
Precise semantics
|
|
-----------------
|
|
|
|
:pep:`634` explicitly includes a section on undefined behavior.
|
|
Large amounts of undefined behavior may be acceptable in a language like C,
|
|
but in Python it should be kept to a minimum.
|
|
Pattern matching in Python can be defined more precisely without losing expressiveness or performance.
|
|
|
|
Improved control over class matching
|
|
------------------------------------
|
|
|
|
:pep:`634` delegates the decision over whether a class is a sequence or mapping to ``collections.abc``.
|
|
Not all classes that could be considered sequences are registered as subclasses of ``collections.abc.Sequence``.
|
|
This PEP allows them to match sequence patterns, without the full ``collections.abc.Sequence`` machinery.
|
|
|
|
:pep:`634` privileges some builtin classes with a special form of matching, the "self" match.
|
|
For example the pattern ``list(x)`` matches a list and assigns the list to ``x``.
|
|
By allowing classes to choose which kinds of pattern they match, other classes can use this form as well.
|
|
|
|
For example, using ``sympy``, we might want to write::
|
|
|
|
# a*a == a**2
|
|
case Mul(args=[Symbol(a), Symbol(b)]) if a == b:
|
|
return Pow(a, 2)
|
|
|
|
Which requires the sympy class ``Symbol`` to "self" match.
|
|
For ``sympy`` to support this pattern with :pep:`634` is possible, but a bit tricky.
|
|
With this PEP it can be implemented very easily [1]_.
|
|
|
|
Robustness
|
|
----------
|
|
|
|
With this PEP, access to attributes during pattern matching becomes well defined and deterministic.
|
|
This makes pattern matching less error prone when matching objects with hidden side effects, such as object-relational mappers.
|
|
Objects will have more control over their own deconstruction, which can help prevent unintended consequences should attribute access have side-effects.
|
|
|
|
:pep:`634` relies on the ``collections.abc`` module when determining which patterns a value can match, implicitly importing it if necessary.
|
|
This PEP will eliminate surprising import errors and misleading audit events from those imports.
|
|
|
|
|
|
Efficient implementation
|
|
------------------------
|
|
|
|
The semantics proposed in this PEP will allow efficient implementation, partly as a result of having precise semantics
|
|
and partly from using the object model.
|
|
|
|
With precise semantics, it is possible to reason about what code transformations are correct,
|
|
and thus apply optimizations effectively.
|
|
|
|
Because the object model is a core part of Python, implementations already handle special attribute lookup efficiently.
|
|
Looking up a special attribute is much faster than performing a subclass test on an abstract base class.
|
|
|
|
Rationale
|
|
=========
|
|
|
|
The object model and special methods are at the core of the Python language. Consequently,
|
|
implementations support them well.
|
|
Using special attributes for pattern matching allows pattern matching to be implemented in a way that
|
|
integrates well with the rest of the implementation, and is thus easier to maintain and is likely to perform better.
|
|
|
|
A match statement performs a sequence of pattern matches. In general, matching a pattern has three parts:
|
|
|
|
1. Can the value match this kind of pattern?
|
|
2. When deconstructed, does the value match this particular pattern?
|
|
3. Is the guard true?
|
|
|
|
To determine whether a value can match a particular kind of pattern, we add the ``__match_container__``
|
|
and ``__match_class__`` attributes.
|
|
This allows the kind of a value to be determined in a efficient fashion.
|
|
|
|
Specification
|
|
=============
|
|
|
|
Additions to the object model
|
|
-----------------------------
|
|
|
|
The ``__match_container__`` and ``__match_class__`` attributes will be added to ``object``.
|
|
``__match_container__`` should be overridden by classes that want to match mapping or sequence patterns.
|
|
``__match_class__`` should be overridden by classes that want to change the default behavior when matching class patterns.
|
|
|
|
``__match_container__`` must be an integer and should be exactly one of these::
|
|
|
|
0
|
|
MATCH_SEQUENCE = 1
|
|
MATCH_MAPPING = 2
|
|
|
|
``MATCH_SEQUENCE`` is used to indicate that instances of the class can match sequence patterns.
|
|
|
|
``MATCH_MAPPING`` is used to indicate that instances of the class can match mapping patterns.
|
|
|
|
``__match_class__`` must be an integer and should be exactly one of these::
|
|
|
|
0
|
|
MATCH_SELF = 8
|
|
|
|
``MATCH_SELF`` is used to indicate that for a single positional argument class pattern, the subject will be used and not deconstructed.
|
|
|
|
.. note::
|
|
In the rest of this document, we will refer to the above values by name only.
|
|
Symbolic constants will be provided both for Python and C, and the values will
|
|
never be changed.
|
|
|
|
``object`` will have the following values for the special attributes::
|
|
|
|
__match_container__ = 0
|
|
__match_class__= 0
|
|
__match_args__ = ()
|
|
|
|
These special attributes will be inherited as normal.
|
|
|
|
If ``__match_args__`` is overridden, then it is required to hold a tuple of unique strings. It may be empty.
|
|
|
|
.. note::
|
|
``__match_args__`` will be automatically generated for dataclasses and named tuples, as specified in :pep:`634`.
|
|
|
|
The pattern matching implementation is *not* required to check that any of these attributes behave as specified.
|
|
If the value of ``__match_container__``, ``__match_class__`` or ``__match_args__`` is not as specified, then
|
|
the implementation may raise any exception, or match the wrong pattern.
|
|
Of course, implementations are free to check these properties and provide meaningful error messages if they can do so efficiently.
|
|
|
|
Semantics of the matching process
|
|
---------------------------------
|
|
|
|
In the following, all variables of the form ``$var`` are temporary variables and are not visible to the Python program.
|
|
They may be visible via introspection, but that is an implementation detail and should not be relied on.
|
|
The pseudo-statement ``FAIL`` is used to signify that matching failed for this pattern and that matching should move to the next pattern.
|
|
If control reaches the end of the translation without reaching a ``FAIL``, then it has matched, and following patterns are ignored.
|
|
|
|
Variables of the form ``$ALL_CAPS`` are meta-variables holding a syntactic element, they are not normal variables.
|
|
So, ``$VARS = $items`` is not an assignment of ``$items`` to ``$VARS``,
|
|
but an unpacking of ``$items`` into the variables that ``$VARS`` holds.
|
|
For example, with the abstract syntax ``case [$VARS]:``, and the concrete syntax ``case[a, b]:`` then ``$VARS`` would hold the variables ``(a, b)``,
|
|
not the values of those variables.
|
|
|
|
The pseudo-function ``QUOTE`` takes a variable and returns the name of that variable.
|
|
For example, if the meta-variable ``$VAR`` held the variable ``foo`` then ``QUOTE($VAR) == "foo"``.
|
|
|
|
All additional code listed below that is not present in the original source will not trigger line events, conforming to :pep:`626`.
|
|
|
|
|
|
Preamble
|
|
''''''''
|
|
|
|
Before any patterns are matched, the expression being matched is evaluated::
|
|
|
|
match expr:
|
|
|
|
translates to::
|
|
|
|
$value = expr
|
|
|
|
Capture patterns
|
|
''''''''''''''''
|
|
|
|
Capture patterns always match, so the irrefutable match::
|
|
|
|
case capture_var:
|
|
|
|
translates to::
|
|
|
|
capture_var = $value
|
|
|
|
Wildcard patterns
|
|
'''''''''''''''''
|
|
|
|
Wildcard patterns always match, so::
|
|
|
|
case _:
|
|
|
|
translates to::
|
|
|
|
# No code -- Automatically matches
|
|
|
|
|
|
Literal Patterns
|
|
''''''''''''''''
|
|
|
|
The literal pattern::
|
|
|
|
case LITERAL:
|
|
|
|
translates to::
|
|
|
|
if $value != LITERAL:
|
|
FAIL
|
|
|
|
except when the literal is one of ``None``, ``True`` or ``False`` ,
|
|
when it translates to::
|
|
|
|
if $value is not LITERAL:
|
|
FAIL
|
|
|
|
Value Patterns
|
|
''''''''''''''
|
|
|
|
The value pattern::
|
|
|
|
case value.pattern:
|
|
|
|
translates to::
|
|
|
|
if $value != value.pattern:
|
|
FAIL
|
|
|
|
Sequence Patterns
|
|
'''''''''''''''''
|
|
|
|
A pattern not including a star pattern::
|
|
|
|
case [$VARS]:
|
|
|
|
translates to::
|
|
|
|
$kind = type($value).__match_container__
|
|
if $kind != MATCH_SEQUENCE:
|
|
FAIL
|
|
if len($value) != len($VARS):
|
|
FAIL
|
|
$VARS = $value
|
|
|
|
Example: [2]_
|
|
|
|
A pattern including a star pattern::
|
|
|
|
case [$VARS]
|
|
|
|
translates to::
|
|
|
|
$kind = type($value).__match_container__
|
|
if $kind != MATCH_SEQUENCE:
|
|
FAIL
|
|
if len($value) < len($VARS):
|
|
FAIL
|
|
$VARS = $value # Note that $VARS includes a star expression.
|
|
|
|
Example: [3]_
|
|
|
|
Mapping Patterns
|
|
''''''''''''''''
|
|
|
|
A pattern not including a double-star pattern::
|
|
|
|
case {$KEYWORD_PATTERNS}:
|
|
|
|
translates to::
|
|
|
|
$sentinel = object()
|
|
$kind = type($value).__match_container__
|
|
if $kind != MATCH_MAPPING:
|
|
FAIL
|
|
# $KEYWORD_PATTERNS is a meta-variable mapping names to variables.
|
|
for $KEYWORD in $KEYWORD_PATTERNS:
|
|
$tmp = $value.get(QUOTE($KEYWORD), $sentinel)
|
|
if $tmp is $sentinel:
|
|
FAIL
|
|
$KEYWORD_PATTERNS[$KEYWORD] = $tmp
|
|
|
|
Example: [4]_
|
|
|
|
A pattern including a double-star pattern::
|
|
|
|
case {$KEYWORD_PATTERNS, **$DOUBLE_STARRED_PATTERN}:
|
|
|
|
translates to::
|
|
|
|
$kind = type($value).__match_container__
|
|
if $kind != MATCH_MAPPING:
|
|
FAIL
|
|
# $KEYWORD_PATTERNS is a meta-variable mapping names to variables.
|
|
$tmp = dict($value)
|
|
if not $tmp.keys() >= $KEYWORD_PATTERNS.keys():
|
|
FAIL:
|
|
for $KEYWORD in $KEYWORD_PATTERNS:
|
|
$KEYWORD_PATTERNS[$KEYWORD] = $tmp.pop(QUOTE($KEYWORD))
|
|
$DOUBLE_STARRED_PATTERN = $tmp
|
|
|
|
Example: [5]_
|
|
|
|
Class Patterns
|
|
''''''''''''''
|
|
|
|
Class pattern with no arguments::
|
|
|
|
case ClsName():
|
|
|
|
translates to::
|
|
|
|
if not isinstance($value, ClsName):
|
|
FAIL
|
|
|
|
Class pattern with a single positional pattern::
|
|
|
|
case ClsName($VAR):
|
|
|
|
translates to::
|
|
|
|
$kind = type($value).__match_class__
|
|
if $kind == MATCH_SELF:
|
|
if not isinstance($value, ClsName):
|
|
FAIL
|
|
$VAR = $value
|
|
else:
|
|
As other positional-only class pattern
|
|
|
|
|
|
Positional-only class pattern::
|
|
|
|
case ClsName($VARS):
|
|
|
|
translates to::
|
|
|
|
if not isinstance($value, ClsName):
|
|
FAIL
|
|
$attrs = ClsName.__match_args__
|
|
if len($attr) < len($VARS):
|
|
raise TypeError(...)
|
|
try:
|
|
for i, $VAR in enumerate($VARS):
|
|
$VAR = getattr($value, $attrs[i])
|
|
except AttributeError:
|
|
FAIL
|
|
|
|
Example: [6]_
|
|
|
|
Class patterns with all keyword patterns::
|
|
|
|
case ClsName($KEYWORD_PATTERNS):
|
|
|
|
translates to::
|
|
|
|
if not isinstance($value, ClsName):
|
|
FAIL
|
|
try:
|
|
for $KEYWORD in $KEYWORD_PATTERNS:
|
|
$tmp = getattr($value, QUOTE($KEYWORD))
|
|
$KEYWORD_PATTERNS[$KEYWORD] = $tmp
|
|
except AttributeError:
|
|
FAIL
|
|
|
|
Example: [7]_
|
|
|
|
Class patterns with positional and keyword patterns::
|
|
|
|
case ClsName($VARS, $KEYWORD_PATTERNS):
|
|
|
|
translates to::
|
|
|
|
if not isinstance($value, ClsName):
|
|
FAIL
|
|
$attrs = ClsName.__match_args__
|
|
if len($attr) < len($VARS):
|
|
raise TypeError(...)
|
|
$pos_attrs = $attrs[:len($VARS)]
|
|
try:
|
|
for i, $VAR in enumerate($VARS):
|
|
$VAR = getattr($value, $attrs[i])
|
|
for $KEYWORD in $KEYWORD_PATTERNS:
|
|
$name = QUOTE($KEYWORD)
|
|
if $name in pos_attrs:
|
|
raise TypeError(...)
|
|
$KEYWORD_PATTERNS[$KEYWORD] = getattr($value, $name)
|
|
except AttributeError:
|
|
FAIL
|
|
|
|
Example: [8]_
|
|
|
|
|
|
Nested patterns
|
|
'''''''''''''''
|
|
|
|
The above specification assumes that patterns are not nested. For nested patterns
|
|
the above translations are applied recursively by introducing temporary capture patterns.
|
|
|
|
For example, the pattern::
|
|
|
|
case [int(), str()]:
|
|
|
|
translates to::
|
|
|
|
$kind = type($value).__match_class__
|
|
if $kind != MATCH_SEQUENCE:
|
|
FAIL
|
|
if len($value) != 2:
|
|
FAIL
|
|
$value_0, $value_1 = $value
|
|
#Now match on temporary values
|
|
if not isinstance($value_0, int):
|
|
FAIL
|
|
if not isinstance($value_1, str):
|
|
FAIL
|
|
|
|
Guards
|
|
''''''
|
|
|
|
Guards translate to a test following the rest of the translation::
|
|
|
|
case pattern if guard:
|
|
|
|
translates to::
|
|
|
|
[translation for pattern]
|
|
if not guard:
|
|
FAIL
|
|
|
|
|
|
Non-conforming special attributes
|
|
'''''''''''''''''''''''''''''''''
|
|
|
|
All classes should ensure that the the values of ``__match_container__``, ``__match_class__``
|
|
and ``__match_args__`` follow the specification.
|
|
Therefore, implementations can assume, without checking, that the following are true::
|
|
|
|
__match_container__ == 0 or __match_container__ == MATCH_SEQUENCE or __match_container__ == MATCH_MAPPING
|
|
__match_class__ == 0 or __match_class__ == MATCH_SELF
|
|
|
|
and that ``__match_args__`` is a tuple of unique strings.
|
|
|
|
Values of the special attributes for classes in the standard library
|
|
--------------------------------------------------------------------
|
|
|
|
For the core builtin container classes ``__match_container__`` will be:
|
|
|
|
* ``list``: ``MATCH_SEQUENCE``
|
|
* ``tuple``: ``MATCH_SEQUENCE``
|
|
* ``dict``: ``MATCH_MAPPING``
|
|
* ``bytearray``: 0
|
|
* ``bytes``: 0
|
|
* ``str``: 0
|
|
|
|
Named tuples will have ``__match_container__`` set to ``MATCH_SEQUENCE``.
|
|
|
|
* All other standard library classes for which ``issubclass(cls, collections.abc.Mapping)`` is true will have ``__match_container__`` set to ``MATCH_MAPPING``.
|
|
* All other standard library classes for which ``issubclass(cls, collections.abc.Sequence)`` is true will have ``__match_container__`` set to ``MATCH_SEQUENCE``.
|
|
|
|
For the following builtin classes ``__match_class__`` will be set to ``MATCH_SELF``:
|
|
|
|
* ``bool``
|
|
* ``bytearray``
|
|
* ``bytes``
|
|
* ``float``
|
|
* ``frozenset``
|
|
* ``int``
|
|
* ``set``
|
|
* ``str``
|
|
* ``list``
|
|
* ``tuple``
|
|
* ``dict``
|
|
|
|
Legal optimizations
|
|
-------------------
|
|
|
|
The above semantics implies a lot of redundant effort and copying in the implementation.
|
|
However, it is possible to implement the above semantics efficiently by employing semantic preserving transformations
|
|
on the naive implementation.
|
|
|
|
When performing matching, implementations are allowed
|
|
to treat the following functions and methods as pure:
|
|
|
|
For any class supporting ``MATCH_SEQUENCE``::
|
|
|
|
* ``cls.__len__()``
|
|
* ``cls.__getitem__()``
|
|
|
|
For any class supporting ``MATCH_MAPPING``::
|
|
|
|
* ``cls.get()`` (Two argument form only)
|
|
|
|
Implementations are allowed to make the following assumptions:
|
|
|
|
* ``isinstance(obj, cls)`` can be freely replaced with ``issubclass(type(obj), cls)`` and vice-versa.
|
|
* ``isinstance(obj, cls)`` will always return the same result for any ``(obj, cls)`` pair and repeated calls can thus be elided.
|
|
* Reading any of ``__match_container__``, ``__match_class__`` or ``__match_args__`` is a pure operation, and may be cached.
|
|
* Sequences, that is any class for which ``__match_container__ == MATCH_SEQUENCE`` is not zero, are not modified by iteration, subscripting or calls to ``len()``.
|
|
Consequently, those operations can be freely substituted for each other where they would be equivalent when applied to an immutable sequence.
|
|
* Mappings, that is any class for which ``__match_container__ == MATCH_MAPPING`` is not zero, will not capture the second argument of the ``get()`` method.
|
|
So, the ``$sentinel`` value may be freely re-used.
|
|
|
|
In fact, implementations are encouraged to make these assumptions, as it is likely to result in significantly better performance.
|
|
|
|
|
|
Security Implications
|
|
=====================
|
|
|
|
None.
|
|
|
|
Implementation
|
|
==============
|
|
|
|
The naive implementation that follows from the specification will not be very efficient.
|
|
Fortunately, there are some reasonably straightforward transformations that can be used to improve performance.
|
|
Performance should be comparable to the implementation of :pep:`634` (at time of writing) by the release of 3.10.
|
|
Further performance improvements may have to wait for the 3.11 release.
|
|
|
|
Possible optimizations
|
|
----------------------
|
|
|
|
The following is not part of the specification,
|
|
but guidelines to help developers create an efficient implementation.
|
|
|
|
Splitting evaluation into lanes
|
|
'''''''''''''''''''''''''''''''
|
|
|
|
Since the first step in matching each pattern is check to against the kind, it is possible to combine all the checks against kind into a single multi-way branch at the beginning
|
|
of the match. The list of cases can then be duplicated into several "lanes" each corresponding to one kind.
|
|
It is then trivial to remove unmatchable cases from each lane.
|
|
Depending on the kind, different optimization strategies are possible for each lane.
|
|
Note that the body of the match clause does not need to be duplicated, just the pattern.
|
|
|
|
Sequence patterns
|
|
'''''''''''''''''
|
|
|
|
This is probably the most complex to optimize and the most profitable in terms of performance.
|
|
Since each pattern can only match a range of lengths, often only a single length,
|
|
the sequence of tests can be rewritten in as an explicit iteration over the sequence,
|
|
attempting to match only those patterns that apply to that sequence length.
|
|
|
|
For example:
|
|
|
|
::
|
|
|
|
case []:
|
|
A
|
|
case [x]:
|
|
B
|
|
case [x, y]:
|
|
C
|
|
case other:
|
|
D
|
|
|
|
Can be compiled roughly as:
|
|
|
|
::
|
|
|
|
# Choose lane
|
|
$i = iter($value)
|
|
for $0 in $i:
|
|
break
|
|
else:
|
|
A
|
|
goto done
|
|
for $1 in $i:
|
|
break
|
|
else:
|
|
x = $0
|
|
B
|
|
goto done
|
|
for $2 in $i:
|
|
del $0, $1, $2
|
|
break
|
|
else:
|
|
x = $0
|
|
y = $1
|
|
C
|
|
goto done
|
|
other = $value
|
|
D
|
|
done:
|
|
|
|
|
|
Mapping patterns
|
|
''''''''''''''''
|
|
|
|
The best strategy here is probably to form a decision tree based on the size of the mapping and which keys are present.
|
|
There is no point repeatedly testing for the presence of a key.
|
|
For example::
|
|
|
|
match obj:
|
|
case {a:x, b:y}:
|
|
W
|
|
case {a:x, c:y}:
|
|
X
|
|
case {a:x, b:_, c:y}:
|
|
Y
|
|
case other:
|
|
Z
|
|
|
|
If the key ``"a"`` is not present when checking for case X, there is no need to check it again for Y.
|
|
|
|
The mapping lane can be implemented, roughly as:
|
|
|
|
::
|
|
|
|
# Choose lane
|
|
if len($value) == 2:
|
|
if "a" in $value:
|
|
if "b" in $value:
|
|
x = $value["a"]
|
|
y = $value["b"]
|
|
goto W
|
|
if "c" in $value:
|
|
x = $value["a"]
|
|
y = $value["c"]
|
|
goto X
|
|
elif len($value) == 3:
|
|
if "a" in $value and "b" in $value:
|
|
x = $value["a"]
|
|
y = $value["c"]
|
|
goto Y
|
|
other = $value
|
|
goto Z
|
|
|
|
Summary of differences between this PEP and PEP 634
|
|
===================================================
|
|
|
|
|
|
The changes to the semantics can be summarized as:
|
|
|
|
* Requires ``__match_args__`` to be a *tuple* of strings, not just a sequence.
|
|
This make pattern matching a bit more robust and optimizable as ``__match_args__`` can be assumed to be immutable.
|
|
* Selecting the kind of container patterns that can be matched uses ``cls.__match_container__`` instead of
|
|
``issubclass(cls, collections.abc.Mapping)`` and ``issubclass(cls, collections.abc.Sequence)``.
|
|
* Allows classes to opt out of deconstruction altogether, if necessary, but setting ``__match_class__ = 0``.
|
|
* The behavior when matching patterns is more precisely defined, but is otherwise unchanged.
|
|
|
|
There are no changes to syntax. All examples given in the :pep:`636` tutorial should continue to work as they do now.
|
|
|
|
Rejected Ideas
|
|
==============
|
|
|
|
Using attributes from the instance's dictionary
|
|
-----------------------------------------------
|
|
|
|
An earlier version of this PEP only used attributes from the instance's dictionary when matching a class pattern with ``__match_class__`` was the default value.
|
|
The intent was to avoid capturing bound-methods and other synthetic attributes. However, this also mean that properties were ignored.
|
|
|
|
For the class::
|
|
|
|
class C:
|
|
def __init__(self):
|
|
self.a = "a"
|
|
@property
|
|
def p(self):
|
|
...
|
|
def m(self):
|
|
...
|
|
|
|
Ideally we would match the attributes "a" and "p", but not "m".
|
|
However, there is no general way to do that, so this PEP now follows the semantics of :pep:`634`.
|
|
|
|
Lookup of ``__match_args__`` on the subject not the pattern
|
|
-----------------------------------------------------------
|
|
|
|
An earlier version of this PEP looked up ``__match_args__`` on the class of the subject and
|
|
not the class specified in the pattern.
|
|
This has been rejected for a few reasons::
|
|
|
|
* Using the class specified in the pattern is more amenable to optimization and can offer better performance.
|
|
* Using the class specified in the pattern has the potential to provide better error reporting is some cases.
|
|
* Neither approach is perfect, both have odd corner cases. Keeping the status quo minimizes disruption.
|
|
|
|
Combining ``__match_class__`` and ``__match_container__`` into a single value
|
|
-----------------------------------------------------------------------------
|
|
|
|
An earlier version of this PEP combined ``__match_class__`` and ``__match_container__`` into a single value, ``__match_kind__``.
|
|
Using a single value has a small advantage in terms of performance,
|
|
but is likely to result in unintended changes to container matching when overriding class matching behavior, and vice versa.
|
|
|
|
|
|
Deferred Ideas
|
|
==============
|
|
|
|
The original version of this PEP included the match kind ``MATCH_POSITIONAL`` and special method
|
|
``__deconstruct__`` which would allow classes full control over their matching. This is important
|
|
for libraries like ``sympy``.
|
|
|
|
For example, using ``sympy``, we might want to write::
|
|
|
|
# sin(x)**2 + cos(x)**2 == 1
|
|
case Add(Pow(sin(a), 2), Pow(cos(b), 2)) if a == b:
|
|
return 1
|
|
|
|
For ``sympy`` to support the positional patterns with current pattern matching is possible,
|
|
but is tricky. With these additional features it can be implemented easily [9]_.
|
|
|
|
This idea will feature in a future PEP for 3.11.
|
|
However, it is too late in the 3.10 development cycle for such a change.
|
|
|
|
Having a separate value to reject all class matches
|
|
---------------------------------------------------
|
|
|
|
In an earlier version of this PEP, there was a distinct value for ``__match_class__`` that allowed classes to not match any class
|
|
pattern that would have required deconstruction. However, this would become redundant once ``MATCH_POSITIONAL`` is introduced, and
|
|
complicates the specification for an extremely rare case.
|
|
|
|
|
|
Code examples
|
|
=============
|
|
|
|
.. [1]
|
|
|
|
::
|
|
|
|
class Symbol:
|
|
__match_class__ = MATCH_SELF
|
|
|
|
.. [2]
|
|
|
|
This::
|
|
|
|
case [a, b] if a is b:
|
|
|
|
translates to::
|
|
|
|
$kind = type($value).__match_container__
|
|
if $kind != MATCH_SEQUENCE:
|
|
FAIL
|
|
if len($value) != 2:
|
|
FAIL
|
|
a, b = $value
|
|
if not a is b:
|
|
FAIL
|
|
|
|
.. [3]
|
|
|
|
This::
|
|
|
|
case [a, *b, c]:
|
|
|
|
translates to::
|
|
|
|
$kind = type($value).__match_container__
|
|
if $kind != MATCH_SEQUENCE:
|
|
FAIL
|
|
if len($value) < 2:
|
|
FAIL
|
|
a, *b, c = $value
|
|
|
|
.. [4]
|
|
|
|
This::
|
|
|
|
case {"x": x, "y": y} if x > 2:
|
|
|
|
translates to::
|
|
|
|
$kind = type($value).__match_container__
|
|
if $kind != MATCH_MAPPING:
|
|
FAIL
|
|
$tmp = $value.get("x", $sentinel)
|
|
if $tmp is $sentinel:
|
|
FAIL
|
|
x = $tmp
|
|
$tmp = $value.get("y", $sentinel)
|
|
if $tmp is $sentinel:
|
|
FAIL
|
|
y = $tmp
|
|
if not x > 2:
|
|
FAIL
|
|
|
|
.. [5]
|
|
|
|
This::
|
|
|
|
case {"x": x, "y": y, **z}:
|
|
|
|
translates to::
|
|
|
|
$kind = type($value).__match_container__
|
|
if $kind != MATCH_MAPPING:
|
|
FAIL
|
|
$tmp = dict($value)
|
|
if not $tmp.keys() >= {"x", "y"}:
|
|
FAIL
|
|
x = $tmp.pop("x")
|
|
y = $tmp.pop("y")
|
|
z = $tmp
|
|
|
|
.. [6]
|
|
|
|
This::
|
|
|
|
match ClsName(x, y):
|
|
|
|
translates to::
|
|
|
|
if not isinstance($value, ClsName):
|
|
FAIL
|
|
$attrs = ClsName.__match_args__
|
|
if len($attr) < 2:
|
|
FAIL
|
|
try:
|
|
x = getattr($value, $attrs[0])
|
|
y = getattr($value, $attrs[1])
|
|
except AttributeError:
|
|
FAIL
|
|
|
|
.. [7]
|
|
|
|
This::
|
|
|
|
match ClsName(a=x, b=y):
|
|
|
|
translates to::
|
|
|
|
if not isinstance($value, ClsName):
|
|
FAIL
|
|
try:
|
|
x = $value.a
|
|
y = $value.b
|
|
except AttributeError:
|
|
FAIL
|
|
|
|
.. [8]
|
|
|
|
This::
|
|
|
|
match ClsName(x, a=y):
|
|
|
|
translates to::
|
|
|
|
|
|
if not isinstance($value, ClsName):
|
|
FAIL
|
|
$attrs = ClsName.__match_args__
|
|
if len($attr) < 1:
|
|
raise TypeError(...)
|
|
$positional_names = $attrs[:1]
|
|
try:
|
|
x = getattr($value, $attrs[0])
|
|
if "a" in $positional_names:
|
|
raise TypeError(...)
|
|
y = $value.a
|
|
except AttributeError:
|
|
FAIL
|
|
|
|
.. [9]
|
|
|
|
::
|
|
|
|
class Basic:
|
|
__match_class__ = MATCH_POSITIONAL
|
|
def __deconstruct__(self):
|
|
return self._args
|
|
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document is placed in the public domain or under the
|
|
CC0-1.0-Universal license, whichever is more permissive.
|