Classes which define ``__match_kind__ & MATCH_CLASS`` to be non-zero must
implement one additional special attribute, and one special method:
*``__attributes__``: should hold a tuple of strings indicating the names of attributes that are to be considered for matching; it may be empty for postional-only matches.
*``__deconstruct__()``: should return a sequence which contains the parts of the deconstructed object.
..note::
``__attributes__`` and ``__deconstruct__`` will be automatically generated for dataclasses and named tuples.
The pattern matching implementation is *not* required to check that ``__attributes__`` and ``__deconstruct__`` behave as specified.
If the value of ``__attributes__`` or the result of ``__deconstruct__()`` is not as specified, then
the implementation may raise any exception, or match the wrong pattern.
Of course, implementations are free to check these properties and provide meaningful error messages if they can do so efficiently.
Semantics of the matching process
---------------------------------
In the following, all variables of the form ``$var`` are temporary variables and are not visible to the Python program.
They may be visible via introspection, but that is an implementation detail and should not be relied on.
The psuedo-statement ``DONE`` is used to signify that matching is complete and that following patterns should be ignored.
All the translations below include guards. If no guard is present, simply substitute the guard ``if True`` when translating.
Variables of the form ``$ALL_CAPS`` are meta-variables holding a syntactic element, they are not normal variables.
So, ``$VARS = $items`` is not an assignment of ``$items`` to ``$VARS``,
but an unpacking of ``$items`` into the variables that ``$VARS`` holds.
For example, with the abstract syntax ``case [$VARS]:``, and the concrete syntax ``case[a, b]:`` then ``$VARS`` would hold the variables ``(a, b)``,
not the values of those variables.
The psuedo-function ``QUOTE`` takes a variable and returns the name of that variable.
For example, if the meta-variable ``$VAR`` held the variable ``foo`` then ``QUOTE($VAR) == "foo"``.
All additional code listed below that is not present in the original source will not trigger line events, conforming to PEP 626.
Preamble
''''''''
Before any patterns are matched, the expression being matched is evaluated and its kind is determined::
match expr:
translates to::
$value = expr
$kind = type($value).__match_kind__
In addition some helper variables are initialized::
$list = None
$dict = None
$attrs = None
$items = None
Capture patterns
''''''''''''''''
Capture patterns always match, so::
case capture_var if guard:
translates to::
capture_var = $value
if guard:
DONE
Wildcard patterns
'''''''''''''''''
Wildcard patterns always match, so::
case _ if guard:
translates to::
if guard:
DONE
Literal Patterns
''''''''''''''''
The literal pattern::
case LITERAL if guard:
translates to::
if $value == LITERAL and guard:
DONE
except when the literal is one of ``None``, ``True`` or ``False`` ,
when it translates to::
if $value is LITERAL and guard:
DONE
Value Patterns
''''''''''''''
The value pattern::
case value.pattern if guard:
translates to::
if $value == value.pattern and guard:
DONE
Sequence Patterns
'''''''''''''''''
Before matching the first sequence pattern, but after checking that ``$value`` is a sequence,
``$value`` is converted to a list.
A pattern not including a star pattern::
case [$VARS] if guard:
translates to::
if $kind & MATCH_SEQUENCE:
if $list is None:
$list = list($value)
if len($list) == len($VARS):
$VARS = $list
if guard:
DONE
Example: [2]_
A pattern including a star pattern::
case [$VARS] if guard
translates to::
if $kind & MATCH_SEQUENCE:
if $list is None:
$list = list($value)
if len($list) >= len($VARS):
$VARS = $list # Note that $VARS includes a star expression.
if guard:
DONE
Example: [3]_
Mapping Patterns
''''''''''''''''
Before matching the first mapping pattern, but after checking that ``$value`` is a mapping,
``$value`` is converted to a ``dict``.
A pattern not including a double-star pattern::
case {$KEYWORD_PATTERNS} if guard:
translates to::
if $kind & MATCH_MAPPING:
if $dict is None:
$dict = dict($value)
if $dict.keys() == $KEYWORD_PATTERNS.keys():
# $KEYWORD_PATTERNS is a meta-variable mapping names to variables.
``object.__match_kind__`` will be ``MATCH_DEFAULT``.
For common builtin classes ``__match_kind__`` will be:
*``bool``: ``MATCH_SELF``
*``bytearray``: ``MATCH_SELF``
*``bytes``: ``MATCH_SELF``
*``float``: ``MATCH_SELF``
*``frozenset``: ``MATCH_SELF``
*``int``: ``MATCH_SELF``
*``set``: ``MATCH_SELF``
*``str``: ``MATCH_SELF``
*``list``: ``MATCH_SEQUENCE | MATCH_SELF``
*``tuple``: ``MATCH_SEQUENCE | MATCH_SELF``
*``dict``: ``MATCH_MAPPING | MATCH_SELF``
Named tuples will have ``__match_kind__`` set to ``MATCH_SEQUENCE | MATCH_CLASS``.
* All other standard library classes for which ``issubclass(cls, collections.abc.Mapping)`` is true will have ``__match_kind__`` set to ``MATCH_MAPPING``.
* All other standard library classes for which ``issubclass(cls, collections.abc.Sequence)`` is true will have ``__match_kind__`` set to ``MATCH_SEQUENCE``.
Legal optimizations
-------------------
The above semantics implies a lot of redundant effort and copying in the implementation.
However, it is possible to implement the above semantics efficiently by employing semantic preserving transformations
on the naive implementation.
When performing matching, implementations are allowed
to treat the following functions and methods as pure:
*``cls.__len__()`` for any class supporting ``MATCH_SEQUENCE``
*``dict.keys()``
*``dict.__contains__()``
*``dict.__getitem__()``
Implementations are also allowed to freely replace ``isinstance(obj, cls)`` with ``issubclass(type(obj), cls)`` and vice-versa.
The naive implementation that follows from the specification will not be very efficient.
Fortunately, there are some reasonably straightforward transformations that can be used to improve performance.
Performance should be comparable to the implementation of PEP 634 (at time of writing) by the release of 3.10.
Further performance improvements may have to wait for the 3.11 release.
Possible optimizations
----------------------
The following is not part of the specification,
but guidelines to help developers create an efficient implementation.
Splitting evaluation into lanes
'''''''''''''''''''''''''''''''
Since the first step in matching each pattern is check to against the kind, it is possible to combine all the checks against kind into a single multi-way branch at the beginning
of the match. The list of cases can then be duplicated into several "lanes" each corresponding to one kind.
It is then trivial to remove unmatchable cases from each lane.
Depending on the kind, different optimization strategies are possible for each lane.
Note that the body of the match clause does not need to be duplicated, just the pattern.
Sequence patterns
'''''''''''''''''
This is probably the most complex to optimize and the most profitable in terms of performance.
Since each pattern can only match a range of lengths, often only a single length,
the sequence of tests can be rewitten in as an explicit iteration over the sequence,
attempting to match only those patterns that apply to that sequence length.
For example:
::
case []:
A
case [x]:
B
case [x, y]:
C
case other:
D
Can be compiled roughly as:
::
# Choose lane
$i = iter($value)
for $0 in $i:
break
else:
A
goto done
for $1 in $i:
break
else:
x = $0
B
goto done
for $2 in $i:
del $0, $1, $2
break
else:
x = $0
y = $1
C
goto done
other = $value
D
done:
Mapping patterns
''''''''''''''''
The best stategy here is probably to form a decision tree based on the size of the mapping and which keys are present.
There is no point repeatedly testing for the presence of a key.
For example::
match obj:
case {a:x, b:y}:
W
case {a:x, c:y}:
X
case {a:x, b:_, c:y}:
Y
case other:
Z
If the key ``"a"`` is not present when checking for case X, there is no need to check it again for Y.
The mapping lane can be implemented, roughly as:
::
# Choose lane
if len($dict) == 2:
if "a" in $dict:
if "b" in $dict:
x = $dict["a"]
y = $dict["b"]
goto W
if "c" in $dict:
x = $dict["a"]
y = $dict["c"]
goto X
elif len(dict) == 3:
if "a" in $dict and "b" in $dict:
x = $dict["a"]
y = $dict["c"]
goto Y
other = $value
goto Z
Summary of differences between this PEP and PEP 634