959 lines
28 KiB
ReStructuredText
959 lines
28 KiB
ReStructuredText
PEP: 653
|
||
Title: Precise Semantics for Pattern Matching
|
||
Author: Mark Shannon <mark@hotpy.org>
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 9-Feb-2021
|
||
Post-History: 18-Feb-2021
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
This PEP proposes a semantics for pattern matching that respects the general concept of PEP 634,
|
||
but is more precise, easier to reason about, and should be faster.
|
||
|
||
The object model will be extended with two special (dunder) attributes,
|
||
in addition to the ``__match_args__`` attribute from PEP 634, to support pattern matching.
|
||
|
||
* A ``__match_kind__`` attribute. Must be an integer.
|
||
* A ``__match_args__`` attribute. Only needed for those classes wanting to customize matching the class pattern.
|
||
If present, it must be a tuple of strings.
|
||
* A ``__deconstruct__()`` method. Only needed for customizing matching of class patterns with positional arguments.
|
||
Returns an iterable over the components of the deconstructed object.
|
||
|
||
|
||
With this PEP:
|
||
|
||
* The semantics of pattern matching will be clearer, so that patterns are easier to reason about.
|
||
* It will be possible to implement pattern matching in a more efficient fashion.
|
||
* Pattern matching will be more usable for complex classes, by allowing classes more control over which patterns they match.
|
||
|
||
Motivation
|
||
==========
|
||
|
||
Pattern matching in Python, as described in PEP 634, is to be added to Python 3.10.
|
||
Unfortunately, PEP 634 is not as precise about the semantics as it could be,
|
||
nor does it allow classes sufficient control over how they match patterns.
|
||
|
||
Precise semantics
|
||
-----------------
|
||
|
||
PEP 634 explicitly includes a section on undefined behavior.
|
||
Large amounts of undefined behavior may be acceptable in a language like C,
|
||
but in Python it should be kept to a minimum.
|
||
Pattern matching in Python can be defined more precisely without loosing expressiveness or performance.
|
||
|
||
Improved control over class matching
|
||
------------------------------------
|
||
|
||
PEP 634 assumes that class instances are simply a collection of their attributes,
|
||
and that deconstruction by attribute access is the dual of construction. That is not true, as
|
||
many classes have a more complex relation between their constructor and internal attributes.
|
||
Those classes need to be able to define their own deconstruction.
|
||
|
||
For example, using ``sympy``, we might want to write::
|
||
|
||
# sin(x)**2 + cos(x)**2 == 1
|
||
case Add(Pow(sin(a), 2), Pow(cos(b), 2)) if a == b:
|
||
return 1
|
||
|
||
For ``sympy`` to support this pattern for PEP 634 would be possible, but tricky and cumbersome.
|
||
With this PEP it can be implemented easily [1]_.
|
||
|
||
PEP 634 also privileges some builtin classes with a special form of matching, the "self" match.
|
||
For example the pattern ``list(x)`` matches a list and assigns the list to ``x``.
|
||
By allowing classes to choose which kinds of pattern they match, other classes can use this form as well.
|
||
|
||
|
||
Robustness
|
||
----------
|
||
|
||
With this PEP, access to attributes during pattern matching becomes well defined and deterministic.
|
||
This makes pattern matching less error prone when matching objects with hidden side effects, such as object-relational mappers.
|
||
Objects will have control over their own deconstruction, which can help prevent unintended consequences should attribute access have side-effects.
|
||
|
||
PEP 634 relies on the ``collections.abc`` module when determining which patterns a value can match, implicitly importing it if necessary.
|
||
This PEP will eliminate surprising import errors and misleading audit events from those imports.
|
||
|
||
Efficient implementation
|
||
------------------------
|
||
|
||
The semantics proposed in this PEP will allow efficient implementation, partly as a result of having precise semantics
|
||
and partly from using the object model.
|
||
|
||
With precise semantics, it is possible to reason about what code transformations are correct,
|
||
and thus apply optimizations effectively.
|
||
|
||
Because the object model is a core part of Python, implementations already handle special attribute lookup efficiently.
|
||
Looking up a special attribute is much faster than performing a subclass test on an abstract base class.
|
||
|
||
Rationale
|
||
=========
|
||
|
||
The object model and special methods are at the core of the Python language. Consequently,
|
||
implementations support them well.
|
||
Using special attributes for pattern matching allows pattern matching to be implemented in a way that
|
||
integrates well with the rest of the implementation, and is thus easier to maintain and is likely to perform better.
|
||
|
||
A match statement performs a sequence of pattern matches. In general, matching a pattern has three parts:
|
||
|
||
1. Can the value match this kind of pattern?
|
||
2. When deconstructed, does the value match this particular pattern?
|
||
3. Is the guard true?
|
||
|
||
To determine whether a value can match a particular kind of pattern, we add the ``__match_kind__`` attribute.
|
||
This allows the kind of a value to be determined once and in a efficient fashion.
|
||
|
||
To deconstruct an object, pre-existing special methods can be used for sequence and mapping patterns, but something new is needed for class patterns.
|
||
PEP 634 proposes using ad-hoc attribute access, disregarding the possibility of side-effects.
|
||
This could be problematic should the attributes of the object be dynamically created or consume resources.
|
||
By adding the ``__deconstruct__()`` method, objects can control how they are deconstructed,
|
||
and patterns with a different set of attributes can be efficiently rejected.
|
||
Should deconstruction of an object make no sense, then classes can define ``__match_kind__`` to reject class patterns completely.
|
||
|
||
Specification
|
||
=============
|
||
|
||
|
||
Additions to the object model
|
||
-----------------------------
|
||
|
||
A ``__match_kind__`` attribute will be added to ``object``.
|
||
It should be overridden by classes that want to match mapping or sequence patterns,
|
||
or want change the default behavior when matching class patterns.
|
||
It must be an integer and should be exactly one of these::
|
||
|
||
0
|
||
MATCH_SEQUENCE
|
||
MATCH_MAPPING
|
||
|
||
bitwise ``or``\ ed with exactly one of these::
|
||
|
||
0
|
||
MATCH_DEFAULT
|
||
MATCH_POSITIONAL
|
||
MATCH_SELF
|
||
|
||
.. note::
|
||
It does not matter what the actual values are. We will refer to them by name only.
|
||
Symbolic constants will be provided both for Python and C, and once defined they will
|
||
never be changed.
|
||
|
||
Classes inheriting from ``object`` will inherit ``__match_kind__ = MATCH_DEFAULT`` and ``__match_args__ = ()``
|
||
|
||
Classes which define ``__match_kind__ & MATCH_POSITIONAL`` to be non-zero must
|
||
implement ``__deconstruct__()`` and should consider redefining ``__match_args__``.
|
||
|
||
* ``__match_args__``: should hold a tuple of strings indicating the names of attributes that are to be considered for matching; it may be empty for positional-only matches.
|
||
* ``__deconstruct__()``: should return a sequence which contains the parts of the deconstructed object.
|
||
|
||
.. note::
|
||
``__match_args__`` will be automatically generated for dataclasses and named tuples, as specified in PEP 634.
|
||
|
||
The pattern matching implementation is *not* required to check that ``__match_args__`` and ``__deconstruct__`` behave as specified.
|
||
If the value of ``__match_args__`` or the result of ``__deconstruct__()`` is not as specified, then
|
||
the implementation may raise any exception, or match the wrong pattern.
|
||
Of course, implementations are free to check these properties and provide meaningful error messages if they can do so efficiently.
|
||
|
||
Semantics of the matching process
|
||
---------------------------------
|
||
|
||
In the following, all variables of the form ``$var`` are temporary variables and are not visible to the Python program.
|
||
They may be visible via introspection, but that is an implementation detail and should not be relied on.
|
||
The psuedo-statement ``FAIL`` is used to signify that matching failed for this pattern and that matching should move to the next pattern.
|
||
If control reaches the end of the translation without reaching a ``FAIL``, then it has matched, and following patterns are ignored.
|
||
All the translations below include guards. If no guard is present, simply substitute the guard ``if True`` when translating.
|
||
|
||
Variables of the form ``$ALL_CAPS`` are meta-variables holding a syntactic element, they are not normal variables.
|
||
So, ``$VARS = $items`` is not an assignment of ``$items`` to ``$VARS``,
|
||
but an unpacking of ``$items`` into the variables that ``$VARS`` holds.
|
||
For example, with the abstract syntax ``case [$VARS]:``, and the concrete syntax ``case[a, b]:`` then ``$VARS`` would hold the variables ``(a, b)``,
|
||
not the values of those variables.
|
||
|
||
The psuedo-function ``QUOTE`` takes a variable and returns the name of that variable.
|
||
For example, if the meta-variable ``$VAR`` held the variable ``foo`` then ``QUOTE($VAR) == "foo"``.
|
||
|
||
All additional code listed below that is not present in the original source will not trigger line events, conforming to PEP 626.
|
||
|
||
|
||
Preamble
|
||
''''''''
|
||
|
||
Before any patterns are matched, the expression being matched is evaluated and its kind is determined::
|
||
|
||
match expr:
|
||
|
||
translates to::
|
||
|
||
$value = expr
|
||
$kind = type($value).__match_kind__
|
||
|
||
In addition some helper variables are initialized::
|
||
|
||
$list = None
|
||
$dict = None
|
||
$attrs = None
|
||
$items = None
|
||
|
||
Capture patterns
|
||
''''''''''''''''
|
||
|
||
Capture patterns always match, so the irrefutable match::
|
||
|
||
case capture_var:
|
||
|
||
translates to::
|
||
|
||
capture_var = $value
|
||
|
||
Wildcard patterns
|
||
'''''''''''''''''
|
||
|
||
Wildcard patterns always match, so::
|
||
|
||
case _:
|
||
|
||
translates to::
|
||
|
||
# No code -- Automatically matches
|
||
|
||
|
||
Literal Patterns
|
||
''''''''''''''''
|
||
|
||
The literal pattern::
|
||
|
||
case LITERAL:
|
||
|
||
translates to::
|
||
|
||
if $value != LITERAL:
|
||
FAIL
|
||
|
||
except when the literal is one of ``None``, ``True`` or ``False`` ,
|
||
when it translates to::
|
||
|
||
if $value is not LITERAL:
|
||
FAIL
|
||
|
||
Value Patterns
|
||
''''''''''''''
|
||
|
||
The value pattern::
|
||
|
||
case value.pattern:
|
||
|
||
translates to::
|
||
|
||
if $value != value.pattern:
|
||
FAIL
|
||
|
||
Sequence Patterns
|
||
'''''''''''''''''
|
||
|
||
Before matching the first sequence pattern, but after checking that ``$value`` is a sequence,
|
||
``$value`` is converted to a list.
|
||
|
||
A pattern not including a star pattern::
|
||
|
||
case [$VARS]:
|
||
|
||
translates to::
|
||
|
||
if $kind & MATCH_SEQUENCE == 0:
|
||
FAIL
|
||
if $list is None:
|
||
$list = list($value)
|
||
if len($list) != len($VARS):
|
||
FAIL
|
||
$VARS = $list
|
||
|
||
Example: [2]_
|
||
|
||
A pattern including a star pattern::
|
||
|
||
case [$VARS]
|
||
|
||
translates to::
|
||
|
||
if $kind & MATCH_SEQUENCE == 0:
|
||
FAIL
|
||
if $list is None:
|
||
$list = list($value)
|
||
if len($list) < len($VARS):
|
||
FAIL
|
||
$VARS = $list # Note that $VARS includes a star expression.
|
||
|
||
Example: [3]_
|
||
|
||
Mapping Patterns
|
||
''''''''''''''''
|
||
|
||
Before matching the first mapping pattern, but after checking that ``$value`` is a mapping,
|
||
``$value`` is converted to a ``dict``.
|
||
|
||
A pattern not including a double-star pattern::
|
||
|
||
case {$KEYWORD_PATTERNS}:
|
||
|
||
translates to::
|
||
|
||
if $kind & MATCH_MAPPING == 0:
|
||
FAIL
|
||
if $dict is None:
|
||
$dict = dict($value)
|
||
if $dict.keys() != $KEYWORD_PATTERNS.keys():
|
||
FAIL
|
||
# $KEYWORD_PATTERNS is a meta-variable mapping names to variables.
|
||
for $KEYWORD in $KEYWORD_PATTERNS:
|
||
$KEYWORD_PATTERNS[$KEYWORD] = $dict[QUOTE($KEYWORD)]
|
||
|
||
Example: [4]_
|
||
|
||
A pattern including a double-star pattern::
|
||
|
||
case {$KEYWORD_PATTERNS, **$DOUBLE_STARRED_PATTERN}:
|
||
|
||
translates to::
|
||
|
||
if $kind & MATCH_MAPPING == 0:
|
||
FAIL
|
||
if $dict is None:
|
||
$dict = dict($value)
|
||
if $dict.keys() not >= $KEYWORD_PATTERNS.keys():
|
||
FAIL:
|
||
# $KEYWORD_PATTERNS is a meta-variable mapping names to variables.
|
||
$tmp = dict($dict)
|
||
for $KEYWORD in $KEYWORD_PATTERNS:
|
||
$KEYWORD_PATTERNS[$KEYWORD] = $tmp.pop(QUOTE($KEYWORD))
|
||
$DOUBLE_STARRED_PATTERN = $tmp
|
||
|
||
Example: [5]_
|
||
|
||
Class Patterns
|
||
''''''''''''''
|
||
|
||
Class pattern with no arguments::
|
||
|
||
case ClsName():
|
||
|
||
translates to::
|
||
|
||
if not isinstance($value, ClsName):
|
||
FAIL
|
||
|
||
Class pattern with a single positional pattern::
|
||
|
||
case ClsName($VAR):
|
||
|
||
translates to::
|
||
|
||
if $kind & MATCH_SELF:
|
||
if not isinstance($value, ClsName):
|
||
FAIL
|
||
$VAR = $value
|
||
else:
|
||
As other positional-only class pattern
|
||
|
||
|
||
Positional-only class pattern::
|
||
|
||
case ClsName($VARS):
|
||
|
||
translates to::
|
||
|
||
if not isinstance($value, ClsName):
|
||
FAIL
|
||
if $kind & MATCH_POSITIONAL:
|
||
if $items is None:
|
||
$items = type($value).__deconstruct__($value)
|
||
# $VARS is a meta-variable.
|
||
if len($items) < len($VARS):
|
||
FAIL
|
||
$VARS = $items
|
||
elif $kind & MATCH_DEFAULT:
|
||
if $attrs is None:
|
||
$attrs = type($value).__match_args__
|
||
if len($attr) < len($VARS):
|
||
FAIL
|
||
try:
|
||
for i, $VAR in enumerate($VARS):
|
||
$VAR = getattr($value, $attrs[i])
|
||
except AttributeError:
|
||
FAIL
|
||
else:
|
||
FAIL
|
||
|
||
Example: [6]_
|
||
|
||
.. note::
|
||
|
||
``__match_args__`` is not checked when matching positional-only class patterns,
|
||
this allows classes to match only positional-only patterns by leaving ``__match_args__`` set to the default value of ``()``.
|
||
|
||
Class patterns with all keyword patterns::
|
||
|
||
case ClsName($KEYWORD_PATTERNS):
|
||
|
||
translates to::
|
||
|
||
if not isinstance($value, ClsName):
|
||
FAIL
|
||
if $kind & MATCH_POSITIONAL:
|
||
if $items is None:
|
||
$items = type($value).__deconstruct__($value)
|
||
if $attrs is None:
|
||
$attrs = type($value).__match_args__
|
||
$kwname_tuple = tuple(QUOTE($KEYWORD) for $KEYWORD in $KEYWORD_PATTERNS)
|
||
$indices = multi_index($attrs, $kwname_tuple, 0)
|
||
if $indices is None:
|
||
raise TypeError(...)
|
||
try:
|
||
for $KEYWORD, $index in zip($KEYWORD_PATTERNS, indices):
|
||
$KEYWORD_PATTERNS[$KEYWORD] = $items[$index]
|
||
except IndexError:
|
||
raise TypeError(...)
|
||
elif $kind & MATCH_DEFAULT:
|
||
try:
|
||
for $KEYWORD in $KEYWORD_PATTERNS:
|
||
$tmp = getattr($value, QUOTE($KEYWORD))
|
||
$KEYWORD_PATTERNS[$KEYWORD] = $tmp
|
||
except AttributeError:
|
||
FAIL
|
||
else:
|
||
FAIL
|
||
|
||
Where the helper function ``multi_index(t, values, min)`` returns a tuple of indices of ``values`` into ``t``,
|
||
or ``None`` if any value is not present in ``t`` or the index of the value is less than ``min``.
|
||
|
||
Examples::
|
||
|
||
multi_index(("a", "b", "c"), ("a", "c"), 0) == (0,2)
|
||
multi_index(("a", "b", "c"), ("a", "c"), 1) is None
|
||
multi_index(("a", "b", "c"), ("a", "d"), 0) is None
|
||
|
||
Example: [7]_
|
||
|
||
Class patterns with positional and keyword patterns::
|
||
|
||
case ClsName($VARS, $KEYWORD_PATTERNS):
|
||
|
||
translates to::
|
||
|
||
if not isinstance($value, ClsName):
|
||
FAIL
|
||
if $kind & MATCH_POSITIONAL:
|
||
if $items is None:
|
||
$items = type($value).__deconstruct__($value)
|
||
if $attrs is None:
|
||
$attrs = type($value).__match_args__
|
||
if len($items) < len($VARS):
|
||
FAIL
|
||
$VARS = $items[:len($VARS)]
|
||
$kwname_tuple = tuple(QUOTE($KEYWORD) for $KEYWORD in $KEYWORD_PATTERNS)
|
||
$indices = multi_index($attrs, $kwname_tuple, len($VARS))
|
||
if $indices is None:
|
||
raise TypeError(...)
|
||
try:
|
||
for $KEYWORD, $index in zip($KEYWORD_PATTERNS, indices):
|
||
$KEYWORD_PATTERNS[$KEYWORD] = $items[$index]
|
||
except IndexError:
|
||
raise TypeError(...)
|
||
elif $kind & MATCH_DEFAULT:
|
||
if $attrs is None:
|
||
$attrs = type($value).__match_args__
|
||
if len($attr) < len($VARS):
|
||
raise TypeError(...)
|
||
$positional_names = $attrs[:len($VARS)]
|
||
try:
|
||
for i, $VAR in enumerate($VARS):
|
||
$VAR = getattr($value, $attrs[i])
|
||
for $KEYWORD in $KEYWORD_PATTERNS:
|
||
$name = QUOTE($KEYWORD)
|
||
if $name in $positional_names:
|
||
raise TypeError(...)
|
||
$KEYWORD_PATTERNS[$KEYWORD] = getattr($value, $name)
|
||
except AttributeError:
|
||
FAIL
|
||
else:
|
||
FAIL
|
||
|
||
Example: [8]_
|
||
|
||
|
||
Nested patterns
|
||
'''''''''''''''
|
||
|
||
The above specification assumes that patterns are not nested. For nested patterns
|
||
the above translations are applied recursively by introducing temporary capture patterns.
|
||
|
||
For example, the pattern::
|
||
|
||
case [int(), str()]:
|
||
|
||
translates to::
|
||
|
||
if $kind & MATCH_SEQUENCE == 0:
|
||
FAIL
|
||
if $list is None:
|
||
$list = list($value)
|
||
if len($list) != 2:
|
||
FAIL
|
||
$value_0, $value_1 = $list
|
||
#Now match on temporary values
|
||
if not isinstance($value_0, int):
|
||
FAIL
|
||
if not isinstance($value_1, str):
|
||
FAIL
|
||
|
||
Guards
|
||
''''''
|
||
|
||
Guards translate to a test following the rest of the translation::
|
||
|
||
case pattern if guard:
|
||
|
||
translates to::
|
||
|
||
[translation for pattern]
|
||
if not guard:
|
||
FAIL
|
||
|
||
|
||
Non-conforming ``__match_kind__``
|
||
'''''''''''''''''''''''''''''''''
|
||
|
||
All classes should ensure that the the value of ``__match_kind__`` follows the specification.
|
||
Therefore, implementations can assume, without checking, that all the following are true::
|
||
|
||
(__match_kind__ & (MATCH_SEQUENCE | MATCH_MAPPING)) != (MATCH_SEQUENCE | MATCH_MAPPING)
|
||
(__match_kind__ & (MATCH_SELF | MATCH_POSITIONAL)) != (MATCH_SELF | MATCH_POSITIONAL)
|
||
(__match_kind__ & (MATCH_SELF | MATCH_DEFAULT)) != (MATCH_SELF | MATCH_DEFAULT)
|
||
(__match_kind__ & (MATCH_DEFAULT | MATCH_POSITIONAL)) != (MATCH_DEFAULT | MATCH_POSITIONAL)
|
||
|
||
Thus, implementations can assume that ``__match_kind__ & MATCH_SEQUENCE`` implies ``(__match_kind__ & MATCH_MAPPING) == 0``, and vice-versa.
|
||
Likewise for ``MATCH_SELF``, ``MATCH_POSITIONAL`` and ``MATCH_DEFAULT``.
|
||
|
||
If ``__match_kind__`` does not follow the specification,
|
||
then implementations may treat any of the expressions of the form ``$kind & MATCH_...`` above as having any value.
|
||
|
||
Implementation of ``__match_kind__`` in the standard library
|
||
------------------------------------------------------------
|
||
|
||
``object.__match_kind__`` will be ``MATCH_DEFAULT``.
|
||
|
||
For common builtin classes ``__match_kind__`` will be:
|
||
|
||
* ``bool``: ``MATCH_SELF``
|
||
* ``bytearray``: ``MATCH_SELF``
|
||
* ``bytes``: ``MATCH_SELF``
|
||
* ``float``: ``MATCH_SELF``
|
||
* ``frozenset``: ``MATCH_SELF``
|
||
* ``int``: ``MATCH_SELF``
|
||
* ``set``: ``MATCH_SELF``
|
||
* ``str``: ``MATCH_SELF``
|
||
* ``list``: ``MATCH_SEQUENCE | MATCH_SELF``
|
||
* ``tuple``: ``MATCH_SEQUENCE | MATCH_SELF``
|
||
* ``dict``: ``MATCH_MAPPING | MATCH_SELF``
|
||
|
||
Named tuples will have ``__match_kind__`` set to ``MATCH_SEQUENCE | MATCH_POSITIONAL``.
|
||
|
||
* All other standard library classes for which ``issubclass(cls, collections.abc.Mapping)`` is true will have ``__match_kind__`` set to ``MATCH_MAPPING``.
|
||
* All other standard library classes for which ``issubclass(cls, collections.abc.Sequence)`` is true will have ``__match_kind__`` set to ``MATCH_SEQUENCE``.
|
||
|
||
|
||
Legal optimizations
|
||
-------------------
|
||
|
||
The above semantics implies a lot of redundant effort and copying in the implementation.
|
||
However, it is possible to implement the above semantics efficiently by employing semantic preserving transformations
|
||
on the naive implementation.
|
||
|
||
When performing matching, implementations are allowed
|
||
to treat the following functions and methods as pure:
|
||
|
||
* ``cls.__len__()`` for any class supporting ``MATCH_SEQUENCE``
|
||
* ``dict.keys()``
|
||
* ``dict.__contains__()``
|
||
* ``dict.__getitem__()``
|
||
|
||
Implementations are allowed to freely replace ``isinstance(obj, cls)`` with ``issubclass(type(obj), cls)`` and vice-versa.
|
||
Implementations are also allowed to elide repeated tests of ``isinstance(obj, cls)``.
|
||
|
||
Security Implications
|
||
=====================
|
||
|
||
None.
|
||
|
||
Implementation
|
||
==============
|
||
|
||
The naive implementation that follows from the specification will not be very efficient.
|
||
Fortunately, there are some reasonably straightforward transformations that can be used to improve performance.
|
||
Performance should be comparable to the implementation of PEP 634 (at time of writing) by the release of 3.10.
|
||
Further performance improvements may have to wait for the 3.11 release.
|
||
|
||
Possible optimizations
|
||
----------------------
|
||
|
||
The following is not part of the specification,
|
||
but guidelines to help developers create an efficient implementation.
|
||
|
||
Splitting evaluation into lanes
|
||
'''''''''''''''''''''''''''''''
|
||
|
||
Since the first step in matching each pattern is check to against the kind, it is possible to combine all the checks against kind into a single multi-way branch at the beginning
|
||
of the match. The list of cases can then be duplicated into several "lanes" each corresponding to one kind.
|
||
It is then trivial to remove unmatchable cases from each lane.
|
||
Depending on the kind, different optimization strategies are possible for each lane.
|
||
Note that the body of the match clause does not need to be duplicated, just the pattern.
|
||
|
||
Sequence patterns
|
||
'''''''''''''''''
|
||
|
||
This is probably the most complex to optimize and the most profitable in terms of performance.
|
||
Since each pattern can only match a range of lengths, often only a single length,
|
||
the sequence of tests can be rewitten in as an explicit iteration over the sequence,
|
||
attempting to match only those patterns that apply to that sequence length.
|
||
|
||
For example:
|
||
|
||
::
|
||
|
||
case []:
|
||
A
|
||
case [x]:
|
||
B
|
||
case [x, y]:
|
||
C
|
||
case other:
|
||
D
|
||
|
||
Can be compiled roughly as:
|
||
|
||
::
|
||
|
||
# Choose lane
|
||
$i = iter($value)
|
||
for $0 in $i:
|
||
break
|
||
else:
|
||
A
|
||
goto done
|
||
for $1 in $i:
|
||
break
|
||
else:
|
||
x = $0
|
||
B
|
||
goto done
|
||
for $2 in $i:
|
||
del $0, $1, $2
|
||
break
|
||
else:
|
||
x = $0
|
||
y = $1
|
||
C
|
||
goto done
|
||
other = $value
|
||
D
|
||
done:
|
||
|
||
|
||
Mapping patterns
|
||
''''''''''''''''
|
||
|
||
The best stategy here is probably to form a decision tree based on the size of the mapping and which keys are present.
|
||
There is no point repeatedly testing for the presence of a key.
|
||
For example::
|
||
|
||
match obj:
|
||
case {a:x, b:y}:
|
||
W
|
||
case {a:x, c:y}:
|
||
X
|
||
case {a:x, b:_, c:y}:
|
||
Y
|
||
case other:
|
||
Z
|
||
|
||
If the key ``"a"`` is not present when checking for case X, there is no need to check it again for Y.
|
||
|
||
The mapping lane can be implemented, roughly as:
|
||
|
||
::
|
||
|
||
# Choose lane
|
||
if len($dict) == 2:
|
||
if "a" in $dict:
|
||
if "b" in $dict:
|
||
x = $dict["a"]
|
||
y = $dict["b"]
|
||
goto W
|
||
if "c" in $dict:
|
||
x = $dict["a"]
|
||
y = $dict["c"]
|
||
goto X
|
||
elif len(dict) == 3:
|
||
if "a" in $dict and "b" in $dict:
|
||
x = $dict["a"]
|
||
y = $dict["c"]
|
||
goto Y
|
||
other = $value
|
||
goto Z
|
||
|
||
Summary of differences between this PEP and PEP 634
|
||
===================================================
|
||
|
||
|
||
The changes to the semantics can be summarized as:
|
||
|
||
* Selecting the kind of pattern uses ``cls.__match_kind__`` instead of
|
||
``issubclass(cls, collections.abc.Mapping)`` and ``issubclass(cls, collections.abc.Sequence)``
|
||
and allows classes control over which kinds of pattern they match.
|
||
* Class matching is controlled by the ``__match_kind__`` attribute,
|
||
and the ``__deconstruct__`` method allows classes more control over how they are deconstructed.
|
||
* The default behavior when matching a class pattern with keyword patterns is more precisely defined,
|
||
but is broadly unchanged.
|
||
|
||
There are no changes to syntax. All examples given in the PEP 636 tutorial should continue to work as they do now.
|
||
|
||
Rejected Ideas
|
||
==============
|
||
|
||
An earlier version of this PEP only used attributes from the instance's dictionary when matching a class pattern with ``__match_kind__ == MATCH_DEFAULT``.
|
||
The intent was to avoid capturing bound-methods and other synthetic attributes. However, this also mean that properties were ignored.
|
||
|
||
For the class::
|
||
|
||
class C:
|
||
def __init__(self):
|
||
self.a = "a"
|
||
@property
|
||
def p(self):
|
||
...
|
||
def m(self):
|
||
...
|
||
|
||
Ideally we would match the attributes "a" and "p", but not "m".
|
||
However, there is no general way to do that, so this PEP now follows the semantics of PEP 634 for ``MATCH_DEFAULT``.
|
||
Classes may override this behavior if needed by using ``__match_kind__ == MATCH_POSITIONAL`` or ``__match_args__``.
|
||
|
||
Open Issues
|
||
===========
|
||
|
||
None, as yet.
|
||
|
||
|
||
References
|
||
==========
|
||
|
||
PEP 634
|
||
https://www.python.org/dev/peps/pep-0634
|
||
|
||
Code examples
|
||
=============
|
||
|
||
.. [1]
|
||
|
||
::
|
||
|
||
class Basic:
|
||
__match_kind__ = MATCH_POSITIONAL
|
||
def __deconstruct__(self):
|
||
return self._args
|
||
|
||
.. [2]
|
||
|
||
This::
|
||
|
||
case [a, b] if a is b:
|
||
|
||
translates to::
|
||
|
||
if $kind & MATCH_SEQUENCE == 0:
|
||
FAIL
|
||
if $list is None:
|
||
$list = list($value)
|
||
if len($list) != 2:
|
||
FAIL
|
||
a, b = $list
|
||
if not a is b:
|
||
FAIL
|
||
|
||
.. [3]
|
||
|
||
This::
|
||
|
||
case [a, *b, c]:
|
||
|
||
translates to::
|
||
|
||
if $kind & MATCH_SEQUENCE == 0:
|
||
FAIL
|
||
if $list is None:
|
||
$list = list($value)
|
||
if len($list) < 2:
|
||
FAIL
|
||
a, *b, c = $list
|
||
|
||
.. [4]
|
||
|
||
This::
|
||
|
||
case {"x": x, "y": y} if x > 2:
|
||
|
||
translates to::
|
||
|
||
if $kind & MATCH_MAPPING == 0:
|
||
FAIL
|
||
if $dict is None:
|
||
$dict = dict($value)
|
||
if $dict.keys() != {"x", "y"}:
|
||
FAIL
|
||
x = $dict["x"]
|
||
y = $dict["y"]
|
||
if not x > 2:
|
||
FAIL
|
||
|
||
.. [5]
|
||
|
||
This::
|
||
|
||
case {"x": x, "y": y, **: z}:
|
||
|
||
translates to::
|
||
|
||
if $kind & MATCH_MAPPING == 0:
|
||
FAIL
|
||
if $dict is None:
|
||
$dict = dict($value)
|
||
if not $dict.keys() >= {"x", "y"}:
|
||
FAIL
|
||
$tmp = dict($dict)
|
||
x = $tmp.pop("x")
|
||
y = $tmp.pop("y")
|
||
z = $tmp
|
||
|
||
.. [6]
|
||
|
||
This::
|
||
|
||
match ClsName(x, y):
|
||
|
||
translates to::
|
||
|
||
if not isinstance($value, ClsName):
|
||
FAIL
|
||
if $kind & MATCH_POSITIONAL:
|
||
if $items is None:
|
||
$items = type($value).__deconstruct__($value)
|
||
if len($items) < 2:
|
||
FAIL
|
||
x, y = $items
|
||
elif $kind & MATCH_DEFAULT:
|
||
if $attrs is None:
|
||
$attrs = type($value).__match_args__
|
||
if len($attr) < 2:
|
||
FAIL
|
||
try:
|
||
x = getattr($value, $attrs[0])
|
||
y = getattr($value, $attrs[1])
|
||
except AttributeError:
|
||
FAIL
|
||
else:
|
||
FAIL
|
||
|
||
.. [7]
|
||
|
||
This::
|
||
|
||
match ClsName(a=x, b=y):
|
||
|
||
translates to::
|
||
|
||
if not isinstance($value, ClsName):
|
||
FAIL
|
||
if $kind & MATCH_POSITIONAL:
|
||
if $items is None:
|
||
$items = type($value).__deconstruct__($value)
|
||
if $attrs is None:
|
||
$attrs = type($value).__match_args__
|
||
$indices = multi_index($attrs, ("a", "b"), 0)
|
||
if $indices is None:
|
||
raise TypeError(...)
|
||
try:
|
||
x = $items[$indices[0]]
|
||
y = $items[$indices[1]]
|
||
except IndexError:
|
||
raise TypeError(...)
|
||
elif $kind & MATCH_DEFAULT:
|
||
try:
|
||
x = $value.a
|
||
y = $value.b
|
||
except AttributeError:
|
||
FAIL
|
||
else:
|
||
FAIL
|
||
|
||
.. [8]
|
||
|
||
This::
|
||
|
||
match ClsName(x, a=y):
|
||
|
||
translates to::
|
||
|
||
|
||
if not isinstance($value, ClsName):
|
||
FAIL
|
||
if $kind & MATCH_POSITIONAL:
|
||
if $items is None:
|
||
$items = type($value).__deconstruct__($value)
|
||
if $attrs is None:
|
||
$attrs = type($value).__match_args__
|
||
if len($items) < 1:
|
||
FAIL
|
||
x = $items[0]
|
||
$indices = multi_index($attrs, ("a",), 1)
|
||
if $indices is None:
|
||
raise TypeError(...)
|
||
$index = $indices[0]
|
||
try:
|
||
y = $items[$index]
|
||
except IndexError:
|
||
raise TypeError(...)
|
||
elif $kind & MATCH_DEFAULT:
|
||
if $attrs is None:
|
||
$attrs = type($value).__match_args__
|
||
if len($attr) < 1:
|
||
raise TypeError(...)
|
||
$positional_names = $attrs[:1]
|
||
try:
|
||
x = getattr($value, $attrs[0])
|
||
if "a" in $positional_names:
|
||
raise TypeError(...)
|
||
y = $value.a
|
||
except AttributeError:
|
||
FAIL
|
||
else:
|
||
FAIL
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document is placed in the public domain or under the
|
||
CC0-1.0-Universal license, whichever is more permissive.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|