PEP 653: Split __match_kind__ into __match_container__ and __match_class__ (#1901)

This commit is contained in:
Mark Shannon 2021-03-30 11:47:14 +01:00 committed by GitHub
parent 31e30aebe1
commit 0a0e7a3045
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 96 additions and 61 deletions

View File

@ -14,9 +14,9 @@ Abstract
This PEP proposes a semantics for pattern matching that respects the general concept of PEP 634,
but is more precise, easier to reason about, and should be faster.
The object model will be extended with a special (dunder) attribute, ``__match_kind__``,
in addition to the ``__match_args__`` attribute from PEP 634, to support pattern matching.
The ``__match_kind__`` attribute must be an integer.
The object model will be extended with two special (dunder) attributes, ``__match_container__`` and
``__match_class__``, in addition to the ``__match_args__`` attribute from PEP 634, to support pattern matching.
Both of these new attributes must be integers and ``__match_args__`` is required to be a tuple.
With this PEP:
@ -97,29 +97,30 @@ A match statement performs a sequence of pattern matches. In general, matching a
2. When deconstructed, does the value match this particular pattern?
3. Is the guard true?
To determine whether a value can match a particular kind of pattern, we add the ``__match_kind__`` attribute.
This allows the kind of a value to be determined once and in a efficient fashion.
To determine whether a value can match a particular kind of pattern, we add the ``__match_container__``
and ``__match_class__`` attributes.
This allows the kind of a value to be determined in a efficient fashion.
Specification
=============
Additions to the object model
-----------------------------
A ``__match_kind__`` attribute will be added to ``object``.
It should be overridden by classes that want to match mapping or sequence patterns,
or want change the default behavior when matching class patterns.
It must be an integer and should be exactly one of these::
The ``__match_container__ ``and ``__match_class__`` attributes will be added to ``object``.
``__match_container__`` should be overridden by classes that want to match mapping or sequence patterns.
``__match_class__`` should be overridden by classes that want to change the default behavior when matching class patterns.
``__match_container__`` must be an integer and should be exactly one of these::
0
MATCH_SEQUENCE
MATCH_MAPPING
bitwise ``or``\ ed with exactly one of these::
``__match_class__`` must be an integer and should be exactly one of these::
0
MATCH_DEFAULT
MATCH_ATTRIBUTES
MATCH_SELF
.. note::
@ -127,15 +128,21 @@ bitwise ``or``\ ed with exactly one of these::
Symbolic constants will be provided both for Python and C, and once defined they will
never be changed.
Classes inheriting from ``object`` will inherit ``__match_kind__ = MATCH_DEFAULT`` and ``__match_args__ = ()``
``object`` will have the following values for the special attributes::
__match_container__ = 0
__match_class__= MATCH_ATTRIBUTES
__match_args__ = ()
These special attributes will be inherited as normal.
If ``__match_args__`` is overridden, then it is required to hold a tuple of strings. It may be empty.
.. note::
``__match_args__`` will be automatically generated for dataclasses and named tuples, as specified in PEP 634.
The pattern matching implementation is *not* required to check that ``__match_args__`` behaves as specified.
If the value of ``__match_args__`` is not as specified, then
The pattern matching implementation is *not* required to check that any of these attributes behave as specified.
If the value of ``__match_container__``, ``__match_class__`` or ``__match_args__`` is not as specified, then
the implementation may raise any exception, or match the wrong pattern.
Of course, implementations are free to check these properties and provide meaningful error messages if they can do so efficiently.
@ -163,14 +170,13 @@ All additional code listed below that is not present in the original source will
Preamble
''''''''
Before any patterns are matched, the expression being matched is evaluated and its kind is determined::
Before any patterns are matched, the expression being matched is evaluated::
match expr:
translates to::
$value = expr
$kind = type($value).__match_kind__
Capture patterns
''''''''''''''''
@ -234,6 +240,7 @@ A pattern not including a star pattern::
translates to::
$kind = type($value).__match_container__
if $kind & MATCH_SEQUENCE == 0:
FAIL
if len($value) != len($VARS):
@ -248,6 +255,7 @@ A pattern including a star pattern::
translates to::
$kind = type($value).__match_container__
if $kind & MATCH_SEQUENCE == 0:
FAIL
if len($value) < len($VARS):
@ -265,6 +273,7 @@ A pattern not including a double-star pattern::
translates to::
$kind = type($value).__match_container__
if $kind & MATCH_MAPPING == 0:
FAIL
if not $value.keys() >= $KEYWORD_PATTERNS.keys():
@ -281,6 +290,7 @@ A pattern including a double-star pattern::
translates to::
$kind = type($value).__match_container__
if $kind & MATCH_MAPPING == 0:
FAIL
if not $value.keys() >= $KEYWORD_PATTERNS.keys():
@ -308,7 +318,7 @@ translates to::
.. note::
``case ClsName():`` is the only class pattern that can succeed if
``($kind & (MATCH_SELF|MATCH_DEFAULT)) == 0``
``($kind & (MATCH_SELF|MATCH_ATTRIBUTES)) == 0``
Class pattern with a single positional pattern::
@ -317,6 +327,7 @@ Class pattern with a single positional pattern::
translates to::
$kind = type($value).__match_class__
if $kind & MATCH_SELF:
if not isinstance($value, ClsName):
FAIL
@ -333,7 +344,8 @@ translates to::
if not isinstance($value, ClsName):
FAIL
if $kind & MATCH_DEFAULT:
$kind = type($value).__match_class__
if $kind & MATCH_ATTRIBUTES:
$attrs = ClsName.__match_args__
if len($attr) < len($VARS):
raise TypeError(...)
@ -355,7 +367,8 @@ translates to::
if not isinstance($value, ClsName):
FAIL
if $kind & MATCH_DEFAULT:
$kind = type($value).__match_class__
if $kind & MATCH_ATTRIBUTES:
try:
for $KEYWORD in $KEYWORD_PATTERNS:
$tmp = getattr($value, QUOTE($KEYWORD))
@ -375,7 +388,8 @@ translates to::
if not isinstance($value, ClsName):
FAIL
if $kind & MATCH_DEFAULT:
$kind = type($value).__match_class__
if $kind & MATCH_ATTRIBUTES:
$attrs = ClsName.__match_args__
if len($attr) < len($VARS):
raise TypeError(...)
@ -408,6 +422,7 @@ For example, the pattern::
translates to::
$kind = type($value).__match_class__
if $kind & MATCH_SEQUENCE == 0:
FAIL
if len($value) != 2:
@ -433,45 +448,49 @@ translates to::
FAIL
Non-conforming ``__match_kind__``
Non-conforming special attributes
'''''''''''''''''''''''''''''''''
All classes should ensure that the the value of ``__match_kind__`` follows the specification.
All classes should ensure that the the values of ``__match_container__``, ``__match_class__``
and ``__match_args__`` follow the specification.
Therefore, implementations can assume, without checking, that the following are true::
(__match_kind__ & (MATCH_SEQUENCE | MATCH_MAPPING)) != (MATCH_SEQUENCE | MATCH_MAPPING)
(__match_kind__ & (MATCH_SELF | MATCH_DEFAULT)) != (MATCH_SELF | MATCH_DEFAULT)
(__match_container__ & (MATCH_SEQUENCE | MATCH_MAPPING)) != (MATCH_SEQUENCE | MATCH_MAPPING)
(__match_class__ & (MATCH_SELF | MATCH_ATTRIBUTES)) != (MATCH_SELF | MATCH_ATTRIBUTES)
Thus, implementations can assume that ``__match_kind__ & MATCH_SEQUENCE`` implies ``(__match_kind__ & MATCH_MAPPING) == 0``, and vice-versa.
Likewise for ``MATCH_SELF`` and ``MATCH_DEFAULT``.
Thus, implementations can assume that ``__match_container__ & MATCH_SEQUENCE`` implies ``(__match_container__ & MATCH_MAPPING) == 0``, and vice-versa.
Likewise for ``__match_class__``, ``MATCH_SELF`` and ``MATCH_ATTRIBUTES``.
If ``__match_kind__`` does not follow the specification,
then implementations may treat any of the expressions of the form ``$kind & MATCH_...`` above as having any value.
Values of the special attributes for classes in the standard library
--------------------------------------------------------------------
Implementation of ``__match_kind__`` in the standard library
------------------------------------------------------------
For the core builtin container classes ``__match_container__`` will be:
``object.__match_kind__`` will be ``MATCH_DEFAULT``.
* ``list``: ``MATCH_SEQUENCE``
* ``tuple``: ``MATCH_SEQUENCE``
* ``dict``: ``MATCH_MAPPING``
* ``bytearray``: 0
* ``bytes``: 0
* ``str``: 0
For common builtin classes ``__match_kind__`` will be:
Named tuples will have ``__match_container__`` set to ``MATCH_SEQUENCE``.
* ``bool``: ``MATCH_SELF``
* ``bytearray``: ``MATCH_SELF``
* ``bytes``: ``MATCH_SELF``
* ``float``: ``MATCH_SELF``
* ``frozenset``: ``MATCH_SELF``
* ``int``: ``MATCH_SELF``
* ``set``: ``MATCH_SELF``
* ``str``: ``MATCH_SELF``
* ``list``: ``MATCH_SEQUENCE | MATCH_SELF``
* ``tuple``: ``MATCH_SEQUENCE | MATCH_SELF``
* ``dict``: ``MATCH_MAPPING | MATCH_SELF``
* All other standard library classes for which ``issubclass(cls, collections.abc.Mapping)`` is true will have ``__match_container__`` set to ``MATCH_MAPPING``.
* All other standard library classes for which ``issubclass(cls, collections.abc.Sequence)`` is true will have ``__match_container__`` set to ``MATCH_SEQUENCE``.
Named tuples will have ``__match_kind__`` set to ``MATCH_SEQUENCE | MATCH_DEFAULT``.
* All other standard library classes for which ``issubclass(cls, collections.abc.Mapping)`` is true will have ``__match_kind__`` set to ``MATCH_MAPPING``.
* All other standard library classes for which ``issubclass(cls, collections.abc.Sequence)`` is true will have ``__match_kind__`` set to ``MATCH_SEQUENCE``.
For the following builtin classes ``__match_class__`` will be set to ``MATCH_SELF``:
* ``bool``
* ``bytearray``
* ``bytes``
* ``float``
* ``frozenset``
* ``int``
* ``set``
* ``str``
* ``list``
* ``tuple``
* ``dict``
Legal optimizations
-------------------
@ -497,9 +516,9 @@ Implementations are allowed to make the following assumptions:
* ``isinstance(obj, cls)`` can be freely replaced with ``issubclass(type(obj), cls)`` and vice-versa.
* ``isinstance(obj, cls)`` will always return the same result for any ``(obj, cls)`` pair and repeated calls can thus be elided.
* Reading ``__match_args__`` and calling ``__deconstruct__`` are pure operations, and may be cached.
* Sequences, that is any class for which ``MATCH_SEQUENCE`` is true, are not modified by iteration, subscripting or calls to ``len()``,
and thus those operations can be freely substituted for each other where they would be equivalent when applied to an immuable sequence.
* Reading any of ``__match_container__``, ``__match_class__`` or ``__match_args__`` is a pure operation, and may be cached.
* Sequences, that is any class for which ``__match_container__&MATCH_SEQUENCE`` is not zero, are not modified by iteration, subscripting or calls to ``len()``.
Consequently, those operations can be freely substituted for each other where they would be equivalent when applied to an immutable sequence.
In fact, implementations are encouraged to make these assumptions, as it is likely to result in signficantly better performance.
@ -631,9 +650,11 @@ Summary of differences between this PEP and PEP 634
The changes to the semantics can be summarized as:
* Selecting the kind of pattern uses ``cls.__match_kind__`` instead of
``issubclass(cls, collections.abc.Mapping)`` and ``issubclass(cls, collections.abc.Sequence)``
and allows classes a bit more control over which kinds of pattern they match.
* Requires ``__match_args__`` to be a *tuple* of strings, not just a sequence.
This make pattern matching a bit more robust and optimizable as ``__match_args__`` can be assumed to be immutable.
* Selecting the kind of container patterns that can be matched uses ``cls.__match_container__`` instead of
``issubclass(cls, collections.abc.Mapping)`` and ``issubclass(cls, collections.abc.Sequence)``.
* Allows classes to opt out of deconstruction altogether, if neccessary, but setting ``__match_class__ = 0``.
* The behavior when matching patterns is more precisely defined, but is otherwise unchanged.
There are no changes to syntax. All examples given in the PEP 636 tutorial should continue to work as they do now.
@ -644,7 +665,7 @@ Rejected Ideas
Using attributes from the instance's dictionary
-----------------------------------------------
An earlier version of this PEP only used attributes from the instance's dictionary when matching a class pattern with ``__match_kind__ == MATCH_DEFAULT``.
An earlier version of this PEP only used attributes from the instance's dictionary when matching a class pattern with ``MATCH_ATTRIBUTES``.
The intent was to avoid capturing bound-methods and other synthetic attributes. However, this also mean that properties were ignored.
For the class::
@ -659,7 +680,7 @@ For the class::
...
Ideally we would match the attributes "a" and "p", but not "m".
However, there is no general way to do that, so this PEP now follows the semantics of PEP 634 for ``MATCH_DEFAULT``.
However, there is no general way to do that, so this PEP now follows the semantics of PEP 634 for ``MATCH_ATTRIBUTES``.
Lookup of ``__match_args__`` on the subject not the pattern
-----------------------------------------------------------
@ -672,6 +693,13 @@ This has been rejected for a few reasons::
* Using the class specified in the pattern has the potential to provide better error reporting is some cases.
* Neither approach is perfect, both have odd corner cases. Keeping the status quo minimizes disruption.
Combining ``__match_class__`` and ``__match_container__`` into a single value
-----------------------------------------------------------------------------
An earlier version of this PEP combined ``__match_class__`` and ``__match_container__`` into a single value, ``__match_kind__``.
Using a single value has a small advantage in terms of performance,
but is likely to result in unintended changes to container matching when overriding class matching behavior, and vice versa.
Deferred Ideas
==============
@ -706,7 +734,7 @@ Code examples
::
class Symbol:
__match_kind__ = MATCH_SELF
__match_class__ = MATCH_SELF
.. [2]
@ -716,6 +744,7 @@ This::
translates to::
$kind = type($value).__match_container__
if $kind & MATCH_SEQUENCE == 0:
FAIL
if len($value) != 2:
@ -732,6 +761,7 @@ This::
translates to::
$kind = type($value).__match_container__
if $kind & MATCH_SEQUENCE == 0:
FAIL
if len($value) < 2:
@ -746,6 +776,7 @@ This::
translates to::
$kind = type($value).__match_container__
if $kind & MATCH_MAPPING == 0:
FAIL
if $value.keys() != {"x", "y"}:
@ -763,6 +794,7 @@ This::
translates to::
$kind = type($value).__match_container__
if $kind & MATCH_MAPPING == 0:
FAIL
if not $value.keys() >= {"x", "y"}:
@ -782,7 +814,8 @@ translates to::
if not isinstance($value, ClsName):
FAIL
if $kind & MATCH_DEFAULT:
$kind = type($value).__match_class__
if $kind & MATCH_ATTRIBUTES:
$attrs = ClsName.__match_args__
if len($attr) < 2:
FAIL
@ -804,7 +837,8 @@ translates to::
if not isinstance($value, ClsName):
FAIL
lif $kind & MATCH_DEFAULT:
$kind = type($value).__match_class__
lif $kind & MATCH_ATTRIBUTES:
try:
x = $value.a
y = $value.b
@ -824,7 +858,8 @@ translates to::
if not isinstance($value, ClsName):
FAIL
if $kind & MATCH_DEFAULT:
$kind = type($value).__match_class__
if $kind & MATCH_ATTRIBUTES:
$attrs = ClsName.__match_args__
if len($attr) < 1:
raise TypeError(...)
@ -844,7 +879,7 @@ translates to::
::
class Basic:
__match_kind__ = MATCH_POSITIONAL
__match_class__ = MATCH_POSITIONAL
def __deconstruct__(self):
return self._args