PEP 622: Include overview to different patterns and summarised syntax (#1501)
Also some other improvements, and added Daniel as author (that was Guido's work). Co-authored-by: Brandt Bucher <brandtbucher@gmail.com>
This commit is contained in:
parent
e43a11d86e
commit
26ac4b3d3e
163
pep-0622.rst
163
pep-0622.rst
|
@ -3,6 +3,7 @@ Title: Structural Pattern Matching
|
|||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Brandt Bucher <brandtbucher@gmail.com>,
|
||||
Daniel F Moisset <dfmoisset@gmail.com>,
|
||||
Tobias Kohn <kohnt@tobiaskohn.ch>,
|
||||
Ivan Levkivskyi <levkivskyi@gmail.com>,
|
||||
Guido van Rossum <guido@python.org>,
|
||||
|
@ -14,7 +15,7 @@ Type: Standards Track
|
|||
Content-Type: text/x-rst
|
||||
Created: 23-Jun-2020
|
||||
Python-Version: 3.10
|
||||
Post-History: 23-Jun-2020
|
||||
Post-History: 23-Jun-2020, 8-Jul-2020
|
||||
Resolution:
|
||||
|
||||
|
||||
|
@ -280,8 +281,67 @@ section for more details.
|
|||
Syntax and Semantics
|
||||
====================
|
||||
|
||||
Case clauses
|
||||
------------
|
||||
Patterns
|
||||
--------
|
||||
|
||||
The **pattern** is a new syntactical construct, that could be considered a loose
|
||||
generalization of assignment targets. The key properties of a pattern are what
|
||||
types and shapes of subjects it accepts, what variables it captures and how
|
||||
it extracts them from the subject. For example the pattern ``[a, b]`` matches
|
||||
only sequences of exactly 2 elements, extracting the first element into ``a``
|
||||
and the second one into ``b``.
|
||||
|
||||
This PEP defines several types of patterns. These are certainly not the
|
||||
only possible ones, so the design decision was made to choose a subset of
|
||||
functionality that is useful now but conservative. More patterns can be added
|
||||
later as this feature gets more widespread use. See the `rejected ideas`_
|
||||
and `deferred ideas`_ sections for more details.
|
||||
|
||||
The patterns listed here are described in more detail below, but summarized
|
||||
together in this section for simplicity:
|
||||
|
||||
- A **literal pattern** is useful to filter constant values in a structure.
|
||||
It looks like a Python literal (including some values like ``True``,
|
||||
``False`` and ``None``). It only matches objects equal to the literal, and
|
||||
never binds.
|
||||
- A **capture pattern** looks like ``x`` and is equivalent to an identical
|
||||
assignment target: it always matches and binds the variable
|
||||
with the given name.
|
||||
- The **wildcard pattern** is a single underscore: ``_``. It always matches,
|
||||
but does not capture any variable (which prevents interference with other
|
||||
uses for ``_`` and allows for some optimizations).
|
||||
- A **constant value pattern** works like the literal but for certain named
|
||||
constants. Note that it must be a qualified (dotted) name, given the possible
|
||||
ambiguity with a capture pattern. It looks like ``Color.RED`` and
|
||||
only matches values equal to the corresponding value. It never binds.
|
||||
- A **sequence pattern** looks like ``[a, *rest, b]`` and is similar to
|
||||
a list unpacking. An important difference is that the elements nested
|
||||
within it can be any kind of patterns, not just names or sequences.
|
||||
It matches only sequences of appropriate length, as long as all the sub-patterns
|
||||
also match. It makes all the bindings of its sub-patterns.
|
||||
- A **mapping pattern** looks like ``{"user": u, "emails": [*es]}``. It matches
|
||||
mappings with at least the set of provided keys, and if all the
|
||||
sub-patterns match their corresponding values. It binds whatever the
|
||||
sub-patterns bind while matching with the values corresponding to the keys.
|
||||
Adding ``**rest`` at the end of the pattern to capture extra items is allowed.
|
||||
- A **class pattern** is similar to the above but matches attributes instead
|
||||
of keys. It looks like ``datetime.date(year=y, day=d)``. It matches
|
||||
instances of the given type, having at least the specified
|
||||
attributes, as long as the attributes match with the corresponding
|
||||
sub-patterns. It binds whatever the sub-patterns bind when matching with the
|
||||
values of
|
||||
the given attributes. An optional protocol also allows matching positional
|
||||
arguments.
|
||||
- An **OR pattern** looks like ``[*x] | {"elems": [*x]}``. It matches if any
|
||||
of its sub-patterns match. It uses the binding for the leftmost pattern
|
||||
that matched.
|
||||
- A **walrus pattern** looks like ``d := datetime(year=2020, month=m)``. It
|
||||
matches only
|
||||
if its sub-pattern also matches. It binds whatever the sub-pattern match does, and
|
||||
also binds the named variable to the entire object.
|
||||
|
||||
The ``match`` statement
|
||||
-----------------------
|
||||
|
||||
A simplified, approximate grammar for the proposed syntax is::
|
||||
|
||||
|
@ -299,14 +359,17 @@ A simplified, approximate grammar for the proposed syntax is::
|
|||
closed_pattern:
|
||||
| literal_pattern
|
||||
| capture_pattern
|
||||
| wildcard_pattern
|
||||
| constant_pattern
|
||||
| sequence_pattern
|
||||
| mapping_pattern
|
||||
| class_pattern
|
||||
|
||||
(See `Appendix A`_ for the full, unabridged grammar.)
|
||||
See `Appendix A`_ for the full, unabridged grammar. The simplified grammars in
|
||||
this section are there for helping the reader, not as a full specification.
|
||||
|
||||
We propose the match syntax to be a statement, not an expression. Although in
|
||||
We propose that the match operation should be a statement, not an expression.
|
||||
Although in
|
||||
many languages it is an expression, being a statement better suits the general
|
||||
logic of Python syntax. See `rejected ideas`_ for more discussion. The list of
|
||||
allowed patterns is specified below in the `patterns`_ subsection.
|
||||
|
@ -384,6 +447,16 @@ building blocks. The following patterns are supported:
|
|||
Literal Patterns
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
Simplified syntax::
|
||||
|
||||
literal_pattern:
|
||||
| number
|
||||
| string
|
||||
| 'None'
|
||||
| 'True'
|
||||
| 'False'
|
||||
|
||||
|
||||
A literal pattern consists of a simple literal like a string, a number,
|
||||
a Boolean literal (``True`` or ``False``), or ``None``::
|
||||
|
||||
|
@ -427,6 +500,10 @@ really literals).
|
|||
Capture Patterns
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
Simplified syntax::
|
||||
|
||||
capture_pattern: NAME
|
||||
|
||||
A capture pattern serves as an assignment target for the matched expression::
|
||||
|
||||
match greeting:
|
||||
|
@ -449,30 +526,51 @@ the ``""`` case clause was taken::
|
|||
... # but works fine if greeting was not empty
|
||||
|
||||
While matching against each case clause, a name may be bound at most
|
||||
once, having two capture patterns with coinciding names is an error. An
|
||||
exception is made for the special single underscore (``_``) name; in
|
||||
patterns, it's a wildcard that *never* binds::
|
||||
once, having two capture patterns with coinciding names is an error::
|
||||
|
||||
match data:
|
||||
case [x, x]: # Error!
|
||||
...
|
||||
case [_, _]:
|
||||
print("Some pair")
|
||||
print(_) # Error!
|
||||
|
||||
Note: one can still match on a collection with equal items using `guards`_.
|
||||
Also, ``[x, y] | Point(x, y)`` is a legal pattern because the two
|
||||
alternatives are never matched at the same time.
|
||||
|
||||
The single underscore (``_``) is not considered a ``NAME`` and treated specially
|
||||
as a `wildcard pattern`_.
|
||||
|
||||
Reminder: ``None``, ``False`` and ``True`` are keywords denoting
|
||||
literals, not names.
|
||||
|
||||
.. _wildcard_pattern:
|
||||
|
||||
Wildcard Pattern
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
Simplified syntax::
|
||||
|
||||
wildcard_pattern: "_"
|
||||
|
||||
The single underscore (``_``) name is a special kind of pattern that always
|
||||
matches but *never* binds::
|
||||
|
||||
match data:
|
||||
case [_, _]:
|
||||
print("Some pair")
|
||||
print(_) # Error!
|
||||
|
||||
Given that no binding is made, it can be used as many times as desired, unlike
|
||||
capture patterns.
|
||||
|
||||
.. _constant_value_pattern:
|
||||
|
||||
Constant Value Patterns
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Simplified syntax::
|
||||
|
||||
constant_pattern: NAME ('.' NAME)+
|
||||
|
||||
This is used to match against constants and enum values.
|
||||
Every dotted name in a pattern is looked up using normal Python name
|
||||
resolution rules, and the value is used for comparison by equality with
|
||||
|
@ -502,6 +600,14 @@ considered for constant value patterns.
|
|||
Sequence Patterns
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
Simplified syntax::
|
||||
|
||||
sequence_pattern:
|
||||
| '[' [values_pattern] ']'
|
||||
| '(' [value_pattern ',' [values pattern]] ')'
|
||||
values_pattern: ','.value_pattern+ ','?
|
||||
value_pattern: '*' capture_pattern | pattern
|
||||
|
||||
A sequence pattern follows the same semantics as unpacking assignment.
|
||||
Like unpacking assignment, both tuple-like and list-like syntax can be
|
||||
used, with identical semantics. Each element can be an arbitrary
|
||||
|
@ -533,6 +639,15 @@ example:
|
|||
Mapping Patterns
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
Simplified syntax::
|
||||
|
||||
mapping_pattern: '{' [items_pattern] '}'
|
||||
items_pattern: ','.key_value_pattern+ ','?
|
||||
key_value_pattern:
|
||||
| (literal_pattern | constant_pattern) ':' or_pattern
|
||||
| '**' capture_pattern
|
||||
|
||||
|
||||
Mapping pattern is a generalization of iterable unpacking to mappings.
|
||||
Its syntax is similar to dictionary display but each key and value are
|
||||
patterns ``"{" (pattern ":" pattern)+ "}"``. A ``**name`` pattern is also
|
||||
|
@ -568,6 +683,16 @@ were already present when the ``match`` block was entered.
|
|||
Class Patterns
|
||||
~~~~~~~~~~~~~~
|
||||
|
||||
Simplified syntax::
|
||||
|
||||
class_pattern:
|
||||
| name_or_attr '(' ')'
|
||||
| name_or_attr '(' ','.pattern+ ','? ')'
|
||||
| name_or_attr '(' ','.keyword_pattern+ ','? ')'
|
||||
| name_or_attr '(' ','.pattern+ ',' ','.keyword_pattern+ ','? ')'
|
||||
keyword_pattern: NAME '=' or_pattern
|
||||
|
||||
|
||||
A class pattern provides support for destructuring arbitrary objects.
|
||||
There are two possible ways of matching on object attributes: by position
|
||||
like ``Point(1, 2)``, and by name like ``Point(x=1, y=2)``. These
|
||||
|
@ -594,7 +719,7 @@ The leading name must not be ``_``, so e.g. ``_(...)`` and
|
|||
matched object has an attribute ``foo``.
|
||||
|
||||
By default, sub-patterns may only be matched by keyword for
|
||||
user-defined classes. In order to suport positional sub-patterns, a
|
||||
user-defined classes. In order to support positional sub-patterns, a
|
||||
custom ``__match_args__`` attribute is required.
|
||||
The runtime allows matching against
|
||||
arbitrarily nested patterns by chaining all of the instance checks and
|
||||
|
@ -640,7 +765,7 @@ the same set of variables (excluding ``_``). For example::
|
|||
Guards
|
||||
------
|
||||
|
||||
Each *top-level* pattern can be followed by a guard of the form
|
||||
Each *top-level* pattern can be followed by a **guard** of the form
|
||||
``if expression``. A case clause succeeds if the pattern matches and the guard
|
||||
evaluates to a true value. For example::
|
||||
|
||||
|
@ -700,7 +825,7 @@ match statements, but this will be less readable and/or will produce less
|
|||
efficient code. Essentially, most of the arguments in PEP 572 apply here
|
||||
equally.
|
||||
|
||||
``_`` is not a valid name here.
|
||||
The wildcard ``_`` is not a valid name here.
|
||||
|
||||
|
||||
.. _runtime:
|
||||
|
@ -1940,7 +2065,6 @@ We are grateful for the help of the following individuals (among many
|
|||
others) for helping out during various phases of the writing of this
|
||||
PEP:
|
||||
|
||||
- Daniel F Moisset
|
||||
- Taine Zhao
|
||||
- Nate Lust
|
||||
|
||||
|
@ -1959,8 +2083,9 @@ Version History
|
|||
- Why we choose ``_`` for wildcard patterns
|
||||
- Why we choose ``|`` for OR patterns
|
||||
- Why we choose not to use special syntax for capture variables
|
||||
- Why this pattern matching operation and not others
|
||||
|
||||
- Clarify exception semantics
|
||||
- Clarify exception and side effect semantics
|
||||
- Clarify partial binding semantics
|
||||
- Drop restriction on use of ``_`` in load contexts
|
||||
- Simplify behavior of ``__match_args__``
|
||||
|
@ -1968,7 +2093,11 @@ Version History
|
|||
- Drop ``ImpossibleMatchError`` exception
|
||||
- Drop leading dot for loads (moved to `deferred ideas`_)
|
||||
- Reworked the initial sections (everything before `syntax`_)
|
||||
|
||||
- Added an overview of all the types of patterns before the
|
||||
detailed description
|
||||
- Added simplified syntax next to the description of each pattern
|
||||
- Separate description of the wildcard from capture patterns
|
||||
- Added Daniel F Moisset as sixth co-author
|
||||
|
||||
References
|
||||
==========
|
||||
|
|
Loading…
Reference in New Issue