PEP 634: Finish up, ready for SC review (#1633)

Clean up PEP 634 (the new PEP 622).

Renamed Constant Value Pattern to Value Pattern. Renamed a few other grammar rules as well.

Leave open for the SC to decide: whether to change `name := pattern` to `pattern as name`.

Note that the companion PEP 635 (motivation and rationale) and PEP 636 (tutorial) are not yet ready.
This commit is contained in:
Guido van Rossum 2020-10-08 15:09:40 -07:00 committed by GitHub
parent 4058b1cafb
commit ada9ee667f
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 81 additions and 78 deletions

View File

@ -19,28 +19,17 @@ Resolution:
Abstract Abstract
======== ========
**NOTE:** This draft is incomplete and not intended for review yet. This PEP provides the technical specification for the match
We're checking it into the peps repo for the convenience of the authors.
This PEP provides the technical specification for the ``match``
statement. It replaces PEP 622, which is hereby split in three parts: statement. It replaces PEP 622, which is hereby split in three parts:
- PEP 634: Specification - PEP 634: Specification
- PEP 635: Motivation and Rationale - PEP 635: Motivation and Rationale
- PEP 636: Tutorial - PEP 636: Tutorial
This PEP is intentionally devoid of commentary; all explanations of This PEP is intentionally devoid of commentary; the motivation and all
design choices are in PEP 635. First-time readers are encouraged to explanations of our design choices are in PEP 635. First-time readers
start with PEP 636, which provides a gentler introduction to the are encouraged to start with PEP 636, which provides a gentler
concepts, syntax and semantics of patterns. introduction to the concepts, syntax and semantics of patterns.
TODO: Maybe we should add simple examples back to each section?
There's no rule saying a spec can't include examples, and currently
it's *very* dry.
TODO: Go over the feedback from the SC and make sure everything's
somehow incorporated (either here or in PEP 635, which has to answer
why we didn't budge on most of the SC's initial requests).
Syntax and Semantics Syntax and Semantics
@ -48,7 +37,7 @@ Syntax and Semantics
See `Appendix A`_ for the complete grammar. See `Appendix A`_ for the complete grammar.
Overview and terminology Overview and Terminology
------------------------ ------------------------
The pattern matching process takes as input a pattern (following The pattern matching process takes as input a pattern (following
@ -78,13 +67,13 @@ failure, the bindings are merged. Several more rules, explained
below, apply to these cases. below, apply to these cases.
The ``match`` statement The Match Statement
----------------------- -------------------
Syntax:: Syntax::
match_stmt: "match" match_expr ':' NEWLINE INDENT case_block+ DEDENT match_stmt: "match" subject_expr ':' NEWLINE INDENT case_block+ DEDENT
match_expr: subject_expr:
| star_named_expression ',' star_named_expressions? | star_named_expression ',' star_named_expressions?
| named_expression | named_expression
case_block: "case" patterns [guard] ':' block case_block: "case" patterns [guard] ':' block
@ -108,30 +97,33 @@ For context, ``match_stmt`` is a new alternative for
The ``match`` and ``case`` keywords are soft keywords, i.e. they are The ``match`` and ``case`` keywords are soft keywords, i.e. they are
not reserved words in other grammatical contexts (including at the not reserved words in other grammatical contexts (including at the
start of a line if there is no colon where expected). This implies start of a line if there is no colon where expected). This implies
that they are recognized as keywords when part of a ``match`` that they are recognized as keywords when part of a match
statement or ``case`` block only, and are allowed to be used in all statement or case block only, and are allowed to be used in all
other context as variable or argument names. other contexts as variable or argument names.
Match semantics Match Semantics
^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^
TODO: Make the language about choosing a block more precise. The match statement first evaluates the subject expression. If a
comma is present a tuple is constructed using the standard rules.
The overall semantics for choosing the match is to choose the first The resulting subject value is then used to select the first case
matching pattern (including guard) and execute the corresponding block whose patterns succeeds matching it *and* whose guard condition
block. The remaining patterns are not tried. If there are no (if present) is "truthy". If no case blocks qualify the match
matching patterns, execution continues at the following statement. statement is complete; otherwise, the block of the selected case block
is executed. The usual rules for executing a block nested inside a
compound statement apply (e.g. an ``if`` statement).
Name bindings made during a successful pattern match outlive the Name bindings made during a successful pattern match outlive the
executed block and can be used after the ``match`` statement. executed block and can be used after the match statement.
During failed pattern matches, some subpatterns may succeed. For During failed pattern matches, some subpatterns may succeed. For
example, while matching the pattern ``(0, x, 1)`` with the value ``[0, example, while matching the pattern ``(0, x, 1)`` with the value ``[0,
1, 2]``, the subpattern ``x`` may succeed if the list elements are 1, 2]``, the subpattern ``x`` may succeed if the list elements are
matched from left to right. The implementation may choose to either matched from left to right. The implementation may choose to either
make persistent bindings for those partial matches or not. User code make persistent bindings for those partial matches or not. User code
including a ``match`` statement should not rely on the bindings being including a match statement should not rely on the bindings being
made for a failed match, but also shouldn't assume that variables are made for a failed match, but also shouldn't assume that variables are
unchanged by a failed match. This part of the behavior is left unchanged by a failed match. This part of the behavior is left
intentionally unspecified so different implementations can add intentionally unspecified so different implementations can add
@ -147,16 +139,19 @@ specified below.
Guards Guards
^^^^^^ ^^^^^^
Syntax:: If a guard is present on a case block, once the pattern or patterns in
the case block succeed, the expression in the guard is evaluated. If
this raises an exception, the exception bubbles up. Otherwise, if the
condition is "truthy" the case block is selected; if it is "falsy" the
case block is not selected.
case_block: "case" patterns [guard] ':' block Since guards are expressions they are allowed to have side effects.
guard: 'if' named_expression Guard evaluation must proceed from the first to the last case block,
one at a time, skipping case blocks whose pattern(s) don't all
succeed. (I.e., even if determining whether those patterns succeed
may happen out of order, guard evaluation must happen in order.)
Guard evaluation must stop once a case block is selected.
If a guard is present on a case block, once all patterns succeed,
the expression in the guard is evaluated.
If this raises an exception, the exception bubbles up.
Otherwise, if the condition is "truthy" the block is selected;
if it is "falsy" the next case block (if any) is tried.
.. _patterns: .. _patterns:
@ -174,18 +169,16 @@ The top-level syntax for patterns is as follows::
| literal_pattern | literal_pattern
| capture_pattern | capture_pattern
| wildcard_pattern | wildcard_pattern
| constant_pattern | value_pattern
| group_pattern | group_pattern
| sequence_pattern | sequence_pattern
| mapping_pattern | mapping_pattern
| class_pattern | class_pattern
Walrus patterns Walrus Patterns
^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^
TODO: Change to ``or_pattern 'as' capture_pattern`` (and rename)?
Syntax:: Syntax::
walrus_pattern: capture_pattern ':=' or_pattern walrus_pattern: capture_pattern ':=' or_pattern
@ -197,8 +190,23 @@ operator against the subject. If this fails, the walrus pattern fails.
Otherwise, the walrus pattern binds the subject to the name on the left Otherwise, the walrus pattern binds the subject to the name on the left
of the ``:=`` operator and succeeds. of the ``:=`` operator and succeeds.
Open Issue
~~~~~~~~~~
OR patterns An alternate syntax for this construct has been put forward, whose
syntax would be::
walrus_pattern: or_pattern 'as' capture_pattern
The semantics would be the same: it matches the OR pattern against the
subject and on success binds the subject to the name in the capture
pattern.
We leave it to the Steering Council to decide which form to prefer (we
would rename "walrus pattern" to "AS pattern").
OR Patterns
^^^^^^^^^^^ ^^^^^^^^^^^
Syntax:: Syntax::
@ -284,26 +292,22 @@ Syntax::
A wildcard pattern always succeeds. It binds no name. A wildcard pattern always succeeds. It binds no name.
.. _constant_value_pattern:
Constant Value Patterns Value Patterns
^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^
TODO: Rename to Value Patterns? (But ``value[s]_pattern`` is already
a grammatical rule.)
Syntax:: Syntax::
constant_pattern: attr value_pattern: attr
attr: name_or_attr '.' NAME attr: name_or_attr '.' NAME
name_or_attr: attr | NAME name_or_attr: attr | NAME
The dotted name in the pattern is looked up using the standard Python The dotted name in the pattern is looked up using the standard Python
name resolution rules. However, when the same constant pattern occurs name resolution rules. However, when the same value pattern occurs
multiple times in the same ``match`` statement, the interpreter may cache multiple times in the same match statement, the interpreter may cache
the first value found and reuse it, rather than repeat the same the first value found and reuse it, rather than repeat the same
lookup. (To clarify, this cache is strictly tied to a given execution lookup. (To clarify, this cache is strictly tied to a given execution
of a given ``match`` statement.) of a given match statement.)
The pattern succeeds if the value found thus compares equal to the The pattern succeeds if the value found thus compares equal to the
subject value (using the ``==`` operator). subject value (using the ``==`` operator).
@ -332,11 +336,11 @@ Sequence Patterns
Syntax:: Syntax::
sequence_pattern: sequence_pattern:
| '[' [values_pattern] ']' | '[' [maybe_sequence_pattern] ']'
| '(' [open_sequence_pattern] ')' | '(' [open_sequence_pattern] ')'
open_sequence_pattern: value_pattern ',' [values_pattern] open_sequence_pattern: maybe_star_pattern ',' [maybe_sequence_pattern]
values_pattern: ','.value_pattern+ ','? maybe_sequence_pattern: ','.maybe_star_pattern+ ','?
value_pattern: star_pattern | pattern maybe_star_pattern: star_pattern | pattern
star_pattern: '*' (capture_pattern | wildcard_pattern) star_pattern: '*' (capture_pattern | wildcard_pattern)
(Note that a single parenthesized pattern without a trailing comma is (Note that a single parenthesized pattern without a trailing comma is
@ -365,7 +369,7 @@ sequence is less than the number of non-star subpatterns.
The length of the subject sequence is obtained using the builtin The length of the subject sequence is obtained using the builtin
``len()`` function (i.e., via the ``__len__`` protocol). However, the ``len()`` function (i.e., via the ``__len__`` protocol). However, the
interpreter may cache this value in a similar manner as described for interpreter may cache this value in a similar manner as described for
constant value patterns. value patterns.
A fixed-length sequence pattern matches the subpatterns to A fixed-length sequence pattern matches the subpatterns to
corresponding items of the subject sequence, from left to right. corresponding items of the subject sequence, from left to right.
@ -393,7 +397,7 @@ Syntax::
mapping_pattern: '{' [items_pattern] '}' mapping_pattern: '{' [items_pattern] '}'
items_pattern: ','.key_value_pattern+ ','? items_pattern: ','.key_value_pattern+ ','?
key_value_pattern: key_value_pattern:
| (literal_pattern | constant_pattern) ':' or_pattern | (literal_pattern | value_pattern) ':' or_pattern
| double_star_pattern | double_star_pattern
double_star_pattern: '**' capture_pattern double_star_pattern: '**' capture_pattern
@ -423,7 +427,7 @@ subject's ``get()`` method. As a consequence, matched key-value pairs
must already be present in the mapping, and not created on-the-fly by must already be present in the mapping, and not created on-the-fly by
``__missing__`` or ``__getitem__``. For example, ``__missing__`` or ``__getitem__``. For example,
``collections.defaultdict`` instances will only be matched by patterns ``collections.defaultdict`` instances will only be matched by patterns
with keys that were already present when the ``match`` block was with keys that were already present when the match statement was
entered. entered.
@ -513,13 +517,13 @@ This behavior is roughly equivalent to the following::
return self return self
Side effects Side Effects
============ ============
The only side-effect produced explicitly by the matching process is The only side-effect produced explicitly by the matching process is
the binding of names. However, the process relies on attribute the binding of names. However, the process relies on attribute
access, instance checks, ``len()``, equality and item access on the access, instance checks, ``len()``, equality and item access on the
subject and some of its components. It also evaluates constant value subject and some of its components. It also evaluates value
patterns and the class name of class patterns. While none of those patterns and the class name of class patterns. While none of those
typically create any side-effects, in theory they could. This typically create any side-effects, in theory they could. This
proposal intentionally leaves out any specification of what methods proposal intentionally leaves out any specification of what methods
@ -527,7 +531,7 @@ are called or how many times. This behavior is therefore undefined
and user code should not rely on it. and user code should not rely on it.
The standard library The Standard Library
==================== ====================
To facilitate the use of pattern matching, several changes will be To facilitate the use of pattern matching, several changes will be
@ -552,10 +556,6 @@ it looks beneficial.
Appendix A -- Full Grammar Appendix A -- Full Grammar
========================== ==========================
TODO: Go over the differences with the reference implementation and
resolve them (either by fixing the PEP or by fixing the reference
implementation).
Here is the full grammar for ``match_stmt``. This is an additional Here is the full grammar for ``match_stmt``. This is an additional
alternative for ``compound_stmt``. Remember that ``match`` and alternative for ``compound_stmt``. Remember that ``match`` and
``case`` are soft keywords, i.e. they are not reserved words in other ``case`` are soft keywords, i.e. they are not reserved words in other
@ -570,8 +570,8 @@ Other notation used beyond standard EBNF:
:: ::
match_stmt: "match" match_expr ':' NEWLINE INDENT case_block+ DEDENT match_stmt: "match" subject_expr ':' NEWLINE INDENT case_block+ DEDENT
match_expr: subject_expr:
| star_named_expression ',' [star_named_expressions] | star_named_expression ',' [star_named_expressions]
| named_expression | named_expression
case_block: "case" patterns [guard] ':' block case_block: "case" patterns [guard] ':' block
@ -585,7 +585,7 @@ Other notation used beyond standard EBNF:
| literal_pattern | literal_pattern
| capture_pattern | capture_pattern
| wildcard_pattern | wildcard_pattern
| constant_pattern | value_pattern
| group_pattern | group_pattern
| sequence_pattern | sequence_pattern
| mapping_pattern | mapping_pattern
@ -605,24 +605,24 @@ Other notation used beyond standard EBNF:
wildcard_pattern: "_" wildcard_pattern: "_"
constant_pattern: attr !('.' | '(' | '=') value_pattern: attr !('.' | '(' | '=')
attr: name_or_attr '.' NAME attr: name_or_attr '.' NAME
name_or_attr: attr | NAME name_or_attr: attr | NAME
group_pattern: '(' pattern ')' group_pattern: '(' pattern ')'
sequence_pattern: sequence_pattern:
| '[' [values_pattern] ']' | '[' [maybe_sequence_pattern] ']'
| '(' [open_sequence_pattern] ')' | '(' [open_sequence_pattern] ')'
open_sequence_pattern: value_pattern ',' [values_pattern] open_sequence_pattern: maybe_star_pattern ',' [maybe_sequence_pattern]
values_pattern: ','.value_pattern+ ','? maybe_sequence_pattern: ','.maybe_star_pattern+ ','?
value_pattern: star_pattern | pattern maybe_star_pattern: star_pattern | pattern
star_pattern: '*' (capture_pattern | wildcard_pattern) star_pattern: '*' (capture_pattern | wildcard_pattern)
mapping_pattern: '{' [items_pattern] '}' mapping_pattern: '{' [items_pattern] '}'
items_pattern: ','.key_value_pattern+ ','? items_pattern: ','.key_value_pattern+ ','?
key_value_pattern: key_value_pattern:
| (literal_pattern | constant_pattern) ':' or_pattern | (literal_pattern | value_pattern) ':' or_pattern
| double_star_pattern | double_star_pattern
double_star_pattern: '**' capture_pattern double_star_pattern: '**' capture_pattern

View File

@ -27,6 +27,9 @@ This PEP provides the motivation and rationale for PEP 634
are encouraged to start with PEP 636, which provides a gentler are encouraged to start with PEP 636, which provides a gentler
introduction to the concepts, syntax and semantics of patterns. introduction to the concepts, syntax and semantics of patterns.
TODO: Go over the feedback from the SC and make sure everything's
somehow addressed.
Motivation Motivation