1135 lines
47 KiB
ReStructuredText
1135 lines
47 KiB
ReStructuredText
PEP: 642
|
||
Title: Constraint Pattern Syntax for Structural Pattern Matching
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Nick Coghlan <ncoghlan@gmail.com>
|
||
BDFL-Delegate:
|
||
Discussions-To: Python-Dev <python-dev@python.org>
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Requires: 634
|
||
Created: 26-Sep-2020
|
||
Python-Version: 3.10
|
||
Post-History: 31-Oct-2020, 8-Nov-2020
|
||
Resolution:
|
||
|
||
Abstract
|
||
========
|
||
|
||
This PEP covers an alternative syntax proposal for PEP 634's structural pattern
|
||
matching that explicitly anchors match patterns in the existing syntax for
|
||
assignment targets, while retaining most semantic aspects of the existing
|
||
proposal.
|
||
|
||
Specifically, this PEP adopts an additional design restriction that PEP 634's
|
||
authors considered unreasonable: that any syntax that is common to both
|
||
assignment targets and match patterns must have a comparable semantic effect,
|
||
while any novel match pattern semantics must offer syntax which emits a syntax
|
||
error when used in an assignment target. It is still considered acceptable to
|
||
offer syntactic sugar that is specific to match patterns, as long as there is
|
||
an underlying more explicit form that is compatible with assignment targets.
|
||
|
||
As a consequence, this PEP proposes the following changes to the proposed match
|
||
pattern syntax:
|
||
|
||
* a new pattern type is introduced: "constraint patterns"
|
||
* constraint patterns are either equality constraints or identity constraints
|
||
* equality constraints use ``==`` as a prefix marker on an otherwise
|
||
arbitrary primary expression: ``== EXPR``
|
||
* identity constraints use ``is`` as a prefix marker on an otherwise
|
||
arbitrary primary expression: ``is EXPR``
|
||
* value patterns and literal patterns (with some exceptions) are redefined as
|
||
"inferred equality constraints", and become a syntactic shorthand for an
|
||
equality constraint
|
||
* ``None`` and ``...`` are defined as "inferred identity constraints" and become
|
||
a syntactic shorthand for an identity constraint
|
||
* due to ambiguity of intent, neither ``True`` nor ``False`` are accepted as
|
||
implying an inferred constraint (instead requiring the use of an explicit
|
||
constraint, a class pattern, or a capture pattern with a guard expression)
|
||
* inferred constraints are *not* defined in the Abstract Syntax Tree. Instead,
|
||
inferred constraints are converted to explicit constraints by the parser
|
||
* ``_`` remains the wildcard pattern, but gains a dedicated ``SkippedBinding``
|
||
AST node to distinguish it from the use of ``_`` as an identifier
|
||
* Mapping patterns change to allow arbitrary primary expressions as keys
|
||
|
||
|
||
Relationship with other PEPs
|
||
============================
|
||
|
||
This PEP both depends on and competes with PEP 634 - the PEP author agrees that
|
||
match statements would be a sufficiently valuable addition to the language to
|
||
be worth the additional complexity that they add to the learning process, but
|
||
disagrees with the idea that "simple name vs literal or attribute lookup"
|
||
really offers an adequate syntactic distinction between name binding and value
|
||
lookup operations in match patterns. (Even though this PEP ultimately retained
|
||
that shorthand to reduce the verbosity of common use cases, it still redefines
|
||
it in terms of a more explicit underlying construct).
|
||
|
||
By dropping its own proposal to switch the wildcard pattern to ``?`` (and
|
||
instead retaining PEP 634's ``_``), this PEP now effectively votes against
|
||
the proposal in PEP 640 to allow the use of ``?`` as a general purpose wildcard
|
||
marker in name binding operations.
|
||
|
||
|
||
Motivation
|
||
==========
|
||
|
||
The original PEP 622 (which was later split into PEP 634, PEP 635, and PEP 636)
|
||
incorporated an unstated but essential assumption in its syntax design: that
|
||
neither ordinary expressions *nor* the existing assignment target syntax provide
|
||
an adequate foundation for the syntax used in match patterns.
|
||
|
||
While the PEP didn't explicitly state this assumption, one of the PEP authors
|
||
explained it clearly on python-dev [1_]:
|
||
|
||
The actual problem that I see is that we have different cultures/intuitions
|
||
fundamentally clashing here. In particular, so many programmers welcome
|
||
pattern matching as an "extended switch statement" and find it therefore
|
||
strange that names are binding and not expressions for comparison. Others
|
||
argue that it is at odds with current assignment statements, say, and
|
||
question why dotted names are _/not/_ binding. What all groups seem to
|
||
have in common, though, is that they refer to _/their/_ understanding and
|
||
interpretation of the new match statement as 'consistent' or 'intuitive'
|
||
--- naturally pointing out where we as PEP authors went wrong with our
|
||
design.
|
||
|
||
But here is the catch: at least in the Python world, pattern matching as
|
||
proposed by this PEP is an unprecedented and new way of approaching a common
|
||
problem. It is not simply an extension of something already there. Even
|
||
worse: while designing the PEP we found that no matter from which angle you
|
||
approach it, you will run into issues of seeming 'inconsistencies' (which is
|
||
to say that pattern matching cannot be reduced to a 'linear' extension of
|
||
existing features in a meaningful way): there is always something that goes
|
||
fundamentally beyond what is already there in Python. That's why I argue
|
||
that arguments based on what is 'intuitive' or 'consistent' just do not
|
||
make sense _/in this case/_.
|
||
|
||
PEP 635 (and PEP 622 before it) makes a strong case that treating capture
|
||
patterns as the default usage for simple names in match patterns is the right
|
||
approach, and provides a number of examples where having names express value
|
||
constraints by default would be confusing (this difference from C/C++ switch
|
||
statement semantics is also a key reason it makes sense to use ``match`` as the
|
||
introductory keyword for the new statement rather than ``switch``).
|
||
|
||
However, PEP 635 doesn't even *try* to make the case for the second assertion,
|
||
that treating match patterns as a variation on assignment targets also leads to
|
||
inherent contradictions. Even a PR submitted to explicitly list this option in
|
||
the "Rejected Ideas" section of the original PEP 622 was declined [2_].
|
||
|
||
This PEP instead starts from the assumption that it *is* possible to treat match
|
||
patterns as a variation on assignment targets, and the only essential
|
||
differences that emerge relative to the syntactic proposal in PEP 634 are:
|
||
|
||
* a requirement to offer an explicit marker prefix for value lookups rather than
|
||
only allowing them to be inferred from the use of dotted names or literals; and
|
||
* a requirement to use a non-binding wildcard marker other than ``_``.
|
||
|
||
This PEP concedes the second point in the name of cross-language consistency
|
||
(and for lack of a compelling alternative wildcard marker), but proposes
|
||
constraint expressions as a way of addressing the first point.
|
||
|
||
PEP 634 also proposes special casing the literals ``None``, ``True``, and
|
||
``False`` so that they're compared by identity when written directly as a
|
||
literal pattern, but by equality when referenced by a value pattern. This PEP
|
||
eliminates the need for those special cases by proposing distinct syntax for
|
||
matching by identity and matching by equality (but does accept the convenience
|
||
and consistency argument in allowing ``None`` as a shorthand for ``is None``).
|
||
|
||
|
||
Specification
|
||
=============
|
||
|
||
This PEP retains the overall `match`/`case` statement syntax from PEP 634, and
|
||
retains both the syntax and semantics for the following match pattern variants:
|
||
|
||
* capture patterns
|
||
* class patterns
|
||
* group patterns
|
||
* sequence patterns
|
||
|
||
Pattern combination (both OR and AS patterns) and guard expressions also remain
|
||
the same as they are in PEP 634.
|
||
|
||
Constraint patterns are added, offering equality constraints and identity
|
||
constraints.
|
||
|
||
Literal patterns and value patterns are replaced by inferred constraint
|
||
patterns, offering inferred equality constraints for strings, numbers and
|
||
attribute lookups, and inferred identity constraints for ``None`` and ``...``.
|
||
|
||
Mapping patterns change to allow arbitrary primary expressions for keys, rather
|
||
than being restricted to literal patterns or value patterns.
|
||
|
||
Wildcard patterns remain the same in the proposed surface syntax, but are
|
||
explicitly distinguished from the use of ``_`` as an identifier in the Abstract
|
||
Syntax Tree produced by the parser.
|
||
|
||
|
||
Constraint patterns
|
||
-------------------
|
||
|
||
Constraint patterns use the following simplified syntax::
|
||
|
||
constraint_pattern: id_constraint | eq_constraint
|
||
eq_constraint: '==' primary
|
||
id_constraint: 'is' primary
|
||
|
||
The constraint expression is an arbitrary primary expression - it can be a
|
||
simple name, a dotted name lookup, a literal, a function call, or any other
|
||
primary expression.
|
||
|
||
If this PEP were to be adopted in preference to PEP 634, then all literal and
|
||
value patterns could instead be written more explicitly as constraint patterns::
|
||
|
||
# Literal patterns
|
||
match number:
|
||
case == 0:
|
||
print("Nothing")
|
||
case == 1:
|
||
print("Just one")
|
||
case == 2:
|
||
print("A couple")
|
||
case == (-1):
|
||
print("One less than nothing")
|
||
case == (1-1j):
|
||
print("Good luck with that...")
|
||
|
||
# Additional literal patterns
|
||
match value:
|
||
case == True:
|
||
print("True or 1")
|
||
case == False:
|
||
print("False or 0")
|
||
case == None:
|
||
print("None")
|
||
case == "Hello":
|
||
print("Text 'Hello'")
|
||
case == b"World!":
|
||
print("Binary 'World!'")
|
||
case == ...:
|
||
print("May be useful when writing __getitem__ methods?")
|
||
|
||
# Matching by identity rather than equality
|
||
SENTINEL = object()
|
||
match value:
|
||
case is True:
|
||
print("True, not 1")
|
||
case is False:
|
||
print("False, not 0")
|
||
case is None:
|
||
print("None, following PEP 8 comparison guidelines")
|
||
case is SENTINEL:
|
||
print("Matches the sentinel by identity, not just value")
|
||
|
||
# Constant value patterns
|
||
from enum import Enum
|
||
class Sides(str, Enum):
|
||
SPAM = "Spam"
|
||
EGGS = "eggs"
|
||
...
|
||
|
||
preferred_side = Sides.EGGS
|
||
match entree[-1]:
|
||
case == Sides.SPAM: # Compares entree[-1] == Sides.SPAM.
|
||
response = "Have you got anything without Spam?"
|
||
case == preferred_side: # Compares entree[-1] == preferred_side
|
||
response = f"Oh, I love {preferred_side}!"
|
||
case side: # Assigns side = entree[-1].
|
||
response = f"Well, could I have their Spam instead of the {side} then?"
|
||
|
||
Note the ``== preferred_side`` example: using an explicit prefix marker on
|
||
constraint expressions removes the restriction to only working with attributes
|
||
or literals for value lookups. The ``== (-1)`` and ``== (1-1j)`` examples
|
||
illustrate the use of parentheses to turn any subexpression into an atomic one.
|
||
|
||
This PEP retains the caching property specified for value patterns in PEP 634:
|
||
if a particular constraint pattern occurs more than once in a given match
|
||
statement, language implementations are explicitly permitted to cache the first
|
||
calculation on any given match statement execution and re-use it in other
|
||
clauses. (This implicit caching is less necessary in this PEP, given that
|
||
explicit local variable caching becomes a valid option, but it still seems a
|
||
useful property to preserve)
|
||
|
||
|
||
Inferred constraint patterns
|
||
----------------------------
|
||
|
||
Inferred constraint patterns use the syntax proposed for literal and value
|
||
patterns in PEP 634, but arrange them differently in the proposed grammar to
|
||
allow for a straightforward transformation by the parser into explicit
|
||
constraints in the AST output::
|
||
|
||
inferred_constraint_pattern:
|
||
| inferred_id_constraint # Emits same parser output as id_constraint
|
||
| inferred_eq_constraint # Emits same parser output as eq_constraint
|
||
|
||
inferred_id_constraint:
|
||
| 'None'
|
||
| '...'
|
||
|
||
inferred_eq_constraint:
|
||
| attr_constraint
|
||
| numeric_constraint
|
||
| strings
|
||
|
||
attr_constraint: attr !('.' | '(' | '=')
|
||
attr: name_or_attr '.' NAME
|
||
name_or_attr: attr | NAME
|
||
|
||
numeric_constraint:
|
||
| signed_number !('+' | '-')
|
||
| signed_number '+' NUMBER
|
||
| signed_number '-' NUMBER
|
||
signed_number: NUMBER | '-' NUMBER
|
||
|
||
The terminology changes slightly to refer to them as a kind of constraint
|
||
rather than as a kind of pattern, clearly separating the subelements inside
|
||
patterns into "patterns", which define structures and name binding targets to
|
||
match against, and "constraints", which look up existing values to compare
|
||
against.
|
||
|
||
In practice, the key differences between this PEP's inferred constraint patterns
|
||
and PEP 634's value patterns and literal patterns are that
|
||
|
||
* inferred constraint patterns won't actually exist in the AST definition.
|
||
Instead, they'll be replaced by an explicit constraint node, exactly as if
|
||
they had been written with the explicit ``==`` or ``is`` prefix
|
||
* ``None`` and ``...`` are handled as part of a separate grammar rule, rather
|
||
than needing to be handled as a special case of literal patterns in the parser
|
||
* equality constraints are inferred for f-strings in addition to being inferred
|
||
for string literals
|
||
* inferred constraints for ``True`` and ``False`` are dropped entirely on
|
||
grounds of ambiguity
|
||
* Numeric constraints don't enforce the restriction that they be limited to
|
||
complex literals (only that they be limited to single numbers, or the
|
||
addition or subtraction of two such numbers)
|
||
|
||
Note: even with inferred constraints handled entirely at the parser level, it
|
||
would still be possible to limit the inference of equality constraints to
|
||
complex numbers if the tokeniser was amended to emit a different token type
|
||
(e.g. ``INUMBER``) for imaginary numbers. The PEP doesn't currently propose
|
||
making that change (in line with its generally permissive approach), but it
|
||
could be amended to do so if desired.
|
||
|
||
|
||
Mapping patterns
|
||
----------------
|
||
|
||
Mapping patterns inherit the change to replace literal patterns and
|
||
value patterns with constraint patterns that allow arbitrary primary
|
||
expressions::
|
||
|
||
mapping_pattern: '{' [items_pattern] '}'
|
||
items_pattern: ','.key_value_pattern+ ','?
|
||
key_value_pattern:
|
||
| primary ':' or_pattern
|
||
| '**' capture_pattern
|
||
|
||
However, the constraint marker prefix is not needed in this case, as the fact
|
||
this is a key to be looked up rather than a name to be bound can already be
|
||
inferred from its position within a mapping pattern.
|
||
|
||
This means that in simple cases, mapping patterns look exactly as they do in
|
||
PEP 634::
|
||
|
||
import constants
|
||
|
||
match config:
|
||
case {"route": route}:
|
||
process_route(route)
|
||
case {constants.DEFAULT_PORT: sub_config, **rest}:
|
||
process_config(sub_config, rest)
|
||
|
||
Unlike PEP 634, however, ordinary local and global variables can also be used
|
||
to match mapping keys::
|
||
|
||
ROUTE_KEY="route"
|
||
ADDRESS_KEY="local_address"
|
||
PORT_KEY="port"
|
||
match config:
|
||
case {ROUTE_KEY: route}:
|
||
process_route(route)
|
||
case {ADDRESS_KEY: address, PORT_KEY: port}:
|
||
process_address(address, port)
|
||
|
||
Note: as complex literals are written as binary operations that are evaluated
|
||
at compile time, this PEP nominally requires that they be written in parentheses
|
||
when used as a key in a mapping pattern. This requirement could be relaxed to
|
||
match PEP 634's handling of complex numbers by also accepting
|
||
``numeric_constraint`` as defining a valid key expression, and this is how
|
||
the draft reference implementation currently works (so the affected PEP 634
|
||
test cases will compile and run as expected).
|
||
|
||
|
||
Wildcard patterns
|
||
-----------------
|
||
|
||
Wildcard patterns retain the same ``_`` syntax in this PEP as they have in PEP
|
||
634. However, this PEP explicitly requires that they be represented in the
|
||
Abstract Syntax Tree as something *other than* a regular ``Name`` node.
|
||
|
||
The draft reference implementation uses the node name ``SkippedBinding`` to
|
||
indicate that the node appears where a simple name binding would ordinarily
|
||
occur to indicate that nothing should actually be bound, but the exact name of
|
||
the node is more an implementation decision than a design one. The key design
|
||
requirement is to limit the special casing of ``_`` to the parser and allow the
|
||
rest of the compiler to distinguish wildcard patterns from capture patterns
|
||
based entirely on information contained within the node itself.
|
||
|
||
|
||
Design Discussion
|
||
=================
|
||
|
||
Treating match pattern syntax as an extension of assignment target syntax
|
||
-------------------------------------------------------------------------
|
||
|
||
PEP 634 already draws inspiration from assignment target syntax in the design
|
||
of its sequence pattern matching - while being restricted to sequences for
|
||
performance and runtime correctness reasons, sequence patterns are otherwise
|
||
very similar to the existing iterable unpacking and tuple packing features seen
|
||
in regular assignment statements and function signature declarations.
|
||
|
||
By requiring that any new semantics introduced by match patterns be given new
|
||
syntax that is currently disallowed in assignment targets, one of the goals of
|
||
this PEP is to explicitly leave the door open to one or more future PEPs that
|
||
enhance assignment target syntax to support some of the new features introduced
|
||
by match patterns.
|
||
|
||
In particular, being able to easily deconstruct mappings into local variables
|
||
seems likely to be generally useful, even when there's only one mapping variant
|
||
to be matched::
|
||
|
||
{"host": host, "port": port, "mode": =="TCP"} = settings
|
||
|
||
While such code could already be written using a match statement (assuming
|
||
either this PEP or PEP 634 were to be accepted into the language), an
|
||
assignment statement level variant should be able to provide standardised
|
||
exceptions for cases where the right hand side either wasn't a mapping (throwing
|
||
``TypeError``), didn't have the specified keys (throwing ``KeyError``), or didn't
|
||
have the specific values for the given keys (throwing ``ValueError``), avoiding
|
||
the need to write out that exception raising logic in every case.
|
||
|
||
PEP 635 raises the concern that enough aspects of pattern matching semantics
|
||
will differ from assignment target semantics that pursuing syntactic parallels
|
||
will end up creating confusion rather than reducing it. However, the primary
|
||
examples cited as potentially causing confusion are exactly those where the
|
||
PEP 634 syntax is *already* the same as that for assignment targets: the fact
|
||
that case patterns use iterable unpacking syntax, but only match on sequences
|
||
(and specifically exclude strings and byte-strings) rather than consuming
|
||
arbitrary iterables is an aspect of PEP 634 that this PEP leaves unchanged.
|
||
|
||
These semantic differences are intrinsic to the nature of pattern matching:
|
||
whereas it is reasonable for a one-shot assignment statement to consume a
|
||
one-shot iterator, it isn't reasonable to do that in a construct that's
|
||
explicitly about matching a given value against multiple potential targets,
|
||
making full use of the available runtime type information to ensure those checks
|
||
are as side effect free as possible.
|
||
|
||
It's an entirely orthogonal question to how the distinction is drawn between
|
||
capture patterns and patterns that check for expected values (constraint
|
||
patterns in this PEP, literal and value patterns in PEP 634), and it's a big
|
||
logical leap to take from "these specific semantic differences between iterable
|
||
unpacking and sequence matching are needed in order to handle checking against
|
||
multiple potential targets" to "we can reuse attribute binding syntax to mean
|
||
equality constraints instead and nobody is going to get confused by that".
|
||
|
||
|
||
Interaction with caching of attribute lookups in local variables
|
||
----------------------------------------------------------------
|
||
|
||
The major change between this PEP and PEP 634 is to offer ``== EXPR`` for value
|
||
constraint lookups, rather than only offering ``NAME.ATTR``. The main motivation
|
||
for this is to avoid the semantic conflict with regular assignment targets, where
|
||
``NAME.ATTR`` is already used in assignment statements to set attributes, so if
|
||
``NAME.ATTR`` were the *only* syntax for symbolic value matching, then
|
||
we're pre-emptively ruling out any future attempts to allow matching against
|
||
single patterns using the existing assignment statement syntax. We'd also be
|
||
failing to provide users with suitable scaffolding to help build correct mental
|
||
models of what the shorthand forms mean in match patterns (as compared to what
|
||
they mean in assignment targets).
|
||
|
||
However, even within match statements themselves, the ``name.attr`` syntax for
|
||
value patterns has an undesirable interaction with local variable assignment,
|
||
where routine refactorings that would be semantically neutral for any other
|
||
Python statement introduce a major semantic change when applied to a match
|
||
statement.
|
||
|
||
Consider the following code::
|
||
|
||
while value < self.limit:
|
||
... # Some code that adjusts "value"
|
||
|
||
The attribute lookup can be safely lifted out of the loop and only performed
|
||
once::
|
||
|
||
_limit = self.limit:
|
||
while value < _limit:
|
||
... # Some code that adjusts "value"
|
||
|
||
With the marker prefix based syntax proposal in this PEP, constraint patterns
|
||
would be similarly tolerant of match patterns being refactored to use a local
|
||
variable instead of an attribute lookup, with the following two statements
|
||
being functionally equivalent::
|
||
|
||
match expr:
|
||
case {"key": == self.target}:
|
||
... # Handle the case where 'expr["key"] == self.target'
|
||
case _:
|
||
... # Handle the non-matching case
|
||
|
||
_target = self.target
|
||
match expr:
|
||
case {"key": == _target}:
|
||
... # Handle the case where 'expr["key"] == self.target'
|
||
case _:
|
||
... # Handle the non-matching case
|
||
|
||
By contrast, when using the syntactic shorthand that omits the marker prefix,
|
||
the following two statements wouldn't be equivalent at all::
|
||
|
||
# PEP 634's value pattern syntax / this PEP's attribute constraint syntax
|
||
match expr:
|
||
case {"key": self.target}:
|
||
... # Handle the case where 'expr["key"] == self.target'
|
||
case _:
|
||
... # Handle the non-matching case
|
||
|
||
_target = self.target
|
||
match expr:
|
||
case {"key": _target}:
|
||
... # Matches any mapping with "key", binding its value to _target
|
||
case _:
|
||
... # Handle the non-matching case
|
||
|
||
This PEP offers a straightforward way to retain the original semantics under
|
||
this style of simplistic refactoring: use ``== _target`` to force interpretation
|
||
of the result as a constraint pattern instead of a capture pattern (i.e. drop
|
||
the no longer applicable syntactic shorthand, and switch to the explicit form).
|
||
|
||
PEP 634's proposal to offer only the shorthand syntax, with no explicitly
|
||
prefixed form, means that the primary answer on offer is "Well, don't do that,
|
||
then, only compare against attributes in namespaces, don't compare against
|
||
simple names".
|
||
|
||
PEP 622's walrus pattern syntax had another odd interaction where it might not
|
||
bind the same object as the exact same walrus expression in the body of the
|
||
case clause, but PEP 634 fixed that discrepancy by replacing walrus patterns
|
||
with AS patterns (where the fact that the value bound to the name on the RHS
|
||
might not be the same value as returned by the LHS is a standard feature common
|
||
to all uses of the "as" keyword).
|
||
|
||
|
||
Using existing comparison operators as the constraint pattern prefix
|
||
--------------------------------------------------------------------
|
||
|
||
If the need for a dedicated constraint pattern prefix is accepted, then the
|
||
next question is to ask exactly what that prefix should be.
|
||
|
||
The initially published version of this PEP proposed using the previously
|
||
unused ``?`` symbol as the prefix for equality constraints, and ``?is`` as the
|
||
prefix for identity constraints. When reviewing the PEP, Steven D'Aprano
|
||
presented a compelling counterproposal [5_] to use the existing comparison
|
||
operators (``==`` and ``is``) instead.
|
||
|
||
There were a few concerns with ``==`` as a prefix that kept it from being
|
||
chosen as the prefix in the initial iteration of the PEP:
|
||
|
||
* for common use cases, it's even more visually noisy than ``?``, as a lot of
|
||
folks with PEP 8 trained aesthetic sensibilities are going to want to put
|
||
a space between it and the following expression, effectively making it a 3
|
||
character prefix instead of 1
|
||
* when used in a class pattern, there needs to be a space between the ``=``
|
||
keyword separator and the ``==`` prefix, or the tokeniser will split them
|
||
up incorrectly (getting ``==`` and ``=`` instead of ``=`` and ``==``)
|
||
* when used in a mapping pattern, there needs to be a space between the ``:``
|
||
key/value separator and the ``==`` prefix, or the tokeniser will split them
|
||
up incorrectly (getting ``:=`` and ``=`` instead of ``:`` and ``==``)
|
||
|
||
Rather than introducing a completely new symbol, Steven's proposed resolution to
|
||
this verbosity problem was to retain the ability to omit the prefix marker in
|
||
syntactically unambiguous cases.
|
||
|
||
This prompted a review of the PEP's goals and underlying concerns, and the
|
||
determination that the author's core concern was with the idea of not even
|
||
*offering* users the ability to be explicit when they wanted or needed to be,
|
||
and instead telling them they could only express the intent that the compiler
|
||
inferred that they wanted - they couldn't be more explicit and override the
|
||
compiler's default inference when it turned out to be wrong (as it inevitably
|
||
will be in at least some cases).
|
||
|
||
Given that perspective, PEP 635's arguments against using ``?`` as part of the
|
||
pattern matching syntax held for this proposal as well, and so the PEP was
|
||
amended accordingly.
|
||
|
||
|
||
Retaining ``_`` as the wildcard pattern marker
|
||
----------------------------------------------
|
||
|
||
PEP 635 makes a solid case that introducing ``?`` *solely* as a wildcard pattern
|
||
marker would be a bad idea. With the syntax for constraint patterns now changed
|
||
to use existing comparison operations rather than ``?`` and ``?is``, that
|
||
argument holds for this PEP as well.
|
||
|
||
However, this PEP also proposes adopting an implementation technique that limits
|
||
the scope of the associated special casing of ``_`` to the parser: defining a
|
||
new AST node type (``SkippedBinding``) specifically for wildcard markers.
|
||
|
||
Within the parser, ``_`` would still mean either a regular name or a wildcard
|
||
marker in a match pattern depending on where you were in the parse tree, but
|
||
within the rest of the compiler, ``Name("_")`` would always be a regular name,
|
||
while ``SkippedBinding()`` would always be a wildcard marker (with it being
|
||
the responsibility of the AST validator to disallow the use of
|
||
``SkippedBinding`` outside match patterns).
|
||
|
||
It may also make sense to consider a future proposal that further changes ``_``
|
||
to also skip binding when it's used as part of an iterable unpacking target, but
|
||
that's entirely out of the scope of the pattern matching discussion (and would
|
||
require careful review of how the routine uses of assignment to ``_`` in
|
||
internationalisation use cases and Python interactive prompt implementations
|
||
are handled).
|
||
|
||
|
||
Keeping inferred equality constraints
|
||
-------------------------------------
|
||
|
||
An early (not widely publicised) draft of this proposal considered keeping
|
||
PEP 634's literal patterns, as they don't inherently conflict with assignment
|
||
statement syntax the way that PEP 634's value patterns do (trying to assign
|
||
to a literal is already a syntax error, whereas assigning to a dotted name
|
||
sets the attribute).
|
||
|
||
They were removed in the initially published version due to the fact that they
|
||
have the same syntax sensitivity problem as attribute constraints do, where
|
||
naively attempting to move the literal pattern out to a local variable for
|
||
naming clarity turns the value checking literal pattern into a name binding
|
||
capture pattern::
|
||
|
||
# PEP 634's literal pattern syntax / this PEP's literal constraint syntax
|
||
match expr:
|
||
case {"port": 443}:
|
||
... # Handle the case where 'expr["port"] == 443'
|
||
case _:
|
||
... # Handle the non-matching case
|
||
|
||
HTTPS_PORT = 443
|
||
match expr:
|
||
case {"port": HTTPS_PORT}:
|
||
... # Matches any mapping with "port", binding its value to HTTPS_PORT
|
||
case _:
|
||
... # Handle the non-matching case
|
||
|
||
With explicit equality constraints, this style of refactoring keeps the original
|
||
semantics (just as it would for a value lookup in any other statement)::
|
||
|
||
# This PEP's equality constraints
|
||
match expr:
|
||
case {"port": == 443}:
|
||
... # Handle the case where 'expr["port"] == 443'
|
||
case _:
|
||
... # Handle the non-matching case
|
||
|
||
HTTPS_PORT = 443
|
||
match expr:
|
||
case {"port": == HTTPS_PORT}:
|
||
... # Handle the case where 'expr["port"] == 443'
|
||
case _:
|
||
... # Handle the non-matching case
|
||
|
||
As noted above, both literal patterns and value patterns made their return (in
|
||
the form of inferred equality constraints) as a way to address the verbosity
|
||
problem of offering explicit ``==`` prefixed equality constraints as the *only*
|
||
way to express equality checks.
|
||
|
||
However, the presence of the explicit constraint nodes in the AST means that
|
||
these special cases can be limited to the parser, with the implicit forms
|
||
emitting the same AST nodes as their explicit counterparts.
|
||
|
||
|
||
Inferring equality constraints for f-strings
|
||
--------------------------------------------
|
||
|
||
This is less a design decision in its own right, and more a consequence of
|
||
other design decisions:
|
||
|
||
* the tokeniser and parser don't distinquish f-strings from other kinds of
|
||
strings, so inferring an explicit equality constraint for f-strings happens
|
||
by default when defining the match pattern parser rule for string literals
|
||
* the rest of the compiler then treats that output like any other explicit
|
||
equality constraint in an AST pattern node (i.e. allowing arbitary
|
||
expressions)
|
||
|
||
This combination of factors makes it awkward to implement a special case that
|
||
disallows inferring equality constraints for f-strings while accepting them for
|
||
string literals, so the PEP instead opts to just allow them (as they're just as
|
||
syntactically unambiguous as any other string in a match pattern).
|
||
|
||
|
||
Keeping inferred identity constraints
|
||
-------------------------------------
|
||
|
||
PEP 635 makes a reasonable case that interpreting a check against ``None``
|
||
as ``== None`` would almost always be incorrect, whereas interpreting it as
|
||
``is None`` (as advised in PEP 8) would almost always be what the user intended.
|
||
|
||
Similar reasoning applies to checking against ``...``.
|
||
|
||
Accordingly, this PEP defines the use of either of these tokens as implying an
|
||
identity constraint.
|
||
|
||
However, as with inferred equality contraints, inferred identity constraints
|
||
become explicit identity constraints in the parser output.
|
||
|
||
|
||
Disallowing inferred constraints for ``True`` and ``False``
|
||
-----------------------------------------------------------
|
||
|
||
PEP 635 makes a reasonable case that comparing the ``True``, and ``False``
|
||
literals by equality by default is problematic. PEP 8 advises against writing
|
||
those comparisons out explicitly in code, so it doesn't make sense for us to
|
||
implement a construct that does so implicitly inside the interpreter.
|
||
|
||
Unlike PEP 635, however, this PEP proposes to resolve the discrepancy by leaving
|
||
these two names out of the initial iteration of the inferred constraint syntax
|
||
definition entirely, rather than treating them as implying an identity constraint.
|
||
|
||
This means comparisons against ``True`` and ``False`` in match patterns would
|
||
need to be written in one of the following forms:
|
||
|
||
* comparison by numeric value::
|
||
|
||
case 0:
|
||
...
|
||
case 1:
|
||
...
|
||
|
||
* comparison by equality (equivalent to comparison by numeric value)::
|
||
|
||
case == False:
|
||
...
|
||
case == True:
|
||
...
|
||
|
||
* comparison by identity::
|
||
|
||
case is False:
|
||
...
|
||
case is True:
|
||
...
|
||
|
||
* comparison by value with class check (equivalent to comparison by identity)::
|
||
|
||
case bool(False):
|
||
...
|
||
case bool(True):
|
||
...
|
||
|
||
* comparison by boolean coercion::
|
||
|
||
case (x, p) if not p:
|
||
...
|
||
case (x, p) if p:
|
||
...
|
||
|
||
The last approach is the one that would most closely follow PEP 8's guidance
|
||
for ``if``-``elif`` chains (comparing by boolean coercion), but it's far from
|
||
clear at this point how ``True`` and ``False`` literals will end up being used
|
||
in pattern matching use cases.
|
||
|
||
In particular, PEP 635's assessment that users will *probably* mean "comparison
|
||
by value with class check", which effectively becomes "comparison by identity"
|
||
due to ``True`` and ``False`` being singletons, is a genuinely plausible
|
||
suggestion.
|
||
|
||
However, rather than attempting to guess up front, this PEP proposes that no
|
||
shorthand form be offered for these two constants in the initial implementation,
|
||
and we instead wait and see if a clearly preferred meaning emerges from actual
|
||
usage of the new construct.
|
||
|
||
|
||
Inferred constraints rather than implied constraints
|
||
----------------------------------------------------
|
||
|
||
This PEP uses the term "inferred contraint" to make it clear that the parser
|
||
is making assumptions about the user's intent when converting an inferred
|
||
constraint to an explicit one.
|
||
|
||
Calling them "implied constraints" instead would also be reasonable, but that
|
||
phrasing has a slightly stronger connotation that the inference is always going
|
||
to be correct, and one of the motivations of this PEP is that the inference
|
||
*isn't* always going to be correct, so we should be offering a way for users to
|
||
be explicit when the parser's assumptions don't align with their intent.
|
||
|
||
|
||
Deferred Ideas
|
||
==============
|
||
|
||
Allowing negated constraints in match patterns
|
||
----------------------------------------------
|
||
|
||
The requirement that constraint expressions be primary expressions means that
|
||
it isn't permitted to write ``!= expr`` or ``is not expr``.
|
||
|
||
Both of these forms have clear potential interpretions as a negated equality
|
||
constraint (i.e. ``x != expr``) and a negated identity constraint
|
||
(i.e. ``x is not expr``).
|
||
|
||
However, it's far from clear either form would come up often enough to justify
|
||
the dedicated syntax, so the extension has been deferred pending further
|
||
community experience with match statements.
|
||
|
||
|
||
Allowing containment checks in match patterns
|
||
---------------------------------------------
|
||
|
||
The syntax used for equality and identity constraints would be straightforward
|
||
to extend to containment checks: ``in container``.
|
||
|
||
One downside of the proposals in both this PEP and PEP 634 is that checking
|
||
for multiple values in the same case is quite verbose::
|
||
|
||
# PEP 634's literal patterns / this PEP's inferred constraints
|
||
match value:
|
||
case 0 | 1 | 2 | 3:
|
||
...
|
||
|
||
Explicit equality constraints are even worse::
|
||
|
||
match value:
|
||
case == one | == two | == three | == four:
|
||
...
|
||
|
||
Containment constraints would provide a more concise way to check if the
|
||
match subject was present in a container::
|
||
|
||
match value:
|
||
case in {0, 1, 2, 3}:
|
||
...
|
||
case in {one, two, three, four}:
|
||
...
|
||
case in range(4): # It would accept any container, not just literal sets
|
||
...
|
||
|
||
Such a feature would also be readily extensible to allow all kinds of case
|
||
clauses without any further syntax updates, simply by defining ``__contains__``
|
||
appropriately on a custom class definition.
|
||
|
||
However, while this does seem like a useful extension, it isn't essential to
|
||
making match statements a valuable addition to the language, so it seems more
|
||
appropriate to defer it to a separate proposal, rather than including it here.
|
||
|
||
|
||
Rejected Ideas
|
||
==============
|
||
|
||
Restricting permitted expressions in constraint patterns and mapping pattern keys
|
||
---------------------------------------------------------------------------------
|
||
|
||
While it's entirely technically possible to restrict the kinds of expressions
|
||
permitted in constraint patterns and mapping pattern keys to just attribute
|
||
lookups and constant literals (as PEP 634 does), there isn't any clear runtime
|
||
value in doing so, so this PEP proposes allowing any kind of primary expression
|
||
(primary expressions are an existing node type in the grammar that includes
|
||
things like literals, names, attribute lookups, function calls, container
|
||
subscripts, parenthesised groups, etc).
|
||
|
||
While PEP 635 does emphasise several times that literal patterns and value
|
||
patterns are not full expressions, it doesn't ever articulate a concrete benefit
|
||
that is obtained from that restriction (just a theoretical appeal to it being
|
||
useful to separate static checks from dynamic checks, which a code style
|
||
tool could still enforce, even if the compiler itself is more permissive).
|
||
|
||
The last time we imposed such a restriction was for decorator expressions and
|
||
the primary outcome of that was that users had to put up with years of awkward
|
||
syntactic workarounds (like nesting arbitrary expressions inside function calls
|
||
that just returned their argument) to express the behaviour they wanted before
|
||
the language definition was finally updated to allow arbitrary expressions and
|
||
let users make their own decisions about readability.
|
||
|
||
The situation in PEP 634 that bears a resemblance to the situation with decorator
|
||
expressions is that arbitrary expressions are technically supported in value
|
||
patterns, they just require awkward workarounds where either all the values to
|
||
match need to be specified in a helper class that is placed before the match
|
||
statement::
|
||
|
||
# Allowing arbitrary match targets with PEP 634's value pattern syntax
|
||
class mt:
|
||
value = func()
|
||
match expr:
|
||
case (_, mt.value):
|
||
... # Handle the case where 'expr[1] == func()'
|
||
|
||
Or else they need to be written as a combination of a capture pattern and a
|
||
guard expression::
|
||
|
||
match expr:
|
||
case (_, _matched) if _matched == func():
|
||
... # Handle the case where 'expr[1] == func()'
|
||
|
||
This PEP proposes skipping requiring any such workarounds, and instead
|
||
supporting arbitrary value constraints from the start::
|
||
|
||
match expr:
|
||
case (_, == func()):
|
||
... # Handle the case where 'expr == func()'
|
||
|
||
Whether actually writing that kind of code is a good idea would be a topic for
|
||
style guides and code linters, not the language compiler.
|
||
|
||
In particular, if static analysers can't follow certain kinds of dynamic checks,
|
||
then they can limit the permitted expressions at analysis time, rather than the
|
||
compiler restricting them at compile time.
|
||
|
||
There are also some kinds of expressions that are almost certain to give
|
||
nonsensical results (e.g. ``yield``, ``yield from``, ``await``) due to the
|
||
pattern caching rule, where the number of times the constraint expression
|
||
actually gets evaluated will be implementation dependent. Even here, the PEP
|
||
takes the view of letting users write nonsense if they really want to.
|
||
|
||
Aside from the recenty updated decorator expressions, another situation where
|
||
Python's formal syntax offers full freedom of expression that is almost never
|
||
used in practice is in ``except`` clauses: the exceptions to match against
|
||
almost always take the form of a simple name, a dotted name, or a tuple of
|
||
those, but the language grammar permits arbitrary expressions at that point.
|
||
This is a good indication that Python's user base can be trusted to
|
||
take responsibility for finding readable ways to use permissive language
|
||
features, by avoiding writing hard to read constructs even when they're
|
||
permitted by the compiler.
|
||
|
||
This permissiveness comes with a real concrete benefit on the implementation
|
||
side: dozens of lines of match statement specific code in the compiler is
|
||
replaced by simple calls to the existing code for compiling expressions
|
||
(including in the AST validation pass, the AST optimization pass, the symbol
|
||
table analysis pass, and the code generation pass). This implementation
|
||
benefit would accrue not just to CPython, but to every other Python
|
||
implementation looking to add match statement support.
|
||
|
||
|
||
Requiring the use of constraint prefix markers for mapping pattern keys
|
||
-----------------------------------------------------------------------
|
||
|
||
The initial (unpublished) draft of this proposal suggested requiring mapping
|
||
pattern keys be constraint patterns, just as PEP 634 requires that they be valid
|
||
literal or value patterns::
|
||
|
||
import constants
|
||
|
||
match config:
|
||
case {?"route": route}:
|
||
process_route(route)
|
||
case {?constants.DEFAULT_PORT: sub_config, **rest}:
|
||
process_config(sub_config, rest)
|
||
|
||
However, the extra character was syntactically noisy and unlike its use in
|
||
constraint patterns (where it distinguishes them from capture patterns), the
|
||
prefix doesn't provide any additional information here that isn't already
|
||
conveyed by the expression's position as a key within a mapping pattern.
|
||
|
||
Accordingly, the proposal was simplified to omit the marker prefix from mapping
|
||
pattern keys.
|
||
|
||
This omission also aligns with the fact that containers may incorporate both
|
||
identity and equality checks into their lookup process - they don't purely
|
||
rely on equality checks, as would be incorrectly implied by the use of the
|
||
equality constraint prefix.
|
||
|
||
|
||
Providing dedicated syntax for binding matched constraint values
|
||
----------------------------------------------------------------
|
||
|
||
The initial (unpublished) draft of this proposal suggested allowing ``NAME?EXPR``
|
||
as a syntactically unambiguous shorthand for PEP 622's ``NAME := BASE.ATTR`` or
|
||
PEP 634's ``BASE.ATTR as NAME``.
|
||
|
||
This idea was dropped as it complicated the grammar for no gain in
|
||
expressiveness over just using the general purpose approach to combining
|
||
capture patterns with other match patterns (i.e. ``?EXPR as NAME`` at the
|
||
time, ``== EXPR as NAME`` now) when the identity of the matching object is
|
||
important.
|
||
|
||
This idea is even less appropriate after the switch to using existing comparison
|
||
operators as the marker prefix, as both ``NAME == EXPR`` and ``NAME is EXPR``
|
||
would look like ordinary comparison operations, with nothing to suggest that
|
||
``NAME`` is being bound by the pattern matching process.
|
||
|
||
|
||
Reference Implementation
|
||
========================
|
||
|
||
A reference implementation for this PEP [3_] has been derived from Brandt
|
||
Bucher's reference implementation for PEP 634 [4_].
|
||
|
||
Relative to the text of this PEP, the draft reference implementation currently
|
||
implements the variant of mapping patterns where numeric constraints are
|
||
accepted in addition to primary expressions (this allowed the PEP 634 mapping
|
||
pattern checks for complex keys to run as written).
|
||
|
||
All other modified patterns have been updated to follow this PEP rather than
|
||
PEP 634.
|
||
|
||
The AST validator for match patterns has not yet been implemented.
|
||
|
||
There is an implementation decision still to be made around representing
|
||
constraint operators in the AST. The draft implementation adds them as new
|
||
cases on the existing ``UnaryOp`` node, but there's an argument to be made that
|
||
they would be better implemented as a new ``Constraint`` node, since they're
|
||
accepted at different points in the syntax tree than other unary operators.
|
||
Making them a new node type would also allow an attribute to be added that
|
||
marked them as implicit or explicit nodes, which ``ast.unparse`` could use
|
||
to make the unparsed code look more like original.
|
||
|
||
|
||
Acknowledgments
|
||
===============
|
||
|
||
The PEP 622 and PEP 634/635/636 authors, as the proposal in this PEP is merely
|
||
an attempt to improve the readability of an already well-constructed idea by
|
||
proposing that reusing the existing attribute binding syntax to mean an
|
||
attribute lookup will be more easily understood as syntactic sugar for a more
|
||
explicit underlying expression that's compatible with the existing binding
|
||
target syntax than it will be as the *only* way to spell such comparisons in
|
||
match patterns.
|
||
|
||
Steven D'Aprano, who made a convincing case that the key goals of this PEP could
|
||
be achieved by using existing comparison tokens to add the ability to override
|
||
the compiler when our guesses as to "what most users will want most of the time"
|
||
are inevitably incorrect for at least some users some of the time, and retaining
|
||
some of PEP 634's syntactic sugar (with a slightly different semantic definition)
|
||
to obtain the same level of brevity as PEP 634 in most situations. (Paul
|
||
Sokolosvsky also independently suggested using ``==`` instead of ``?`` as a
|
||
more easily understood prefix for equality constraints).
|
||
|
||
|
||
References
|
||
==========
|
||
|
||
.. [1] Post explaining the syntactic novelties in PEP 622
|
||
https://mail.python.org/archives/list/python-dev@python.org/message/2VRPDW4EE243QT3QNNCO7XFZYZGIY6N3/>
|
||
|
||
.. [2] Declined pull request proposing to list this as a Rejected Idea in PEP 622
|
||
https://github.com/python/peps/pull/1564
|
||
|
||
.. [3] In-progress reference implementation for this PEP
|
||
https://github.com/ncoghlan/cpython/tree/pep-642-constraint-patterns
|
||
|
||
.. [4] PEP 634 reference implementation
|
||
https://github.com/python/cpython/pull/22917
|
||
|
||
.. [5] Steven D'Aprano's cogent criticism of the first published iteration of this PEP
|
||
https://mail.python.org/archives/list/python-dev@python.org/message/BTHFWG6MWLHALOD6CHTUFPHAR65YN6BP/
|
||
|
||
|
||
.. _Appendix A:
|
||
|
||
Appendix A -- Full Grammar
|
||
==========================
|
||
|
||
Here is the full modified grammar for ``match_stmt``, replacing Appendix A
|
||
in PEP 634.
|
||
|
||
Notation used beyond standard EBNF is as per PEP 534:
|
||
|
||
- ``'KWD'`` denotes a hard keyword
|
||
- ``"KWD"`` denotes a soft keyword
|
||
- ``SEP.RULE+`` is shorthand for ``RULE (SEP RULE)*``
|
||
- ``!RULE`` is a negative lookahead assertion
|
||
|
||
::
|
||
|
||
match_stmt: "match" subject_expr ':' NEWLINE INDENT case_block+ DEDENT
|
||
subject_expr:
|
||
| star_named_expression ',' [star_named_expressions]
|
||
| named_expression
|
||
case_block: "case" patterns [guard] ':' block
|
||
guard: 'if' named_expression
|
||
|
||
patterns: open_sequence_pattern | pattern
|
||
pattern: as_pattern | or_pattern
|
||
as_pattern: or_pattern 'as' capture_pattern
|
||
or_pattern: '|'.closed_pattern+
|
||
closed_pattern:
|
||
| capture_pattern
|
||
| wildcard_pattern
|
||
| constraint_pattern
|
||
| inferred_constraint_pattern
|
||
| group_pattern
|
||
| sequence_pattern
|
||
| mapping_pattern
|
||
| class_pattern
|
||
|
||
capture_pattern: !"_" NAME !('.' | '(' | '=')
|
||
|
||
wildcard_pattern: "_"
|
||
|
||
constraint_pattern:
|
||
| eq_constraint
|
||
| id_constraint
|
||
eq_constraint: '==' primary
|
||
id_constraint: 'is' primary
|
||
|
||
inferred_constraint_pattern:
|
||
| inferred_id_constraint
|
||
| inferred_eq_constraint
|
||
|
||
inferred_id_constraint[expr_ty]:
|
||
| 'None'
|
||
| '...'
|
||
|
||
inferred_eq_constraint:
|
||
| attr_constraint
|
||
| numeric_constraint
|
||
| strings
|
||
|
||
attr_constraint: attr !('.' | '(' | '=')
|
||
attr: name_or_attr '.' NAME
|
||
name_or_attr: attr | NAME
|
||
numeric_constraint:
|
||
| signed_number !('+' | '-')
|
||
| signed_number '+' NUMBER
|
||
| signed_number '-' NUMBER
|
||
signed_number: NUMBER | '-' NUMBER
|
||
|
||
group_pattern: '(' pattern ')'
|
||
|
||
sequence_pattern:
|
||
| '[' [maybe_sequence_pattern] ']'
|
||
| '(' [open_sequence_pattern] ')'
|
||
open_sequence_pattern: maybe_star_pattern ',' [maybe_sequence_pattern]
|
||
maybe_sequence_pattern: ','.maybe_star_pattern+ ','?
|
||
maybe_star_pattern: star_pattern | pattern
|
||
star_pattern: '*' (capture_pattern | wildcard_pattern)
|
||
|
||
mapping_pattern: '{' [items_pattern] '}'
|
||
items_pattern: ','.key_value_pattern+ ','?
|
||
key_value_pattern:
|
||
| primary ':' pattern
|
||
| double_star_pattern
|
||
double_star_pattern: '**' capture_pattern
|
||
|
||
class_pattern:
|
||
| name_or_attr '(' [pattern_arguments ','?] ')'
|
||
pattern_arguments:
|
||
| positional_patterns [',' keyword_patterns]
|
||
| keyword_patterns
|
||
positional_patterns: ','.pattern+
|
||
keyword_patterns: ','.keyword_pattern+
|
||
keyword_pattern: NAME '=' pattern
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document is placed in the public domain or under the
|
||
CC0-1.0-Universal license, whichever is more permissive.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|