991 lines
41 KiB
ReStructuredText
991 lines
41 KiB
ReStructuredText
PEP: 642
|
||
Title: Constraint Pattern Syntax for Structural Pattern Matching
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Nick Coghlan <ncoghlan@gmail.com>
|
||
BDFL-Delegate:
|
||
Discussions-To: Python-Dev <python-dev@python.org>
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Requires: 634
|
||
Created: 26-Sep-2020
|
||
Python-Version: 3.10
|
||
Post-History: 31-Oct-2020
|
||
Resolution:
|
||
|
||
Abstract
|
||
========
|
||
|
||
This PEP covers an alternative syntax proposal for PEP 634's structural pattern
|
||
matching that explicitly anchors match expressions in the existing syntax for
|
||
assignment targets, while retaining most semantic aspects of the existing
|
||
proposal.
|
||
|
||
Specifically, this PEP adopts an additional design restriction that PEP 634's
|
||
authors considered unreasonable: that any syntax that is common to both
|
||
assignment targets and match patterns must have a comparable semantic effect,
|
||
while any novel match pattern semantics must use syntax which emits a syntax
|
||
error when used in an assignment target.
|
||
|
||
As a consequence, this PEP proposes the following changes to the proposed match
|
||
pattern syntax:
|
||
|
||
* Literal patterns and value patterns are combined into a single new
|
||
pattern type: "constraint patterns"
|
||
* Constraint patterns are either equality constraints or identity constraints
|
||
* Equality constraints use ``?`` as a prefix marker on an otherwise
|
||
arbitrary primary expression: ``?EXPR``
|
||
* Identity constraints use `?is` as a prefix marker on an otherwise
|
||
arbitrary primary expression: ``?is EXPR``
|
||
* There is no special casing of the ``None``, ``True``, or ``False`` literals
|
||
* The constraint expression in an equality constraint may be omitted to give a
|
||
non-binding wildcard pattern
|
||
* Mapping patterns change to allow arbitrary primary expressions as keys
|
||
* Attempting to use a dotted name as a match pattern is a syntax error rather
|
||
than implying an equality constraint
|
||
* Attempting to use a literal as a match pattern is a syntax error rather
|
||
than implying an equality or identity constraint
|
||
* The ``_`` identifier is no longer syntactically special (it is a normal
|
||
capture pattern, just as it is an ordinary assignment target)
|
||
|
||
|
||
Relationship with other PEPs
|
||
============================
|
||
|
||
This PEP both depends on and competes with PEP 634 - the PEP author agrees that
|
||
match statements would be a sufficiently valuable addition to the language to
|
||
be worth the additional complexity that they add to the learning process, but
|
||
disagrees with the idea that "simple name vs literal or attribute lookup" offers
|
||
an adequate syntactic distinction between name binding and value lookup
|
||
operations in match patterns.
|
||
|
||
By switching the wildcard pattern to "?", this PEP complements the proposal in
|
||
PEP 640 to allow the use of wildcard patterns in other contexts where a name
|
||
binding is syntactically required, but the application doesn't actually need
|
||
the value.
|
||
|
||
|
||
Motivation
|
||
==========
|
||
|
||
The original PEP 622 (which was later split into PEPs 634, 635, and 636)
|
||
incorporated an unstated but essential assumption in its syntax design: that
|
||
neither ordinary expressions *nor* the existing assignment target syntax provide
|
||
an adequate foundation for the syntax used in match patterns.
|
||
|
||
While the PEP didn't explicitly state this assumption, one of the PEP authors
|
||
explained it clearly on python-dev [1_]:
|
||
|
||
The actual problem that I see is that we have different cultures/intuitions
|
||
fundamentally clashing here. In particular, so many programmers welcome
|
||
pattern matching as an "extended switch statement" and find it therefore
|
||
strange that names are binding and not expressions for comparison. Others
|
||
argue that it is at odds with current assignment statements, say, and
|
||
question why dotted names are _/not/_ binding. What all groups seem to
|
||
have in common, though, is that they refer to _/their/_ understanding and
|
||
interpretation of the new match statement as 'consistent' or 'intuitive'
|
||
--- naturally pointing out where we as PEP authors went wrong with our
|
||
design.
|
||
|
||
But here is the catch: at least in the Python world, pattern matching as
|
||
proposed by this PEP is an unprecedented and new way of approaching a common
|
||
problem. It is not simply an extension of something already there. Even
|
||
worse: while designing the PEP we found that no matter from which angle you
|
||
approach it, you will run into issues of seeming 'inconsistencies' (which is
|
||
to say that pattern matching cannot be reduced to a 'linear' extension of
|
||
existing features in a meaningful way): there is always something that goes
|
||
fundamentally beyond what is already there in Python. That's why I argue
|
||
that arguments based on what is 'intuitive' or 'consistent' just do not
|
||
make sense _/in this case/_.
|
||
|
||
PEP 635 (and PEP 622 before it) makes a strong case that treating capture
|
||
patterns as the default usage for simple names in match patterns is the right
|
||
approach, and provides a number of examples where having names express value
|
||
constraints by default would be confusing (this difference from C/C++ switch
|
||
statement semantics is also a key reason it makes sense to use `match` as the
|
||
introductory keyword for the new statement rather than `switch`).
|
||
|
||
However, PEP 635 doesn't even *try* to make the case for the second assertion,
|
||
that treating match patterns as a variation on assignment targets also leads to
|
||
inherent contradictions. Even a PR submitted to explicitly list this option in
|
||
the "Rejected Ideas" section of the original PEP 622 was declined [2_].
|
||
|
||
This PEP instead starts from the assumption that it *is* possible to treat match
|
||
patterns as a variation on assignment targets, and the only essential
|
||
differences that emerge relative to the syntactic proposal in PEP 634 are:
|
||
|
||
* a requirement to use an explicit marker prefix on value lookups rather than
|
||
allowing them to be implied by the use of dotted names; and
|
||
* a requirement to use a non-binding wildcard marker other than ``_``.
|
||
|
||
PEP 634 also proposes special casing the literals ``None``, ``True``, and
|
||
``False`` so that they're compared by identity when written directly as a
|
||
literal pattern, but by equality when referenced by a value pattern. This PEP
|
||
eliminates those special cases by proposing distinct syntax for matching by
|
||
identity and matching by equality.
|
||
|
||
|
||
Specification
|
||
=============
|
||
|
||
This PEP retains the overall `match`/`case` statement syntax from PEP 634, and
|
||
retains both the syntax and semantics for the following match pattern variants:
|
||
|
||
* capture patterns
|
||
* class patterns
|
||
* group patterns
|
||
* sequence patterns
|
||
|
||
Pattern combination (both OR and AS patterns) and guard expressions also remain
|
||
the same as they are in PEP 634.
|
||
|
||
Wildcard patterns change their syntactic marker from `_` to `?`.
|
||
|
||
Literal patterns and value patterns are replaced by constraint
|
||
patterns.
|
||
|
||
Mapping patterns change to allow arbitrary primary expressions for keys, rather
|
||
than being restricted to literal patterns or value patterns.
|
||
|
||
|
||
Wildcard patterns
|
||
-----------------
|
||
|
||
Wildcard patterns change their syntactic marker from `_` to `?`::
|
||
|
||
# Wildcard pattern
|
||
match data:
|
||
case [?, ?]:
|
||
print("Some pair")
|
||
print(?) # Error!
|
||
|
||
With `?` taking over the role of the non-binding syntactically significant
|
||
wildcard marker, `_` reverts to working the same way it does in other assignment
|
||
contexts: it operates as an ordinary identifier and hence becomes a normal
|
||
capture pattern rather than a special case.
|
||
|
||
|
||
Constraint patterns
|
||
-------------------
|
||
|
||
Constraint patterns use the following simplified syntax::
|
||
|
||
constraint_pattern: id_constraint | eq_constraint
|
||
id_constraint: '?' 'is' primary
|
||
eq_constraint: '?' primary
|
||
|
||
The constraint expression is an arbitrary primary expression - it can be a
|
||
simple name, a dotted name lookup, a literal, a function call, or any other
|
||
primary expression.
|
||
|
||
While the compiler would allow whitespace between ``?`` and ``is`` in
|
||
identity constraints (as they're defined as separate tokens), this PEP
|
||
proposes that PEP 8 be updated to recommend writing them like ``?is``, as if
|
||
they were a combined unary operator.
|
||
|
||
If this PEP were to be adopted in preference to PEP 634, then all literal and
|
||
value patterns would instead be written as constraint patterns::
|
||
|
||
# Literal patterns
|
||
match number:
|
||
case ?0:
|
||
print("Nothing")
|
||
case ?1:
|
||
print("Just one")
|
||
case ?2:
|
||
print("A couple")
|
||
case ?-1:
|
||
print("One less than nothing")
|
||
case ?(1-1j):
|
||
print("Good luck with that...")
|
||
|
||
# Additional literal patterns
|
||
match value:
|
||
case ?True:
|
||
print("True or 1")
|
||
case ?False:
|
||
print("False or 0")
|
||
case ?None:
|
||
print("None")
|
||
case ?"Hello":
|
||
print("Text 'Hello'")
|
||
case ?b"World!":
|
||
print("Binary 'World!'")
|
||
case ?...:
|
||
print("May be useful when writing __getitem__ methods?")
|
||
|
||
# Matching by identity rather than equality
|
||
SENTINEL = object()
|
||
match value:
|
||
case ?is True:
|
||
print("True, not 1")
|
||
case ?is False:
|
||
print("False, not 0")
|
||
case ?is None:
|
||
print("None, following PEP 8 comparison guidelines")
|
||
case ?is SENTINEL:
|
||
print("Matches the sentinel by identity, not just value")
|
||
|
||
# Constant value patterns
|
||
from enum import Enum
|
||
class Sides(str, Enum):
|
||
SPAM = "Spam"
|
||
EGGS = "eggs"
|
||
...
|
||
|
||
preferred_side = Sides.EGGS
|
||
match entree[-1]:
|
||
case ?Sides.SPAM: # Compares entree[-1] == Sides.SPAM.
|
||
response = "Have you got anything without Spam?"
|
||
case ?preferred_side: # Compares entree[-1] == preferred_side
|
||
response = f"Oh, I love {preferred_side}!"
|
||
case side: # Assigns side = entree[-1].
|
||
response = f"Well, could I have their Spam instead of the {side} then?"
|
||
|
||
Note the `?preferred_side` example: using an explicit prefix marker on constraint
|
||
expressions removes the restriction to only working with bound names for value
|
||
lookups. The `?(1-1j)` example illustrates the use of parentheses to turn any
|
||
subexpression into an atomic one.
|
||
|
||
This PEP retains the caching property specified for value patterns in PEP 634:
|
||
if a particular constraint pattern occurs more than once in a given match
|
||
statement, language implementations are explicitly permitted to cache the first
|
||
calculation on any given match statement execution and re-use it in other
|
||
clauses. (This implicit caching is less necessary in this PEP, given that
|
||
explicit local variable caching becomes a valid option, but it still seems a
|
||
useful property to preserve)
|
||
|
||
Mapping patterns
|
||
----------------
|
||
|
||
Mapping patterns inherit the change to replace literal patterns and constant
|
||
value patterns with constraint patterns::
|
||
|
||
mapping_pattern: '{' [items_pattern] '}'
|
||
items_pattern: ','.key_value_pattern+ ','?
|
||
key_value_pattern:
|
||
| primary ':' or_pattern
|
||
| '**' capture_pattern
|
||
|
||
However, the constraint marker prefix is not needed in this case, as the fact
|
||
this is a key to be looked up rather than a name to be bound is already
|
||
implied by its position within a mapping pattern.
|
||
|
||
This means that in simple cases, mapping patterns look exactly as they do in
|
||
PEP 634::
|
||
|
||
import constants
|
||
|
||
match config:
|
||
case {"route": route}:
|
||
process_route(route)
|
||
case {constants.DEFAULT_PORT: sub_config, **rest}:
|
||
process_config(sub_config, rest)
|
||
|
||
Unlike PEP 634, however, ordinary local and global variables can also be used
|
||
to match mapping keys::
|
||
|
||
ROUTE_KEY="route"
|
||
ADDRESS_KEY="local_address"
|
||
PORT_KEY="port"
|
||
match config:
|
||
case {ROUTE_KEY: route}:
|
||
process_route(route)
|
||
case {ADDRESS_KEY: address, PORT_KEY: port}:
|
||
process_address(address, port)
|
||
|
||
Note: as complex literals are written as binary operations that are evaluated
|
||
at compile time, this PEP requires that they be written in parentheses when
|
||
used as a key in a mapping pattern.
|
||
|
||
|
||
Design Discussion
|
||
=================
|
||
|
||
Treating match pattern syntax as an extension of assignment target syntax
|
||
-------------------------------------------------------------------------
|
||
|
||
PEP 634 already draws inspiration from assignment target syntax in the design
|
||
of its sequence pattern matching - while being restricted to sequences for
|
||
performance and runtime correctness reasons, sequence patterns are otherwise
|
||
very similar to the existing iterable unpacking and tuple packing features seen
|
||
in regular assignment statements and function signature declarations.
|
||
|
||
By requiring that any new semantics introduced by match patterns be given new
|
||
syntax that is currently disallowed in assignment targets, one of the goals of
|
||
this PEP is to explicitly leave the door open to one or more future PEPs that
|
||
enhance assignment target syntax to support some of the new features introduced
|
||
by match patterns.
|
||
|
||
In particular, being able to easily deconstruct mappings into local variables
|
||
seems likely to be generally useful, even when there's only one mapping variant
|
||
to be matched::
|
||
|
||
{"host": host, "port": port, "mode": ?"TCP"} = settings
|
||
|
||
While such code could already be written using a match statement (assuming
|
||
either this PEP or PEP 634 were to be accepted into the language), an
|
||
assignment statement level variant should be able to provide standardised
|
||
exceptions for cases where the right hand side either wasn't a mapping (throwing
|
||
`TypeError`), didn't have the specified keys (throwing `KeyError`), or didn't
|
||
have the specific values for the given keys (throwing `ValueError`), avoiding
|
||
the need to write out that exception raising logic in every case.
|
||
|
||
PEP 635 raises the concern that enough aspects of pattern matching semantics
|
||
will differ from assignment target semantics that pursuing syntactic parallels
|
||
will end up creating confusion rather than reducing it. However, the primary
|
||
examples cited as potentially causing confusion are exactly those where the
|
||
PEP 634 syntax is *already* the same as that for assignment targets: the fact
|
||
that case patterns use iterable unpacking syntax, but only match on sequences
|
||
(and specifically exclude strings and byte-strings) rather than consuming
|
||
arbitrary iterables is an aspect of PEP 634 that this PEP leaves unchanged.
|
||
|
||
These semantic differences are intrinsic to the nature of pattern matching:
|
||
whereas it is reasonable for a one-shot assignment statement to consume a
|
||
one-shot iterator, it isn't reasonable to do that in a construct that's
|
||
explicitly about matching a given value against multiple potential targets,
|
||
making full use of the available runtime type information to ensure those checks
|
||
are as side effect free as possible.
|
||
|
||
It's an entirely orthogonal question to how the distinction is drawn between
|
||
capture patterns and patterns that check for expected values (constraint
|
||
patterns in this PEP, literal and value patterns in PEP 634), and it's a big
|
||
logical leap to take from "these specific semantic differences between iterable
|
||
unpacking and sequence matching are needed in order to handle checking against
|
||
multiple potential targets" to "we can reuse attribute binding syntax to mean
|
||
equality constraints instead and nobody is going to get confused by that".
|
||
|
||
|
||
Interaction with caching of attribute lookups in local variables
|
||
----------------------------------------------------------------
|
||
|
||
The major change between this PEP and PEP 634 is the use of `?EXPR` for value
|
||
constraint lookups, rather than ``NAME.ATTR``. The main motivation for this is
|
||
to avoid the semantic conflict with regular assignment targets, where
|
||
``NAME.ATTR`` is already used in assignment statements to set attributes.
|
||
|
||
However, even within match statements themselves, the ``name.attr`` syntax for
|
||
value patterns has an undesirable interaction with local variable assignment,
|
||
where routine refactorings that would be semantically neutral for any other
|
||
Python statement introduce a major semantic change when applied to a match
|
||
statement.
|
||
|
||
Consider the following code::
|
||
|
||
while value < self.limit:
|
||
... # Some code that adjusts "value"
|
||
|
||
The attribute lookup can be safely lifted out of the loop and only performed
|
||
once::
|
||
|
||
_limit = self.limit:
|
||
while value < _limit:
|
||
... # Some code that adjusts "value"
|
||
|
||
With the marker prefix based syntax proposal in this PEP, constraint patterns
|
||
would be similarly tolerant of match patterns being refactored to use a local
|
||
variable instead of an attribute lookup, with the following two statements
|
||
being functionally equivalent::
|
||
|
||
match expr:
|
||
case {"key": ?self.target}:
|
||
... # Handle the case where 'expr["key"] == self.target'
|
||
case ?:
|
||
... # Handle the non-matching case
|
||
|
||
_target = self.target
|
||
match expr:
|
||
case {"key": ?_target}:
|
||
... # Handle the case where 'expr["key"] == self.target'
|
||
case ?:
|
||
... # Handle the non-matching case
|
||
|
||
By contrast, PEP 634's attribution of additional semantic significance to the
|
||
use of attribute lookup notation means that the following two statements
|
||
wouldn't be equivalent at all::
|
||
|
||
|
||
# PEP 634's value pattern syntax
|
||
match expr:
|
||
case {"key": self.target}:
|
||
... # Handle the case where 'expr["key"] == self.target'
|
||
case _:
|
||
... # Handle the non-matching case
|
||
|
||
_target = self.target
|
||
match expr:
|
||
case {"key": _target}:
|
||
... # Matches any mapping with "key", binding its value to _target
|
||
case _:
|
||
... # Handle the non-matching case
|
||
|
||
To be completely clear, the latter statement means the same under this PEP as it
|
||
does under PEP 634. The difference is that PEP 634 is relying entirely on the
|
||
dotted attribute lookup syntax to identify value patterns, so when the attribute
|
||
lookup gets removed, the pattern type immediately changes from a value pattern
|
||
to a capture pattern.
|
||
|
||
By contrast, the explicit marker prefix on constraint patterns in this PEP means
|
||
that switching from a dotted lookup to a local variable lookup has no effect on
|
||
the kind of pattern that the compiler detects - to change it to a capture
|
||
pattern, you have to explicitly remove the marker prefix (which will result in
|
||
a syntax error if the binding target isn't a simple name).
|
||
|
||
PEP 622's walrus pattern syntax had another odd interaction where it might not
|
||
bind the same object as the exact same walrus expression in the body of the
|
||
case clause, but PEP 634 fixed that discrepancy by replacing walrus patterns
|
||
with AS patterns (where the fact that the value bound to the name on the RHS
|
||
might not be the same value as returned by the LHS is a standard feature common
|
||
to all uses of the "as" keyword).
|
||
|
||
|
||
Using "?" as the constraint pattern prefix
|
||
------------------------------------------
|
||
|
||
If the need for a dedicated constraint pattern prefix is accepted, then the
|
||
next question is to ask exactly what that prefix should be.
|
||
|
||
With multiple constraint patterns potentially appearing inside larger
|
||
structural patterns, using a single punctuation character rather than a keyword
|
||
is desirable for brevity.
|
||
|
||
Most potential candidates are already used in Python for another unrelated
|
||
purpose, or would integrate poorly with other aspects of the pattern matching
|
||
syntax (e.g. ``=`` or ``==`` have multiple problems along those lines, in particular
|
||
in the way they would combine with ``=`` as a keyword separator in class
|
||
patterns, or ``:`` as a key/value separate in mapping patterns).
|
||
|
||
This PEP proposes ``?`` as the prefix marker as it isn't currently used in Python's
|
||
core syntax, the proposed usage as a prefix marker won't conflict with its
|
||
use in other Python related contexts (e.g. looking up object help information in
|
||
IPython), and there are plausible mnemonics that may help users to *remember*
|
||
what the syntax means even if they can't guess the semantics if exposed to it
|
||
without any explanation (mostly that it's a shorthand for the question "Is the
|
||
unpacked value at this position equivalent to the value given by the expression?
|
||
If not, don't match")).
|
||
|
||
PEP 635 has a good discussion of the problems with this choice in the context
|
||
of using it as the wildcard pattern marker:
|
||
|
||
An alternative that does not suggest an arbitrary number of items would
|
||
be ``?``. This is even being proposed independently from pattern matching in
|
||
PEP 640. We feel however that using ``?`` as a special "assignment" target is
|
||
likely more confusing to Python users than using ``_``. It violates Python's
|
||
(admittedly vague) principle of using punctuation characters only in ways
|
||
similar to how they are used in common English usage or in high school math,
|
||
unless the usage is very well established in other programming languages
|
||
(like, e.g., using a dot for member access).
|
||
|
||
The question mark fails on both counts: its use in other programming
|
||
languages is a grab-bag of usages only vaguely suggested by the idea of a
|
||
"question". For example, it means "any character" in shell globbing,
|
||
"maybe" in regular expressions, "conditional expression" in C and many
|
||
C-derived languages, "predicate function" in Scheme,
|
||
"modify error handling" in Rust, "optional argument" and "optional chaining"
|
||
in TypeScript (the latter meaning has also been proposed for Python by
|
||
PEP 505). An as yet unnamed PEP proposes it to mark optional types,
|
||
e.g. int?.
|
||
|
||
Another common use of ``?`` in programming systems is "help", for example, in
|
||
IPython and Jupyter Notebooks and many interactive command-line utilities.
|
||
|
||
This PEP takes the view that *not* requiring a marker prefix on value lookups
|
||
in match patterns results in a cure that is worse than the disease: Python's
|
||
first ever syntax-sensitive value lookup where you can't transparently
|
||
replace an attribute lookup with a local variable lookup and maintain semantic
|
||
equivalence aside from the exact relative timing of the attribute lookup.
|
||
|
||
Assuming the requirement for a marker prefix is accepted on those grounds, then
|
||
the syntactic bar to meet isn't "Can users *guess* what the chosen symbol means
|
||
without anyone ever explaining it to them?" but instead the lower standard
|
||
applied when choosing the ``@`` symbol for both decorator expressions and matrix
|
||
multiplication and the ``:=`` character combination for assignment expressions:
|
||
"Can users *remember* what it means once they've had it explained to them at
|
||
least once?".
|
||
|
||
This PEP contends that ``?`` will be able to pass that lower standard, and would
|
||
pass it even more readily if PEP 640 were also subsequently adopted to allow it
|
||
as a general purpose non-binding wildcard marker that doesn't conflict with the
|
||
use of ``_`` in application internationalisation use cases.
|
||
|
||
PEPs proposing additional meanings for this character would need to take the
|
||
pattern matching meaning into account, but wouldn't necessarily fail purely on
|
||
that account (e.g. ``@`` was adopted as a binary operator for matrix
|
||
multiplication well after its original adoption as a decorator expression
|
||
prefix). "Value checking" related use cases such as PEP 505's None-aware
|
||
operators would likely fare especially well on that front, but each such
|
||
proposal would continue to be judged on a case-by-case basis.
|
||
|
||
|
||
Using ``?`` as the wildcard pattern
|
||
-----------------------------------
|
||
|
||
PEP 635 makes a solid case that introducing ``?`` *solely* as a wildcard pattern
|
||
marker would be a bad idea. Continuing on from the text already quoted in the
|
||
previous section:
|
||
|
||
In addition, this would put Python in a rather unique position: The
|
||
underscore is used as a wildcard pattern in every programming language
|
||
with pattern matching that we could find (including C#, Elixir, Erlang,
|
||
F#, Grace, Haskell, Mathematica, OCaml, Ruby, Rust, Scala, Swift, and
|
||
Thorn). Keeping in mind that many users of Python also work with other
|
||
programming languages, have prior experience when learning Python, and
|
||
may move on to other languages after having learned Python, we find that
|
||
such well-established standards are important and relevant with respect
|
||
to readability and learnability. In our view, concerns that this wildcard
|
||
means that a regular name received special treatment are not strong enough
|
||
to introduce syntax that would make Python special.
|
||
|
||
Other languages with pattern matching don't use ``?`` as the wildcard pattern
|
||
(they all use ``_``), and without any other usage in Python's syntax, there
|
||
wouldn't be any useful prompts to help users remember what ``?`` means when
|
||
they encounter it in a match pattern.
|
||
|
||
In this PEP, the adoption of ``?`` as the wildcard pattern marker instead comes
|
||
from asking the question "What does it mean to omit the constraint expression
|
||
from a constraint pattern?", and concluding that "match any value" is a more
|
||
useful definition in most situations than reporting a syntax error.
|
||
|
||
That said, one possible modification to consider in the name of making code and
|
||
concepts easier to share with other languages would be to exempt ``_`` from the
|
||
"no repeated names" compiler check.
|
||
|
||
With that change, using ``_`` as a wildcard marker would *work* - it would just
|
||
also bind the ``_`` name, the same as it does in any other Python assignment
|
||
context.
|
||
|
||
|
||
No special casing for ``?None``, ``?True``, and ``?False``
|
||
----------------------------------------------------------
|
||
|
||
This PEP follows PEP 622 in treating ``None``, ``True`` and ``False`` like any other
|
||
value constraint, and comparing them by equality, rather than following PEP
|
||
634 in proposing that these literals (and only these literals) be handled specially
|
||
and compared via identity.
|
||
|
||
While writing ``x is None`` is a common (and PEP 8 recommended) practice, nobody
|
||
litters their ``if``-``elif`` chains with ``x is True`` or ``x is False`` expressions,
|
||
they write ``x`` and ``not x``, both of which compare by value, not identity.
|
||
Indeed, PEP 8 explicitly disallows the use ``if x is True:`` and ``if x is False:``,
|
||
preferring the forms without any comparison operator at all.
|
||
|
||
The key problem with special casing is that it doesn't interact properly with
|
||
Python's historical practice where "a reference is just a reference, it doesn't
|
||
matter how it is spelled in the code".
|
||
|
||
Instead, with the special casing proposed in PEP 634, checking against one of
|
||
these values directly would behave differently from checking against it when
|
||
saved in a variable or attribute::
|
||
|
||
# PEP 634's literal pattern syntax
|
||
match expr:
|
||
case True:
|
||
... # Only handles the case where "expr is True"
|
||
|
||
# PEP 634's value pattern syntax
|
||
match expr:
|
||
case self.expected_match: # Set to 'True' somewhere else
|
||
... # Handles the case where "expr == True"
|
||
|
||
By contrast, the explicit prefix syntax proposed in this PEP makes it
|
||
straightforward to include both equality constraints and identity constraints,
|
||
allowing users to specify directly in their case clauses whether they want to
|
||
match by identity or by value.
|
||
|
||
This distinction means that case clauses can even be used to provide a dedicated
|
||
code path for exact identity matches on arbitrary objects::
|
||
|
||
match value:
|
||
case ?is obj:
|
||
... # Handle being given the exact same object
|
||
case ?obj:
|
||
... # Handle being given an equivalent object
|
||
case ?:
|
||
... # Handle the non-matching case
|
||
|
||
|
||
Deferred Ideas
|
||
==============
|
||
|
||
Allowing negated constraints in match patterns
|
||
----------------------------------------------
|
||
|
||
The requirement that constraint expressions be primary expressions means that
|
||
it isn't permitted to write ``?not expr`` or ``?is not expr``.
|
||
|
||
Both of these forms have reasonably clear potential interpretions as a
|
||
negated equality constraint (i.e. ``x != expr``) and a negated identity
|
||
constraint (i.e. ``x is not expr``).
|
||
|
||
However, it's far from clear either form would come up often enough to justify
|
||
the dedicated syntax, so the extension has been deferred pending further
|
||
community experience with match statements.
|
||
|
||
Note: the compiler can't enforce the primary expression restriction when asked
|
||
to compile an AST tree directly, as parentheses used purely for grouping are
|
||
lost in the AST generation process. This means the permitted ``?(not expr)``
|
||
generates the same AST as the syntactically disallowed ``?not expr`` would.
|
||
That isn't a problem though, as in the hypothetical future where this feature
|
||
was implemented, ``?not expr`` wouldn't generate the same AST as ``?(not expr)``,
|
||
it would generate a new AST node that indicated the use of a negated eqaulity
|
||
constraint pattern.
|
||
|
||
|
||
Allowing containment checks in match patterns
|
||
---------------------------------------------
|
||
|
||
The syntax used for identity constraints would be straightforward to extend to
|
||
containment checks: ``?in container``.
|
||
|
||
One downside of the proposal in this PEP relative to PEP 634 is that checking
|
||
against multiple possible values becomes noticably more verbose, especially
|
||
for literal value checks::
|
||
|
||
# PEP 634 literal pattern
|
||
match value:
|
||
case 0 | 1 | 2 | 3:
|
||
...
|
||
|
||
# This PEP's equality constraints
|
||
match value:
|
||
case ?0 | ?1 | ?2 | ?3:
|
||
...
|
||
|
||
Containment constraints would provide a more concise way to check if the
|
||
match subject was present in a container::
|
||
|
||
match value:
|
||
case ?in {0, 1, 2, 3}:
|
||
...
|
||
case ?in range(4): # It would accept any container, not just literal sets
|
||
...
|
||
|
||
Such a feature would also be readily extensible to allow all kinds of case
|
||
clauses without any further syntax updates, simply by defining ``__contains__``
|
||
appropriately on a custom class definition.
|
||
|
||
However, while this does seem like a useful extension, it isn't essential, so
|
||
it seems more appropriate to defer it to a separate proposal, rather than
|
||
including it here.
|
||
|
||
|
||
Rejected Ideas
|
||
==============
|
||
|
||
Restricting permitted expressions in constraint patterns and mapping pattern keys
|
||
---------------------------------------------------------------------------------
|
||
|
||
While it's entirely technically possible to restrict the kinds of expressions
|
||
permitted in constraint patterns and mapping pattern keys to just attribute
|
||
lookups and constant literals (as PEP 634 does), there isn't any clear runtime
|
||
value in doing so, so this PEP proposes allowing any kind of primary expression
|
||
(primary expressions are an existing node type in the grammar that includes
|
||
things like literals, names, attribute lookups, function calls, container
|
||
subscripts, parenthesised groups, etc).
|
||
|
||
While PEP 635 does emphasise several times that literal patterns and value
|
||
patterns are not full expressions, it doesn't ever articulate a concrete benefit
|
||
that is obtained from that restriction (just a theoretical appeal to it being
|
||
useful to separate static checks from dynamic checks, which a code style
|
||
tool could still enforce, even if the compiler itself is more permissive).
|
||
|
||
The last time we imposed such a restriction was for decorator expressions and
|
||
the primary outcome of that was that users had to put up with years of awkward
|
||
syntactic workarounds (like nesting arbitrary expressions inside function calls
|
||
that just returned their argument) to express the behaviour they wanted before
|
||
the language definition was finally updated to allow arbitrary expressions and
|
||
let users make their own decisions about readability.
|
||
|
||
The situation in PEP 634 that bears a resemblance to the situation with decorator
|
||
expressions is that arbitrary expressions are technically supported in value
|
||
patterns, they just require awkward workarounds where either all the values to
|
||
match need to be specified in a helper class that is placed before the match
|
||
statement::
|
||
|
||
# Allowing arbitrary match targets with PEP 634's value pattern syntax
|
||
class mt:
|
||
value = func()
|
||
match expr:
|
||
case (?, mt.value):
|
||
... # Handle the case where 'expr[1] == func()'
|
||
|
||
Or else they need to be written as a combination of a capture pattern and a
|
||
guard expression::
|
||
|
||
match expr:
|
||
case (?, _matched) if _matched == func():
|
||
... # Handle the case where 'expr[1] == func()'
|
||
|
||
This PEP proposes skipping requiring any such workarounds, and instead
|
||
supporting arbitrary value constraints from the start::
|
||
|
||
match expr:
|
||
case ?func():
|
||
... # Handle the case where 'expr == func()'
|
||
|
||
Whether actually writing that kind of code is a good idea would be a topic for
|
||
style guides and code linters, not the language compiler.
|
||
|
||
In particular, if static analysers can't follow certain kinds of dynamic checks,
|
||
then they can limit the permitted expressions at analysis time, rather than the
|
||
compiler restricting them at compile time.
|
||
|
||
There are also some kinds of expressions that are almost certain to give
|
||
nonsensical results (e.g. ``yield``, ``yield from``, ``await``) due to the
|
||
pattern caching rule, where the number of times the constraint expression
|
||
actually gets evaluated will be implementation dependent. Even here, the PEP
|
||
takes the view of letting users write nonsense if they really want to.
|
||
|
||
Aside from the recenty updated decorator expressions, another situation where
|
||
Python's formal syntax offers full freedom of expression that is almost never
|
||
used in practice is in ``except`` clauses: the exceptions to match against
|
||
almost always take the form of a simple name, a dotted name, or a tuple of
|
||
those, but the language grammar permits arbitrary expressions at that point.
|
||
This is a good indication that Python's user base can be trusted to
|
||
take responsibility for finding readable ways to use permissive language
|
||
features, by avoiding writing hard to read constructs even when they're
|
||
permitted by the compiler.
|
||
|
||
This permissiveness comes with a real concrete benefit on the implementation
|
||
side: dozens of lines of match statement specific code in the compiler is
|
||
replaced by simple calls to the existing code for compiling expressions. This
|
||
implementation benefit would accrue not just to CPython, but to every other
|
||
Python implementation looking to add match statement support.
|
||
|
||
|
||
Keeping literal patterns
|
||
------------------------
|
||
|
||
An early (not widely publicised) draft of this proposal considered keeping
|
||
PEP 634's literal patterns, as they don't inherently conflict with assignment
|
||
statement syntax the way that PEP 634's value patterns do (trying to assign
|
||
to a literal is already a syntax error, whereas assigning to a dotted name
|
||
sets the attribute).
|
||
|
||
They were subsequently removed (replaced by the combination of equality and
|
||
identity constraints) due to the fact that they have the same syntax
|
||
sensitivity problem as value patterns do, where attempting to move the
|
||
literal pattern out to a local variable for naming clarity would turn the
|
||
value checking literal pattern into a name binding capture pattern::
|
||
|
||
# PEP 634's literal pattern syntax
|
||
match expr:
|
||
case {"port": 443}:
|
||
... # Handle the case where 'expr["port"] == 443'
|
||
case _:
|
||
... # Handle the non-matching case
|
||
|
||
HTTPS_PORT = 443
|
||
match expr:
|
||
case {"port": HTTPS_PORT}:
|
||
... # Matches any mapping with "port", binding its value to HTTPS_PORT
|
||
case _:
|
||
... # Handle the non-matching case
|
||
|
||
With equality constraints, this style of refactoring keeps the original
|
||
semantics (just as it would for a value lookup in any other statement)::
|
||
|
||
# This PEP's equality constraints
|
||
match expr:
|
||
case {"port": ?443}:
|
||
... # Handle the case where 'expr["port"] == 443'
|
||
case _:
|
||
... # Handle the non-matching case
|
||
|
||
HTTPS_PORT = 443
|
||
match expr:
|
||
case {"port": ?HTTPS_PORT}:
|
||
... # Handle the case where 'expr["port"] == 443'
|
||
case _:
|
||
... # Handle the non-matching case
|
||
|
||
|
||
Requiring the use of constraint prefix markers for mapping pattern keys
|
||
-----------------------------------------------------------------------
|
||
|
||
The initial (unpublished) draft of this proposal suggested requiring mapping
|
||
pattern keys be constraint patterns, just as PEP 634 requires that they be valid
|
||
literal or value patterns::
|
||
|
||
import constants
|
||
|
||
match config:
|
||
case {?"route": route}:
|
||
process_route(route)
|
||
case {?constants.DEFAULT_PORT: sub_config, **rest}:
|
||
process_config(sub_config, rest)
|
||
|
||
However, the extra character is syntactically noisy and unlike its use in
|
||
constraint patterns (where it distinguishes them from capture patterns), the
|
||
prefix doesn't provide any additional information here that isn't already
|
||
conveyed by the expression's position as a key within a mapping pattern.
|
||
|
||
Accordingly, the proposal was simplified to omit the marker prefix from mapping
|
||
pattern keys.
|
||
|
||
This omission also aligns with the fact that containers may incorporate both
|
||
identity and equality checks into their lookup process - they don't purely
|
||
rely on equality checks, as would be incorrectly implied by the use of the
|
||
equality constraint prefix.
|
||
|
||
|
||
Providing dedicated syntax for binding matched constraint values
|
||
----------------------------------------------------------------
|
||
|
||
The initial (unpublished) draft of this proposal suggested allowing ``NAME?EXPR``
|
||
as a syntactically unambiguous shorthand for PEP 622's ``NAME := BASE.ATTR`` or
|
||
PEP 634's ``BASE.ATTR as NAME``.
|
||
|
||
This idea was dropped as it complicated the grammar for no gain in
|
||
expressiveness over just using the general purpose approach to combining
|
||
capture patterns with other match patterns (i.e. ``?EXPR as NAME``) when the
|
||
identity of the matching object is important.
|
||
|
||
|
||
Reference Implementation
|
||
========================
|
||
|
||
A reference implementation for this PEP [3_] has been derived from Brandt
|
||
Bucher's reference implementation for PEP 634 [4_].
|
||
|
||
Relative to the text of this PEP, the draft reference implementation currently
|
||
retains literal patterns mostly as implemented for PEP 634, except that the
|
||
special casing of ``None``, ``True``, and ``False`` has been removed (with
|
||
``PEP 642 TODO`` notes added to the code that can be deleted once these patterns
|
||
are dropped entirely).
|
||
|
||
Value patterns, wildcard patterns, and mapping patterns have been updated
|
||
to follow this PEP rather than PEP 634.
|
||
|
||
Removing literal patterns will be a matter of deleting the code out of the
|
||
compiler, and then adding either ``?`` or ``?is`` as necessary to the test
|
||
cases that no longer compile. This removal isn't necessary to show that the
|
||
PEP's syntax proposal is feasible, so that work has been deferred for now.
|
||
|
||
There will also be an implementation decision to be made around representing
|
||
constraint operators in the AST. The draft implementation adds them as new
|
||
cases on the existing ``UnaryOp`` node, but it would potentially be better to
|
||
implement them as a new ``Constraint`` node, since they're accepted at
|
||
different points in the syntax tree than other unary operators.
|
||
|
||
|
||
Acknowledgments
|
||
===============
|
||
|
||
The PEP 622 and PEP 634/635/636 authors, as the proposal in this PEP is merely
|
||
an attempt to improve the readability of an already well-constructed idea by
|
||
proposing that one of the key new concepts in that proposal (the ability to
|
||
express value constraints in a name binding target) is sufficiently notable
|
||
to be worthy of using up one of the few remaining unused ASCII punctuation
|
||
characters in Python's syntax instead of reusing the existing attribute binding
|
||
syntax to mean an attribute lookup.
|
||
|
||
|
||
References
|
||
==========
|
||
|
||
.. [1] Post explaining the syntactic novelties in PEP 622
|
||
https://mail.python.org/archives/list/python-dev@python.org/message/2VRPDW4EE243QT3QNNCO7XFZYZGIY6N3/>
|
||
|
||
.. [2] Declined pull request proposing to list this as a Rejected Idea in PEP 622
|
||
https://github.com/python/peps/pull/1564
|
||
|
||
.. [3] In-progress reference implementation for this PEP
|
||
https://github.com/ncoghlan/cpython/tree/pep-642-constraint-patterns
|
||
|
||
.. [4] PEP 634 reference implementation
|
||
https://github.com/python/cpython/pull/22917
|
||
|
||
|
||
.. _Appendix A:
|
||
|
||
Appendix A -- Full Grammar
|
||
==========================
|
||
|
||
Here is the full modified grammar for ``match_stmt``, replacing Appendix A
|
||
in PEP 634.
|
||
|
||
Notation used beyond standard EBNF is as per PEP 534:
|
||
|
||
- ``'KWD'`` denotes a hard keyword
|
||
- ``"KWD"`` denotes a soft keyword
|
||
- ``SEP.RULE+`` is shorthand for ``RULE (SEP RULE)*``
|
||
- ``!RULE`` is a negative lookahead assertion
|
||
|
||
::
|
||
|
||
match_stmt: "match" subject_expr ':' NEWLINE INDENT case_block+ DEDENT
|
||
subject_expr:
|
||
| star_named_expression ',' [star_named_expressions]
|
||
| named_expression
|
||
case_block: "case" patterns [guard] ':' block
|
||
guard: 'if' named_expression
|
||
|
||
patterns: open_sequence_pattern | pattern
|
||
pattern: as_pattern | or_pattern
|
||
as_pattern: or_pattern 'as' capture_pattern
|
||
or_pattern: '|'.closed_pattern+
|
||
closed_pattern:
|
||
| capture_pattern
|
||
| constraint_pattern
|
||
| wildcard_pattern
|
||
| group_pattern
|
||
| sequence_pattern
|
||
| mapping_pattern
|
||
| class_pattern
|
||
|
||
capture_pattern: NAME !('.' | '(' | '=')
|
||
|
||
constraint_pattern: eq_constraint | id_constraint
|
||
id_constraint: '?' 'is' primary
|
||
eq_constraint: '?' primary
|
||
|
||
wildcard_pattern: '?'
|
||
|
||
group_pattern: '(' pattern ')'
|
||
|
||
sequence_pattern:
|
||
| '[' [maybe_sequence_pattern] ']'
|
||
| '(' [open_sequence_pattern] ')'
|
||
open_sequence_pattern: maybe_star_pattern ',' [maybe_sequence_pattern]
|
||
maybe_sequence_pattern: ','.maybe_star_pattern+ ','?
|
||
maybe_star_pattern: star_pattern | pattern
|
||
star_pattern: '*' (capture_pattern | wildcard_pattern)
|
||
|
||
mapping_pattern: '{' [items_pattern] '}'
|
||
items_pattern: ','.key_value_pattern+ ','?
|
||
key_value_pattern:
|
||
| primary ':' pattern
|
||
| double_star_pattern
|
||
double_star_pattern: '**' capture_pattern
|
||
|
||
class_pattern:
|
||
| name_or_attr '(' [pattern_arguments ','?] ')'
|
||
attr: name_or_attr '.' NAME
|
||
name_or_attr: attr | NAME
|
||
pattern_arguments:
|
||
| positional_patterns [',' keyword_patterns]
|
||
| keyword_patterns
|
||
positional_patterns: ','.pattern+
|
||
keyword_patterns: ','.keyword_pattern+
|
||
keyword_pattern: NAME '=' pattern
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document is placed in the public domain or under the
|
||
CC0-1.0-Universal license, whichever is more permissive.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|