PEP 642: Constraint Pattern Syntax for Structural Pattern Matching (#1685)
A counter-proposal that mostly endorses PEP 634, but proposes an explicit marker prefix based alternative to PEP 634's approach where attribute lookups are processed as value constraints while simple names are capture patterns.
This commit is contained in:
parent
657677f3c5
commit
5e04c595b4
|
@ -0,0 +1,742 @@
|
||||||
|
PEP: 642
|
||||||
|
Title: Constraint Pattern Syntax for Structural Pattern Matching
|
||||||
|
Version: $Revision$
|
||||||
|
Last-Modified: $Date$
|
||||||
|
Author: Nick Coghlan <ncoghlan@gmail.com>
|
||||||
|
BDFL-Delegate:
|
||||||
|
Discussions-To: Python-Dev <python-dev@python.org>
|
||||||
|
Status: Draft
|
||||||
|
Type: Standards Track
|
||||||
|
Content-Type: text/x-rst
|
||||||
|
Requires: 634
|
||||||
|
Created: 26-Sep-2020
|
||||||
|
Python-Version: 3.10
|
||||||
|
Post-History: 25-Oct-2020
|
||||||
|
Resolution:
|
||||||
|
|
||||||
|
Abstract
|
||||||
|
========
|
||||||
|
|
||||||
|
This PEP covers an alternative syntax proposal for PEP 634's structural pattern
|
||||||
|
matching that explicitly anchors match expressions in the existing syntax for
|
||||||
|
assignment targets, while retaining the semantic aspects of the existing proposal.
|
||||||
|
|
||||||
|
Specifically, this PEP adopts an additional design restriction that PEP 634's
|
||||||
|
authors considered unreasonable: that any syntax that is common to both
|
||||||
|
assignment targets and match patterns must have a comparable semantic effect,
|
||||||
|
while any novel match pattern semantics must use syntax which emits a syntax
|
||||||
|
error when used in an assignment target.
|
||||||
|
|
||||||
|
As a consequence, this PEP proposes the following changes to the proposed match
|
||||||
|
pattern syntax:
|
||||||
|
|
||||||
|
* Literal patterns and value patterns are combined into a single new
|
||||||
|
pattern type: "constraint patterns"
|
||||||
|
* Constraint patterns use `?` as a prefix marker on an otherwise arbitrary
|
||||||
|
atomic expression: `?EXPR`
|
||||||
|
* There is no special casing of the `None`, `True`, or `False` literals
|
||||||
|
* The constraint expression may be omitted to give a non-binding wildcard pattern
|
||||||
|
* Mapping patterns change to allow arbitrary atoms as keys
|
||||||
|
* Attempting to use a dotted name as a match pattern is a syntax error rather
|
||||||
|
than implying a value constraint
|
||||||
|
* Attempting to use a literal as a match pattern is a syntax error rather
|
||||||
|
than implying a value constraint
|
||||||
|
* The `_` identifier is no longer syntactically special (it is a normal capture
|
||||||
|
pattern, just as it is an ordinary assignment target)
|
||||||
|
|
||||||
|
|
||||||
|
Relationship with other PEPs
|
||||||
|
============================
|
||||||
|
|
||||||
|
This PEP both depends on and competes with PEP 634 - the PEP author agrees that
|
||||||
|
match statements would be a sufficiently valuable addition to the language to
|
||||||
|
be worth the additional complexity that they add to the learning process, but
|
||||||
|
disagrees with the idea that "simple name vs attribute lookup" offers an
|
||||||
|
adequate syntactic distinction between name binding and value lookup operations
|
||||||
|
in match patterns.
|
||||||
|
|
||||||
|
By switching the wildcard pattern to "?", this PEP complements the proposal in
|
||||||
|
PEP 640 to allow the use of wildcard patterns in other contexts where a name
|
||||||
|
binding is syntactically required, but the application doesn't actually need
|
||||||
|
the value.
|
||||||
|
|
||||||
|
|
||||||
|
Motivation
|
||||||
|
==========
|
||||||
|
|
||||||
|
The original PEP 622 (which was later split into PEPs 634, 635, and 636)
|
||||||
|
incorporated an unstated but essential assumption in its syntax design: that
|
||||||
|
neither ordinary expressions *nor* the existing assignment target syntax provide
|
||||||
|
an adequate foundation for the syntax used in match patterns.
|
||||||
|
|
||||||
|
While the PEP doesn't explicitly state this assumption, one of the PEP authors
|
||||||
|
explained it clearly on python-dev [1_]:
|
||||||
|
|
||||||
|
The actual problem that I see is that we have different cultures/intuitions
|
||||||
|
fundamentally clashing here. In particular, so many programmers welcome
|
||||||
|
pattern matching as an "extended switch statement" and find it therefore
|
||||||
|
strange that names are binding and not expressions for comparison. Others
|
||||||
|
argue that it is at odds with current assignment statements, say, and
|
||||||
|
question why dotted names are _/not/_ binding. What all groups seem to
|
||||||
|
have in common, though, is that they refer to _/their/_ understanding and
|
||||||
|
interpretation of the new match statement as 'consistent' or 'intuitive'
|
||||||
|
--- naturally pointing out where we as PEP authors went wrong with our
|
||||||
|
design.
|
||||||
|
|
||||||
|
But here is the catch: at least in the Python world, pattern matching as
|
||||||
|
proposed by this PEP is an unprecedented and new way of approaching a common
|
||||||
|
problem. It is not simply an extension of something already there. Even
|
||||||
|
worse: while designing the PEP we found that no matter from which angle you
|
||||||
|
approach it, you will run into issues of seeming 'inconsistencies' (which is
|
||||||
|
to say that pattern matching cannot be reduced to a 'linear' extension of
|
||||||
|
existing features in a meaningful way): there is always something that goes
|
||||||
|
fundamentally beyond what is already there in Python. That's why I argue
|
||||||
|
that arguments based on what is 'intuitive' or 'consistent' just do not
|
||||||
|
make sense _/in this case/_.
|
||||||
|
|
||||||
|
PEP 635 (and PEP 622 before it) makes a strong case that treating capture
|
||||||
|
patterns as the default usage for simple names in match patterns is the right
|
||||||
|
approach, and provides a number of examples where having names express value
|
||||||
|
constraints by default would be confusing (this difference from C/C++ switch
|
||||||
|
statement semantics is also a key reason it makes sense to use `match` as the
|
||||||
|
introductory keyword for the new statement rather than `switch`).
|
||||||
|
|
||||||
|
However, PEP 635 doesn't even *try* to make the case for the second assertion,
|
||||||
|
that treating match patterns as a variation on assignment targets also leads to
|
||||||
|
inherent contradictions. Even a PR submitted to explicitly list this option in
|
||||||
|
the "Rejected Ideas" section of the original PEP 622 was declined [2_].
|
||||||
|
|
||||||
|
This PEP instead starts from the assumption that it *is* possible to treat match
|
||||||
|
patterns as a variation on assignment targets, and the only essential
|
||||||
|
differences that emerge relative to the syntactic proposal in PEP 634 are:
|
||||||
|
|
||||||
|
* a requirement to use an explicit marker prefix on value lookups rather than
|
||||||
|
allowing them to be implied by the use of dotted names; and
|
||||||
|
* a requirement to use a non-binding wildcard marker other than `_`.
|
||||||
|
|
||||||
|
|
||||||
|
Specification
|
||||||
|
=============
|
||||||
|
|
||||||
|
This PEP retains the overall `match`/`case` statement syntax from PEP 634, and
|
||||||
|
retains both the syntax and semantics for the following match pattern variants:
|
||||||
|
|
||||||
|
* capture patterns
|
||||||
|
* class patterns
|
||||||
|
* group patterns
|
||||||
|
* sequence patterns
|
||||||
|
|
||||||
|
Pattern combination (both OR and AS patterns) and guard expressions also remain
|
||||||
|
the same as they are in PEP 634.
|
||||||
|
|
||||||
|
Wildcard patterns change their syntactic marker from `_` to `?`.
|
||||||
|
|
||||||
|
Literal patterns and value patterns are replaced by constraint
|
||||||
|
patterns.
|
||||||
|
|
||||||
|
Mapping patterns change to allow arbitrary atoms for keys, rather than being
|
||||||
|
restricted to literal patterns or value patterns.
|
||||||
|
|
||||||
|
|
||||||
|
Wildcard patterns
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
Wildcard patterns change their syntactic marker from `_` to `?`::
|
||||||
|
|
||||||
|
# Wildcard pattern
|
||||||
|
match data:
|
||||||
|
case [?, ?]:
|
||||||
|
print("Some pair")
|
||||||
|
print(?) # Error!
|
||||||
|
|
||||||
|
With `?` taking over the role of the non-binding syntactically significant
|
||||||
|
wildcard marker, `_` reverts to working the same way it does in other assignment
|
||||||
|
contexts: it operates as an ordinary identifier and hence becomes a normal
|
||||||
|
capture pattern rather than a special case.
|
||||||
|
|
||||||
|
|
||||||
|
Constraint patterns
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
Constraint patterns use the following simplified syntax::
|
||||||
|
|
||||||
|
constraint_pattern: '?' EXPR
|
||||||
|
|
||||||
|
The constraint expression is an arbitrary atomic expression - it can be a
|
||||||
|
simple name, a dotted name lookup, a literal, a function call, or any other
|
||||||
|
atomic expression.
|
||||||
|
|
||||||
|
If this PEP were to be adopted in preference to PEP 634, then all literal and
|
||||||
|
value patterns would instead be written as constraint patterns::
|
||||||
|
|
||||||
|
# Literal patterns
|
||||||
|
match number:
|
||||||
|
case ?0:
|
||||||
|
print("Nothing")
|
||||||
|
case ?1:
|
||||||
|
print("Just one")
|
||||||
|
case ?2:
|
||||||
|
print("A couple")
|
||||||
|
case ?-1:
|
||||||
|
print("One less than nothing")
|
||||||
|
case ?(1-1j):
|
||||||
|
print("Good luck with that...")
|
||||||
|
|
||||||
|
# Additional literal patterns
|
||||||
|
match value:
|
||||||
|
case ?True:
|
||||||
|
print("True or 1")
|
||||||
|
case ?False:
|
||||||
|
print("False or 0")
|
||||||
|
case ?None:
|
||||||
|
print("None")
|
||||||
|
case ?"Hello":
|
||||||
|
print("Text 'Hello'")
|
||||||
|
case ?b"World!":
|
||||||
|
print("Binary 'World!'")
|
||||||
|
case ?...:
|
||||||
|
print("May be useful when writing __getitem__ methods?")
|
||||||
|
|
||||||
|
# Constant value patterns
|
||||||
|
from enum import Enum
|
||||||
|
class Sides(str, Enum):
|
||||||
|
SPAM = "Spam"
|
||||||
|
EGGS = "eggs"
|
||||||
|
...
|
||||||
|
|
||||||
|
preferred_side = Sides.EGGS
|
||||||
|
match entree[-1]:
|
||||||
|
case ?Sides.SPAM: # Compares entree[-1] == Sides.SPAM.
|
||||||
|
response = "Have you got anything without Spam?"
|
||||||
|
case ?preferred_side: # Compares entree[-1] == preferred_side
|
||||||
|
response = f"Oh, I love {preferred_side}!"
|
||||||
|
case side: # Assigns side = entree[-1].
|
||||||
|
response = f"Well, could I have their Spam instead of the {side} then?"
|
||||||
|
|
||||||
|
Note the `?preferred_side` example: using an explicit prefix marker on constraint
|
||||||
|
expressions removes the restriction to only working with bound names for value
|
||||||
|
lookups. The `?(1-1j)` example illustrates the use of parentheses to turn any
|
||||||
|
subexpression into an atomic one.
|
||||||
|
|
||||||
|
This PEP retains the caching property specified for value patterns in PEP 634:
|
||||||
|
if a particular constraint pattern occurs more than once in a given match
|
||||||
|
statement, language implementations are explicitly permitted to cache the first
|
||||||
|
calculation on any given match statement execution and re-use it in other
|
||||||
|
clauses. (This implicit caching is less necessary in this PEP, given that
|
||||||
|
explicit local variable caching becomes a valid option, but it still seems a
|
||||||
|
useful property to preserve)
|
||||||
|
|
||||||
|
|
||||||
|
Mapping patterns
|
||||||
|
----------------
|
||||||
|
|
||||||
|
Mapping patterns inherit the change to replace literal patterns and constant
|
||||||
|
value patterns with constraint patterns::
|
||||||
|
|
||||||
|
mapping_pattern: '{' [items_pattern] '}'
|
||||||
|
items_pattern: ','.key_value_pattern+ ','?
|
||||||
|
key_value_pattern:
|
||||||
|
| atom ':' or_pattern
|
||||||
|
| '**' capture_pattern
|
||||||
|
|
||||||
|
However, the constraint marker prefix is not needed in this case, as the fact
|
||||||
|
this is a key to be looked up rather than a name to be bound is already
|
||||||
|
implied by its position within a mapping pattern.
|
||||||
|
|
||||||
|
This means that in simple cases, mapping patterns look exactly as they do in
|
||||||
|
PEP 634::
|
||||||
|
|
||||||
|
import constants
|
||||||
|
|
||||||
|
match config:
|
||||||
|
case {"route": route}:
|
||||||
|
process_route(route)
|
||||||
|
case {constants.DEFAULT_PORT: sub_config, **rest}:
|
||||||
|
process_config(sub_config, rest)
|
||||||
|
|
||||||
|
Unlike PEP 634, however, ordinary local and global variables can also be used
|
||||||
|
to match mapping keys::
|
||||||
|
|
||||||
|
ROUTE_KEY="route"
|
||||||
|
ADDRESS_KEY="local_address"
|
||||||
|
PORT_KEY="port"
|
||||||
|
match config:
|
||||||
|
case {ROUTE_KEY: route}:
|
||||||
|
process_route(route)
|
||||||
|
case {ADDRESS_KEY: address, PORT_KEY: port}:
|
||||||
|
process_address(address, port)
|
||||||
|
|
||||||
|
|
||||||
|
Design Discussion
|
||||||
|
=================
|
||||||
|
|
||||||
|
Treating match pattern syntax as an extension of assignment target syntax
|
||||||
|
-------------------------------------------------------------------------
|
||||||
|
|
||||||
|
PEP 634 already draws inspiration from assignment target syntax in the design
|
||||||
|
of its sequence pattern matching - while being restricted to sequences for
|
||||||
|
performance and runtime correctness reasons, sequence patterns are otherwise
|
||||||
|
very similar to the existing iterable unpacking and tuple packing features seen
|
||||||
|
in regular assignment statements and function signature declarations.
|
||||||
|
|
||||||
|
By requiring that any new semantics introduced by match patterns be given new
|
||||||
|
syntax that is currently disallowed in assignment targets, one of the goals of
|
||||||
|
this PEP is to explicitly leave the door open to one or more future PEPs that
|
||||||
|
enhance assignment target syntax to support some of the new features introduced
|
||||||
|
by match patterns.
|
||||||
|
|
||||||
|
In particular, being able to easily deconstruct mappings into local variables
|
||||||
|
seems likely to be generally useful, even when there's only one mapping variant
|
||||||
|
to be matched::
|
||||||
|
|
||||||
|
{"host": host, "port": port, "mode": ?"TCP"} = settings
|
||||||
|
|
||||||
|
While such code could already be written using a match statement (assuming
|
||||||
|
either this PEP or PEP 634 were to be accepted into the language), an
|
||||||
|
assignment statement level variant should be able to provide standardised
|
||||||
|
exceptions for cases where the right hand side either wasn't a mapping (throwing
|
||||||
|
`TypeError`), didn't have the specified keys (throwing `KeyError`), or didn't
|
||||||
|
have the specific values for the given keys (throwing `ValueError`), avoiding
|
||||||
|
the need to write out that exception raising logic in every case.
|
||||||
|
|
||||||
|
|
||||||
|
Interaction with caching of attribute lookups in local variables
|
||||||
|
----------------------------------------------------------------
|
||||||
|
|
||||||
|
The major change between this PEP and PEP 634 is the use of `?EXPR` for value
|
||||||
|
constraint lookups, rather than `NAME.ATTR`. The main motivation for this is
|
||||||
|
to avoid the semantic conflict with regular assignment targets, where
|
||||||
|
`NAME.ATTR` is already used in assignment statements to set attributes.
|
||||||
|
|
||||||
|
However, even within match statements themselves, the `name.attr` syntax for
|
||||||
|
value patterns has an undesirable interaction with local variable assignment,
|
||||||
|
where routine refactorings that would be semantically neutral for any other
|
||||||
|
Python statement introduce a major semantic change when applied to a match
|
||||||
|
statement.
|
||||||
|
|
||||||
|
Consider the following code::
|
||||||
|
|
||||||
|
while value < self.limit:
|
||||||
|
... # Some code that adjusts "value"
|
||||||
|
|
||||||
|
The attribute lookup can be safely lifted out of the loop and only performed
|
||||||
|
once::
|
||||||
|
|
||||||
|
_limit = self.limit:
|
||||||
|
while value < _limit:
|
||||||
|
... # Some code that adjusts "value"
|
||||||
|
|
||||||
|
With the marker prefix based syntax proposal in this PEP, constraint patterns
|
||||||
|
would be similarly tolerant of match patterns being refactored to use a local
|
||||||
|
variable instead of an attribute lookup, with the following two statements
|
||||||
|
being functionally equivalent::
|
||||||
|
|
||||||
|
match expr:
|
||||||
|
case {"key": ?self.target}:
|
||||||
|
... # Handle the case where 'expr["key"] == self.target'
|
||||||
|
case ?:
|
||||||
|
... # Handle the non-matching case
|
||||||
|
|
||||||
|
_target = self.target
|
||||||
|
match expr:
|
||||||
|
case {"key": ?_target}:
|
||||||
|
... # Handle the case where 'expr["key"] == self.target'
|
||||||
|
case ?:
|
||||||
|
... # Handle the non-matching case
|
||||||
|
|
||||||
|
By contrast, PEP 634's attribution of additional semantic significance to the
|
||||||
|
use of attribute lookup notation means that the following two statements
|
||||||
|
wouldn't be equivalent at all::
|
||||||
|
|
||||||
|
|
||||||
|
# PEP 634's value pattern syntax
|
||||||
|
match expr:
|
||||||
|
case {"key": self.target}:
|
||||||
|
... # Handle the case where 'expr["key"] == self.target'
|
||||||
|
case _:
|
||||||
|
... # Handle the non-matching case
|
||||||
|
|
||||||
|
_target = self.target
|
||||||
|
match expr:
|
||||||
|
case {"key": _target}:
|
||||||
|
... # Matches any mapping with "key", binding its value to _target
|
||||||
|
case _:
|
||||||
|
... # Handle the non-matching case
|
||||||
|
|
||||||
|
To be completely clear, the latter statement means the same under this PEP as it
|
||||||
|
does under PEP 634. The difference is that PEP 634 is relying entirely on the
|
||||||
|
dotted attribute lookup syntax to identify value patterns, so when the attribute
|
||||||
|
lookup gets removed, the pattern type immediately changes from a value pattern
|
||||||
|
to a capture pattern.
|
||||||
|
|
||||||
|
By contrast, the explicit marker prefix on constraint patterns in this PEP means
|
||||||
|
that switching from a dotted lookup to a local variable lookup has no effect on
|
||||||
|
the kind of pattern that the compiler detects - to change it to a capture
|
||||||
|
pattern, you have to explicitly remove the marker prefix (which will result in
|
||||||
|
a syntax error if the binding target isn't a simple name).
|
||||||
|
|
||||||
|
PEP 622's walrus pattern syntax had another odd interaction where it might not
|
||||||
|
bind the same object as the exact same walrus expression in the body of the
|
||||||
|
case clause, but PEP 634 fixed that disrepancy by replacing walrus patterns
|
||||||
|
with AS patterns (where the fact that the value bound to the name on the RHS
|
||||||
|
might not be the same value as returned by the LHS is a standard feature common
|
||||||
|
to all uses of the "as" keyword).
|
||||||
|
|
||||||
|
|
||||||
|
Using "?" as the constraint pattern prefix
|
||||||
|
------------------------------------------
|
||||||
|
|
||||||
|
If the need for a dedicated constraint pattern prefix is accepted, then the
|
||||||
|
next question is to ask exactly what that prefix should be.
|
||||||
|
|
||||||
|
With multiple constraint patterns potentially appearing inside larger
|
||||||
|
structural patterns, using a single punctuation character rather than a keyword
|
||||||
|
is desirable for brevity.
|
||||||
|
|
||||||
|
Most potential candidates are already used in Python for another unrelated
|
||||||
|
purpose, or would integrate poorly with other aspects of the pattern matching
|
||||||
|
syntax (e.g. `=` or `==` have multiple problems along those lines, in particular
|
||||||
|
in the way they would combine with `=` as a keyword separator in class
|
||||||
|
patterns, or `:` as a key/value separate in mapping patterns).
|
||||||
|
|
||||||
|
This PEP proposes `?` as the prefix marker as it isn't currently used in Python's
|
||||||
|
core syntax, the proposed usage as a prefix marker won't conflict with its
|
||||||
|
use in other Python related contexts (e.g. looking up object help information in
|
||||||
|
IPython), and there are plausible mnemonics that may help users to *remember*
|
||||||
|
what the syntax means even if they can't guess the semantics if exposed to it
|
||||||
|
without any explanation (mostly that it's a shorthand for the question "Is the
|
||||||
|
unpacked value at this position equivalent to the value given by the expression?
|
||||||
|
If not, don't match")).
|
||||||
|
|
||||||
|
PEP 635 has a good discussion of the problems with this choice in the context
|
||||||
|
of using it as the wildcard pattern marker:
|
||||||
|
|
||||||
|
An alternative that does not suggest an arbitrary number of items would
|
||||||
|
be `?`. This is even being proposed independently from pattern matching in
|
||||||
|
PEP 640. We feel however that using `?` as a special "assignment" target is
|
||||||
|
likely more confusing to Python users than using `_`. It violates Python's
|
||||||
|
(admittedly vague) principle of using punctuation characters only in ways
|
||||||
|
similar to how they are used in common English usage or in high school math,
|
||||||
|
unless the usage is very well established in other programming languages
|
||||||
|
(like, e.g., using a dot for member access).
|
||||||
|
|
||||||
|
The question mark fails on both counts: its use in other programming
|
||||||
|
languages is a grab-bag of usages only vaguely suggested by the idea of a
|
||||||
|
"question". For example, it means "any character" in shell globbing,
|
||||||
|
"maybe" in regular expressions, "conditional expression" in C and many
|
||||||
|
C-derived languages, "predicate function" in Scheme,
|
||||||
|
"modify error handling" in Rust, "optional argument" and "optional chaining"
|
||||||
|
in TypeScript (the latter meaning has also been proposed for Python by
|
||||||
|
PEP 505). An as yet unnamed PEP proposes it to mark optional types,
|
||||||
|
e.g. int?.
|
||||||
|
|
||||||
|
Another common use of ? in programming systems is "help", for example, in
|
||||||
|
IPython and Jupyter Notebooks and many interactive command-line utilities.
|
||||||
|
|
||||||
|
This PEP takes the view that *not* requiring a marker prefix on value lookups
|
||||||
|
in match patterns results in a cure that is worse than the disease: Python's
|
||||||
|
first ever syntax-sensitive value lookup where you can't transparently
|
||||||
|
replace an attribute lookup with a local variable lookup and maintain semantic
|
||||||
|
equivalence aside from the exact relative timing of the attribute lookup.
|
||||||
|
|
||||||
|
Assuming the requirement for a marker prefix is accepted on those grounds, then
|
||||||
|
the syntactic bar to meet isn't "Can users *guess* what the chosen symbol means
|
||||||
|
without anyone ever explaining it to them?" but instead the lower standard
|
||||||
|
applied when choosing the `@` symbol for both decorator expressions and matrix
|
||||||
|
multiplication and the `:=` character combination for assignment expressions:
|
||||||
|
"Can users *remember* what it means once they've had it explained to them at
|
||||||
|
least once?".
|
||||||
|
|
||||||
|
This PEP contends that `?` will be able to pass that lower standard, and would
|
||||||
|
pass it even more readily if PEP 640 were also subsequently adopted to allow it
|
||||||
|
as a general purpose non-binding wildcard marker that doesn't conflict with the
|
||||||
|
use of `_` in application internationalisation use cases.
|
||||||
|
|
||||||
|
PEPs proposing additional meanings for this character would need to take the
|
||||||
|
pattern matching meaning into account, but wouldn't necessarily fail purely on
|
||||||
|
that account (e.g. `@` was adopted as a binary operator for matrix
|
||||||
|
multiplication well after its original adoption as a decorator expression
|
||||||
|
prefix). "Value checking" related use cases such as PEP 505's None-aware
|
||||||
|
operators would likely fare especially well on that front, but each such
|
||||||
|
proposal would continue to be judged on a case-by-case basis.
|
||||||
|
|
||||||
|
|
||||||
|
Using "?" as the wildcard pattern
|
||||||
|
---------------------------------
|
||||||
|
|
||||||
|
PEP 635 makes a solid case that introducing "?" *solely* as a wildcard pattern
|
||||||
|
marker would be a bad idea. Continuing on from the text already quoted in the
|
||||||
|
previous section:
|
||||||
|
|
||||||
|
In addition, this would put Python in a rather unique position: The
|
||||||
|
underscore is used as a wildcard pattern in every programming language
|
||||||
|
with pattern matching that we could find (including C#, Elixir, Erlang,
|
||||||
|
F#, Grace, Haskell, Mathematica, OCaml, Ruby, Rust, Scala, Swift, and
|
||||||
|
Thorn). Keeping in mind that many users of Python also work with other
|
||||||
|
programming languages, have prior experience when learning Python, and
|
||||||
|
may move on to other languages after having learned Python, we find that
|
||||||
|
such well-established standards are important and relevant with respect
|
||||||
|
to readability and learnability. In our view, concerns that this wildcard
|
||||||
|
means that a regular name received special treatment are not strong enough
|
||||||
|
to introduce syntax that would make Python special.
|
||||||
|
|
||||||
|
Other languages with pattern matching don't use `?` as the wildcard pattern
|
||||||
|
(they all use `_`), and without any other usage in Python's syntax, there
|
||||||
|
wouldn't be any useful prompts to help users remember what `?` means when
|
||||||
|
they encounter it in a match pattern.
|
||||||
|
|
||||||
|
In this PEP, the adoption of "?" as the wildcard pattern marker instead comes
|
||||||
|
from asking the question "What does it mean to omit the constraint expression
|
||||||
|
from a constraint pattern?", and concluding that "match any value" is a more
|
||||||
|
useful definition than reporting a syntax error.
|
||||||
|
|
||||||
|
While making code and concept sharing with other languages easier is a laudable
|
||||||
|
goal, it isn't like using `_` as a wildcard marker won't *work* - it will just
|
||||||
|
bind the `_` name, the same as it does in any other Python assignment context.
|
||||||
|
|
||||||
|
|
||||||
|
No special casing for `?None`, `?True`, and `?False`
|
||||||
|
----------------------------------------------------
|
||||||
|
|
||||||
|
This PEP follows PEP 622 in treating `None`, `True` and `False` like any other
|
||||||
|
value constraint, and comparing them by equality, rather than following PEP
|
||||||
|
634 in proposing that these values (and only these values) be handled specially
|
||||||
|
and compared via identity.
|
||||||
|
|
||||||
|
While writing `x is None` is a common (and PEP 8 recommended) practice, nobody
|
||||||
|
litters their `if-elif` chains with `x is True` or `x is False` expressions,
|
||||||
|
they write `x` and `not x`, both of which compare by value, not identity.
|
||||||
|
Indeed, PEP 8 explicitly disallows the use "if x is True:" and "if x is False:",
|
||||||
|
preferring the forms without any comparison operator at all.
|
||||||
|
|
||||||
|
The key problem with special casing is that it doesn't interact properly with
|
||||||
|
Python's historical practice where "a reference is just a reference, it doesn't
|
||||||
|
matter how it is spelled in the code".
|
||||||
|
|
||||||
|
Instead, with the special casing proposed in PEP 634, checking against one of
|
||||||
|
these values directly would behave differently from checking against it when
|
||||||
|
saved in a variable or attribute::
|
||||||
|
|
||||||
|
# PEP 634's literal pattern syntax
|
||||||
|
match expr:
|
||||||
|
case True:
|
||||||
|
... # Only handles the case where "expr is True"
|
||||||
|
|
||||||
|
# PEP 634's value pattern syntax
|
||||||
|
match expr:
|
||||||
|
case self.expected_match: # Set to 'True' somewhere else
|
||||||
|
... # Handles the case where "expr == True"
|
||||||
|
|
||||||
|
However, the explicit prefix syntax proposed in this PEP leaves the door open
|
||||||
|
to future proposals that would allow for more exact comparisons when desired:
|
||||||
|
|
||||||
|
* A version of literal pattern syntax could be reintroduced, such that
|
||||||
|
`True` checked for `is True` while `?True` checked for `== True` (presumably
|
||||||
|
accompanied by a PEP 8 update to remove the advice against writing such code
|
||||||
|
in the first place)
|
||||||
|
* Constraint expressions could be enhanced such that `==` was just the *default*
|
||||||
|
comparison operator, and others could be selectively introduced based on
|
||||||
|
specific use cases (e.g. `case ?is True:`)
|
||||||
|
|
||||||
|
It's also the case that the `bool(True)` and `bool(False)` class patterns would
|
||||||
|
already exclude truthy-but-not-boolean values, so it isn't at all clear that
|
||||||
|
any significant expressiveness is gained through these special cases.
|
||||||
|
|
||||||
|
|
||||||
|
Rejected Ideas
|
||||||
|
==============
|
||||||
|
|
||||||
|
Providing dedicated syntax for binding matched constraint values
|
||||||
|
----------------------------------------------------------------
|
||||||
|
|
||||||
|
The initial (unpublished) draft of this proposal suggested allowing `NAME?EXPR`
|
||||||
|
as a syntactically unambiguous shorthand for PEP 622's `NAME := BASE.ATTR` or
|
||||||
|
PEP 634's `BASE.ATTR as NAME`.
|
||||||
|
|
||||||
|
This idea was dropped as it complicated the grammar for no gain in
|
||||||
|
expressiveness over just using the general purpose approach to combining
|
||||||
|
capture patterns with other match patterns (i.e. `?EXPR as NAME`) when the
|
||||||
|
identity of the matched object is important.
|
||||||
|
|
||||||
|
|
||||||
|
Requiring the use of constraint prefix markers for mapping pattern keys
|
||||||
|
-----------------------------------------------------------------------
|
||||||
|
|
||||||
|
The initial (unpublished) draft of this proposal suggested requiring mapping
|
||||||
|
pattern keys be constraint patterns, just as PEP 634 requires that they be valid
|
||||||
|
literal or value value patterns::
|
||||||
|
|
||||||
|
import constants
|
||||||
|
|
||||||
|
match config:
|
||||||
|
case {?"route": route}:
|
||||||
|
process_route(route)
|
||||||
|
case {?constants.DEFAULT_PORT: sub_config, **rest}:
|
||||||
|
process_config(sub_config, rest)
|
||||||
|
|
||||||
|
However, the extra character is syntactically noisy and unlike its use in
|
||||||
|
constraint patterns (where it distinguishes them from capture patterns), the
|
||||||
|
prefix doesn't provide any additional information here that isn't already
|
||||||
|
conveyed by the expression's position as a key within a mapping pattern.
|
||||||
|
|
||||||
|
Accordingly, the proposal was simplified to omit the marker prefix from mapping
|
||||||
|
pattern keys.
|
||||||
|
|
||||||
|
|
||||||
|
Restricting the permitted atoms in constraint patterns and mapping pattern keys
|
||||||
|
-------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
While it's entirely technical possible to restrict the kinds of expressions
|
||||||
|
permitted in constraint patterns and mapping pattern keys to just attribute
|
||||||
|
lookups (as PEP 634 does), there isn't any clear runtime value in doing so,
|
||||||
|
so the PEP proposes allowing any kind of atom.
|
||||||
|
|
||||||
|
While PEP 635 does emphasise several times that literal patterns and value
|
||||||
|
patterns are not full expressions, it doesn't ever articulate a concrete benefit
|
||||||
|
that is obtained from that restriction.
|
||||||
|
|
||||||
|
The last time we imposed such a restriction was for decorator expressions and
|
||||||
|
the primary outcome of that was that users had to put up with years of awkward
|
||||||
|
syntactic workarounds (like nesting arbitrary expressions inside function calls
|
||||||
|
that just returned their argument) to express the behaviour they wanted before
|
||||||
|
the language definition was finally updated to allow arbitrary expressions and
|
||||||
|
let users make their own decisions about readability.
|
||||||
|
|
||||||
|
The comparable situation for PEP 634 is that arbitrary expressions are
|
||||||
|
technically supported in value patterns, they just require an awkward workaround
|
||||||
|
where all the values to match need to be specified in a helper class that is
|
||||||
|
placed before the match statement::
|
||||||
|
|
||||||
|
# Allowing arbitrary match targets with PEP 634's value pattern syntax
|
||||||
|
class mt:
|
||||||
|
value = func()
|
||||||
|
match expr:
|
||||||
|
case mt.value:
|
||||||
|
... # Handle the case where 'expr == func()'
|
||||||
|
|
||||||
|
This PEP proposes skipping requiring any such workarounds, and instead
|
||||||
|
supporting arbitrary value constraints from the start::
|
||||||
|
|
||||||
|
match expr:
|
||||||
|
case ?func():
|
||||||
|
... # Handle the case where 'expr == func()'
|
||||||
|
|
||||||
|
Whether actually doing that or not is a good idea would be a topic for style
|
||||||
|
guides and code linters, not the language compiler.
|
||||||
|
|
||||||
|
In particular, if static analysers can't follow certain kinds of dynamic checks,
|
||||||
|
then they can limit the permitted expressions at analysis time, rather than the
|
||||||
|
compiler restricting them at compile time.
|
||||||
|
|
||||||
|
|
||||||
|
Acknowledgments
|
||||||
|
===============
|
||||||
|
|
||||||
|
The PEP 622 and PEP 634/635/636 authors, as the proposal in this PEP is merely
|
||||||
|
an attempt to improve the readability of an already well-constructed idea by
|
||||||
|
proposing that one of the key new concepts in that proposal (the ability to
|
||||||
|
express value constraints in a name binding target) is sufficiently notable
|
||||||
|
to be worthy of using up one of the few remaining unused ASCII punctuation
|
||||||
|
characters in Python's syntax.
|
||||||
|
|
||||||
|
|
||||||
|
References
|
||||||
|
==========
|
||||||
|
|
||||||
|
.. [1] Post explaining the syntactic novelties in PEP 622
|
||||||
|
https://mail.python.org/archives/list/python-dev@python.org/message/2VRPDW4EE243QT3QNNCO7XFZYZGIY6N3/>
|
||||||
|
|
||||||
|
.. [2] Declined pull request proposing to list this as a Rejected Idea in PEP 622
|
||||||
|
https://github.com/python/peps/pull/1564
|
||||||
|
|
||||||
|
|
||||||
|
.. _Appendix A:
|
||||||
|
|
||||||
|
Appendix A -- Full Grammar
|
||||||
|
==========================
|
||||||
|
|
||||||
|
Here is the full modified grammar for ``match_stmt``, replacing Appendix A
|
||||||
|
in PEP 634.
|
||||||
|
|
||||||
|
Notation used beyond standard EBNF is as per PEP 534:
|
||||||
|
|
||||||
|
- ``'KWD'`` denotes a hard keyword
|
||||||
|
- ``"KWD"`` denotes a soft keyword
|
||||||
|
- ``SEP.RULE+`` is shorthand for ``RULE (SEP RULE)*``
|
||||||
|
- ``!RULE`` is a negative lookahead assertion
|
||||||
|
|
||||||
|
Note: this still needs a reference implementation to confirm that the proposed
|
||||||
|
grammar is correct (in particular, ensuring the new wildcard and
|
||||||
|
constraint pattern definitions aren't considered ambiguous by the parser)
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
match_stmt: "match" subject_expr ':' NEWLINE INDENT case_block+ DEDENT
|
||||||
|
subject_expr:
|
||||||
|
| star_named_expression ',' [star_named_expressions]
|
||||||
|
| named_expression
|
||||||
|
case_block: "case" patterns [guard] ':' block
|
||||||
|
guard: 'if' named_expression
|
||||||
|
|
||||||
|
patterns: open_sequence_pattern | pattern
|
||||||
|
pattern: as_pattern | or_pattern
|
||||||
|
as_pattern: or_pattern 'as' capture_pattern
|
||||||
|
or_pattern: '|'.closed_pattern+
|
||||||
|
closed_pattern:
|
||||||
|
| capture_pattern
|
||||||
|
| constraint_pattern
|
||||||
|
| wildcard_pattern
|
||||||
|
| group_pattern
|
||||||
|
| sequence_pattern
|
||||||
|
| mapping_pattern
|
||||||
|
| class_pattern
|
||||||
|
|
||||||
|
capture_pattern: NAME !('.' | '(' | '=')
|
||||||
|
|
||||||
|
wildcard_pattern: '?'
|
||||||
|
|
||||||
|
constraint_pattern: '?' atom
|
||||||
|
|
||||||
|
group_pattern: '(' pattern ')'
|
||||||
|
|
||||||
|
sequence_pattern:
|
||||||
|
| '[' [maybe_sequence_pattern] ']'
|
||||||
|
| '(' [open_sequence_pattern] ')'
|
||||||
|
open_sequence_pattern: maybe_star_pattern ',' [maybe_sequence_pattern]
|
||||||
|
maybe_sequence_pattern: ','.maybe_star_pattern+ ','?
|
||||||
|
maybe_star_pattern: star_pattern | pattern
|
||||||
|
star_pattern: '*' (capture_pattern | wildcard_pattern)
|
||||||
|
|
||||||
|
mapping_pattern: '{' [items_pattern] '}'
|
||||||
|
items_pattern: ','.key_value_pattern+ ','?
|
||||||
|
key_value_pattern:
|
||||||
|
| atom ':' pattern
|
||||||
|
| double_star_pattern
|
||||||
|
double_star_pattern: '**' capture_pattern
|
||||||
|
|
||||||
|
class_pattern:
|
||||||
|
| name_or_attr '(' [pattern_arguments ','?] ')'
|
||||||
|
pattern_arguments:
|
||||||
|
| positional_patterns [',' keyword_patterns]
|
||||||
|
| keyword_patterns
|
||||||
|
positional_patterns: ','.pattern+
|
||||||
|
keyword_patterns: ','.keyword_pattern+
|
||||||
|
keyword_pattern: NAME '=' pattern
|
||||||
|
|
||||||
|
|
||||||
|
Copyright
|
||||||
|
=========
|
||||||
|
|
||||||
|
This document is placed in the public domain or under the
|
||||||
|
CC0-1.0-Universal license, whichever is more permissive.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
..
|
||||||
|
Local Variables:
|
||||||
|
mode: indented-text
|
||||||
|
indent-tabs-mode: nil
|
||||||
|
sentence-end-double-space: t
|
||||||
|
fill-column: 70
|
||||||
|
coding: utf-8
|
||||||
|
End:
|
Loading…
Reference in New Issue