PEP 635: many improvements (#1663)
* PEP 635: Tweaks markup Consistently Capitalize Headings. Remove extra blank lines (two is enough). Add a few TODOs. Fix a few typos. * Went over much of PEP 635 with a fine comb I got as far as capture patterns. * Tweak wildcard patterns (adding '?'); muse on 'else' * Reviewed up to and including sequence patterns * Checkpoint -- got halfway through Class Patterns * Changed Walrus to AS and added rationales (Tobias) * Fix AS-pattern example Co-authored-by: Tobias Kohn <webmaster@tobiaskohn.ch>
This commit is contained in:
parent
0181d5c214
commit
a4502e04d6
440
pep-0635.rst
440
pep-0635.rst
|
@ -15,7 +15,6 @@ Post-History:
|
||||||
Resolution:
|
Resolution:
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Abstract
|
Abstract
|
||||||
========
|
========
|
||||||
|
|
||||||
|
@ -31,7 +30,6 @@ TODO: Go over the feedback from the SC and make sure everything's
|
||||||
somehow addressed.
|
somehow addressed.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Motivation
|
Motivation
|
||||||
==========
|
==========
|
||||||
|
|
||||||
|
@ -88,7 +86,7 @@ We believe that adding pattern matching to Python will enable Python
|
||||||
users to write cleaner, more readable code for examples like those
|
users to write cleaner, more readable code for examples like those
|
||||||
above, and many others.
|
above, and many others.
|
||||||
|
|
||||||
Pattern matching and OO
|
Pattern Matching and OO
|
||||||
-----------------------
|
-----------------------
|
||||||
|
|
||||||
Pattern matching is complimentary to the object-oriented paradigm.
|
Pattern matching is complimentary to the object-oriented paradigm.
|
||||||
|
@ -111,21 +109,31 @@ Like the Visitor pattern, pattern matching allows for a strict separation
|
||||||
of concerns: specific actions or data processing is independent of the
|
of concerns: specific actions or data processing is independent of the
|
||||||
class hierarchy or manipulated objects. When dealing with predefined or
|
class hierarchy or manipulated objects. When dealing with predefined or
|
||||||
even built-in classes, in particular, it is often impossible to add further
|
even built-in classes, in particular, it is often impossible to add further
|
||||||
methods to the individual classes. Pattern matching not only releaves the
|
methods to the individual classes. Pattern matching not only relieves the
|
||||||
programmer or class designer from the burden of the boilerplate code needed
|
programmer or class designer from the burden of the boilerplate code needed
|
||||||
for the Visitor pattern, but is also flexible enough to directly work with
|
for the Visitor pattern, but is also flexible enough to directly work with
|
||||||
built-in types. It naturally distinguishes between sequences of different
|
built-in types. It naturally distinguishes between sequences of different
|
||||||
lengths, who might all share the same class despite obviously differing
|
lengths, which might all share the same class despite obviously differing
|
||||||
structures. Moreover, pattern matching automatically takes inheritance
|
structures. Moreover, pattern matching automatically takes inheritance
|
||||||
into account: a class *D* inheriting from *C* will be handled by a pattern
|
into account: a class *D* inheriting from *C* will be handled by a pattern
|
||||||
that targets *C* by default.
|
that targets *C* by default.
|
||||||
|
|
||||||
|
Object oriented programming is geared towards single-dispatch: it is a
|
||||||
|
single instance (or the type thereof) that determines which method is to
|
||||||
|
be called. This leads to a somewhat artifical situation in case of binary
|
||||||
|
operators where both objects might play an equal role in deciding which
|
||||||
|
implementation to use (Python addresses this through the use of reversed
|
||||||
|
binary methods). Pattern matching is structurally better suited to handle
|
||||||
|
such situations of multi-dispatch, where the action to be taken depends on
|
||||||
|
the types of several objects to equal parts.
|
||||||
|
|
||||||
TODO: Could we say more here?
|
TODO: Could we say more here?
|
||||||
|
|
||||||
Pattern and functional style
|
|
||||||
----------------------------
|
|
||||||
|
|
||||||
Most Python applications and libraries are not written in a consistent
|
Patterns and Functional Style
|
||||||
|
-----------------------------
|
||||||
|
|
||||||
|
Many Python applications and libraries are not written in a consistent
|
||||||
OO style -- unlike Java, Python encourages defining functions at the
|
OO style -- unlike Java, Python encourages defining functions at the
|
||||||
top-level of a module, and for simple data structures, tuples (or
|
top-level of a module, and for simple data structures, tuples (or
|
||||||
named tuples or lists) and dictionaries are often used exclusively or
|
named tuples or lists) and dictionaries are often used exclusively or
|
||||||
|
@ -146,33 +154,51 @@ programming style.
|
||||||
Rationale
|
Rationale
|
||||||
=========
|
=========
|
||||||
|
|
||||||
TBD.
|
This section provides the rationale for individual design decisions.
|
||||||
|
|
||||||
This section should provide the rationale for individual design decisions.
|
|
||||||
It takes the place of "Rejected ideas" in the standard PEP format.
|
It takes the place of "Rejected ideas" in the standard PEP format.
|
||||||
It is organized in sections corresponding to the specification (PEP 634).
|
It is organized in sections corresponding to the specification (PEP 634).
|
||||||
|
|
||||||
|
TODO: Cross-check against PEP 622 as well as (private) SC feedback.
|
||||||
|
|
||||||
Overview and terminology
|
|
||||||
|
Overview and Terminology
|
||||||
------------------------
|
------------------------
|
||||||
|
|
||||||
|
TODO: What to put here?
|
||||||
|
|
||||||
|
Much of the power of pattern matching comes from the nesting of subpatterns.
|
||||||
|
That the success of a pattern match depends directly on the success of
|
||||||
|
subpattern is thus a cornerstone of the design. However, although a
|
||||||
|
pattern like ``P(Q(), R())`` succeeds only if both subpatterns ``Q()``
|
||||||
|
and ``R()`` succeed (i.e. the success of pattern ``P`` depends on ``Q``
|
||||||
|
and ``R``), the pattern ``P`` is checked first. If ``P`` fails, neither
|
||||||
|
``Q()`` nor ``R()`` will be tried (this is a direct consequence of the
|
||||||
|
fact that if ``P`` fails, there are no subjects to match against ``Q()``
|
||||||
|
and ``R()`` in the first place).
|
||||||
|
|
||||||
|
Also note that patterns bind names to values rather than performing an
|
||||||
|
assignment. This reflects the fact that patterns aim to not have side
|
||||||
|
effects, which also means that Capture or AS patterns cannot assign a
|
||||||
|
value to an attribute or subscript. We thus consistently use the term
|
||||||
|
'bind' instead of 'assign' to emphasise this subtle difference between
|
||||||
|
traditional assignments and name binding in patterns.
|
||||||
|
|
||||||
|
|
||||||
The ``match`` statement
|
The Match Statement
|
||||||
-----------------------
|
-------------------
|
||||||
|
|
||||||
The match statement evaluates an expression to produce a subject, finds the
|
The match statement evaluates an expression to produce a subject, finds the
|
||||||
first pattern that matches the subject and executes the associated block
|
first pattern that matches the subject, and executes the associated block
|
||||||
of code. Syntactically, the match statement thus takes an expression and
|
of code. Syntactically, the match statement thus takes an expression and
|
||||||
a sequence of case clauses, where each case clause comprises a pattern and
|
a sequence of case clauses, where each case clause comprises a pattern and
|
||||||
a block of code.
|
a block of code.
|
||||||
|
|
||||||
Since case clauses comprise a block of code, they adhere to the existing
|
Since case clauses comprise a block of code, they adhere to the existing
|
||||||
indentation scheme with the syntactic structure of
|
indentation scheme with the syntactic structure of
|
||||||
``<keyword> ...: <(indented) block>``, which in turn makes it a (compound)
|
``<keyword> ...: <(indented) block>``, which resembles a compound
|
||||||
statement. The chosen keyword ``case`` reflects its widespread use in
|
statement. The keyword ``case`` reflects its widespread use in
|
||||||
pattern matching languages, ignoring those languages that use other
|
pattern matching languages, ignoring those languages that use other
|
||||||
syntactic means such as a symbol like ``|`` because it would not fit
|
syntactic means such as a symbol like ``|``, because it would not fit
|
||||||
established Python structures. The syntax of patterns following the
|
established Python structures. The syntax of patterns following the
|
||||||
keyword is discussed below.
|
keyword is discussed below.
|
||||||
|
|
||||||
|
@ -203,7 +229,7 @@ rules:
|
||||||
...
|
...
|
||||||
|
|
||||||
This may look awkward to the eye of a Python programmer, because
|
This may look awkward to the eye of a Python programmer, because
|
||||||
everywhere else colon is followed by an indent. The ``match`` would
|
everywhere else a colon is followed by an indent. The ``match`` would
|
||||||
neither follow the syntactic scheme of simple nor composite statements
|
neither follow the syntactic scheme of simple nor composite statements
|
||||||
but rather establish a category of its own.
|
but rather establish a category of its own.
|
||||||
|
|
||||||
|
@ -229,9 +255,9 @@ Although flat indentation would save some horizontal space, the cost of
|
||||||
increased complexity or unusual rules is too high. It would also complicate
|
increased complexity or unusual rules is too high. It would also complicate
|
||||||
life for simple-minded code editors. Finally, the horizontal space issue can
|
life for simple-minded code editors. Finally, the horizontal space issue can
|
||||||
be alleviated by allowing "half-indent" (i.e. two spaces instead of four)
|
be alleviated by allowing "half-indent" (i.e. two spaces instead of four)
|
||||||
for match statements.
|
for match statements (though we do not recommend this).
|
||||||
|
|
||||||
In sample programs using match, written as part of the development of this
|
In sample programs using ``match``, written as part of the development of this
|
||||||
PEP, a noticeable improvement in code brevity is observed, more than making
|
PEP, a noticeable improvement in code brevity is observed, more than making
|
||||||
up for the additional indentation level.
|
up for the additional indentation level.
|
||||||
|
|
||||||
|
@ -239,7 +265,7 @@ up for the additional indentation level.
|
||||||
*Statement vs. Expression.* Some suggestions centered around the idea of
|
*Statement vs. Expression.* Some suggestions centered around the idea of
|
||||||
making ``match`` an expression rather than a statement. However, this
|
making ``match`` an expression rather than a statement. However, this
|
||||||
would fit poorly with Python's statement-oriented nature and lead to
|
would fit poorly with Python's statement-oriented nature and lead to
|
||||||
unusually long and complex expressions with the need to invent new
|
unusually long and complex expressions and the need to invent new
|
||||||
syntactic constructs or break well established syntactic rules. An
|
syntactic constructs or break well established syntactic rules. An
|
||||||
obvious consequence of ``match`` as an expression would be that case
|
obvious consequence of ``match`` as an expression would be that case
|
||||||
clauses could no longer have abitrary blocks of code attached, but only
|
clauses could no longer have abitrary blocks of code attached, but only
|
||||||
|
@ -247,8 +273,46 @@ a single expression. Overall, the strong limitations could in no way
|
||||||
offset the slight simplification in some special use cases.
|
offset the slight simplification in some special use cases.
|
||||||
|
|
||||||
|
|
||||||
|
*Hard vs. Soft Keyword.* There were options to make match a hard keyword,
|
||||||
|
or choose a different keyword. Although using a hard keyword would simplify
|
||||||
|
life for simple-minded syntax highlighters, we decided not to use hard
|
||||||
|
keyword for several reasons:
|
||||||
|
|
||||||
Match semantics
|
- Most importantly, the new parser doesn't require us to do this. Unlike
|
||||||
|
with ``async`` that caused hardships with being a soft keyword for few
|
||||||
|
releases, here we can make ``match`` a permanent soft keyword.
|
||||||
|
|
||||||
|
- ``match`` is so commonly used in existing code, that it would break
|
||||||
|
almost every existing program and will put a burden to fix code on many
|
||||||
|
people who may not even benefit from the new syntax.
|
||||||
|
|
||||||
|
- It is hard to find an alternative keyword that would not be commonly used
|
||||||
|
in existing programs as an identifier, and would still clearly reflect the
|
||||||
|
meaning of the statement.
|
||||||
|
|
||||||
|
|
||||||
|
**Use "as" or "|" instead of "case" for case clauses.**
|
||||||
|
The pattern matching proposed here is a combination of multi-branch control
|
||||||
|
flow (in line with ``switch`` in Algol-derived languages or ``cond`` in Lisp)
|
||||||
|
and object-deconstruction as found in functional languages. While the proposed
|
||||||
|
keyword ``case`` highlights the multi-branch aspect, alternative keywords such
|
||||||
|
as ``as`` would equally be possible, highlighting the deconstruction aspect.
|
||||||
|
``as`` or ``with``, for instance, also have the advantage of already being
|
||||||
|
keywords in Python. However, since ``case`` as a keyword can only occur as a
|
||||||
|
leading keyword inside a ``match`` statement, it is easy for a parser to
|
||||||
|
distinguish between its use as a keyword or as a variable.
|
||||||
|
|
||||||
|
Other variants would use a symbol like ``|`` or ``=>``, or go entirely without
|
||||||
|
special marker.
|
||||||
|
|
||||||
|
Since Python is a statement-oriented language in the tradition of Algol, and as
|
||||||
|
each composite statement starts with an identifying keyword, ``case`` seemed to
|
||||||
|
be most in line with Python's style and traditions.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
Match Semantics
|
||||||
~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
The patterns of different case clauses might overlap in that more than
|
The patterns of different case clauses might overlap in that more than
|
||||||
|
@ -290,8 +354,8 @@ unintuitive and surprising behavior.
|
||||||
A direct consequence of this is that any variable bindings outlive the
|
A direct consequence of this is that any variable bindings outlive the
|
||||||
respective case or match statements. Even patterns that only match a
|
respective case or match statements. Even patterns that only match a
|
||||||
subject partially might bind local variables (this is, in fact, necessary
|
subject partially might bind local variables (this is, in fact, necessary
|
||||||
for guards to function properly). However, this escaping of variable
|
for guards to function properly). However, these semantics for variable
|
||||||
bindings is in line with existing Python structures such as for loops and
|
binding are in line with existing Python structures such as for loops and
|
||||||
with statements.
|
with statements.
|
||||||
|
|
||||||
|
|
||||||
|
@ -301,9 +365,9 @@ Guards
|
||||||
Some constraints cannot be adequately expressed through patterns alone.
|
Some constraints cannot be adequately expressed through patterns alone.
|
||||||
For instance, a 'less' or 'greater than' relationship defies the usual
|
For instance, a 'less' or 'greater than' relationship defies the usual
|
||||||
'equal' semantics of patterns. Moreover, different subpatterns are
|
'equal' semantics of patterns. Moreover, different subpatterns are
|
||||||
independent and cannot refer to each other. The addition of _guards_
|
independent and cannot refer to each other. The addition of *guards*
|
||||||
addresses these restrictions: a guard is an arbitrary expression attached
|
addresses these restrictions: a guard is an arbitrary expression attached
|
||||||
to a pattern and that must evaluate to ``True`` for the pattern to succeed.
|
to a pattern and that must evaluate to a "truthy" value for the pattern to succeed.
|
||||||
|
|
||||||
For example, ``case [x, y] if x < y:`` uses a guard (``if x < y``) to
|
For example, ``case [x, y] if x < y:`` uses a guard (``if x < y``) to
|
||||||
express a 'less than' relationship between two otherwise disjoint capture
|
express a 'less than' relationship between two otherwise disjoint capture
|
||||||
|
@ -312,15 +376,15 @@ patterns ``x`` and ``y``.
|
||||||
From a conceptual point of view, patterns describe structural constraints
|
From a conceptual point of view, patterns describe structural constraints
|
||||||
on the subject in a declarative style, ideally without any side-effects.
|
on the subject in a declarative style, ideally without any side-effects.
|
||||||
Recall, in particular, that patterns are clearly distinct from expressions,
|
Recall, in particular, that patterns are clearly distinct from expressions,
|
||||||
following different objectives and semantics. Guards then enhance the
|
following different objectives and semantics. Guards then enhance case
|
||||||
patterns in a highly controlled way with arbitrary expressions (that might
|
blocks in a highly controlled way with arbitrary expressions (that might
|
||||||
have side effects). Splitting the overal pattern into a static structural
|
have side effects). Splitting the overall functionality into a static structural
|
||||||
and a dynamic 'evaluative' part not only helps with readability, but can
|
and a dynamically evaluated part not only helps with readability, but can
|
||||||
also introduce dramatic potential for compiler optimizations. To keep this
|
also introduce dramatic potential for compiler optimizations. To keep this
|
||||||
clear separation, guards are only supported on the level of case clauses
|
clear separation, guards are only supported on the level of case clauses
|
||||||
and not for individual patterns.
|
and not for individual patterns.
|
||||||
|
|
||||||
Example using guards::
|
**Example** using guards::
|
||||||
|
|
||||||
def sort(seq):
|
def sort(seq):
|
||||||
match seq:
|
match seq:
|
||||||
|
@ -354,64 +418,84 @@ seen as a prototype to pattern matching in Python, there is only one
|
||||||
Full pattern matching differs from this in that there is more variety
|
Full pattern matching differs from this in that there is more variety
|
||||||
in structual patterns but only a minimum of binding patterns.
|
in structual patterns but only a minimum of binding patterns.
|
||||||
|
|
||||||
Patterns differ from assignment targets (as in iterable unpacking) in that
|
Patterns differ from assignment targets (as in iterable unpacking) in two ways:
|
||||||
they impose additional constraints on the structure of the subject and in
|
they impose additional constraints on the structure of the subject, and
|
||||||
that a subject might safely fail to match a specific pattern at any point
|
a subject may safely fail to match a specific pattern at any point
|
||||||
(in iterable unpacking, this constitutes an error). The latter means that
|
(in iterable unpacking, this constitutes an error). The latter means that
|
||||||
pattern should avoid side effects wherever possible, including binding
|
pattern should avoid side effects wherever possible.
|
||||||
values to attributes or subscripts.
|
|
||||||
|
This desire to avoid side effects is one reason why capture patterns
|
||||||
|
don't allow binding values to attributes or subscripts: if the
|
||||||
|
containing pattern were to fail in a later step, it would be hard to
|
||||||
|
revert such bindings.
|
||||||
|
|
||||||
A cornerstone of pattern matching is the possibility of arbitrarily
|
A cornerstone of pattern matching is the possibility of arbitrarily
|
||||||
*nesting patterns*. The nesting allows for expressing deep
|
*nesting patterns*. The nesting allows expressing deep
|
||||||
tree structures (for an example of nested class patterns, see the motivation
|
tree structures (for an example of nested class patterns, see the motivation
|
||||||
section above) as well as alternatives.
|
section above) as well as alternatives.
|
||||||
|
|
||||||
Although the structural patterns might superficially look like expressions,
|
Although patterns might superficially look like expressions,
|
||||||
it is important to keep in mind that there is a clear distinction. In fact,
|
it is important to keep in mind that there is a clear distinction. In fact,
|
||||||
no pattern is or contains an expression. It is more productive to think of
|
no pattern is or contains an expression. It is more productive to think of
|
||||||
patterns as declarative elements similar to the formal parameters in a
|
patterns as declarative elements similar to the formal parameters in a
|
||||||
function definition.
|
function definition.
|
||||||
|
|
||||||
|
|
||||||
Walrus/AS patterns
|
AS Patterns
|
||||||
~~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~
|
||||||
|
|
||||||
Patterns fall into two categories: most patterns impose a (structural)
|
Patterns fall into two categories: most patterns impose a (structural)
|
||||||
constraint that the subject needs to fulfill, whereas the capture pattern
|
constraint that the subject needs to fulfill, whereas the capture pattern
|
||||||
binds the subject to a name without regard for the subject's structure or
|
binds the subject to a name without regard for the subject's structure or
|
||||||
actual value. Consequently, a pattern can either express a constraint or
|
actual value. Consequently, a pattern can either express a constraint or
|
||||||
bind a value, but not both. Walrus/AS patterns fill this gap in that they
|
bind a value, but not both. AS patterns fill this gap in that they
|
||||||
allow the user to specify a general pattern as well as capture the subject
|
allow the user to specify a general pattern as well as capture the subject
|
||||||
in a variable.
|
in a variable.
|
||||||
|
|
||||||
Typical use cases for the Walrus/AS pattern include OR and Class patterns
|
Typical use cases for the AS pattern include OR and Class patterns
|
||||||
together with a binding name as in, e.g., ``case BinOp(op := '+'|'-', ...):``
|
together with a binding name as in, e.g., ``case BinOp('+'|'-' as op, ...):``
|
||||||
or ``case [first := int(), second := int()]:``. The latter could be
|
or ``case [int() as first, int() as second]:``. The latter could be
|
||||||
understood as saying that the subject must fulfil two distinct pattern:
|
understood as saying that the subject must fulfil two distinct pattern:
|
||||||
``[first, second]`` as well as ``[int(), int()]``. The Walrus/AS pattern
|
``[first, second]`` as well as ``[int(), int()]``. The AS pattern
|
||||||
can thus be seen as a special case of an 'and' pattern (see OR patterns
|
can thus be seen as a special case of an 'and' pattern (see OR patterns
|
||||||
below for an additional discussion of 'and' patterns).
|
below for an additional discussion of 'and' patterns).
|
||||||
|
|
||||||
Example using the Walrus/AS pattern::
|
In an earlier version, the AS pattern was devised as a 'Walrus pattern',
|
||||||
|
written as ``case [first:=int(), second:=int()]``. However, using ``as``
|
||||||
|
offers some advantages over ``:=``:
|
||||||
|
|
||||||
|
- The walrus operator ``:=`` is used to capture the result of an expression
|
||||||
|
on the right hand side, whereas ``as`` generally indicates some form of
|
||||||
|
'processing' as in ``import foo as bar`` or ``except E as err:``. Indeed,
|
||||||
|
the pattern ``P as x`` does not assign the pattern ``P`` to ``x``, but
|
||||||
|
rather the subject that successfully matches ``P``.
|
||||||
|
|
||||||
|
- ``as`` allows for a more consistent data flow from left to right (the
|
||||||
|
attributes in Class patterns also follow a left-to-right data flow).
|
||||||
|
|
||||||
|
- The walrus operator is very close attributes in the Class pattern,
|
||||||
|
potentially leading to some confusion.
|
||||||
|
|
||||||
|
**Example** using the AS pattern::
|
||||||
|
|
||||||
def simplify_expr(tokens):
|
def simplify_expr(tokens):
|
||||||
match tokens:
|
match tokens:
|
||||||
case [l:=('('|'['), *expr, r:=(')'|']')] if (l+r) in ('()', '[]'):
|
case [('('|'[') as l, *expr, (')'|']') as r] if (l+r) in ('()', '[]'):
|
||||||
return simplify_expr(expr)
|
return simplify_expr(expr)
|
||||||
case [0, op:=('+'|'-'), right]:
|
case [0, ('+'|'-') as op, right]:
|
||||||
return UnaryOp(op, right)
|
return UnaryOp(op, right)
|
||||||
case [left:=(int() | float()) | Num(left), '+', right:=(int() | float()) | Num(right)]:
|
case [(int() | float() as left) | Num(left), '+', (int() | float() as right) | Num(right)]:
|
||||||
return Num(left + right)
|
return Num(left + right)
|
||||||
case [value:=(int() | float())]
|
case [(int() | float()) as value]:
|
||||||
return Num(value)
|
return Num(value)
|
||||||
|
|
||||||
|
|
||||||
OR patterns
|
OR Patterns
|
||||||
~~~~~~~~~~~
|
~~~~~~~~~~~
|
||||||
|
|
||||||
The OR pattern allows you to combine 'structurally equivalent' alternatives
|
The OR pattern allows you to combine 'structurally equivalent' alternatives
|
||||||
into a new pattern, i.e. several patterns can share a common handler. If any
|
into a new pattern, i.e. several patterns can share a common handler. If any
|
||||||
one of an OR pattern's subpatterns matches the given subject, the entire OR
|
of an OR pattern's subpatterns matches the subject, the entire OR
|
||||||
pattern succeeds.
|
pattern succeeds.
|
||||||
|
|
||||||
Statically typed languages prohibit the binding of names (capture patterns)
|
Statically typed languages prohibit the binding of names (capture patterns)
|
||||||
|
@ -422,13 +506,16 @@ must bind the same set of variables so as not to leave potentially undefined
|
||||||
names. With two alternatives ``P | Q``, this means that if *P* binds the
|
names. With two alternatives ``P | Q``, this means that if *P* binds the
|
||||||
variables *u* and *v*, *Q* must bind exactly the same variables *u* and *v*.
|
variables *u* and *v*, *Q* must bind exactly the same variables *u* and *v*.
|
||||||
|
|
||||||
There was some discussion on whether to use the bar ``|`` or the keyword
|
There was some discussion on whether to use the bar symbol ``|`` or the ``or``
|
||||||
``or`` in order to separate alternatives. The OR pattern does not fully fit
|
keyword to separate alternatives. The OR pattern does not fully fit
|
||||||
the existing semantics and usage of either of these two symbols. However,
|
the existing semantics and usage of either of these two symbols. However,
|
||||||
``|`` is the symbol of choice in all programming languages with support of
|
``|`` is the symbol of choice in all programming languages with support of
|
||||||
the OR pattern and is even used in that capacity for regular expressions in
|
the OR pattern and is used in that capacity for regular expressions in
|
||||||
Python as well. Moreover, ``|`` is not only used for bitwise OR, but also
|
Python as well. It is also the traditional separator between alternatives
|
||||||
|
in formal grammars (including Python's).
|
||||||
|
Moreover, ``|`` is not only used for bitwise OR, but also
|
||||||
for set unions and dict merging (:pep:`584`).
|
for set unions and dict merging (:pep:`584`).
|
||||||
|
|
||||||
Other alternatives were considered as well, but none of these would allow
|
Other alternatives were considered as well, but none of these would allow
|
||||||
OR-patterns to be nested inside other patterns:
|
OR-patterns to be nested inside other patterns:
|
||||||
|
|
||||||
|
@ -468,8 +555,9 @@ OR-patterns to be nested inside other patterns:
|
||||||
print("A corner of the unit square")
|
print("A corner of the unit square")
|
||||||
|
|
||||||
|
|
||||||
*AND and NOT patterns.*
|
**AND and NOT Patterns**
|
||||||
This proposal defines an OR-pattern (|) to match one of several alternates;
|
|
||||||
|
Since this proposal defines an OR-pattern (``|``) to match one of several alternates,
|
||||||
why not also an AND-pattern (``&``) or even a NOT-pattern (``!``)?
|
why not also an AND-pattern (``&``) or even a NOT-pattern (``!``)?
|
||||||
Especially given that some other languages (``F#`` for example) support
|
Especially given that some other languages (``F#`` for example) support
|
||||||
AND-patterns.
|
AND-patterns.
|
||||||
|
@ -480,27 +568,28 @@ all attributes and elements mentioned must be present for the match to
|
||||||
succeed. Guard conditions can also support many of the use cases that a
|
succeed. Guard conditions can also support many of the use cases that a
|
||||||
hypothetical 'and' operator would be used for.
|
hypothetical 'and' operator would be used for.
|
||||||
|
|
||||||
A negation of a match pattern using the operator ``!`` as a prefix would match
|
A negation of a match pattern using the operator ``!`` as a prefix
|
||||||
exactly if the pattern itself does not match. For instance, ``!(3 | 4)``
|
would match exactly if the pattern itself does not match. For
|
||||||
would match anything except ``3`` or ``4``. However, there is evidence from
|
instance, ``!(3 | 4)`` would match anything except ``3`` or ``4``.
|
||||||
other languages that this is rarely useful and primarily used as double
|
However, there is `evidence from other languages
|
||||||
negation ``!!`` to control variable scopes and prevent variable bindings
|
<https://dl.acm.org/doi/abs/10.1145/2480360.2384582>`_ that this is
|
||||||
(which does not apply to Python). Other use cases are better expressed using
|
rarely useful, and primarily used as double negation ``!!`` to control
|
||||||
guards.
|
variable scopes and prevent variable bindings (which does not apply to
|
||||||
|
Python). Other use cases are better expressed using guards.
|
||||||
|
|
||||||
In the end, it was decided that this would make the syntax more complex
|
In the end, it was decided that this would make the syntax more complex
|
||||||
without adding a significant benefit.
|
without adding a significant benefit. It can always be added later.
|
||||||
|
|
||||||
|
|
||||||
Example using the OR pattern::
|
**Example** using the OR pattern::
|
||||||
|
|
||||||
def simplify(expr):
|
def simplify(expr):
|
||||||
match expr:
|
match expr:
|
||||||
case ('/', 0, 0):
|
case ('/', 0, 0):
|
||||||
return expr
|
return expr
|
||||||
case ('*' | '/', 0, _):
|
case ('*'|'/', 0, _):
|
||||||
return 0
|
return 0
|
||||||
case ('+' | '-', x, 0) | ('+', 0, x) | ('*', 1, x) | ('*' | '/', x, 1):
|
case ('+'|'-', x, 0) | ('+', 0, x) | ('*', 1, x) | ('*'|'/', x, 1):
|
||||||
return x
|
return x
|
||||||
return expr
|
return expr
|
||||||
|
|
||||||
|
@ -511,8 +600,8 @@ Literal Patterns
|
||||||
~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
Literal patterns are a convenient way for imposing constraints on the
|
Literal patterns are a convenient way for imposing constraints on the
|
||||||
value of a subject, rather than its type or structure. Literal patterns
|
value of a subject, rather than its type or structure. They also
|
||||||
even allow you to emulate a switch statement using pattern matching.
|
allow you to emulate a switch statement using pattern matching.
|
||||||
|
|
||||||
Generally, the subject is compared to a literal pattern by means of standard
|
Generally, the subject is compared to a literal pattern by means of standard
|
||||||
equality (``x == y`` in Python syntax). Consequently, the literal patterns
|
equality (``x == y`` in Python syntax). Consequently, the literal patterns
|
||||||
|
@ -522,7 +611,7 @@ match the same set of objects because ``True == 1`` holds. However, we
|
||||||
believe that many users would be surprised finding that ``case True:``
|
believe that many users would be surprised finding that ``case True:``
|
||||||
matched the subject ``1.0``, resulting in some subtle bugs and convoluted
|
matched the subject ``1.0``, resulting in some subtle bugs and convoluted
|
||||||
workarounds. We therefore adopted the rule that the three singleton
|
workarounds. We therefore adopted the rule that the three singleton
|
||||||
objects ``None``, ``False`` and ``True`` match by identity (``x is y`` in
|
patterns ``None``, ``False`` and ``True`` match by identity (``x is y`` in
|
||||||
Python syntax) rather than equality. Hence, ``case True:`` will match only
|
Python syntax) rather than equality. Hence, ``case True:`` will match only
|
||||||
``True`` and nothing else. Note that ``case 1:`` would still match ``True``,
|
``True`` and nothing else. Note that ``case 1:`` would still match ``True``,
|
||||||
though, because the literal pattern ``1`` works by equality and not identity.
|
though, because the literal pattern ``1`` works by equality and not identity.
|
||||||
|
@ -530,20 +619,22 @@ though, because the literal pattern ``1`` works by equality and not identity.
|
||||||
Early ideas to induce a hierarchy on numbers so that ``case 1.0`` would
|
Early ideas to induce a hierarchy on numbers so that ``case 1.0`` would
|
||||||
match both the integer ``1`` and the floating point number ``1.0``, whereas
|
match both the integer ``1`` and the floating point number ``1.0``, whereas
|
||||||
``case 1:`` would only match the integer ``1`` were eventually dropped in
|
``case 1:`` would only match the integer ``1`` were eventually dropped in
|
||||||
favor of the simpler and consistent rule based on equality. Moreover, any
|
favor of the simpler and more consistent rule based on equality. Moreover, any
|
||||||
additional checks whether the subject is an instance of ``numbers.Integral``
|
additional checks whether the subject is an instance of ``numbers.Integral``
|
||||||
would come at a high runtime cost to introduce what would essentially be
|
would come at a high runtime cost to introduce what would essentially be
|
||||||
novel in Python. When needed, the explicit syntax ``case int(1):`` might
|
a novel idea in Python. When needed, the explicit syntax ``case int(1):`` can
|
||||||
be used.
|
be used.
|
||||||
|
|
||||||
Recall that literal patterns are *not* expressions, but directly denote a
|
Recall that literal patterns are *not* expressions, but directly
|
||||||
specific value or object. From a syntactical point of view, we have to
|
denote a specific value. From a pragmatic point of view, we want to
|
||||||
ensure that negative and complex numbers can equally be used as patterns,
|
allow using negative and even complex values as literal patterns, but
|
||||||
although they are not atomic literal values (i.e. the seeming literal value
|
they are not atomic literals (only unsigned real and imaginary numbers
|
||||||
``-3+4j`` would syntactically be an expression of the form
|
are). E.g., ``-3+4j`` is syntactically an expression of the form
|
||||||
``BinOp(UnaryOp('-', 3), '+', 4j)``, but as expressions are not part of
|
``BinOp(UnaryOp('-', 3), '+', 4j)``. Since expressions are not part
|
||||||
patterns, we added syntactic support for such complex value literals without
|
of patterns, we had to add explicit syntactic support for such values
|
||||||
having to resort to full expressions). Interpolated *f*-strings, on the
|
without having to resort to full expressions.
|
||||||
|
|
||||||
|
Interpolated *f*-strings, on the
|
||||||
other hand, are not literal values, despite their appearance and can
|
other hand, are not literal values, despite their appearance and can
|
||||||
therefore not be used as literal patterns (string concatenation, however,
|
therefore not be used as literal patterns (string concatenation, however,
|
||||||
is supported).
|
is supported).
|
||||||
|
@ -551,7 +642,27 @@ is supported).
|
||||||
Literal patterns not only occur as patterns in their own right, but also
|
Literal patterns not only occur as patterns in their own right, but also
|
||||||
as keys in *mapping patterns*.
|
as keys in *mapping patterns*.
|
||||||
|
|
||||||
Example using Literal patterns::
|
|
||||||
|
**Range matching patterns.**
|
||||||
|
This would allow patterns such as ``1...6``. However, there are a host of
|
||||||
|
ambiguities:
|
||||||
|
|
||||||
|
* Is the range open, half-open, or closed? (I.e. is ``6`` included in the
|
||||||
|
above example or not?)
|
||||||
|
* Does the range match a single number, or a range object?
|
||||||
|
* Range matching is often used for character ranges ('a'...'z') but that
|
||||||
|
won't work in Python since there's no character data type, just strings.
|
||||||
|
* Range matching can be a significant performance optimization if you can
|
||||||
|
pre-build a jump table, but that's not generally possible in Python due
|
||||||
|
to the fact that names can be dynamically rebound.
|
||||||
|
|
||||||
|
Rather than creating a special-case syntax for ranges, it was decided
|
||||||
|
that allowing custom pattern objects (``InRange(0, 6)``) would be more flexible
|
||||||
|
and less ambiguous; however those ideas have been postponed for the time
|
||||||
|
being.
|
||||||
|
|
||||||
|
|
||||||
|
**Example** using Literal patterns::
|
||||||
|
|
||||||
def simplify(expr):
|
def simplify(expr):
|
||||||
match expr:
|
match expr:
|
||||||
|
@ -579,7 +690,7 @@ Capture Patterns
|
||||||
|
|
||||||
Capture patterns take on the form of a name that accepts any value and binds
|
Capture patterns take on the form of a name that accepts any value and binds
|
||||||
it to a (local) variable (unless the name is declared as ``nonlocal`` or
|
it to a (local) variable (unless the name is declared as ``nonlocal`` or
|
||||||
``global``). In that sense, a simple capture pattern is basically equivalent
|
``global``). In that sense, a capture pattern is similar
|
||||||
to a parameter in a function definition (when the function is called, each
|
to a parameter in a function definition (when the function is called, each
|
||||||
parameter binds the respective argument to a local variable in the function's
|
parameter binds the respective argument to a local variable in the function's
|
||||||
scope).
|
scope).
|
||||||
|
@ -599,7 +710,7 @@ repeated use of names later on.
|
||||||
There were calls to explicitly mark capture patterns and thus identify them
|
There were calls to explicitly mark capture patterns and thus identify them
|
||||||
as binding targets. According to that idea, a capture pattern would be
|
as binding targets. According to that idea, a capture pattern would be
|
||||||
written as, e.g. ``?x``, ``$x`` or ``=x``. The aim of such explicit capture
|
written as, e.g. ``?x``, ``$x`` or ``=x``. The aim of such explicit capture
|
||||||
markers is to let an unmarked name be a constant value pattern (see below).
|
markers is to let an unmarked name be a value pattern (see below).
|
||||||
However, this is based on the misconception that pattern matching was an
|
However, this is based on the misconception that pattern matching was an
|
||||||
extension of *switch* statements, placing the emphasis on fast switching based
|
extension of *switch* statements, placing the emphasis on fast switching based
|
||||||
on (ordinal) values. Such a *switch* statement has indeed been proposed for
|
on (ordinal) values. Such a *switch* statement has indeed been proposed for
|
||||||
|
@ -611,7 +722,13 @@ betray the objective of the proposed pattern matching syntax and simplify
|
||||||
a secondary use case at the expense of additional syntactic clutter for
|
a secondary use case at the expense of additional syntactic clutter for
|
||||||
core cases.
|
core cases.
|
||||||
|
|
||||||
Example using Capture patterns::
|
It has been proposed that capture patterns are not needed at all,
|
||||||
|
since the equivalent effect can be obtained by combining a AS
|
||||||
|
pattern with a wildcard pattern (e.g., ``case _ as x`` is equivalent
|
||||||
|
to ``case x``). However, this would be unpleasantly verbose,
|
||||||
|
especially given that we expect capture patterns to be very common.
|
||||||
|
|
||||||
|
**Example** using Capture patterns::
|
||||||
|
|
||||||
def average(*args):
|
def average(*args):
|
||||||
match args:
|
match args:
|
||||||
|
@ -621,8 +738,8 @@ Example using Capture patterns::
|
||||||
return x
|
return x
|
||||||
case []:
|
case []:
|
||||||
return 0
|
return 0
|
||||||
case x: # captures the entire sequence
|
case a: # captures the entire sequence
|
||||||
return sum(x) / len(x)
|
return sum(a) / len(a)
|
||||||
|
|
||||||
|
|
||||||
.. _wildcard_pattern:
|
.. _wildcard_pattern:
|
||||||
|
@ -660,27 +777,38 @@ of items is omitted::
|
||||||
case [a, ..., z]: ...
|
case [a, ..., z]: ...
|
||||||
case [a, *, z]: ...
|
case [a, *, z]: ...
|
||||||
|
|
||||||
Both examples look like the would match a sequence of at two or more items,
|
Either example looks like it would match a sequence of two or more
|
||||||
capturing the first and last values.
|
items, capturing the first and last values. While that may be the
|
||||||
|
ultimate "wildcard", it does not convey the desired semantics.
|
||||||
|
|
||||||
A single wildcard clause (i.e. ``case _:``) is semantically equivalent to
|
An alternative that does not suggest an arbitrary number of items
|
||||||
an ``else:``. It accepts any subject without binding it to a variable or
|
would be ``?``. However, this would require changes in the tokenizer,
|
||||||
performing any other operation. However, the wildcard pattern is in
|
and it would put Python in a rather unique position:
|
||||||
contrast to ``else`` usable as a subpattern in nested patterns.
|
|
||||||
|
|
||||||
Finally note that the underscore is as a wildcard pattern in *every*
|
The underscore is as a wildcard pattern in *every*
|
||||||
programming language with pattern matching that we could find
|
programming language with pattern matching that we could find
|
||||||
(including *C#*, *Elixir*, *Erlang*, *F#*, *Grace*, *Haskell*,
|
(including *C#*, *Elixir*, *Erlang*, *F#*, *Grace*, *Haskell*,
|
||||||
*Mathematica*, *OCaml*, *Ruby*, *Rust*, *Scala*, *Swift*, and *Thorn*).
|
*Mathematica*, *OCaml*, *Ruby*, *Rust*, *Scala*, *Swift*, and *Thorn*).
|
||||||
Keeping in mind that many users of Python also work with other programming
|
Keeping in mind that many users of Python also work with other programming
|
||||||
languages, have prior experience when learning Python, or moving on to
|
languages, have prior experience when learning Python, and may move on to
|
||||||
other languages after having learnt Python, we find that such well
|
other languages after having learned Python, we find that such
|
||||||
established standards are important and relevant with respect to
|
well-established standards are important and relevant with respect to
|
||||||
readability and learnability. In our view, concerns that this wildcard
|
readability and learnability. In our view, concerns that this wildcard
|
||||||
means that a regular name received special treatment are not strong
|
means that a regular name received special treatment are not strong
|
||||||
enough to introduce syntax that would make Python special.
|
enough to introduce syntax that would make Python special.
|
||||||
|
|
||||||
Example using the Wildcard pattern::
|
*Else blocks.* A case block without a guard whose pattern is a single
|
||||||
|
wildcard (i.e., ``case _:``) accepts any subject without binding it to
|
||||||
|
a variable or performing any other operation. It is thus semantically
|
||||||
|
equivalent to ``else:``, if it were supported. However, adding such
|
||||||
|
an else block to the match statement syntax would not remove the need
|
||||||
|
for the wildcard pattern in other contexts. Another argument against
|
||||||
|
this is that there would be two plausible indentation levels for an
|
||||||
|
else block: aligned with ``case`` or aligned with ``match``. The
|
||||||
|
authors have found it quite contentious which indentation level to
|
||||||
|
prefer.
|
||||||
|
|
||||||
|
**Example** using the Wildcard pattern::
|
||||||
|
|
||||||
def is_closed(sequence):
|
def is_closed(sequence):
|
||||||
match sequence:
|
match sequence:
|
||||||
|
@ -692,45 +820,43 @@ Example using the Wildcard pattern::
|
||||||
return False
|
return False
|
||||||
|
|
||||||
|
|
||||||
.. _constant_value_pattern:
|
.. _value_pattern:
|
||||||
|
|
||||||
Value Patterns
|
Value Patterns
|
||||||
~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~
|
||||||
|
|
||||||
It is good programming style to use named constants for parametric values or
|
It is good programming style to use named constants for parametric values or
|
||||||
to clarify the meaning of particular values. Clearly, it would be desirable
|
to clarify the meaning of particular values. Clearly, it would be preferable
|
||||||
to write ``case (HttpStatus.OK, body):`` rather than
|
to write ``case (HttpStatus.OK, body):`` over
|
||||||
``case (200, body):``, for example. The main issue that arises here is how to
|
``case (200, body):``, for example. The main issue that arises here is how to
|
||||||
distinguish capture patterns (variables) from constant value patterns. The
|
distinguish capture patterns (variable bindings) from value patterns. The
|
||||||
general discussion surrounding this issue has brought forward a plethora of
|
general discussion surrounding this issue has brought forward a plethora of
|
||||||
options, which we cannot all fully list here.
|
options, which we cannot all fully list here.
|
||||||
|
|
||||||
Strictly speaking, constant value patterns are not really necessary, but
|
Strictly speaking, value patterns are not really necessary, but
|
||||||
could be implemented using guards, i.e.
|
could be implemented using guards, i.e.
|
||||||
``case (status, body) if status == HttpStatus.OK:``. Nonetheless, the
|
``case (status, body) if status == HttpStatus.OK:``. Nonetheless, the
|
||||||
convenience of constant value patterns is unquestioned and obvious.
|
convenience of value patterns is unquestioned and obvious.
|
||||||
|
|
||||||
The observation that constants tend to be written in uppercase letters or
|
The observation that constants tend to be written in uppercase letters or
|
||||||
collected in enumeration-like namespaces suggests possible rules to discern
|
collected in enumeration-like namespaces suggests possible rules to discern
|
||||||
constants syntactically. However, the idea of using upper vs. lower case as
|
constants syntactically. However, the idea of using upper- vs. lowercase as
|
||||||
a marker has been met with scepticism since there is no similar precedence
|
a marker has been met with scepticism since there is no similar precedence
|
||||||
in core Python (although it is common in other languages). We therefore only
|
in core Python (although it is common in other languages). We therefore only
|
||||||
adopted the rule that any dotted name (i.e. attribute access) is to be
|
adopted the rule that any dotted name (i.e., attribute access) is to be
|
||||||
interpreted as a constant value pattern like ``HttpStatus.OK``
|
interpreted as a value pattern, for example ``HttpStatus.OK``
|
||||||
above. This precludes, in particular, local variables from acting as
|
above. This precludes, in particular, local variables and global
|
||||||
constants.
|
variables defined in the current module from acting as constants.
|
||||||
|
|
||||||
Global variables can only be directly used as constant when defined in other
|
A proposed rule to use a leading dot (e.g.
|
||||||
modules, although there are workarounds to access the current module as a
|
|
||||||
namespace as well. A proposed rule to use a leading dot (e.g.
|
|
||||||
``.CONSTANT``) for that purpose was critisised because it was felt that the
|
``.CONSTANT``) for that purpose was critisised because it was felt that the
|
||||||
dot would not be a visible-enough marker for that purpose. Partly inspired
|
dot would not be a visible-enough marker for that purpose. Partly inspired
|
||||||
by use cases in other programming languages, a number of different
|
by forms found in other programming languages, a number of different
|
||||||
markers/sigils were proposed (such as ``^CONSTANT``, ``$CONSTANT``,
|
markers/sigils were proposed (such as ``^CONSTANT``, ``$CONSTANT``,
|
||||||
``==CONSTANT``, ``CONSTANT?``, or the word enclosed in backticks), although
|
``==CONSTANT``, ``CONSTANT?``, or the word enclosed in backticks), although
|
||||||
there was no obvious or natural choice. The current proposal therefore
|
there was no obvious or natural choice. The current proposal therefore
|
||||||
leaves the discussion and possible introduction of such a 'constant' marker
|
leaves the discussion and possible introduction of such a 'constant' marker
|
||||||
for future PEPs.
|
for a future PEP.
|
||||||
|
|
||||||
Distinguishing the semantics of names based on whether it is a global
|
Distinguishing the semantics of names based on whether it is a global
|
||||||
variable (i.e. the compiler would treat global variables as constants rather
|
variable (i.e. the compiler would treat global variables as constants rather
|
||||||
|
@ -740,7 +866,7 @@ patterns. Moreover, pattern matching could not be used directly inside a
|
||||||
module's scope because all variables would be global, making capture
|
module's scope because all variables would be global, making capture
|
||||||
patterns impossible.
|
patterns impossible.
|
||||||
|
|
||||||
Example using the Value pattern::
|
**Example** using the Value pattern::
|
||||||
|
|
||||||
def handle_reply(reply):
|
def handle_reply(reply):
|
||||||
match reply:
|
match reply:
|
||||||
|
@ -782,7 +908,7 @@ iterable.
|
||||||
possible.
|
possible.
|
||||||
|
|
||||||
- A starred pattern will capture a sub-sequence of arbitrary length,
|
- A starred pattern will capture a sub-sequence of arbitrary length,
|
||||||
mirroring iterable unpacking as well. Only one starred item may be
|
again mirroring iterable unpacking. Only one starred item may be
|
||||||
present in any sequence pattern. In theory, patterns such as ``(*_, 3, *_)``
|
present in any sequence pattern. In theory, patterns such as ``(*_, 3, *_)``
|
||||||
could be understood as expressing any sequence containing the value ``3``.
|
could be understood as expressing any sequence containing the value ``3``.
|
||||||
In practise, however, this would only work for a very narrow set of use
|
In practise, however, this would only work for a very narrow set of use
|
||||||
|
@ -790,31 +916,36 @@ iterable.
|
||||||
|
|
||||||
- The sequence pattern does *not* iterate through an iterable subject. All
|
- The sequence pattern does *not* iterate through an iterable subject. All
|
||||||
elements are accessed through subscripting and slicing, and the subject must
|
elements are accessed through subscripting and slicing, and the subject must
|
||||||
be an instance of ``collections.abc.Sequence`` (including, in particular,
|
be an instance of ``collections.abc.Sequence``. This includes, of course,
|
||||||
lists and tuples, but excluding strings and bytes, as well as sets and
|
lists and tuples, but excludes e.g. sets and dictionaries. While it would
|
||||||
dictionaries).
|
include strings and bytes, we make an exception for these (see below).
|
||||||
|
|
||||||
A sequence pattern cannot just iterate through any iterable object. The
|
A sequence pattern cannot just iterate through any iterable object. The
|
||||||
consumption of elements from the iteration would have to be undone if the
|
consumption of elements from the iteration would have to be undone if the
|
||||||
overall pattern fails, which is not possible.
|
overall pattern fails, which is not feasible.
|
||||||
|
|
||||||
Relying on ``len()`` and subscripting and slicing alone does not work to
|
To identify sequences we cannot rely on ``len()`` and subscripting and
|
||||||
identify sequences because sequences share the protocol with more general
|
slicing alone, because sequences share these protocols with mappings
|
||||||
maps (dictionaries) in this regard. It would be surprising if a sequence
|
(e.g. `dict`) in this regard. It would be surprising if a sequence
|
||||||
pattern also matched dictionaries or other custom objects that implement
|
pattern also matched a dictionaries or other objects implementing
|
||||||
the mapping protocol (i.e. ``__getitem__``). The interpreter therefore
|
the mapping protocol (i.e. ``__getitem__``). The interpreter therefore
|
||||||
performs an instance check to ensure that the subject in question really
|
performs an instance check to ensure that the subject in question really
|
||||||
is a sequence (of known type).
|
is a sequence (of known type). (As an optimization of the most common
|
||||||
|
case, if the subject is exactly a list or a tuple, the instance check
|
||||||
|
can be skipped.)
|
||||||
|
|
||||||
String and bytes objects have a dual nature: they are both 'atomic' objects
|
String and bytes objects have a dual nature: they are both 'atomic' objects
|
||||||
in their own right, as well as sequences (with a strongly recursive nature
|
in their own right, as well as sequences (with a strongly recursive nature
|
||||||
in that a string is a sequence of strings). The typical behavior and use
|
in that a string is a sequence of strings). The typical behavior and use
|
||||||
cases for strings and bytes are different enough from that of tuples and
|
cases for strings and bytes are different enough from those of tuples and
|
||||||
lists to warrant a clear distinction. It is in fact often unintuitive and
|
lists to warrant a clear distinction. It is in fact often unintuitive and
|
||||||
unintended that strings pass for sequences as evidenced by regular questions
|
unintended that strings pass for sequences, as evidenced by regular questions
|
||||||
and complaints. Strings and bytes are therefore not matched by a sequence
|
and complaints. Strings and bytes are therefore not matched by a sequence
|
||||||
pattern, limiting the sequence pattern to a very specific understanding of
|
pattern, limiting the sequence pattern to a very specific understanding of
|
||||||
'sequence'.
|
'sequence'. The built-in ``bytearray`` type, being a mutable version of
|
||||||
|
``bytes``, also deserves an exception; but we don't intend to
|
||||||
|
enumerate all other types that may be used to represent bytes
|
||||||
|
(e.g. some, but not all, instances of ``memoryview`` and ``array.array``).
|
||||||
|
|
||||||
|
|
||||||
.. _mapping_pattern:
|
.. _mapping_pattern:
|
||||||
|
@ -823,9 +954,9 @@ Mapping Patterns
|
||||||
~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
Dictionaries or mappings in general are one of the most important and most
|
Dictionaries or mappings in general are one of the most important and most
|
||||||
widely used data structures in Python. In contrast to sequences mappings
|
widely used data structures in Python. In contrast to sequences, mappings
|
||||||
are built for fast direct access to arbitrary elements (identified by a key).
|
are built for fast direct access to arbitrary elements identified by a key.
|
||||||
In most use cases an element is retrieved from a dictionary by a known key
|
In most cases an element is retrieved from a dictionary by a known key
|
||||||
without regard for any ordering or other key-value pairs stored in the same
|
without regard for any ordering or other key-value pairs stored in the same
|
||||||
dictionary. Particularly common are string keys.
|
dictionary. Particularly common are string keys.
|
||||||
|
|
||||||
|
@ -836,13 +967,13 @@ pattern does not check for the presence of additional keys. Should it be
|
||||||
necessary to impose an upper bound on the mapping and ensure that no
|
necessary to impose an upper bound on the mapping and ensure that no
|
||||||
additional keys are present, then the usual double-star-pattern ``**rest``
|
additional keys are present, then the usual double-star-pattern ``**rest``
|
||||||
can be used. The special case ``**_`` with a wildcard, however, is not
|
can be used. The special case ``**_`` with a wildcard, however, is not
|
||||||
supported as it would not have any effect, but might lead to a wrong
|
supported as it would not have any effect, but might lead to an incorrect
|
||||||
understanding of the mapping pattern's semantics.
|
understanding of the mapping pattern's semantics.
|
||||||
|
|
||||||
To avoid overly expensive matching algorithms, keys must be literals or
|
To avoid overly expensive matching algorithms, keys must be literals or
|
||||||
constant values.
|
value patterns.
|
||||||
|
|
||||||
Example using the Mapping pattern::
|
**Example** using the Mapping pattern::
|
||||||
|
|
||||||
def change_red_to_blue(json_obj):
|
def change_red_to_blue(json_obj):
|
||||||
match json_obj:
|
match json_obj:
|
||||||
|
@ -858,10 +989,10 @@ Example using the Mapping pattern::
|
||||||
Class Patterns
|
Class Patterns
|
||||||
~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~
|
||||||
|
|
||||||
Class patterns fulfil two purposes: checking whether a given subject is
|
Class patterns fulfill two purposes: checking whether a given subject is
|
||||||
indeed an instance of a specific class and extracting data from specific
|
indeed an instance of a specific class, and extracting data from specific
|
||||||
attributes of the subject. A quick survey revealed that ``isinstance()``
|
attributes of the subject. Anecdotal evidence revealed that ``isinstance()``
|
||||||
is indeed one of the most often used functions in Python in terms of
|
is one of the most often used functions in Python in terms of
|
||||||
static occurrences in programs. Such instance checks typically precede
|
static occurrences in programs. Such instance checks typically precede
|
||||||
a subsequent access to information stored in the object, or a possible
|
a subsequent access to information stored in the object, or a possible
|
||||||
manipulation thereof. A typical pattern might be along the lines of::
|
manipulation thereof. A typical pattern might be along the lines of::
|
||||||
|
@ -873,7 +1004,7 @@ manipulation thereof. A typical pattern might be along the lines of::
|
||||||
elif isinstance(node, Leaf):
|
elif isinstance(node, Leaf):
|
||||||
print(node.value)
|
print(node.value)
|
||||||
|
|
||||||
In many cases, however, class patterns occur nested as in the example
|
In many cases class patterns occur nested, as in the example
|
||||||
given in the motivation::
|
given in the motivation::
|
||||||
|
|
||||||
if (isinstance(node, BinOp) and node.op == "+"
|
if (isinstance(node, BinOp) and node.op == "+"
|
||||||
|
@ -881,8 +1012,8 @@ given in the motivation::
|
||||||
a, b, c = node.left, node.right.left, node.right.right
|
a, b, c = node.left, node.right.left, node.right.right
|
||||||
# Handle a + b*c
|
# Handle a + b*c
|
||||||
|
|
||||||
The class pattern lets you concisely specify both an instance-check as
|
The class pattern lets you concisely specify both an instance check
|
||||||
well as relevant attributes (with possible further constraints). It is
|
and relevant attributes (with possible further constraints). It is
|
||||||
thereby very tempting to write, e.g., ``case Node(left, right):`` in the
|
thereby very tempting to write, e.g., ``case Node(left, right):`` in the
|
||||||
first case above and ``case Leaf(value):`` in the second. While this
|
first case above and ``case Leaf(value):`` in the second. While this
|
||||||
indeed works well for languages with strict algebraic data types, it is
|
indeed works well for languages with strict algebraic data types, it is
|
||||||
|
@ -890,14 +1021,14 @@ problematic with the structure of Python objects.
|
||||||
|
|
||||||
When dealing with general Python objects, we face a potentially very large
|
When dealing with general Python objects, we face a potentially very large
|
||||||
number of unordered attributes: an instance of ``Node`` contains a large
|
number of unordered attributes: an instance of ``Node`` contains a large
|
||||||
number of attributes (most of which are 'private methods' such as, e.g.,
|
number of attributes (most of which are 'special methods' such as
|
||||||
``__repr__``). Moreover, the interpreter cannot reliably deduce which of
|
``__repr__``). Moreover, the interpreter cannot reliably deduce the
|
||||||
the attributes comes first and which comes second. For an object that
|
ordering of attributes. For an object that
|
||||||
represents a circle, say, there is no inherently obvious ordering of the
|
represents a circle, say, there is no inherently obvious ordering of the
|
||||||
attributes ``x``, ``y`` and ``radius``.
|
attributes ``x``, ``y`` and ``radius``.
|
||||||
|
|
||||||
We envision two possibilities for dealing with this issue: either explicitly
|
We envision two possibilities for dealing with this issue: either explicitly
|
||||||
name the attributes of interest or provide an additional mapping that tells
|
name the attributes of interest, or provide an additional mapping that tells
|
||||||
the interpreter which attributes to extract and in which order. Both
|
the interpreter which attributes to extract and in which order. Both
|
||||||
approaches are supported. Moreover, explicitly naming the attributes of
|
approaches are supported. Moreover, explicitly naming the attributes of
|
||||||
interest lets you further specify the required structure of an object; if
|
interest lets you further specify the required structure of an object; if
|
||||||
|
@ -948,6 +1079,20 @@ the explicit construction of instances, where class patterns ``c(p, q)``
|
||||||
deliberately mirror the syntax of creating instances.
|
deliberately mirror the syntax of creating instances.
|
||||||
|
|
||||||
|
|
||||||
|
**Type annotations for pattern variables.**
|
||||||
|
The proposal was to combine patterns with type annotations::
|
||||||
|
|
||||||
|
match x:
|
||||||
|
case [a: int, b: str]: print(f"An int {a} and a string {b}:)
|
||||||
|
case [a: int, b: int, c: int]: print(f"Three ints", a, b, c)
|
||||||
|
...
|
||||||
|
|
||||||
|
This idea has a lot of problems. For one, the colon can only
|
||||||
|
be used inside of brackets or parens, otherwise the syntax becomes
|
||||||
|
ambiguous. And because Python disallows ``isinstance()`` checks
|
||||||
|
on generic types, type annotations containing generics will not
|
||||||
|
work as expected.
|
||||||
|
|
||||||
|
|
||||||
History and Context
|
History and Context
|
||||||
===================
|
===================
|
||||||
|
@ -1052,7 +1197,6 @@ Or you would combine these ideas to write ``Node(right=y)`` so as to require
|
||||||
an instance of ``Node`` but only extract the value of the `right` attribute.
|
an instance of ``Node`` but only extract the value of the `right` attribute.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Copyright
|
Copyright
|
||||||
=========
|
=========
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue