Added guards and walrus patterns (#1644)
This commit is contained in:
parent
a0a8919aff
commit
1dc309632a
278
pep-0635.rst
278
pep-0635.rst
|
@ -107,6 +107,19 @@ this: it can only select based on the class.
|
|||
For a complete example, see
|
||||
https://github.com/gvanrossum/patma/blob/master/examples/expr.py#L231
|
||||
|
||||
Like the Visitor pattern, pattern matching allows for a strict separation
|
||||
of concerns: specific actions or data processing is independent of the
|
||||
class hierarchy or manipulated objects. When dealing with predefined or
|
||||
even built-in classes, in particular, it is often impossible to add further
|
||||
methods to the individual classes. Pattern matching not only releaves the
|
||||
programmer or class designer from the burden of the boilerplate code needed
|
||||
for the Visitor pattern, but is also flexible enough to directly work with
|
||||
built-in types. It naturally distinguishes between sequences of different
|
||||
lengths, who might all share the same class despite obviously differing
|
||||
structures. Moreover, pattern matching automatically takes inheritance
|
||||
into account: a class *D* inheriting from *C* will be handled by a pattern
|
||||
that targets *C* by default.
|
||||
|
||||
TODO: Could we say more here?
|
||||
|
||||
Pattern and functional style
|
||||
|
@ -124,7 +137,10 @@ a JSON data structure using ``match``.
|
|||
|
||||
TODO: Example code.
|
||||
|
||||
|
||||
Functional programming generally prefers a declarative style with a focus
|
||||
on relationships in data. Side effects are avoided whenever possible.
|
||||
Pattern matching thus naturally fits and highly supports functional
|
||||
programming style.
|
||||
|
||||
|
||||
Rationale
|
||||
|
@ -174,7 +190,7 @@ semantic meaning.
|
|||
Various suggestions have sought to eliminate or avoid the naturally arising
|
||||
"double indentation" of a case clause's code block. Unfortunately, all such
|
||||
proposals of *flat indentation schemes* come at the expense of violating
|
||||
Python's establish structural paradigm, leading to additional syntactic
|
||||
Python's established structural paradigm, leading to additional syntactic
|
||||
rules:
|
||||
|
||||
- *Unindented case clauses.*
|
||||
|
@ -191,7 +207,7 @@ rules:
|
|||
neither follow the syntactic scheme of simple nor composite statements
|
||||
but rather establish a category of its own.
|
||||
|
||||
- *Putting the expression on a separate line after ``match``.*
|
||||
- *Putting the expression on a separate line after "match".*
|
||||
The idea is to use the expression yielding the subject as a statement
|
||||
to avoid the singularity of ``match`` having no actual block despite
|
||||
the colons::
|
||||
|
@ -220,7 +236,7 @@ PEP, a noticeable improvement in code brevity is observed, more than making
|
|||
up for the additional indentation level.
|
||||
|
||||
|
||||
*Statement v Expression.* Some suggestions centered around the idea of
|
||||
*Statement vs. Expression.* Some suggestions centered around the idea of
|
||||
making ``match`` an expression rather than a statement. However, this
|
||||
would fit poorly with Python's statement-oriented nature and lead to
|
||||
unusually long and complex expressions with the need to invent new
|
||||
|
@ -239,7 +255,7 @@ The patterns of different case clauses might overlap in that more than
|
|||
one case clause would match a given subject. The first-to-match rule
|
||||
ensures that the selection of a case clause for a given subject is
|
||||
unambiguous. Furthermore, case clauses can have increasingly general
|
||||
patterns matching wider classes of subjects. The first-to-match rule
|
||||
patterns matching wider sets of subjects. The first-to-match rule
|
||||
then ensures that the most precise pattern can be chosen (although it
|
||||
is the programmer's responsibility to order the case clauses correctly).
|
||||
|
||||
|
@ -249,7 +265,7 @@ This would, however, require that all patterns be purely declarative and
|
|||
static, running against the established dynamic semantics of Python. The
|
||||
proposed semantics thus represent a path incorporating the best of both
|
||||
worlds: patterns are tried in a strictly sequential order so that each
|
||||
case clause constitutes an actual stement. At the same time, we allow
|
||||
case clause constitutes an actual statement. At the same time, we allow
|
||||
the interpreter to cache any information about the subject or change the
|
||||
order in which subpatterns are tried. In other words: if the interpreter
|
||||
has found that the subject is not an instance of a class ``C``, it can
|
||||
|
@ -268,7 +284,7 @@ would essentially mean that each case clause is a separate function without
|
|||
direct access to the variables in the surrounding scope (without having to
|
||||
resort to ``nonlocal`` that is). Moreover, a case clause could no longer
|
||||
influence any surrounding control flow through standard statement such as
|
||||
``return`` or ``break``. Hence, such script scoping would lead to
|
||||
``return`` or ``break``. Hence, such strict scoping would lead to
|
||||
unintuitive and surprising behavior.
|
||||
|
||||
A direct consequence of this is that any variable bindings outlive the
|
||||
|
@ -279,6 +295,51 @@ bindings is in line with existing Python structures such as for loops and
|
|||
with statements.
|
||||
|
||||
|
||||
Guards
|
||||
~~~~~~
|
||||
|
||||
Some constraints cannot be adequately expressed through patterns alone.
|
||||
For instance, a 'less' or 'greater than' relationship defies the usual
|
||||
'equal' semantics of patterns. Moreover, different subpatterns are
|
||||
independent and cannot refer to each other. The addition of _guards_
|
||||
addresses these restrictions: a guard is an arbitrary expression attached
|
||||
to a pattern and that must evaluate to ``True`` for the pattern to succeed.
|
||||
|
||||
For example, ``case [x, y] if x < y:`` uses a guard (``if x < y``) to
|
||||
express a 'less than' relationship between two otherwise disjoint capture
|
||||
patterns ``x`` and ``y``.
|
||||
|
||||
From a conceptual point of view, patterns describe structural constraints
|
||||
on the subject in a declarative style, ideally without any side-effects.
|
||||
Recall, in particular, that patterns are clearly distinct from expressions,
|
||||
following different objectives and semantics. Guards then enhance the
|
||||
patterns in a highly controlled way with arbitrary expressions (that might
|
||||
have side effects). Splitting the overal pattern into a static structural
|
||||
and a dynamic 'evaluative' part not only helps with readability, but can
|
||||
also introduce dramatic potential for compiler optimizations. To keep this
|
||||
clear separation, guards are only supported on the level of case clauses
|
||||
and not for individual patterns.
|
||||
|
||||
Example using guards::
|
||||
|
||||
def sort(seq):
|
||||
match seq:
|
||||
case [] | [_]:
|
||||
return seq
|
||||
case [x, y] if x <= y:
|
||||
return seq
|
||||
case [x, y]:
|
||||
return [y, x]
|
||||
case [x, y, z] if x <= y <= z:
|
||||
return seq
|
||||
case [x, y, z] if x >= y >= z:
|
||||
return [z, y, x]
|
||||
case [p, *rest]:
|
||||
a = sort([x for x in rest if x <= p])
|
||||
b = sort([x for x in rest if p < x])
|
||||
return a + [p] + b
|
||||
|
||||
|
||||
.. _patterns:
|
||||
|
||||
Patterns
|
||||
|
@ -312,9 +373,37 @@ patterns as declarative elements similar to the formal parameters in a
|
|||
function definition.
|
||||
|
||||
|
||||
Walrus patterns
|
||||
~~~~~~~~~~~~~~~
|
||||
Walrus/AS patterns
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Patterns fall into two categories: most patterns impose a (structural)
|
||||
constraint that the subject needs to fulfill, whereas the capture pattern
|
||||
binds the subject to a name without regard for the subject's structure or
|
||||
actual value. Consequently, a pattern can either express a constraint or
|
||||
bind a value, but not both. Walrus/AS patterns fill this gap in that they
|
||||
allow the user to specify a general pattern as well as capture the subject
|
||||
in a variable.
|
||||
|
||||
Typical use cases for the Walrus/AS pattern include OR and Class patterns
|
||||
together with a binding name as in, e.g., ``case BinOp(op := '+'|'-', ...):``
|
||||
or ``case [first := int(), second := int()]:``. The latter could be
|
||||
understood as saying that the subject must fulfil two distinct pattern:
|
||||
``[first, second]`` as well as ``[int(), int()]``. The Walrus/AS pattern
|
||||
can thus be seen as a special case of an 'and' pattern (see OR patterns
|
||||
below for an additional discussion of 'and' patterns).
|
||||
|
||||
Example using the Walrus/AS pattern::
|
||||
|
||||
def simplify_expr(tokens):
|
||||
match tokens:
|
||||
case [l:=('('|'['), *expr, r:=(')'|']')] if (l+r) in ('()', '[]'):
|
||||
return simplify_expr(expr)
|
||||
case [0, op:=('+'|'-'), right]:
|
||||
return UnaryOp(op, right)
|
||||
case [left:=(int() | float()) | Num(left), '+', right:=(int() | float()) | Num(right)]:
|
||||
return Num(left + right)
|
||||
case [value:=(int() | float())]
|
||||
return Num(value)
|
||||
|
||||
|
||||
OR patterns
|
||||
|
@ -366,9 +455,9 @@ OR-patterns to be nested inside other patterns:
|
|||
in *C*). Also, this would be a novel indentation pattern, which might make
|
||||
it harder to support in IDEs and such (it would break the simple rule "add
|
||||
an indentation level after a line ending in a colon"). Finally, this
|
||||
would not support OR patterns nested inside other patterns.
|
||||
would not support OR patterns nested inside other patterns, either.
|
||||
|
||||
- *Using ``case in`` followed by a comma-separated list*::
|
||||
- *Using "case in" followed by a comma-separated list*::
|
||||
|
||||
case in 401, 403, 404:
|
||||
print("Some HTTP error")
|
||||
|
@ -396,13 +485,14 @@ exactly if the pattern itself does not match. For instance, ``!(3 | 4)``
|
|||
would match anything except ``3`` or ``4``. However, there is evidence from
|
||||
other languages that this is rarely useful and primarily used as double
|
||||
negation ``!!`` to control variable scopes and prevent variable bindings
|
||||
(which does not apply to Python).
|
||||
(which does not apply to Python). Other use cases are better expressed using
|
||||
guards.
|
||||
|
||||
In the end, it was decided that this would make the syntax more complex
|
||||
without adding a significant benefit.
|
||||
|
||||
|
||||
Example::
|
||||
Example using the OR pattern::
|
||||
|
||||
def simplify(expr):
|
||||
match expr:
|
||||
|
@ -415,6 +505,73 @@ Example::
|
|||
return expr
|
||||
|
||||
|
||||
.. _literal_pattern:
|
||||
|
||||
Literal Patterns
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
Literal patterns are a convenient way for imposing constraints on the
|
||||
value of a subject, rather than its type or structure. Literal patterns
|
||||
even allow you to emulate a switch statement using pattern matching.
|
||||
|
||||
Generally, the subject is compared to a literal pattern by means of standard
|
||||
equality (``x == y`` in Python syntax). Consequently, the literal patterns
|
||||
``1.0`` and ``1`` match exactly the same set of objects, i.e. ``case 1.0:``
|
||||
and ``case 1:`` are fully interchangable. In principle, ``True`` would also
|
||||
match the same set of objects because ``True == 1`` holds. However, we
|
||||
believe that many users would be surprised finding that ``case True:``
|
||||
matched the subject ``1.0``, resulting in some subtle bugs and convoluted
|
||||
workarounds. We therefore adopted the rule that the three singleton
|
||||
objects ``None``, ``False`` and ``True`` match by identity (``x is y`` in
|
||||
Python syntax) rather than equality. Hence, ``case True:`` will match only
|
||||
``True`` and nothing else. Note that ``case 1:`` would still match ``True``,
|
||||
though, because the literal pattern ``1`` works by equality and not identity.
|
||||
|
||||
Early ideas to induce a hierarchy on numbers so that ``case 1.0`` would
|
||||
match both the integer ``1`` and the floating point number ``1.0``, whereas
|
||||
``case 1:`` would only match the integer ``1`` were eventually dropped in
|
||||
favor of the simpler and consistent rule based on equality. Moreover, any
|
||||
additional checks whether the subject is an instance of ``numbers.Integral``
|
||||
would come at a high runtime cost to introduce what would essentially be
|
||||
novel in Python. When needed, the explicit syntax ``case int(1):`` might
|
||||
be used.
|
||||
|
||||
Recall that literal patterns are *not* expressions, but directly denote a
|
||||
specific value or object. From a syntactical point of view, we have to
|
||||
ensure that negative and complex numbers can equally be used as patterns,
|
||||
although they are not atomic literal values (i.e. the seeming literal value
|
||||
``-3+4j`` would syntactically be an expression of the form
|
||||
``BinOp(UnaryOp('-', 3), '+', 4j)``, but as expressions are not part of
|
||||
patterns, we added syntactic support for such complex value literals without
|
||||
having to resort to full expressions). Interpolated *f*-strings, on the
|
||||
other hand, are not literal values, despite their appearance and can
|
||||
therefore not be used as literal patterns (string concatenation, however,
|
||||
is supported).
|
||||
|
||||
Literal patterns not only occur as patterns in their own right, but also
|
||||
as keys in *mapping patterns*.
|
||||
|
||||
Example using Literal patterns::
|
||||
|
||||
def simplify(expr):
|
||||
match expr:
|
||||
case ('+', 0, x):
|
||||
return x
|
||||
case ('+' | '-', x, 0):
|
||||
return x
|
||||
case ('and', True, x):
|
||||
return x
|
||||
case ('and', False, x):
|
||||
return False
|
||||
case ('or', False, x):
|
||||
return x
|
||||
case ('or', True, x):
|
||||
return True
|
||||
case ('not', ('not', x)):
|
||||
return x
|
||||
return expr
|
||||
|
||||
|
||||
.. _capture_pattern:
|
||||
|
||||
Capture Patterns
|
||||
|
@ -441,11 +598,11 @@ repeated use of names later on.
|
|||
|
||||
There were calls to explicitly mark capture patterns and thus identify them
|
||||
as binding targets. According to that idea, a capture pattern would be
|
||||
written as, e.g. ``?x`` or ``$x``. The aim of such explicit capture markers
|
||||
is to let an unmarked name be a constant value pattern (see below). However,
|
||||
this is based on the misconception that pattern matching was an extension of
|
||||
*switch* statements, placing the emphasis on fast switching based on
|
||||
(ordinal) values. Such a *switch* statement has indeed been proposed for
|
||||
written as, e.g. ``?x``, ``$x`` or ``=x``. The aim of such explicit capture
|
||||
markers is to let an unmarked name be a constant value pattern (see below).
|
||||
However, this is based on the misconception that pattern matching was an
|
||||
extension of *switch* statements, placing the emphasis on fast switching based
|
||||
on (ordinal) values. Such a *switch* statement has indeed been proposed for
|
||||
Python before (see :pep:`275` and :pep:`3103`). Pattern matching, on the other
|
||||
hand, builds a generalized concept of iterable unpacking. Binding values
|
||||
extracted from a data structure is at the very core of the concept and hence
|
||||
|
@ -454,7 +611,7 @@ betray the objective of the proposed pattern matching syntax and simplify
|
|||
a secondary use case at the expense of additional syntactic clutter for
|
||||
core cases.
|
||||
|
||||
Example::
|
||||
Example using Capture patterns::
|
||||
|
||||
def average(*args):
|
||||
match args:
|
||||
|
@ -503,7 +660,7 @@ of items is omitted::
|
|||
case [a, ..., z]: ...
|
||||
case [a, *, z]: ...
|
||||
|
||||
Both look like the would match a sequence of at two or more items,
|
||||
Both examples look like the would match a sequence of at two or more items,
|
||||
capturing the first and last values.
|
||||
|
||||
A single wildcard clause (i.e. ``case _:``) is semantically equivalent to
|
||||
|
@ -523,7 +680,7 @@ readability and learnability. In our view, concerns that this wildcard
|
|||
means that a regular name received special treatment are not strong
|
||||
enough to introduce syntax that would make Python special.
|
||||
|
||||
Example::
|
||||
Example using the Wildcard pattern::
|
||||
|
||||
def is_closed(sequence):
|
||||
match sequence:
|
||||
|
@ -535,81 +692,14 @@ Example::
|
|||
return False
|
||||
|
||||
|
||||
.. _literal_pattern:
|
||||
|
||||
Literal Patterns
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
Literal patterns are a convenient way for imposing constraints on the
|
||||
value of a subject, rather than its type or structure. Literal patterns
|
||||
even allow you to emulate a switch statement using pattern matching.
|
||||
|
||||
Generally, the subject is compared to a literal pattern by means of standard
|
||||
equality (``x == y`` in Python syntax). Consequently, the literal patterns
|
||||
``1.0`` and ``1`` match exactly the same set of objects, i.e. ``case 1.0:``
|
||||
and ``case 1:`` are fully interchangable. In principle, ``True`` would also
|
||||
match the same set of objects because ``True == 1`` holds. However, we
|
||||
believe that many users would be surprised finding that ``case True:``
|
||||
matched the object ``1.0``, resulting in some subtle bugs and convoluted
|
||||
workarounds. We therefore adopted the rule that the three singleton
|
||||
objects ``None``, ``False`` and ``True`` match by identity (``x is y`` in
|
||||
Python syntax) rather than equality. Hence, ``case True:`` will match only
|
||||
``True`` and nothing else. Note that ``case 1:`` would still match ``True``,
|
||||
though, because the literal pattern ``1`` works by equality and not identity.
|
||||
|
||||
Early ideas to induce a hierarchy on numbers so that ``case 1.0`` would
|
||||
match both the integer ``1`` and the floating point number ``1.0``, whereas
|
||||
``case 1:`` would only match the integer ``1`` were eventually dropped in
|
||||
favor of the simpler and consistent rule based on equality. Moreover, any
|
||||
additional checks whether the subject is an instance of ``numbers.Integral``
|
||||
would come at a high runtime cost to introduce what would essentially be
|
||||
novel in Python. When needed, the explicit syntax ``case int(1):`` might
|
||||
be used.
|
||||
|
||||
Recall that literal patterns are *not* expressions, but directly denote a
|
||||
specific value or object. From a syntactical point of view, we have to
|
||||
ensure that negative and complex numbers can equally be used as patterns,
|
||||
although they are not atomic literal values (i.e. the seeming literal value
|
||||
``-3+4j`` would syntactically be an expression of the form
|
||||
``BinOp(UnaryOp('-', 3), '+', 4j)``, but as expressions are not part of
|
||||
patterns, we added syntactic support for such complex value literals without
|
||||
having to resort to full expressions). Interpolated *f*-strings, on the
|
||||
other hand, are not literal values, despite their appearance and can
|
||||
therefore not be used as literal patterns (string concatenation, however,
|
||||
is supported).
|
||||
|
||||
Literal patterns not only occur as patterns in their own right, but also
|
||||
as keys in *mapping patterns*.
|
||||
|
||||
Example::
|
||||
|
||||
def simplify(expr):
|
||||
match expr:
|
||||
case ('+', 0, x):
|
||||
return x
|
||||
case ('+' | '-', x, 0):
|
||||
return x
|
||||
case ('and', True, x):
|
||||
return x
|
||||
case ('and', False, x):
|
||||
return False
|
||||
case ('or', False, x):
|
||||
return x
|
||||
case ('or', True, x):
|
||||
return True
|
||||
case ('not', ('not', x)):
|
||||
return x
|
||||
return expr
|
||||
|
||||
|
||||
.. _constant_value_pattern:
|
||||
|
||||
Constant Value Patterns
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
Value Patterns
|
||||
~~~~~~~~~~~~~~
|
||||
|
||||
It is good programming style to use named constants for parametric values or
|
||||
to clarify the meaning of particular values. Clearly, it would be desirable
|
||||
to also write ``case (HttpStatus.OK, body):`` rather than
|
||||
to write ``case (HttpStatus.OK, body):`` rather than
|
||||
``case (200, body):``, for example. The main issue that arises here is how to
|
||||
distinguish capture patterns (variables) from constant value patterns. The
|
||||
general discussion surrounding this issue has brought forward a plethora of
|
||||
|
@ -650,7 +740,7 @@ patterns. Moreover, pattern matching could not be used directly inside a
|
|||
module's scope because all variables would be global, making capture
|
||||
patterns impossible.
|
||||
|
||||
Example::
|
||||
Example using the Value pattern::
|
||||
|
||||
def handle_reply(reply):
|
||||
match reply:
|
||||
|
@ -752,7 +842,7 @@ understanding of the mapping pattern's semantics.
|
|||
To avoid overly expensive matching algorithms, keys must be literals or
|
||||
constant values.
|
||||
|
||||
Example::
|
||||
Example using the Mapping pattern::
|
||||
|
||||
def change_red_to_blue(json_obj):
|
||||
match json_obj:
|
||||
|
@ -791,7 +881,7 @@ given in the motivation::
|
|||
a, b, c = node.left, node.right.left, node.right.right
|
||||
# Handle a + b*c
|
||||
|
||||
The class pattern lets you to concisely specify both an instance-check as
|
||||
The class pattern lets you concisely specify both an instance-check as
|
||||
well as relevant attributes (with possible further constraints). It is
|
||||
thereby very tempting to write, e.g., ``case Node(left, right):`` in the
|
||||
first case above and ``case Leaf(value):`` in the second. While this
|
||||
|
|
Loading…
Reference in New Issue