PEP-636: complete missing sections and update TLDR appendix (#1658)

This commit is contained in:
Daniel F Moisset 2020-10-21 00:51:36 +01:00 committed by GitHub
parent 35bd60204b
commit 5f9a50d3d4
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 180 additions and 165 deletions

View File

@ -2,8 +2,7 @@ PEP: 636
Title: Structural Pattern Matching: Tutorial
Version: $Revision$
Last-Modified: $Date$
Author: Daniel F Moisset <dfmoisset@gmail.com>,
Tobias Kohn <kohnt@tobiaskohn.ch>
Author: Daniel F Moisset <dfmoisset@gmail.com>
Sponsor: Guido van Rossum <guido@python.org>
BDFL-Delegate:
Discussions-To: Python-Dev <python-dev@python.org>
@ -37,47 +36,10 @@ matching and design considerations).
For readers who are looking more for a quick review than for a tutorial,
see `Appendix A`_.
Meta
====
This section is intended to get in sync about style and language with
co-authors. It should be removed from the released PEP
The following are design decisions I made while writing this:
1. Who is the target audience?
I'm considering "People with general Python experience" (i.e. who shouldn't be surprised
at anything in the Python tutorial), but not necessarily involved with the
design/development or Python. I'm assuming someone who hasn't been exposed to pattern
matching in other languages.
2. How detailed should this document be?
I considered a range from "very superficial" (like the detail level you might find about
statements in the Python tutorial) to "terse but complete" like
https://github.com/gvanrossum/patma/#tutorial
to "long and detailed". I chose the later, we can always trim down from that.
3. What kind of examples to use?
I tried to write examples that are could that I might write using pattern matching. I
avoided going
for a full application (because the examples I have in mind are too large for a PEP) but
I tried to follow ideas related to a single project to thread the story-telling more
easily. This is probably the most controversial thing here, and if the rest of
the authors dislike it, we can change to a more formal explanatory style.
Other rules I'm following (let me know if I forgot to):
* I'm not going to reference/compare with other languages
* I'm not trying to convince the reader that this is a good idea (that's the job of
PEP 635) just explain how to use it
* I'm not trying to cover every corner case (that's the job of PEP 634), just cover
how to use the full functionality in the "normal" cases.
* I talk to the learner in second person
Tutorial
========
As an example to motivate this tutorial, you will be writing a text-adventure. That is
As an example to motivate this tutorial, you will be writing a text adventure. That is
a form of interactive fiction where the user enters text commands to interact with a
fictional world and receives text descriptions of what happens. Commands will be
simplified forms of natural language like ``get sword``, ``attack dragon``, ``go north``,
@ -108,22 +70,24 @@ You can use a matching statement instead::
case [action, obj]:
... # interpret action, obj
The ``match`` statement evaluates the **subject** after the ``match`` keyword, and checks
it against the **pattern** next to ``case``. A pattern is able to do two different
things:
The match statement evaluates the **"subject"** (the value after the ``match``
keyword), and checks it against the **pattern** (the code next to ``case``). A pattern
is able to do two different things:
* Verify that the subject has certain structure. In your case, the ``[action, obj]``
pattern matches any sequence of exactly two elements. This is called **matching**
* It will bind some names in the pattern to component elements of your subject. In
this case, if the list has two elements, it will bind ``action = subject[0]`` and
``obj = subject[1]``. This is called **destructuring**
``obj = subject[1]``.
If there's a match, the statements inside the ``case`` clause will be executed with the
bound variables. If there's no match, nothing happens and the next statement after
``match`` keeps running.
If there's a match, the statements inside the case block will be executed with the
bound variables. If there's no match, nothing happens and the statement after
``match`` is executed next.
TODO: discuss other sequences, tuples. Discuss syntax with parenthesis. discuss
iterators? discuss [x, x] possibly later on?
Note that, in a similar way to unpacking assignments, you can use either parenthesis,
brankets, or just comma separation as synonyms. So you could write ``case action, obj``
or ``case (action, obj)`` with the same meaning. All forms will match any sequence (for
example lists or tuples).
Matching multiple patterns
--------------------------
@ -139,16 +103,16 @@ of different lengths. For example you might want to add single verbs with no obj
case [action, obj]:
... # interpret action, obj
The ``match`` statement will check patterns from top to bottom. If the pattern doesn't
The match statement will check patterns from top to bottom. If the pattern doesn't
match the subject, the next pattern will be tried. However, once the *first*
matching ``case`` clause is found, the body of that clause is executed, and all further
``case`` clauses are ignored. This is similar to the way that an ``if/elif/elif/...``
matching pattern is found, the body of that case is executed, and all further
cases are ignored. This is similar to the way that an ``if/elif/elif/...``
statement works.
Matching specific values
------------------------
Your code still needs to look at the specific actions and conditionally run
Your code still needs to look at the specific actions and conditionally execute
different logic depending on the specific action (e.g., ``quit``, ``attack``, or ``buy``).
You could do that using a chain of ``if/elif/elif/...``, or using a dictionary of
functions, but here we'll leverage pattern matching to solve that task. Instead of a
@ -168,18 +132,15 @@ This allows you to write::
# The rest of your commands go here
A pattern like ``["get", obj]`` will match only 2-element sequences that have a first
element equal to ``"get"``. When destructuring, it will bind ``obj = subject[1]``.
element equal to ``"get"``. It will also bind ``obj = subject[1]``.
As you can see in the ``go`` case, we also can use different variable names in
different patterns.
FIXME: This *might* be the place to explain a bit that when I say "literal" I mean it
literally, and a "soft constant" will not work :)
Matching multiple values
------------------------
Matching slices
---------------
A player may be able to drop multiple objects by using a series of commands
A player may be able to drop multiple items by using a series of commands
``drop key``, ``drop sword``, ``drop cheese``. This interface might be cumbersome, and
you might like to allow dropping multiple items in a single command, like
``drop key sword cheese``. In this case you don't know beforehand how many words will
@ -199,46 +160,30 @@ variable.
This syntax has similar restrictions as sequence unpacking: you can not have more than one
starred name in a pattern.
Adding a catch-all
Adding a wildcard
------------------
You may want to print an error message saying that the command wasn't recognized when
all the patterns fail. You could use the feature we just learned and write the
following::
all the patterns fail. You could use the feature we just learned and write
``case [*ignored_words]`` as your last pattern. There's however a much simpler way::
match command.split():
case ["quit"]: ... # Code omitted for brevity
case ["go", direction]: ...
case ["drop", *objects]: ...
... # Other case clauses
case [*ignored_words]:
print(f"Sorry, I couldn't understand {command!r}")
Note that you must add this last pattern at the end, otherwise it will match before other
possible patterns that could be considered. This works but it's a bit verbose and
somewhat wasteful: this will make a full copy of the word list, which will be bound to
``ignored_words`` even if it's never used.
You can use an special pattern which is written ``_``, which always matches but it
doesn't bind anything. which would allow you to rewrite::
match command.split():
... # Other case clauses
case [*_]:
print(f"Sorry, I couldn't understand {command!r}")
This pattern will match for any sequence. In this case we can simplify even more and
match any object::
match command.split():
... # Other case clauses
... # Other cases
case _:
print(f"Sorry, I couldn't understand {command!r}")
TODO: Explain about syntaxerror when having an irrefutable pattern above others?
This special pattern which is written ``_`` (and called wildcard) always
matches but it doesn't bind any variables.
How patterns are composed
-------------------------
Note that this will match any object, not just sequences. As such, it only makes
sense to have it by itself as the last pattern (to prevent errors, Python will stop
you from using it before).
Composing patterns
------------------
This is a good moment to step back from the examples and understand how the patterns
that you have been using are built. Patterns can be nested within each other, and we
@ -247,25 +192,22 @@ have being doing that implicitly in the examples above.
There are some "simple" patterns ("simple" here meaning that they do not contain other
patterns) that we've seen:
* **Capture patterns** (stand-alone names like ``direction``, ``action``, ``objects``). We
never discussed these separately, but used them as part of other patterns.
* **Literal patterns** (string literals, number literals, ``True``, ``False``, and
``None``)
* The **wildcard pattern** ``_``
* **Capture patterns** (stand-alone names like ``direction``, ``action``, ``objects``). We
never discussed these separately, but used them as part of other patterns. Note that
a capture pattern by itself will always match, and usually makes sense only
as a catch-all at the end of your ``match`` if you desire to bind the name to the
subject.
Until now, the only non-simple pattern we have experimented with is the sequence pattern.
Each element in a sequence pattern can in fact be
any other pattern. This means that you could write a pattern like
``["first", (left, right), *rest]``. This will match subjects which are a sequence of at
least two elements, where the first one is equal to ``"first"`` and the second one is
``["first", (left, right), _, *rest]``. This will match subjects which are a sequence of at
least three elements, where the first one is equal to ``"first"`` and the second one is
in turn a sequence of two elements. It will also bind ``left=subject[1][0]``,
``right=subject[1][1]``, and ``rest = subject[2:]``
``right=subject[1][1]``, and ``rest = subject[3:]``
Alternate patterns
------------------
Or patterns
-----------
Going back to the adventure game example, you may find that you'd like to have several
patterns resulting in the same outcome. For example, you might want the commands
@ -275,14 +217,14 @@ patterns resulting in the same outcome. For example, you might want the commands
The ``|`` symbol in patterns combines them as alternatives. You could for example write::
match command.split():
... # Other case clauses
... # Other cases
case ["north"] | ["go", "north"]:
current_room = current_room.neighbor("north")
case ["get", obj] | ["pick", "up", obj] | ["pick", obj, "up"]:
... # Code for picking up the given object
This is called an **or pattern** and will produce the expected result. Patterns are
attempted from left to right; this may be relevant to know what is bound if more than
tried from left to right; this may be relevant to know what is bound if more than
one alternative matches. An important restriction when writing or patterns is that all
alternatives should bind the same variables. So a pattern ``[1, x] | [2, y]`` is not
allowed because it would make unclear which variable would be bound after a successful
@ -298,7 +240,7 @@ has some benefits but also some drawbacks in comparison: the latest version allo
alias, but also has the direction hardcoded, which will force us to actually have
separate patterns for north/south/east/west. This leads to some code duplication, but at
the same time we get better input validation, and we will not be getting into that
branch if the command entered by the user is ``"go figure!"`` instead of an direction.
branch if the command entered by the user is ``"go figure!"`` instead of a direction.
We could try to get the best of both worlds doing the following (I'll omit the aliased
version without "go" for brevity)::
@ -311,15 +253,14 @@ version without "go" for brevity)::
This code is a single branch, and it verifies that the word after "go" is really a
direction. But the code moving the player around needs to know which one was chosen and
has no way to do so. What we need is a pattern that behaves like the or pattern but at
the same time does a capture. We can do so with a **walrus pattern**::
the same time does a capture. We can do so with a **as pattern**::
match command.split():
case ["go", direction := ("north" | "south" | "east" | "west")]:
case ["go", ("north" | "south" | "east" | "west") as direction]:
current_room = current_room.neighbor(direction)
The walrus pattern (named like that because the ``:=`` operator looks like a sideways
walrus) matches whatever pattern is on its right hand side, but also binds the value to
a name.
The as-pattern matches whatever pattern is on its left-hand side, but also binds the
value to a name.
Adding conditions to patterns
-----------------------------
@ -328,7 +269,7 @@ The patterns we have explored above can do some powerful data filtering, but som
you may wish for the full power of a boolean expression. Let's say that you would actually
like to allow a "go" command only in a restricted set of directions based on the possible
exits from the current_room. We can achieve that by adding a **guard** to our
case-clause. Guards consist of the ``if`` keyword followed by any expression::
case. Guards consist of the ``if`` keyword followed by any expression::
match command.split():
case ["go", direction] if direction in current_room.exits:
@ -336,50 +277,145 @@ case-clause. Guards consist of the ``if`` keyword followed by any expression::
case ["go", _]:
print("Sorry, you can't go that way")
The guard is not part of the pattern, it's part of the case clause. It's only checked if
The guard is not part of the pattern, it's part of the case. It's only checked if
the pattern matches, and after all the pattern variables have been bound (that's why the
condition can use the ``direction`` variable in the example above). If the pattern
matches and the condition is truthy, the body of the case clause runs normally. If the
matches and the condition is truthy, the body of the case executes normally. If the
pattern matches but the condition is falsy, the match statement proceeds to check the
next ``case`` clause as if the pattern hadn't matched (with the possible side-effect of
next case as if the pattern hadn't matched (with the possible side-effect of
having already bound some variables).
The sequence of these steps must be considered carefully when combining or-patterns and
guards. If you have ``case [x, 100] | [0, x] if x > 10`` and your subject is
``[0, 100]``, the clause will be skipped. This happens because:
* The or-pattern finds the first alternative that matches the subject, which happens to
be ``[x, 100]``
* ``x`` is bound to 0
* The condition x > 10 is checked. Given that it's false, the whole case clause is
skipped. The ``[0, x]`` pattern is never attempted.
Going to the cloud: Mappings
----------------------------
TODO: Give the motivating example of network requests, describe JSON based "protocol"
You have decided to make an online version of your game with a richer interface. All
of your logic will be in a server, and the UI in a client which will communicate using
JSON messages. Via the ``json`` module, those will be mapped to Python dictionaries,
lists and other builtin objects.
TODO: partial matches, double stars
Our client will receive a list of dictionaries (parsed from JSON) of actions to take,
each element looking for example like these:
* ``{"text": "The shop keeper says 'Ah! We have Camembert, yes sir'", "color": "blue"}``
* If the client should make a pause ``{"sleep": 3}``
* To play a sound ``{"sound": "filename.ogg", format: "ogg"}``
Until now, our patterns have processed sequences, but there are patterns to match
mappings based on their present keys. In this case you could use::
for action in message:
match action:
case {"text": message, "color": c}:
ui.set_text_color(c)
ui.display(message)
case {"sleep": duration}:
ui.wait(duration)
case {"sound": url, "format": "ogg"}
ui.play(url)
case {"sound": _, "format": _}
warning("Unsupported audio format")
The keys in your mapping pattern need to be literals, but the values can be any
pattern. As in sequence patterns, all subpatterns have to match for the general
pattern to match.
You can use ``**rest`` within a mapping pattern to capture additional keys in
the subject. Note that if you omit this, extra keys in the subject will be
ignored while matching, i.e. the message
``{"text": "foo", "color": "red", "style": "bold"}`` will match the first pattern
in the example above.
Matching objects
----------------
UI events motivations. describe events in dataclasses. inspiration for event objects
can be taken from https://www.pygame.org/docs/ref/event.html
Our adventure is being a success and we have been asked to implement a graphical
interface. Our UI toolkit of choice allows us to write an event loop where we can get a new
event object by calling ``event.get()``. The resulting object can have different type and
attributes according to the user action, for example:
example of getting constants from module (like key names for keyboard events)
* A ``KeyPress`` object is generated when the user presses a key. It has a ``key_name``
attribute with the name of the key pressed, and some other attributes regarding modifiers.
* A ``Click`` object is generated when the user clicks the mouse. It has an attribute
``position`` with the coordinates of the pointer.
* A ``Quit`` object is generated when the user clicks on the close button for the game
window.
customizing match_args?
Rather than writing multiple ``isinstance()`` checks, we can use patterns to recognize
different kinds of objects, and also apply patterns to its attributes::
match event.get():
case Click(position=(x, y)):
handle_click_at(x, y)
case KeyPress(key_name="Q") | Quit():
game.quit()
case KeyPress(key_name="up arrow"):
game.go_north()
...
case KeyPress():
pass # Ignore other keystrokes
case other_event:
raise ValueError(f"Unrecognized event: {other_event}")
A pattern like ``Click(position=(x, y))`` only matches if the actual event is a subclass of
the ``Click`` class. It will also requires that the event has a ``position`` attribute
that matches the ``(x, y)`` pattern. If there's a match, the locals ``x`` and ``y`` will
get the expected values.
A pattern like ``KeyPress()``, with no arguments will match any object which is an
instance of the ``KeyPress`` class. Only the attributes you specify in the pattern are
matched, and any other attributes are ignored.
Matching positional attributes
------------------------------
The previous section described how to match named attributes when doing an object match.
For some objects it could be convenient to describe the matched arguments by position
(especially if there are only a few attributes and they have a "standard" ordering).
If the classes that you are using are named tuples or dataclasses, you can do that by
following the same order that you'd use when constructing an object. For example, if
the UI framework above defines their class like this::
from dataclasses import dataclass
@dataclass
class Click:
position: tuple
button: str
then you can rewrite your match statement above as::
match event.get():
case Click((x, y)):
handle_click_at(x, y)
The ``(x, y)`` pattern will be automatically matched against the ``position``
attribute, because the first argument in the pattern corresponds to the first
attribute in your dataclass definition.
Other classes don't have a natural ordering of their attributes so you're required to
use explicit names in your pattern to match with their attributes. However, it's possible
to manually specify the ordering of the attributes allowing positional matching, like in
this alternative definition::
class Click:
__match_args__ = ["position", "button"]
def __init__(self, position, button):
...
The ``__match_args__`` special attribute defines an explicit order for your attribtues
that can be used in patterns like ``case Click((x,y))``.
# TODO: special rules for builtin classes
# TODO: matching foo.bar as a constant
.. _Appendix A:
Appendix A -- Quick Intro
=========================
A ``match`` statement takes an expression and compares it to successive
patterns given as one or more ``case`` blocks. This is superficially
similar to a ``switch`` statement in C, Java or JavaScript (and many
A match statement takes an expression and compares its value to successive
patterns given as one or more case blocks. This is superficially
similar to a switch statement in C, Java or JavaScript (and many
other languages), but much more powerful.
The simplest form compares a subject value against one or more literals::
@ -388,10 +424,6 @@ The simplest form compares a subject value against one or more literals::
match status:
case 400:
return "Bad request"
case 401:
return "Unauthorized"
case 403:
return "Forbidden"
case 404:
return "Not found"
case 418:
@ -410,7 +442,7 @@ You can combine several literals in a single pattern using ``|`` ("or")::
Patterns can look like unpacking assignments, and can be used to bind
variables::
# The subject is an (x, y) tuple
# point is an (x, y) tuple
match point:
case (0, 0):
print("Origin")
@ -426,35 +458,35 @@ variables::
Study that one carefully! The first pattern has two literals, and can
be thought of as an extension of the literal pattern shown above. But
the next two patterns combine a literal and a variable, and the
variable *captures* a value from the subject (``point``). The fourth
variable *binds* a value from the subject (``point``). The fourth
pattern captures two values, which makes it conceptually similar to
the unpacking assignment ``(x, y) = point``.
If you are using classes to structure your data (e.g. data classes)
If you are using classes to structure your data
you can use the class name followed by an argument list resembling a
constructor, but with the ability to capture variables::
constructor, but with the ability to capture attributes into variables::
from dataclasses import dataclass
@dataclass
class Point:
x: int
y: int
def whereis(point):
def where_is(point):
match point:
case Point(0, 0):
case Point(x=0, y=0):
print("Origin")
case Point(0, y):
case Point(x=0, y=y):
print(f"Y={y}")
case Point(x, 0):
case Point(x=x, y=0):
print(f"X={x}")
case Point():
print("Somewhere else")
case _:
print("Not a point")
We can use keyword parameters too. The following patterns are all
You can use positional parameters with some builtin classes that provide an
ordering for their attributes (e.g. dataclasses). You can also define a specific
position for attributes in patterns by setting the ``__match_args__`` special
attribute in your classes. If it's set to ("x", "y"), the following patterns are all
equivalent (and all bind the ``y`` attribute to the ``var`` variable)::
Point(1, var)
@ -478,7 +510,7 @@ list of points, we could match it like this::
print("Something else")
We can add an ``if`` clause to a pattern, known as a "guard". If the
guard is false, ``match`` goes on to try the next ``case`` block. Note
guard is false, ``match`` goes on to try the next case block. Note
that value capture happens before the guard is evaluated::
match point:
@ -505,9 +537,9 @@ Several other key features:
patterns, extra keys are ignored. A wildcard ``**rest`` is also
supported. (But ``**_`` would be redundant, so it not allowed.)
- Subpatterns may be captured using the walrus (``:=``) operator::
- Subpatterns may be captured using the ``as`` keyword::
case (Point(x1, y1), p2 := Point(x2, y2)): ...
case (Point(x1, y1), Point(x2, y2) as p2): ...
- Patterns may use named constants. These must be dotted names
to prevent them from being interpreted as capture variable::
@ -526,23 +558,6 @@ Several other key features:
case Color.BLUE:
print("I'm feeling the blues :(")
- The literals ``None``, ``False`` and ``True`` are treated specially:
comparisons to the subject are done using ``is``. This::
match b:
case True:
print("Yes!")
is exactly equivalent to this::
if b is True:
print("Yes!")
- Classes may override the mapping from positional arguments to
attributes by setting a class variable ``__match_args__``.
Read about it in PEP 634.
Copyright
=========