386 lines
16 KiB
ReStructuredText
386 lines
16 KiB
ReStructuredText
|
PEP: 636
|
||
|
Title: Structural Pattern Matching: Tutorial
|
||
|
Version: $Revision$
|
||
|
Last-Modified: $Date$
|
||
|
Author: Daniel F Moisset <dfmoisset@gmail.com>,
|
||
|
Tobias Kohn <kohnt@tobiaskohn.ch>
|
||
|
Sponsor: Guido van Rossum <guido@python.org>
|
||
|
BDFL-Delegate:
|
||
|
Discussions-To: Python-Dev <python-dev@python.org>
|
||
|
Status: Draft
|
||
|
Type: Informational
|
||
|
Content-Type: text/x-rst
|
||
|
Created: 12-Sep-2020
|
||
|
Python-Version: 3.10
|
||
|
Post-History:
|
||
|
Resolution:
|
||
|
|
||
|
|
||
|
Abstract
|
||
|
========
|
||
|
|
||
|
**NOTE:** This draft is incomplete and not intended for review yet.
|
||
|
We're checking it into the peps repo for the convenience of the authors.
|
||
|
|
||
|
This PEP is a tutorial for the pattern matching introduced by PEP 634.
|
||
|
|
||
|
PEP 622 proposed syntax for pattern matching, which received detailed discussion
|
||
|
both from the community and the Steering Council. A frequent concern was
|
||
|
about how easy it would be to explain (and learn) this feature. This PEP
|
||
|
addresses that concern providing the kind of document which developers could use
|
||
|
to learn about pattern matching in Python.
|
||
|
|
||
|
This is considered supporting material for PEP 634 (the technical specification
|
||
|
for pattern matching) and PEP 635 (the motivation and rationale for having pattern
|
||
|
matching and design considerations).
|
||
|
|
||
|
Meta
|
||
|
====
|
||
|
|
||
|
This section is intended to get in sync about style and language with
|
||
|
co-authors. It should be removed from the released PEP
|
||
|
|
||
|
The following are design decisions I made while writing this:
|
||
|
|
||
|
1. Who is the target audience?
|
||
|
I'm considering "People with general Python experience" (i.e. who shouldn't be surprised
|
||
|
at anything in the Python tutorial), but not necessarily involved with the
|
||
|
design/development or Python. I'm assuming someone who hasn't been exposed to pattern
|
||
|
matching in other languages.
|
||
|
|
||
|
2. How detailed should this document be?
|
||
|
I considered a range from "very superficial" (like the detail level you might find about
|
||
|
statements in the Python tutorial) to "terse but complete" like
|
||
|
https://github.com/gvanrossum/patma/#tutorial
|
||
|
to "long and detailed". I chose the later, we can always trim down from that.
|
||
|
|
||
|
3. What kind of examples to use?
|
||
|
I tried to write examples that are could that I might write using pattern matching. I
|
||
|
avoided going
|
||
|
for a full application (because the examples I have in mind are too large for a PEP) but
|
||
|
I tried to follow ideas related to a single project to thread the story-telling more
|
||
|
easily. This is probably the most controversial thing here, and if the rest of
|
||
|
the authors dislike it, we can change to a more formal explanatory style.
|
||
|
|
||
|
Other rules I'm following (let me know if I forgot to):
|
||
|
|
||
|
* I'm not going to reference/compare with other languages
|
||
|
* I'm not trying to convince the reader that this is a good idea (that's the job of
|
||
|
PEP 635) just explain how to use it
|
||
|
* I'm not trying to cover every corner case (that's the job of PEP 634), just cover
|
||
|
how to use the full functionality in the "normal" cases.
|
||
|
* I talk to the learner in second person
|
||
|
|
||
|
Tutorial
|
||
|
========
|
||
|
|
||
|
As an example to motivate this tutorial, you will be writing a text-adventure. That is
|
||
|
a form of interactive fiction where the user enters text commands to interact with a
|
||
|
fictional world and receives text descriptions of what happens. Commands will be
|
||
|
simplified forms of natural language like ``get sword``, ``attack dragon``, ``go north``,
|
||
|
``enter shop`` or ``buy cheese``.
|
||
|
|
||
|
Matching sequences
|
||
|
------------------
|
||
|
|
||
|
Your main loop will need to get input from the user and split it into words, let's say
|
||
|
a list of strings like this::
|
||
|
|
||
|
command = input("What are you doing next? ")
|
||
|
# analyze the result of command.split()
|
||
|
|
||
|
The next step is to interpret the words. Most of our commands will have two words: an
|
||
|
action and an object. So you may be tempted to do the following::
|
||
|
|
||
|
[action, obj] = command.split()
|
||
|
... # interpret action, obj
|
||
|
|
||
|
The problem with that line of code is that it's missing something: what if the user
|
||
|
types more or fewer than 2 words? To prevent this problem you can either check the length
|
||
|
of the list of words, or capture the ``ValueError`` that the statement above would raise.
|
||
|
|
||
|
You can use a matching statement instead::
|
||
|
|
||
|
match command.split():
|
||
|
case [action, obj]:
|
||
|
... # interpret action, obj
|
||
|
|
||
|
The ``match`` statement evaluates the **subject** after the ``match`` keyword, and checks
|
||
|
it against the **pattern** next to ``case``. A pattern is able to do two different
|
||
|
things:
|
||
|
|
||
|
* Verify that the subject has certain structure. In your case, the ``[action, obj]``
|
||
|
pattern matches any sequence of exactly two elements. This is called **matching**
|
||
|
* It will bind some names in the pattern to component elements of your subject. In
|
||
|
this case, if the list has two elements, it will bind ``action = subject[0]`` and
|
||
|
``obj = subject[1]``. This is called **destructuring**
|
||
|
|
||
|
If there's a match, the statements inside the ``case`` clause will be executed with the
|
||
|
bound variables. If there's no match, nothing happens and the next statement after
|
||
|
``match`` keeps running.
|
||
|
|
||
|
TODO: discuss other sequences, tuples. Discuss syntax with parenthesis. discuss
|
||
|
iterators? discuss [x, x] possibly later on?
|
||
|
|
||
|
Matching multiple patterns
|
||
|
--------------------------
|
||
|
|
||
|
Even if most commands have the action/object form, you might want to have user commands
|
||
|
of different lengths. For example you might want to add single verbs with no object like
|
||
|
``look`` or ``quit``. A match statement can (and is likely to) have more than one
|
||
|
``case``::
|
||
|
|
||
|
match command.split():
|
||
|
case [action]:
|
||
|
... # interpret single-verb action
|
||
|
case [action, obj]:
|
||
|
... # interpret action, obj
|
||
|
|
||
|
The ``match`` statement will check patterns from top to bottom. If the pattern doesn't
|
||
|
match the subject, the next pattern will be tried. However, once the *first*
|
||
|
matching ``case`` clause is found, the body of that clause is executed, and all further
|
||
|
``case`` clauses are ignored. This is similar to the way that an ``if/elif/elif/...``
|
||
|
statement works.
|
||
|
|
||
|
Matching specific values
|
||
|
------------------------
|
||
|
|
||
|
Your code still needs to look at the specific actions and conditionally run
|
||
|
different logic depending on the specific action (e.g., ``quit``, ``attack``, or ``buy``).
|
||
|
You could do that using a chain of ``if/elif/elif/...``, or using a dictionary of
|
||
|
functions, but here we'll leverage pattern matching to solve that task. Instead of a
|
||
|
variable, you can use literal values in patterns (like ``"quit"``, ``42``, or ``None``).
|
||
|
This allows you to write::
|
||
|
|
||
|
match command.split():
|
||
|
case ["quit"]:
|
||
|
print("Goodbye!")
|
||
|
quit_game()
|
||
|
case ["look"]:
|
||
|
current_room.describe()
|
||
|
case ["get", obj]:
|
||
|
character.get(obj, current_room)
|
||
|
case ["go", direction]:
|
||
|
current_room = current_room.neighbor(direction)
|
||
|
# The rest of your commands go here
|
||
|
|
||
|
A pattern like ``["get", obj]`` will match only 2-element sequences that have a first
|
||
|
element equal to ``"get"``. When destructuring, it will bind ``obj = subject[1]``.
|
||
|
|
||
|
As you can see in the ``go`` case, we also can use different variable names in
|
||
|
different patterns.
|
||
|
|
||
|
FIXME: This *might* be the place to explain a bit that when I say "literal" I mean it
|
||
|
literally, and a "soft constant" will not work :)
|
||
|
|
||
|
Matching slices
|
||
|
---------------
|
||
|
|
||
|
A player may be able to drop multiple objects by using a series of commands
|
||
|
``drop key``, ``drop sword``, ``drop cheese``. This interface might be cumbersome, and
|
||
|
you might like to allow dropping multiple items in a single command, like
|
||
|
``drop key sword cheese``. In this case you don't know beforehand how many words will
|
||
|
be in the command, but you can use extended unpacking in patterns in the same way that
|
||
|
they are allowed in assignments::
|
||
|
|
||
|
match command.split():
|
||
|
case ["drop", *objects]:
|
||
|
for obj in objects:
|
||
|
character.drop(obj, current_room)
|
||
|
# The rest of your commands go here
|
||
|
|
||
|
This will match any sequences having "drop" as its first elements. All remaining
|
||
|
elements will be captured in a ``list`` object which will be bound to the ``objects``
|
||
|
variable.
|
||
|
|
||
|
This syntax has similar restrictions as sequence unpacking: you can not have more than one
|
||
|
starred name in a pattern.
|
||
|
|
||
|
Adding a catch-all
|
||
|
------------------
|
||
|
|
||
|
You may want to print an error message saying that the command wasn't recognized when
|
||
|
all the patterns fail. You could use the feature we just learned and write the
|
||
|
following::
|
||
|
|
||
|
match command.split():
|
||
|
case ["quit"]: ... # Code omitted for brevity
|
||
|
case ["go", direction]: ...
|
||
|
case ["drop", *objects]: ...
|
||
|
... # Other case clauses
|
||
|
case [*ignored_words]:
|
||
|
print(f"Sorry, I couldn't understand {command!r}")
|
||
|
|
||
|
Note that you must add this last pattern at the end, otherwise it will match before other
|
||
|
possible patterns that could be considered. This works but it's a bit verbose and
|
||
|
somewhat wasteful: this will make a full copy of the word list, which will be bound to
|
||
|
``ignored_words`` even if it's never used.
|
||
|
|
||
|
You can use an special pattern which is written ``_``, which always matches but it
|
||
|
doesn't bind anything. which would allow you to rewrite::
|
||
|
|
||
|
match command.split():
|
||
|
... # Other case clauses
|
||
|
case [*_]:
|
||
|
print(f"Sorry, I couldn't understand {command!r}")
|
||
|
|
||
|
This pattern will match for any sequence. In this case we can simplify even more and
|
||
|
match any object::
|
||
|
|
||
|
match command.split():
|
||
|
... # Other case clauses
|
||
|
case _:
|
||
|
print(f"Sorry, I couldn't understand {command!r}")
|
||
|
|
||
|
TODO: Explain about syntaxerror when having an irrefutable pattern above others?
|
||
|
|
||
|
How patterns are composed
|
||
|
-------------------------
|
||
|
|
||
|
This is a good moment to step back from the examples and understand how the patterns
|
||
|
that you have been using are built. Patterns can be nested within each other, and we
|
||
|
have being doing that implicitly in the examples above.
|
||
|
|
||
|
There are some "simple" patterns ("simple" here meaning that they do not contain other
|
||
|
patterns) that we've seen:
|
||
|
|
||
|
* **Literal patterns** (string literals, number literals, ``True``, ``False``, and
|
||
|
``None``)
|
||
|
* The **wildcard pattern** ``_``
|
||
|
* **Capture patterns** (stand-alone names like ``direction``, ``action``, ``objects``). We
|
||
|
never discussed these separately, but used them as part of other patterns. Note that
|
||
|
a capture pattern by itself will always match, and usually makes sense only
|
||
|
as a catch-all at the end of your ``match`` if you desire to bind the name to the
|
||
|
subject.
|
||
|
|
||
|
Until now, the only non-simple pattern we have experimented with is the sequence pattern.
|
||
|
Each element in a sequence pattern can in fact be
|
||
|
any other pattern. This means that you could write a pattern like
|
||
|
``["first", (left, right), *rest]``. This will match subjects which are a sequence of at
|
||
|
least two elements, where the first one is equal to ``"first"`` and the second one is
|
||
|
in turn a sequence of two elements. It will also bind ``left=subject[1][0]``,
|
||
|
``right=subject[1][1]``, and ``rest = subject[2:]``
|
||
|
|
||
|
Alternate patterns
|
||
|
------------------
|
||
|
|
||
|
Going back to the adventure game example, you may find that you'd like to have several
|
||
|
patterns resulting in the same outcome. For example, you might want the commands
|
||
|
``north`` and ``go north`` be equivalent. You may also desire to have aliases for
|
||
|
``get X``, ``pick up X`` and ``pick X up`` for any X.
|
||
|
|
||
|
The ``|`` symbol in patterns combines them as alternatives. You could for example write::
|
||
|
|
||
|
match command.split():
|
||
|
... # Other case clauses
|
||
|
case ["north"] | ["go", "north"]:
|
||
|
current_room = current_room.neighbor("north")
|
||
|
case ["get", obj] | ["pick", "up", obj] | ["pick", obj, "up"]:
|
||
|
... # Code for picking up the given object
|
||
|
|
||
|
This is called an **or pattern** and will produce the expected result. Patterns are
|
||
|
attempted from left to right; this may be relevant to know what is bound if more than
|
||
|
one alternative matches. An important restriction when writing or patterns is that all
|
||
|
alternatives should bind the same variables. So a pattern ``[1, x] | [2, y]`` is not
|
||
|
allowed because it would make unclear which variable would be bound after a successful
|
||
|
match. ``[1, x] | [2, x]`` is perfectly fine and will always bind ``x`` if successful.
|
||
|
|
||
|
|
||
|
Capturing matched sub-patterns
|
||
|
------------------------------
|
||
|
|
||
|
The first version of our "go" command was written with a ``["go", direction]`` pattern.
|
||
|
The change we did in our last version using the pattern ``["north"] | ["go", "north"]``
|
||
|
has some benefits but also some drawbacks in comparison: the latest version allows the
|
||
|
alias, but also has the direction hardcoded, which will force us to actually have
|
||
|
separate patterns for north/south/east/west. This leads to some code duplication, but at
|
||
|
the same time we get better input validation, and we will not be getting into that
|
||
|
branch if the command entered by the user is ``"go figure!"`` instead of an direction.
|
||
|
|
||
|
We could try to get the best of both worlds doing the following (I'll omit the aliased
|
||
|
version without "go" for brevity)::
|
||
|
|
||
|
match command.split():
|
||
|
case ["go", ("north" | "south" | "east" | "west")]:
|
||
|
current_room = current_room.neighbor(...)
|
||
|
# how do I know which direction to go?
|
||
|
|
||
|
This code is a single branch, and it verifies that the word after "go" is really a
|
||
|
direction. But the code moving the player around needs to know which one was chosen and
|
||
|
has no way to do so. What we need is a pattern that behaves like the or pattern but at
|
||
|
the same time does a capture. We can do so with a **walrus pattern**::
|
||
|
|
||
|
match command.split():
|
||
|
case ["go", direction := ("north" | "south" | "east" | "west")]:
|
||
|
current_room = current_room.neighbor(direction)
|
||
|
|
||
|
The walrus pattern (named like that because the ``:=`` operator looks like a sideways
|
||
|
walrus) matches whatever pattern is on its right hand side, but also binds the value to
|
||
|
a name.
|
||
|
|
||
|
Adding conditions to patterns
|
||
|
-----------------------------
|
||
|
|
||
|
The patterns we have explored above can do some powerful data filtering, but sometimes
|
||
|
you may wish for the full power of a boolean expression. Let's say that you would actually
|
||
|
like to allow a "go" command only in a restricted set of directions based on the possible
|
||
|
exits from the current_room. We can achieve that by adding a **guard** to our
|
||
|
case-clause. Guards consist of the ``if`` keyword followed by any expression::
|
||
|
|
||
|
match command.split():
|
||
|
case ["go", direction] if direction in current_room.exits:
|
||
|
current_room = current_room.neighbor(direction)
|
||
|
case ["go", _]:
|
||
|
print("Sorry, you can't go that way")
|
||
|
|
||
|
The guard is not part of the pattern, it's part of the case clause. It's only checked if
|
||
|
the pattern matches, and after all the pattern variables have been bound (that's why the
|
||
|
condition can use the ``direction`` variable in the example above). If the pattern
|
||
|
matches and the condition is truthy, the body of the case clause runs normally. If the
|
||
|
pattern matches but the condition is falsy, the match statement proceeds to check the
|
||
|
next ``case`` clause as if the pattern hadn't matched (with the possible side-effect of
|
||
|
having already bound some variables).
|
||
|
|
||
|
The sequence of these steps must be considered carefully when combining or-patterns and
|
||
|
guards. If you have ``case [x, 100] | [0, x] if x > 10`` and your subject is
|
||
|
``[0, 100]``, the clause will be skipped. This happens because:
|
||
|
|
||
|
* The or-pattern finds the first alternative that matches the subject, which happens to
|
||
|
be ``[x, 100]``
|
||
|
* ``x`` is bound to 0
|
||
|
* The condition x > 10 is checked. Given that it's false, the whole case clause is
|
||
|
skipped. The ``[0, x]`` pattern is never attempted.
|
||
|
|
||
|
Going to the cloud: Mappings
|
||
|
----------------------------
|
||
|
|
||
|
TODO: Give the motivating example of netowrk requests, describe JSON based "protocol"
|
||
|
|
||
|
TODO: partial matches, double stars
|
||
|
|
||
|
Matching objects
|
||
|
----------------
|
||
|
|
||
|
UI events motivations. describe events in dataclasses. inspiration for event objects
|
||
|
can be taken from https://www.pygame.org/docs/ref/event.html
|
||
|
|
||
|
example of getting constants from module (like key names for keyboard events)
|
||
|
|
||
|
customizing match_args?
|
||
|
|
||
|
Copyright
|
||
|
=========
|
||
|
|
||
|
This document is placed in the public domain or under the
|
||
|
CC0-1.0-Universal license, whichever is more permissive.
|
||
|
|
||
|
|
||
|
..
|
||
|
Local Variables:
|
||
|
mode: indented-text
|
||
|
indent-tabs-mode: nil
|
||
|
sentence-end-double-space: t
|
||
|
fill-column: 70
|
||
|
coding: utf-8
|
||
|
End:
|