PEP 642: Additional edits for 2nd posting (#1709)

* Switch to `__`` as the proposed wildcard marker
* OR patterns also need a disambiguating space before `==`
* Misc wording tweaks and other edits
This commit is contained in:
Nick Coghlan 2020-11-08 15:52:50 +10:00 committed by GitHub
parent f2213d72e4
commit ed214e8955
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 105 additions and 48 deletions

View File

@ -23,12 +23,11 @@ assignment targets, while retaining most semantic aspects of the existing
proposal. proposal.
Specifically, this PEP adopts an additional design restriction that PEP 634's Specifically, this PEP adopts an additional design restriction that PEP 634's
authors considered unreasonable: that any syntax that is common to both authors considered unreasonable: that any novel match pattern semantics must
assignment targets and match patterns must have a comparable semantic effect, offer syntax that future PEPs could plausibly propose for adoption in assignment
while any novel match pattern semantics must offer syntax which emits a syntax targets. It is (reluctantly) considered acceptable to offer syntactic sugar that
error when used in an assignment target. It is still considered acceptable to is specific to match patterns, as long as there is an underlying more explicit
offer syntactic sugar that is specific to match patterns, as long as there is form that is compatible (or potentially compatible) with assignment targets.
an underlying more explicit form that is compatible with assignment targets.
As a consequence, this PEP proposes the following changes to the proposed match As a consequence, this PEP proposes the following changes to the proposed match
pattern syntax: pattern syntax:
@ -49,8 +48,8 @@ pattern syntax:
constraint, a class pattern, or a capture pattern with a guard expression) constraint, a class pattern, or a capture pattern with a guard expression)
* inferred constraints are *not* defined in the Abstract Syntax Tree. Instead, * inferred constraints are *not* defined in the Abstract Syntax Tree. Instead,
inferred constraints are converted to explicit constraints by the parser inferred constraints are converted to explicit constraints by the parser
* ``_`` remains the wildcard pattern, but gains a dedicated ``SkippedBinding`` * The wildcard pattern changes from ``_`` (single underscore) to ``__`` (double
AST node to distinguish it from the use of ``_`` as an identifier underscore), and gains a dedicated ``SkippedBinding`` node in the AST
* Mapping patterns change to allow arbitrary primary expressions as keys * Mapping patterns change to allow arbitrary primary expressions as keys
@ -66,10 +65,13 @@ lookup operations in match patterns. (Even though this PEP ultimately retained
that shorthand to reduce the verbosity of common use cases, it still redefines that shorthand to reduce the verbosity of common use cases, it still redefines
it in terms of a more explicit underlying construct). it in terms of a more explicit underlying construct).
By dropping its own proposal to switch the wildcard pattern to ``?`` (and This PEP agrees with the spirit of PEP 640 (that the chosen wildcard pattern to
instead retaining PEP 634's ``_``), this PEP now effectively votes against skip a name binding should be supported everywhere, not just in match patterns),
the proposal in PEP 640 to allow the use of ``?`` as a general purpose wildcard but is now proposing a different spelling for the wildcard syntax (``__`` rather
marker in name binding operations. than ``?``). As such, it competes with PEP 640 as written, but would complement
a proposal to deprecate the use of ``__`` as an ordinary identifier and instead
turn it into a general purpose wildcard marker that always skips making a new
local variable binding.
Motivation Motivation
@ -125,16 +127,16 @@ differences that emerge relative to the syntactic proposal in PEP 634 are:
only allowing them to be inferred from the use of dotted names or literals; and only allowing them to be inferred from the use of dotted names or literals; and
* a requirement to use a non-binding wildcard marker other than ``_``. * a requirement to use a non-binding wildcard marker other than ``_``.
This PEP concedes the second point in the name of cross-language consistency This PEP proposes constraint expressions as a way of addressing the first point,
(and for lack of a compelling alternative wildcard marker), but proposes and changes the proposed non-binding wildcard marker to a double-underscore to
constraint expressions as a way of addressing the first point. address the latter.
PEP 634 also proposes special casing the literals ``None``, ``True``, and PEP 634 also proposes special casing the literals ``None``, ``True``, and
``False`` so that they're compared by identity when written directly as a ``False`` so that they're compared by identity when written directly as a
literal pattern, but by equality when referenced by a value pattern. This PEP literal pattern, but by equality when referenced by a value pattern. This PEP
eliminates the need for those special cases by proposing distinct syntax for eliminates the need for those special cases by proposing distinct syntax for
matching by identity and matching by equality (but does accept the convenience matching by identity and matching by equality, but does accept the convenience
and consistency argument in allowing ``None`` as a shorthand for ``is None``). and consistency argument in allowing ``None`` as a shorthand for ``is None``.
Specification Specification
@ -143,7 +145,6 @@ Specification
This PEP retains the overall `match`/`case` statement syntax from PEP 634, and This PEP retains the overall `match`/`case` statement syntax from PEP 634, and
retains both the syntax and semantics for the following match pattern variants: retains both the syntax and semantics for the following match pattern variants:
* capture patterns
* class patterns * class patterns
* group patterns * group patterns
* sequence patterns * sequence patterns
@ -151,6 +152,9 @@ retains both the syntax and semantics for the following match pattern variants:
Pattern combination (both OR and AS patterns) and guard expressions also remain Pattern combination (both OR and AS patterns) and guard expressions also remain
the same as they are in PEP 634. the same as they are in PEP 634.
Capture patterns are essentially unchanged, except that ``_`` becomes a regular
capture pattern, due to the wildcard pattern marker changing to ``__``.
Constraint patterns are added, offering equality constraints and identity Constraint patterns are added, offering equality constraints and identity
constraints. constraints.
@ -161,9 +165,9 @@ attribute lookups, and inferred identity constraints for ``None`` and ``...``.
Mapping patterns change to allow arbitrary primary expressions for keys, rather Mapping patterns change to allow arbitrary primary expressions for keys, rather
than being restricted to literal patterns or value patterns. than being restricted to literal patterns or value patterns.
Wildcard patterns remain the same in the proposed surface syntax, but are Wildcard patterns are changed to use ``__`` (double underscore) rather than
explicitly distinguished from the use of ``_`` as an identifier in the Abstract ``_`` (single underscore), and are also given a new dedicated node in the
Syntax Tree produced by the parser. Abstract Syntax Tree produced by the parser.
Constraint patterns Constraint patterns
@ -365,17 +369,28 @@ test cases will compile and run as expected).
Wildcard patterns Wildcard patterns
----------------- -----------------
Wildcard patterns retain the same ``_`` syntax in this PEP as they have in PEP Wildcard patterns are changed to use ``__`` (double underscore) rather than
634. However, this PEP explicitly requires that they be represented in the the ``_`` (single underscore) syntax proposed in PEP 634::
match sequence:
case [__]: # any sequence with a single element
return True
case [start, *__, end]: # a sequence with at least two elements
return start == end
case __: # anything
return False
This PEP explicitly requires that wildcard patterns be represented in the
Abstract Syntax Tree as something *other than* a regular ``Name`` node. Abstract Syntax Tree as something *other than* a regular ``Name`` node.
The draft reference implementation uses the node name ``SkippedBinding`` to The draft reference implementation uses the node name ``SkippedBinding`` to
indicate that the node appears where a simple name binding would ordinarily indicate that the node appears where a simple name binding would ordinarily
occur to indicate that nothing should actually be bound, but the exact name of occur to indicate that nothing should actually be bound, but the exact name of
the node is more an implementation decision than a design one. The key design the node is more an implementation decision than a design one. The key design
requirement is to limit the special casing of ``_`` to the parser and allow the requirement is to limit the special casing of ``__`` to the parser and allow the
rest of the compiler to distinguish wildcard patterns from capture patterns rest of the compiler to distinguish wildcard patterns from capture patterns
based entirely on information contained within the node itself. based entirely on the kind of the AST node, rather than needing to inspect the
identifier used in ``Name`` nodes.
Design Discussion Design Discussion
@ -545,6 +560,9 @@ chosen as the prefix in the initial iteration of the PEP:
* when used in a mapping pattern, there needs to be a space between the ``:`` * when used in a mapping pattern, there needs to be a space between the ``:``
key/value separator and the ``==`` prefix, or the tokeniser will split them key/value separator and the ``==`` prefix, or the tokeniser will split them
up incorrectly (getting ``:=`` and ``=`` instead of ``:`` and ``==``) up incorrectly (getting ``:=`` and ``=`` instead of ``:`` and ``==``)
* when used in an OR pattern, there needs to be a space between the ``|``
pattern separator and the ``==`` prefix, or the tokeniser will split them
up incorrectly (getting ``|=`` and ``=`` instead of ``|`` and ``==``)
Rather than introducing a completely new symbol, Steven's proposed resolution to Rather than introducing a completely new symbol, Steven's proposed resolution to
this verbosity problem was to retain the ability to omit the prefix marker in this verbosity problem was to retain the ability to omit the prefix marker in
@ -563,31 +581,58 @@ pattern matching syntax held for this proposal as well, and so the PEP was
amended accordingly. amended accordingly.
Retaining ``_`` as the wildcard pattern marker Using ``__`` as the wildcard pattern marker
---------------------------------------------- -------------------------------------------
PEP 635 makes a solid case that introducing ``?`` *solely* as a wildcard pattern PEP 635 makes a solid case that introducing ``?`` *solely* as a wildcard pattern
marker would be a bad idea. With the syntax for constraint patterns now changed marker would be a bad idea. With the syntax for constraint patterns now changed
to use existing comparison operations rather than ``?`` and ``?is``, that to use existing comparison operations rather than ``?`` and ``?is``, that
argument holds for this PEP as well. argument holds for this PEP as well.
However, this PEP also proposes adopting an implementation technique that limits However, as noted by Thomas Wouters in [6_], PEP 634's choice of ``_`` remains
the scope of the associated special casing of ``_`` to the parser: defining a problematic as it would likely mean that match patterns would have a *permanent*
difference from all other parts of Python - the use of ``_`` in software
internationalisation and at the interactive prompt means that there isn't really
a plausible path towards using it as a general purpose "skipped binding" marker.
``__`` is an alternative "this value is not needed" marker drawn from a Stack
Overflow answer [7_] (originally posted by the author of this PEP) on the
various meanings of ``_`` in existing Python code.
This PEP also proposes adopting an implementation technique that limits
the scope of the associated special casing of ``__`` to the parser: defining a
new AST node type (``SkippedBinding``) specifically for wildcard markers. new AST node type (``SkippedBinding``) specifically for wildcard markers.
Within the parser, ``_`` would still mean either a regular name or a wildcard Within the parser, ``__`` would still mean either a regular name or a wildcard
marker in a match pattern depending on where you were in the parse tree, but marker in a match pattern depending on where you were in the parse tree, but
within the rest of the compiler, ``Name("_")`` would always be a regular name, within the rest of the compiler, ``Name("__")`` would still be a regular name,
while ``SkippedBinding()`` would always be a wildcard marker (with it being while ``SkippedBinding()`` would always be a wildcard marker.
the responsibility of the AST validator to disallow the use of
``SkippedBinding`` outside match patterns).
It may also make sense to consider a future proposal that further changes ``_`` Unlike ``_``, the lack of other use cases for ``__`` means that there would be
to also skip binding when it's used as part of an iterable unpacking target, but a plausible path towards restoring identifier handling consistency with the rest
that's entirely out of the scope of the pattern matching discussion (and would of the language by making it mean "skip this name binding" everwhere in Python:
require careful review of how the routine uses of assignment to ``_`` in
internationalisation use cases and Python interactive prompt implementations * in the interpreter itself, deprecate loading variables with the name ``__``.
are handled). This would make reading from ``__`` emit a deprecation warning, while writing
to it would initially be unchanged. To avoid slowing down all name loads, this
could be handled by having the compiler emit additional code for the
deprecated name, rather than using a runtime check in the standard name
loading opcodes.
* after a suitable number of releases, change the parser to emit
``SkippedBinding`` for all uses of ``__`` as an assignment target, not just
those appearing inside match patterns
* consider making ``__`` a true hard keyword rather than a soft keyword
This deprecation path couldn't be followed for ``_``, as there's no way for the
interpreter to distinguish between attempts to read back ``_`` when nominally
used as a "don't care" marker, and legitimate reads of ``_`` as either an
i18n text translation function or as the last statement result at the
interactive prompt.
Names starting with double-underscores are also already reserved for use by the
language, whether that is for compile time constants (i.e. ``__debug__``),
special methods, or class attribute name mangling, so using ``__`` here would
be consistent with that existing approach.
Keeping inferred equality constraints Keeping inferred equality constraints
@ -786,14 +831,16 @@ The syntax used for equality and identity constraints would be straightforward
to extend to containment checks: ``in container``. to extend to containment checks: ``in container``.
One downside of the proposals in both this PEP and PEP 634 is that checking One downside of the proposals in both this PEP and PEP 634 is that checking
for multiple values in the same case is quite verbose:: for multiple values in the same case doesn't look like any existing set
membership check in Python::
# PEP 634's literal patterns / this PEP's inferred constraints # PEP 634's literal patterns / this PEP's inferred constraints
match value: match value:
case 0 | 1 | 2 | 3: case 0 | 1 | 2 | 3:
... ...
Explicit equality constraints are even worse:: Explicit equality constraints also become quite verbose if they need to be
repeated::
match value: match value:
case == one | == two | == three | == four: case == one | == two | == three | == four:
@ -870,7 +917,7 @@ This PEP proposes skipping requiring any such workarounds, and instead
supporting arbitrary value constraints from the start:: supporting arbitrary value constraints from the start::
match expr: match expr:
case (_, == func()): case (__, == func()):
... # Handle the case where 'expr == func()' ... # Handle the case where 'expr == func()'
Whether actually writing that kind of code is a good idea would be a topic for Whether actually writing that kind of code is a good idea would be a topic for
@ -974,9 +1021,6 @@ constraint operators in the AST. The draft implementation adds them as new
cases on the existing ``UnaryOp`` node, but there's an argument to be made that cases on the existing ``UnaryOp`` node, but there's an argument to be made that
they would be better implemented as a new ``Constraint`` node, since they're they would be better implemented as a new ``Constraint`` node, since they're
accepted at different points in the syntax tree than other unary operators. accepted at different points in the syntax tree than other unary operators.
Making them a new node type would also allow an attribute to be added that
marked them as implicit or explicit nodes, which ``ast.unparse`` could use
to make the unparsed code look more like original.
Acknowledgments Acknowledgments
@ -999,6 +1043,13 @@ to obtain the same level of brevity as PEP 634 in most situations. (Paul
Sokolosvsky also independently suggested using ``==`` instead of ``?`` as a Sokolosvsky also independently suggested using ``==`` instead of ``?`` as a
more easily understood prefix for equality constraints). more easily understood prefix for equality constraints).
Thomas Wouters, whose publication of PEP 640 and public review of the structured
pattern matching proposals persuaded the author of this PEP to continue
advocating for a wildcard pattern syntax that a future PEP could plausibly turn
into a hard keyword that always skips binding a reference in any location a
simple name is expected, rather than continuing indefinitely as the match
pattern specific soft keyword that is proposed here.
References References
========== ==========
@ -1018,6 +1069,12 @@ References
.. [5] Steven D'Aprano's cogent criticism of the first published iteration of this PEP .. [5] Steven D'Aprano's cogent criticism of the first published iteration of this PEP
https://mail.python.org/archives/list/python-dev@python.org/message/BTHFWG6MWLHALOD6CHTUFPHAR65YN6BP/ https://mail.python.org/archives/list/python-dev@python.org/message/BTHFWG6MWLHALOD6CHTUFPHAR65YN6BP/
.. [6] Thomas Wouter's initial review of the structured pattern matching proposals
https://mail.python.org/archives/list/python-dev@python.org/thread/4SBR3J5IQUYE752KR7C6432HNBSYKC5X/
.. [7] Stack Overflow answer regarding the use cases for ``_`` as an identifier
https://stackoverflow.com/questions/5893163/what-is-the-purpose-of-the-single-underscore-variable-in-python/5893946#5893946
.. _Appendix A: .. _Appendix A:
@ -1057,9 +1114,9 @@ Notation used beyond standard EBNF is as per PEP 534:
| mapping_pattern | mapping_pattern
| class_pattern | class_pattern
capture_pattern: !"_" NAME !('.' | '(' | '=') capture_pattern: !"__" NAME !('.' | '(' | '=')
wildcard_pattern: "_" wildcard_pattern: "__"
constraint_pattern: constraint_pattern:
| eq_constraint | eq_constraint