PEP 642: Additional edits for 2nd posting (#1709)

* Switch to `__`` as the proposed wildcard marker
* OR patterns also need a disambiguating space before `==`
* Misc wording tweaks and other edits
This commit is contained in:
Nick Coghlan 2020-11-08 15:52:50 +10:00 committed by GitHub
parent f2213d72e4
commit ed214e8955
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 105 additions and 48 deletions

View File

@ -23,12 +23,11 @@ assignment targets, while retaining most semantic aspects of the existing
proposal.
Specifically, this PEP adopts an additional design restriction that PEP 634's
authors considered unreasonable: that any syntax that is common to both
assignment targets and match patterns must have a comparable semantic effect,
while any novel match pattern semantics must offer syntax which emits a syntax
error when used in an assignment target. It is still considered acceptable to
offer syntactic sugar that is specific to match patterns, as long as there is
an underlying more explicit form that is compatible with assignment targets.
authors considered unreasonable: that any novel match pattern semantics must
offer syntax that future PEPs could plausibly propose for adoption in assignment
targets. It is (reluctantly) considered acceptable to offer syntactic sugar that
is specific to match patterns, as long as there is an underlying more explicit
form that is compatible (or potentially compatible) with assignment targets.
As a consequence, this PEP proposes the following changes to the proposed match
pattern syntax:
@ -49,8 +48,8 @@ pattern syntax:
constraint, a class pattern, or a capture pattern with a guard expression)
* inferred constraints are *not* defined in the Abstract Syntax Tree. Instead,
inferred constraints are converted to explicit constraints by the parser
* ``_`` remains the wildcard pattern, but gains a dedicated ``SkippedBinding``
AST node to distinguish it from the use of ``_`` as an identifier
* The wildcard pattern changes from ``_`` (single underscore) to ``__`` (double
underscore), and gains a dedicated ``SkippedBinding`` node in the AST
* Mapping patterns change to allow arbitrary primary expressions as keys
@ -66,10 +65,13 @@ lookup operations in match patterns. (Even though this PEP ultimately retained
that shorthand to reduce the verbosity of common use cases, it still redefines
it in terms of a more explicit underlying construct).
By dropping its own proposal to switch the wildcard pattern to ``?`` (and
instead retaining PEP 634's ``_``), this PEP now effectively votes against
the proposal in PEP 640 to allow the use of ``?`` as a general purpose wildcard
marker in name binding operations.
This PEP agrees with the spirit of PEP 640 (that the chosen wildcard pattern to
skip a name binding should be supported everywhere, not just in match patterns),
but is now proposing a different spelling for the wildcard syntax (``__`` rather
than ``?``). As such, it competes with PEP 640 as written, but would complement
a proposal to deprecate the use of ``__`` as an ordinary identifier and instead
turn it into a general purpose wildcard marker that always skips making a new
local variable binding.
Motivation
@ -125,16 +127,16 @@ differences that emerge relative to the syntactic proposal in PEP 634 are:
only allowing them to be inferred from the use of dotted names or literals; and
* a requirement to use a non-binding wildcard marker other than ``_``.
This PEP concedes the second point in the name of cross-language consistency
(and for lack of a compelling alternative wildcard marker), but proposes
constraint expressions as a way of addressing the first point.
This PEP proposes constraint expressions as a way of addressing the first point,
and changes the proposed non-binding wildcard marker to a double-underscore to
address the latter.
PEP 634 also proposes special casing the literals ``None``, ``True``, and
``False`` so that they're compared by identity when written directly as a
literal pattern, but by equality when referenced by a value pattern. This PEP
eliminates the need for those special cases by proposing distinct syntax for
matching by identity and matching by equality (but does accept the convenience
and consistency argument in allowing ``None`` as a shorthand for ``is None``).
matching by identity and matching by equality, but does accept the convenience
and consistency argument in allowing ``None`` as a shorthand for ``is None``.
Specification
@ -143,7 +145,6 @@ Specification
This PEP retains the overall `match`/`case` statement syntax from PEP 634, and
retains both the syntax and semantics for the following match pattern variants:
* capture patterns
* class patterns
* group patterns
* sequence patterns
@ -151,6 +152,9 @@ retains both the syntax and semantics for the following match pattern variants:
Pattern combination (both OR and AS patterns) and guard expressions also remain
the same as they are in PEP 634.
Capture patterns are essentially unchanged, except that ``_`` becomes a regular
capture pattern, due to the wildcard pattern marker changing to ``__``.
Constraint patterns are added, offering equality constraints and identity
constraints.
@ -161,9 +165,9 @@ attribute lookups, and inferred identity constraints for ``None`` and ``...``.
Mapping patterns change to allow arbitrary primary expressions for keys, rather
than being restricted to literal patterns or value patterns.
Wildcard patterns remain the same in the proposed surface syntax, but are
explicitly distinguished from the use of ``_`` as an identifier in the Abstract
Syntax Tree produced by the parser.
Wildcard patterns are changed to use ``__`` (double underscore) rather than
``_`` (single underscore), and are also given a new dedicated node in the
Abstract Syntax Tree produced by the parser.
Constraint patterns
@ -365,17 +369,28 @@ test cases will compile and run as expected).
Wildcard patterns
-----------------
Wildcard patterns retain the same ``_`` syntax in this PEP as they have in PEP
634. However, this PEP explicitly requires that they be represented in the
Wildcard patterns are changed to use ``__`` (double underscore) rather than
the ``_`` (single underscore) syntax proposed in PEP 634::
match sequence:
case [__]: # any sequence with a single element
return True
case [start, *__, end]: # a sequence with at least two elements
return start == end
case __: # anything
return False
This PEP explicitly requires that wildcard patterns be represented in the
Abstract Syntax Tree as something *other than* a regular ``Name`` node.
The draft reference implementation uses the node name ``SkippedBinding`` to
indicate that the node appears where a simple name binding would ordinarily
occur to indicate that nothing should actually be bound, but the exact name of
the node is more an implementation decision than a design one. The key design
requirement is to limit the special casing of ``_`` to the parser and allow the
requirement is to limit the special casing of ``__`` to the parser and allow the
rest of the compiler to distinguish wildcard patterns from capture patterns
based entirely on information contained within the node itself.
based entirely on the kind of the AST node, rather than needing to inspect the
identifier used in ``Name`` nodes.
Design Discussion
@ -545,6 +560,9 @@ chosen as the prefix in the initial iteration of the PEP:
* when used in a mapping pattern, there needs to be a space between the ``:``
key/value separator and the ``==`` prefix, or the tokeniser will split them
up incorrectly (getting ``:=`` and ``=`` instead of ``:`` and ``==``)
* when used in an OR pattern, there needs to be a space between the ``|``
pattern separator and the ``==`` prefix, or the tokeniser will split them
up incorrectly (getting ``|=`` and ``=`` instead of ``|`` and ``==``)
Rather than introducing a completely new symbol, Steven's proposed resolution to
this verbosity problem was to retain the ability to omit the prefix marker in
@ -563,31 +581,58 @@ pattern matching syntax held for this proposal as well, and so the PEP was
amended accordingly.
Retaining ``_`` as the wildcard pattern marker
----------------------------------------------
Using ``__`` as the wildcard pattern marker
-------------------------------------------
PEP 635 makes a solid case that introducing ``?`` *solely* as a wildcard pattern
marker would be a bad idea. With the syntax for constraint patterns now changed
to use existing comparison operations rather than ``?`` and ``?is``, that
argument holds for this PEP as well.
However, this PEP also proposes adopting an implementation technique that limits
the scope of the associated special casing of ``_`` to the parser: defining a
However, as noted by Thomas Wouters in [6_], PEP 634's choice of ``_`` remains
problematic as it would likely mean that match patterns would have a *permanent*
difference from all other parts of Python - the use of ``_`` in software
internationalisation and at the interactive prompt means that there isn't really
a plausible path towards using it as a general purpose "skipped binding" marker.
``__`` is an alternative "this value is not needed" marker drawn from a Stack
Overflow answer [7_] (originally posted by the author of this PEP) on the
various meanings of ``_`` in existing Python code.
This PEP also proposes adopting an implementation technique that limits
the scope of the associated special casing of ``__`` to the parser: defining a
new AST node type (``SkippedBinding``) specifically for wildcard markers.
Within the parser, ``_`` would still mean either a regular name or a wildcard
Within the parser, ``__`` would still mean either a regular name or a wildcard
marker in a match pattern depending on where you were in the parse tree, but
within the rest of the compiler, ``Name("_")`` would always be a regular name,
while ``SkippedBinding()`` would always be a wildcard marker (with it being
the responsibility of the AST validator to disallow the use of
``SkippedBinding`` outside match patterns).
within the rest of the compiler, ``Name("__")`` would still be a regular name,
while ``SkippedBinding()`` would always be a wildcard marker.
It may also make sense to consider a future proposal that further changes ``_``
to also skip binding when it's used as part of an iterable unpacking target, but
that's entirely out of the scope of the pattern matching discussion (and would
require careful review of how the routine uses of assignment to ``_`` in
internationalisation use cases and Python interactive prompt implementations
are handled).
Unlike ``_``, the lack of other use cases for ``__`` means that there would be
a plausible path towards restoring identifier handling consistency with the rest
of the language by making it mean "skip this name binding" everwhere in Python:
* in the interpreter itself, deprecate loading variables with the name ``__``.
This would make reading from ``__`` emit a deprecation warning, while writing
to it would initially be unchanged. To avoid slowing down all name loads, this
could be handled by having the compiler emit additional code for the
deprecated name, rather than using a runtime check in the standard name
loading opcodes.
* after a suitable number of releases, change the parser to emit
``SkippedBinding`` for all uses of ``__`` as an assignment target, not just
those appearing inside match patterns
* consider making ``__`` a true hard keyword rather than a soft keyword
This deprecation path couldn't be followed for ``_``, as there's no way for the
interpreter to distinguish between attempts to read back ``_`` when nominally
used as a "don't care" marker, and legitimate reads of ``_`` as either an
i18n text translation function or as the last statement result at the
interactive prompt.
Names starting with double-underscores are also already reserved for use by the
language, whether that is for compile time constants (i.e. ``__debug__``),
special methods, or class attribute name mangling, so using ``__`` here would
be consistent with that existing approach.
Keeping inferred equality constraints
@ -786,14 +831,16 @@ The syntax used for equality and identity constraints would be straightforward
to extend to containment checks: ``in container``.
One downside of the proposals in both this PEP and PEP 634 is that checking
for multiple values in the same case is quite verbose::
for multiple values in the same case doesn't look like any existing set
membership check in Python::
# PEP 634's literal patterns / this PEP's inferred constraints
match value:
case 0 | 1 | 2 | 3:
...
Explicit equality constraints are even worse::
Explicit equality constraints also become quite verbose if they need to be
repeated::
match value:
case == one | == two | == three | == four:
@ -870,7 +917,7 @@ This PEP proposes skipping requiring any such workarounds, and instead
supporting arbitrary value constraints from the start::
match expr:
case (_, == func()):
case (__, == func()):
... # Handle the case where 'expr == func()'
Whether actually writing that kind of code is a good idea would be a topic for
@ -974,9 +1021,6 @@ constraint operators in the AST. The draft implementation adds them as new
cases on the existing ``UnaryOp`` node, but there's an argument to be made that
they would be better implemented as a new ``Constraint`` node, since they're
accepted at different points in the syntax tree than other unary operators.
Making them a new node type would also allow an attribute to be added that
marked them as implicit or explicit nodes, which ``ast.unparse`` could use
to make the unparsed code look more like original.
Acknowledgments
@ -999,6 +1043,13 @@ to obtain the same level of brevity as PEP 634 in most situations. (Paul
Sokolosvsky also independently suggested using ``==`` instead of ``?`` as a
more easily understood prefix for equality constraints).
Thomas Wouters, whose publication of PEP 640 and public review of the structured
pattern matching proposals persuaded the author of this PEP to continue
advocating for a wildcard pattern syntax that a future PEP could plausibly turn
into a hard keyword that always skips binding a reference in any location a
simple name is expected, rather than continuing indefinitely as the match
pattern specific soft keyword that is proposed here.
References
==========
@ -1018,6 +1069,12 @@ References
.. [5] Steven D'Aprano's cogent criticism of the first published iteration of this PEP
https://mail.python.org/archives/list/python-dev@python.org/message/BTHFWG6MWLHALOD6CHTUFPHAR65YN6BP/
.. [6] Thomas Wouter's initial review of the structured pattern matching proposals
https://mail.python.org/archives/list/python-dev@python.org/thread/4SBR3J5IQUYE752KR7C6432HNBSYKC5X/
.. [7] Stack Overflow answer regarding the use cases for ``_`` as an identifier
https://stackoverflow.com/questions/5893163/what-is-the-purpose-of-the-single-underscore-variable-in-python/5893946#5893946
.. _Appendix A:
@ -1057,9 +1114,9 @@ Notation used beyond standard EBNF is as per PEP 534:
| mapping_pattern
| class_pattern
capture_pattern: !"_" NAME !('.' | '(' | '=')
capture_pattern: !"__" NAME !('.' | '(' | '=')
wildcard_pattern: "_"
wildcard_pattern: "__"
constraint_pattern:
| eq_constraint