PEP 642: Prepare for initial publication (#1698)

* Set initial post date
* Remove most of the caveats on the reference implementation
  (the PEP 634 test cases have been adjusted as necessary to
  pass, and new PEP 642 specific test cases added)
* Additional subsection in Deferred Ideas for negated constraints
* Add notes on PEP 635's concerns about potential semantic confusion
  between assignment targets and pattern matching
* Assorted final pre-publication tweaks
This commit is contained in:
Nick Coghlan 2020-10-31 16:14:04 +10:00 committed by GitHub
parent 9128bcd1b3
commit e772923bb0
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 123 additions and 33 deletions

View File

@ -11,7 +11,7 @@ Content-Type: text/x-rst
Requires: 634
Created: 26-Sep-2020
Python-Version: 3.10
Post-History:
Post-History: 31-Oct-2020
Resolution:
Abstract
@ -49,10 +49,6 @@ pattern syntax:
* The ``_`` identifier is no longer syntactically special (it is a normal
capture pattern, just as it is an ordinary assignment target)
Note: the reference implementation for this PEP is being built on the reference
implementation for PEP 634. Once the implementation reaches a usable state,
the PEP will be published to python-dev and discuss.python.org.
Relationship with other PEPs
============================
@ -336,16 +332,40 @@ exceptions for cases where the right hand side either wasn't a mapping (throwing
have the specific values for the given keys (throwing `ValueError`), avoiding
the need to write out that exception raising logic in every case.
PEP 635 raises the concern that enough aspects of pattern matching semantics
will differ from assignment target semantics that pursuing syntactic parallels
will end up creating confusion rather than reducing it. However, the primary
examples cited as potentially causing confusion are exactly those where the
PEP 634 syntax is *already* the same as that for assignment targets: the fact
that case patterns use iterable unpacking syntax, but only match on sequences
(and specifically exclude strings and byte-strings) rather than consuming
arbitrary iterables is an aspect of PEP 634 that this PEP leaves unchanged.
These semantic differences are intrinsic to the nature of pattern matching:
whereas it is reasonable for a one-shot assignment statement to consume a
one-shot iterator, it isn't reasonable to do that in a construct that's
explicitly about matching a given value against multiple potential targets,
making full use of the available runtime type information to ensure those checks
are as side effect free as possible.
It's an entirely orthogonal question to how the distinction is drawn between
capture patterns and patterns that check for expected values (constraint
patterns in this PEP, literal and value patterns in PEP 634), and it's a big
logical leap to take from "these specific semantic differences between iterable
unpacking and sequence matching are needed in order to handle checking against
multiple potential targets" to "we can reuse attribute binding syntax to mean
equality constraints instead and nobody is going to get confused by that".
Interaction with caching of attribute lookups in local variables
----------------------------------------------------------------
The major change between this PEP and PEP 634 is the use of `?EXPR` for value
constraint lookups, rather than `NAME.ATTR`. The main motivation for this is
constraint lookups, rather than ``NAME.ATTR``. The main motivation for this is
to avoid the semantic conflict with regular assignment targets, where
`NAME.ATTR` is already used in assignment statements to set attributes.
``NAME.ATTR`` is already used in assignment statements to set attributes.
However, even within match statements themselves, the `name.attr` syntax for
However, even within match statements themselves, the ``name.attr`` syntax for
value patterns has an undesirable interaction with local variable assignment,
where routine refactorings that would be semantically neutral for any other
Python statement introduce a major semantic change when applied to a match
@ -541,7 +561,7 @@ No special casing for ``?None``, ``?True``, and ``?False``
This PEP follows PEP 622 in treating ``None``, ``True`` and ``False`` like any other
value constraint, and comparing them by equality, rather than following PEP
634 in proposing that these values (and only these values) be handled specially
634 in proposing that these literals (and only these literals) be handled specially
and compared via identity.
While writing ``x is None`` is a common (and PEP 8 recommended) practice, nobody
@ -588,6 +608,30 @@ code path for exact identity matches on arbitrary objects::
Deferred Ideas
==============
Allowing negated constraints in match patterns
----------------------------------------------
The requirement that constraint expressions be primary expressions means that
it isn't permitted to write ``?not expr`` or ``?is not expr``.
Both of these forms have reasonably clear potential interpretions as a
negated equality constraint (i.e. ``x != expr``) and a negated identity
constraint (i.e. ``x is not expr``).
However, it's far from clear either form would come up often enough to justify
the dedicated syntax, so the extension has been deferred pending further
community experience with match statements.
Note: the compiler can't enforce the primary expression restriction when asked
to compile an AST tree directly, as parentheses used purely for grouping are
lost in the AST generation process. This means the permitted ``?(not expr)``
generates the same AST as the syntactically disallowed ``?not expr`` would.
That isn't a problem though, as in the hypothetical future where this feature
was implemented, ``?not expr`` wouldn't generate the same AST as ``?(not expr)``,
it would generate a new AST node that indicated the use of a negated eqaulity
constraint pattern.
Allowing containment checks in match patterns
---------------------------------------------
@ -632,16 +676,19 @@ Rejected Ideas
Restricting permitted expressions in constraint patterns and mapping pattern keys
---------------------------------------------------------------------------------
While it's entirely technical possible to restrict the kinds of expressions
While it's entirely technically possible to restrict the kinds of expressions
permitted in constraint patterns and mapping pattern keys to just attribute
lookups (as PEP 634 does), there isn't any clear runtime value in doing so,
so the PEP proposes allowing any kind of primary expression (primary
expressions are an existing node type in the grammar that includes things like
literals, names, attribute lookups, function calls, container subscripts, etc).
lookups and constant literals (as PEP 634 does), there isn't any clear runtime
value in doing so, so this PEP proposes allowing any kind of primary expression
(primary expressions are an existing node type in the grammar that includes
things like literals, names, attribute lookups, function calls, container
subscripts, parenthesised groups, etc).
While PEP 635 does emphasise several times that literal patterns and value
patterns are not full expressions, it doesn't ever articulate a concrete benefit
that is obtained from that restriction.
that is obtained from that restriction (just a theoretical appeal to it being
useful to separate static checks from dynamic checks, which a code style
tool could still enforce, even if the compiler itself is more permissive).
The last time we imposed such a restriction was for decorator expressions and
the primary outcome of that was that users had to put up with years of awkward
@ -652,7 +699,7 @@ let users make their own decisions about readability.
The situation in PEP 634 that bears a resemblance to the situation with decorator
expressions is that arbitrary expressions are technically supported in value
patterns, they just require an awkward workaround where all the values to
patterns, they just require awkward workarounds where either all the values to
match need to be specified in a helper class that is placed before the match
statement::
@ -660,8 +707,15 @@ statement::
class mt:
value = func()
match expr:
case mt.value:
... # Handle the case where 'expr == func()'
case (?, mt.value):
... # Handle the case where 'expr[1] == func()'
Or else they need to be written as a combination of a capture pattern and a
guard expression::
match expr:
case (?, _matched) if _matched == func():
... # Handle the case where 'expr[1] == func()'
This PEP proposes skipping requiring any such workarounds, and instead
supporting arbitrary value constraints from the start::
@ -677,6 +731,28 @@ In particular, if static analysers can't follow certain kinds of dynamic checks,
then they can limit the permitted expressions at analysis time, rather than the
compiler restricting them at compile time.
There are also some kinds of expressions that are almost certain to give
nonsensical results (e.g. ``yield``, ``yield from``, ``await``) due to the
pattern caching rule, where the number of times the constraint expression
actually gets evaluated will be implementation dependent. Even here, the PEP
takes the view of letting users write nonsense if they really want to.
Aside from the recenty updated decorator expressions, another situation where
Python's formal syntax offers full freedom of expression that is almost never
used in practice is in ``except`` clauses: the exceptions to match against
almost always take the form of a simple name, a dotted name, or a tuple of
those, but the language grammar permits arbitrary expressions at that point.
This is a good indication that Python's user base can be trusted to
take responsibility for finding readable ways to use permissive language
features, by avoiding writing hard to read constructs even when they're
permitted by the compiler.
This permissiveness comes with a real concrete benefit on the implementation
side: dozens of lines of match statement specific code in the compiler is
replaced by simple calls to the existing code for compiling expressions. This
implementation benefit would accrue not just to CPython, but to every other
Python implementation looking to add match statement support.
Keeping literal patterns
------------------------
@ -684,12 +760,14 @@ Keeping literal patterns
An early (not widely publicised) draft of this proposal considered keeping
PEP 634's literal patterns, as they don't inherently conflict with assignment
statement syntax the way that PEP 634's value patterns do (trying to assign
to a literal is already a syntax error).
to a literal is already a syntax error, whereas assigning to a dotted name
sets the attribute).
They were subsequently removed (and replaced by identity constraints) due to
the fact that they have the same syntax sensitivity problem as value patterns
do, where attempting to move the literal pattern out to a local variable for
naming clarity would turn the match pattern into a capture pattern::
They were subsequently removed (replaced by the combination of equality and
identity constraints) due to the fact that they have the same syntax
sensitivity problem as value patterns do, where attempting to move the
literal pattern out to a local variable for naming clarity would turn the
value checking literal pattern into a name binding capture pattern::
# PEP 634's literal pattern syntax
match expr:
@ -705,7 +783,8 @@ naming clarity would turn the match pattern into a capture pattern::
case _:
... # Handle the non-matching case
With equality constraints, this refactoring keeps the original semantics::
With equality constraints, this style of refactoring keeps the original
semantics (just as it would for a value lookup in any other statement)::
# This PEP's equality constraints
match expr:
@ -761,25 +840,35 @@ PEP 634's ``BASE.ATTR as NAME``.
This idea was dropped as it complicated the grammar for no gain in
expressiveness over just using the general purpose approach to combining
capture patterns with other match patterns (i.e. ``?EXPR as NAME``) when the
identity of the matched object is important.
identity of the matching object is important.
Reference Implementation
========================
A reference implementation for this PEP [3_] is being derived from Brandt
A reference implementation for this PEP [3_] has been derived from Brandt
Bucher's reference implementation for PEP 634 [4_].
Relative to the text of this PEP, the draft reference implementation currently
retains literal patterns as implemented for PEP 634. Removing them will be
a matter of deleting the code out of the compiler, and then adding either
``?`` or ``?is`` as necessary to the test cases that no longer compile. This
removal isn't necessary to show that the PEP's proposal is feasible, so that
work has been deferred for now.
retains literal patterns mostly as implemented for PEP 634, except that the
special casing of ``None``, ``True``, and ``False`` has been removed (with
``PEP 642 TODO`` notes added to the code that can be deleted once these patterns
are dropped entirely).
Value patterns, wildcard patterns, and mapping patterns are all being updated
Value patterns, wildcard patterns, and mapping patterns have been updated
to follow this PEP rather than PEP 634.
Removing literal patterns will be a matter of deleting the code out of the
compiler, and then adding either ``?`` or ``?is`` as necessary to the test
cases that no longer compile. This removal isn't necessary to show that the
PEP's syntax proposal is feasible, so that work has been deferred for now.
There will also be an implementation decision to be made around representing
constraint operators in the AST. The draft implementation adds them as new
cases on the existing ``UnaryOp`` node, but it would potentially be better to
implement them as a new ``Constraint`` node, since they're accepted at
different points in the syntax tree than other unary operators.
Acknowledgments
===============
@ -789,7 +878,8 @@ an attempt to improve the readability of an already well-constructed idea by
proposing that one of the key new concepts in that proposal (the ability to
express value constraints in a name binding target) is sufficiently notable
to be worthy of using up one of the few remaining unused ASCII punctuation
characters in Python's syntax.
characters in Python's syntax instead of reusing the existing attribute binding
syntax to mean an attribute lookup.
References