PEP 642: Prepare for initial publication (#1698)

* Set initial post date * Remove most of the caveats on the reference implementation (the PEP 634 test cases have been adjusted as necessary to pass, and new PEP 642 specific test cases added) * Additional subsection in Deferred Ideas for negated constraints * Add notes on PEP 635's concerns about potential semantic confusion between assignment targets and pattern matching * Assorted final pre-publication tweaks
2020-10-31 16:14:04 +10:00 · 2020-10-31 16:14:04 +10:00 · e772923bb0
parent 9128bcd1b3
commit e772923bb0
1 changed files with 123 additions and 33 deletions
--- a/pep-0642.rst
+++ b/pep-0642.rst
@ -11,7 +11,7 @@ Content-Type: text/x-rst
 Requires: 634
 Created: 26-Sep-2020
 Python-Version: 3.10
-Post-History:
+Post-History: 31-Oct-2020
 Resolution:

 Abstract
@ -49,10 +49,6 @@ pattern syntax:
 * The ``_`` identifier is no longer syntactically special (it is a normal
  capture pattern, just as it is an ordinary assignment target)

-Note: the reference implementation for this PEP is being built on the reference
-implementation for PEP 634. Once the implementation reaches a usable state,
-the PEP will be published to python-dev and discuss.python.org.
-

 Relationship with other PEPs
 ============================
@ -336,16 +332,40 @@ exceptions for cases where the right hand side either wasn't a mapping (throwing
 have the specific values for the given keys (throwing `ValueError`), avoiding
 the need to write out that exception raising logic in every case.

+PEP 635 raises the concern that enough aspects of pattern matching semantics
+will differ from assignment target semantics that pursuing syntactic parallels
+will end up creating confusion rather than reducing it. However, the primary
+examples cited as potentially causing confusion are exactly those where the
+PEP 634 syntax is *already* the same as that for assignment targets: the fact
+that case patterns use iterable unpacking syntax, but only match on sequences
+(and specifically exclude strings and byte-strings) rather than consuming
+arbitrary iterables is an aspect of PEP 634 that this PEP leaves unchanged.
+
+These semantic differences are intrinsic to the nature of pattern matching:
+whereas it is reasonable for a one-shot assignment statement to consume a
+one-shot iterator, it isn't reasonable to do that in a construct that's
+explicitly about matching a given value against multiple potential targets,
+making full use of the available runtime type information to ensure those checks
+are as side effect free as possible.
+
+It's an entirely orthogonal question to how the distinction is drawn between
+capture patterns and patterns that check for expected values (constraint
+patterns in this PEP, literal and value patterns in PEP 634), and it's a big
+logical leap to take from "these specific semantic differences between iterable
+unpacking and sequence matching are needed in order to handle checking against
+multiple potential targets" to "we can reuse attribute binding syntax to mean
+equality constraints instead and nobody is going to get confused by that".
+

 Interaction with caching of attribute lookups in local variables
 ----------------------------------------------------------------

 The major change between this PEP and PEP 634 is the use of `?EXPR` for value
-constraint lookups, rather than `NAME.ATTR`. The main motivation for this is
+constraint lookups, rather than ``NAME.ATTR``. The main motivation for this is
 to avoid the semantic conflict with regular assignment targets, where
-`NAME.ATTR` is already used in assignment statements to set attributes.
+``NAME.ATTR`` is already used in assignment statements to set attributes.

-However, even within match statements themselves, the `name.attr` syntax for
+However, even within match statements themselves, the ``name.attr`` syntax for
 value patterns has an undesirable interaction with local variable assignment,
 where routine refactorings that would be semantically neutral for any other
 Python statement introduce a major semantic change when applied to a match
@ -541,7 +561,7 @@ No special casing for ``?None``, ``?True``, and ``?False``

 This PEP follows PEP 622 in treating ``None``, ``True`` and ``False`` like any other
 value constraint, and comparing them by equality, rather than following PEP
-634 in proposing that these values (and only these values) be handled specially
+634 in proposing that these literals (and only these literals) be handled specially
 and compared via identity.

 While writing ``x is None`` is a common (and PEP 8 recommended) practice, nobody
@ -588,6 +608,30 @@ code path for exact identity matches on arbitrary objects::
 Deferred Ideas
 ==============

+Allowing negated constraints in match patterns
+----------------------------------------------
+
+The requirement that constraint expressions be primary expressions means that
+it isn't permitted to write ``?not expr`` or ``?is not expr``.
+
+Both of these forms have reasonably clear potential interpretions as a
+negated equality constraint (i.e. ``x != expr``) and a negated identity
+constraint (i.e. ``x is not expr``).
+
+However, it's far from clear either form would come up often enough to justify
+the dedicated syntax, so the extension has been deferred pending further
+community experience with match statements.
+
+Note: the compiler can't enforce the primary expression restriction when asked
+to compile an AST tree directly, as parentheses used purely for grouping are
+lost in the AST generation process. This means the permitted ``?(not expr)``
+generates the same AST as the syntactically disallowed ``?not expr`` would.
+That isn't a problem though, as in the hypothetical future where this feature
+was implemented, ``?not expr`` wouldn't generate the same AST as ``?(not expr)``,
+it would generate a new AST node that indicated the use of a negated eqaulity
+constraint pattern.
+
+
 Allowing containment checks in match patterns
 ---------------------------------------------

@ -632,16 +676,19 @@ Rejected Ideas
 Restricting permitted expressions in constraint patterns and mapping pattern keys
 ---------------------------------------------------------------------------------

-While it's entirely technical possible to restrict the kinds of expressions
+While it's entirely technically possible to restrict the kinds of expressions
 permitted in constraint patterns and mapping pattern keys to just attribute
-lookups (as PEP 634 does), there isn't any clear runtime value in doing so,
-so the PEP proposes allowing any kind of primary expression (primary
-expressions are an existing node type in the grammar that includes things like
-literals, names, attribute lookups, function calls, container subscripts, etc).
+lookups and constant literals (as PEP 634 does), there isn't any clear runtime
+value in doing so, so this PEP proposes allowing any kind of primary expression
+(primary expressions are an existing node type in the grammar that includes
+things like literals, names, attribute lookups, function calls, container
+subscripts, parenthesised groups, etc).

 While PEP 635 does emphasise several times that literal patterns and value
 patterns are not full expressions, it doesn't ever articulate a concrete benefit
-that is obtained from that restriction.
+that is obtained from that restriction (just a theoretical appeal to it being
+useful to separate static checks from dynamic checks, which a code style
+tool could still enforce, even if the compiler itself is more permissive).

 The last time we imposed such a restriction was for decorator expressions and
 the primary outcome of that was that users had to put up with years of awkward
@ -652,7 +699,7 @@ let users make their own decisions about readability.

 The situation in PEP 634 that bears a resemblance to the situation with decorator
 expressions is that arbitrary expressions are technically supported in value
-patterns, they just require an awkward workaround where all the values to
+patterns, they just require awkward workarounds where either all the values to
 match need to be specified in a helper class that is placed before the match
 statement::

@ -660,8 +707,15 @@ statement::
    class mt:
        value = func()
    match expr:
-        case mt.value:
-            ... # Handle the case where 'expr == func()'
+        case (?, mt.value):
+            ... # Handle the case where 'expr[1] == func()'
+
+Or else they need to be written as a combination of a capture pattern and a
+guard expression::
+
+    match expr:
+        case (?, _matched) if _matched == func():
+            ... # Handle the case where 'expr[1] == func()'

 This PEP proposes skipping requiring any such workarounds, and instead
 supporting arbitrary value constraints from the start::
@ -677,6 +731,28 @@ In particular, if static analysers can't follow certain kinds of dynamic checks,
 then they can limit the permitted expressions at analysis time, rather than the
 compiler restricting them at compile time.

+There are also some kinds of expressions that are almost certain to give
+nonsensical results (e.g. ``yield``, ``yield from``, ``await``) due to the
+pattern caching rule, where the number of times the constraint expression
+actually gets evaluated will be implementation dependent. Even here, the PEP
+takes the view of letting users write nonsense if they really want to.
+
+Aside from the recenty updated decorator expressions, another situation where
+Python's formal syntax offers full freedom of expression that is almost never
+used in practice is in ``except`` clauses: the exceptions to match against
+almost always take the form of a simple name, a dotted name, or a tuple of
+those, but the language grammar permits arbitrary expressions at that point.
+This is a good indication that Python's user base can be trusted to
+take responsibility for finding readable ways to use permissive language
+features, by avoiding writing hard to read constructs even when they're
+permitted by the compiler.
+
+This permissiveness comes with a real concrete benefit on the implementation
+side: dozens of lines of match statement specific code in the compiler is
+replaced by simple calls to the existing code for compiling expressions. This
+implementation benefit would accrue not just to CPython, but to every other
+Python implementation looking to add match statement support.
+

 Keeping literal patterns
 ------------------------
@ -684,12 +760,14 @@ Keeping literal patterns
 An early (not widely publicised) draft of this proposal considered keeping
 PEP 634's literal patterns, as they don't inherently conflict with assignment
 statement syntax the way that PEP 634's value patterns do (trying to assign
-to a literal is already a syntax error).
+to a literal is already a syntax error, whereas assigning to a dotted name
+sets the attribute).

-They were subsequently removed (and replaced by identity constraints) due to
-the fact that they have the same syntax sensitivity problem as value patterns
-do, where attempting to move the literal pattern out to a local variable for
-naming clarity would turn the match pattern into a capture pattern::
+They were subsequently removed (replaced by the combination of equality and
+identity constraints) due to the fact that they have the same syntax
+sensitivity problem as value patterns do, where attempting to move the
+literal pattern out to a local variable for naming clarity would turn the
+value checking literal pattern into a name binding capture pattern::

    # PEP 634's literal pattern syntax
    match expr:
@ -705,7 +783,8 @@ naming clarity would turn the match pattern into a capture pattern::
        case _:
            ... # Handle the non-matching case

-With equality constraints, this refactoring keeps the original semantics::
+With equality constraints, this style of refactoring keeps the original
+semantics (just as it would for a value lookup in any other statement)::

    # This PEP's equality constraints
    match expr:
@ -761,25 +840,35 @@ PEP 634's ``BASE.ATTR as NAME``.
 This idea was dropped as it complicated the grammar for no gain in
 expressiveness over just using the general purpose approach to combining
 capture patterns with other match patterns (i.e. ``?EXPR as NAME``) when the
-identity of the matched object is important.
+identity of the matching object is important.


 Reference Implementation
 ========================

-A reference implementation for this PEP [3_] is being derived from Brandt
+A reference implementation for this PEP [3_] has been derived from Brandt
 Bucher's reference implementation for PEP 634 [4_].

 Relative to the text of this PEP, the draft reference implementation currently
-retains literal patterns as implemented for PEP 634. Removing them will be
-a matter of deleting the code out of the compiler, and then adding either
-``?`` or ``?is`` as necessary to the test cases that no longer compile. This
-removal isn't necessary to show that the PEP's proposal is feasible, so that
-work has been deferred for now.
+retains literal patterns mostly as implemented for PEP 634, except that the
+special casing of ``None``, ``True``, and ``False`` has been removed (with
+``PEP 642 TODO`` notes added to the code that can be deleted once these patterns
+are dropped entirely).

-Value patterns, wildcard patterns, and mapping patterns are all being updated
+Value patterns, wildcard patterns, and mapping patterns have been updated
 to follow this PEP rather than PEP 634.

+Removing literal patterns will be a matter of deleting the code out of the
+compiler, and then adding either ``?`` or ``?is`` as necessary to the test
+cases that no longer compile. This removal isn't necessary to show that the
+PEP's syntax proposal is feasible, so that work has been deferred for now.
+
+There will also be an implementation decision to be made around representing
+constraint operators in the AST. The draft implementation adds them as new
+cases on the existing ``UnaryOp`` node, but it would potentially be better to
+implement them as a new ``Constraint`` node, since they're accepted at
+different points in the syntax tree than other unary operators.
+

 Acknowledgments
 ===============
@ -789,7 +878,8 @@ an attempt to improve the readability of an already well-constructed idea by
 proposing that one of the key new concepts in that proposal (the ability to
 express value constraints in a name binding target) is sufficiently notable
 to be worthy of using up one of the few remaining unused ASCII punctuation
-characters in Python's syntax.
+characters in Python's syntax instead of reusing the existing attribute binding
+syntax to mean an attribute lookup.


 References