diff --git a/pep-0642.rst b/pep-0642.rst index a574cc651..eb6cd76a6 100644 --- a/pep-0642.rst +++ b/pep-0642.rst @@ -19,7 +19,8 @@ Abstract This PEP covers an alternative syntax proposal for PEP 634's structural pattern matching that explicitly anchors match expressions in the existing syntax for -assignment targets, while retaining the semantic aspects of the existing proposal. +assignment targets, while retaining most semantic aspects of the existing +proposal. Specifically, this PEP adopts an additional design restriction that PEP 634's authors considered unreasonable: that any syntax that is common to both @@ -32,17 +33,21 @@ pattern syntax: * Literal patterns and value patterns are combined into a single new pattern type: "constraint patterns" -* Constraint patterns use `?` as a prefix marker on an otherwise arbitrary - primary expression: `?EXPR` -* There is no special casing of the `None`, `True`, or `False` literals -* The constraint expression may be omitted to give a non-binding wildcard pattern +* Constraint patterns are either equality constraints or identity constraints +* Equality constraints use ``?`` as a prefix marker on an otherwise + arbitrary primary expression: ``?EXPR`` +* Identity constraints use `?is` as a prefix marker on an otherwise + arbitrary primary expression: ``?is EXPR`` +* There is no special casing of the ``None``, ``True``, or ``False`` literals +* The constraint expression in an equality constraint may be omitted to give a + non-binding wildcard pattern * Mapping patterns change to allow arbitrary primary expressions as keys * Attempting to use a dotted name as a match pattern is a syntax error rather - than implying a value constraint + than implying an equality constraint * Attempting to use a literal as a match pattern is a syntax error rather - than implying a value constraint -* The `_` identifier is no longer syntactically special (it is a normal capture - pattern, just as it is an ordinary assignment target) + than implying an equality or identity constraint +* The ``_`` identifier is no longer syntactically special (it is a normal + capture pattern, just as it is an ordinary assignment target) Note: the reference implementation for this PEP is being built on the reference implementation for PEP 634. Once the implementation reaches a usable state, @@ -55,9 +60,9 @@ Relationship with other PEPs This PEP both depends on and competes with PEP 634 - the PEP author agrees that match statements would be a sufficiently valuable addition to the language to be worth the additional complexity that they add to the learning process, but -disagrees with the idea that "simple name vs attribute lookup" offers an -adequate syntactic distinction between name binding and value lookup operations -in match patterns. +disagrees with the idea that "simple name vs literal or attribute lookup" offers +an adequate syntactic distinction between name binding and value lookup +operations in match patterns. By switching the wildcard pattern to "?", this PEP complements the proposal in PEP 640 to allow the use of wildcard patterns in other contexts where a name @@ -73,7 +78,7 @@ incorporated an unstated but essential assumption in its syntax design: that neither ordinary expressions *nor* the existing assignment target syntax provide an adequate foundation for the syntax used in match patterns. -While the PEP doesn't explicitly state this assumption, one of the PEP authors +While the PEP didn't explicitly state this assumption, one of the PEP authors explained it clearly on python-dev [1_]: The actual problem that I see is that we have different cultures/intuitions @@ -116,7 +121,13 @@ differences that emerge relative to the syntactic proposal in PEP 634 are: * a requirement to use an explicit marker prefix on value lookups rather than allowing them to be implied by the use of dotted names; and -* a requirement to use a non-binding wildcard marker other than `_`. +* a requirement to use a non-binding wildcard marker other than ``_``. + +PEP 634 also proposes special casing the literals ``None``, ``True``, and +``False`` so that they're compared by identity when written directly as a +literal pattern, but by equality when referenced by a value pattern. This PEP +eliminates those special cases by proposing distinct syntax for matching by +identity and matching by equality. Specification @@ -164,12 +175,19 @@ Constraint patterns Constraint patterns use the following simplified syntax:: - constraint_pattern: '?' primary + constraint_pattern: id_constraint | eq_constraint + id_constraint: '?' 'is' primary + eq_constraint: '?' primary The constraint expression is an arbitrary primary expression - it can be a simple name, a dotted name lookup, a literal, a function call, or any other primary expression. +While the compiler would allow whitespace between ``?`` and ``is`` in +identity constraints (as they're defined as separate tokens), this PEP +proposes that PEP 8 be updated to recommend writing them like ``?is``, as if +they were a combined unary operator. + If this PEP were to be adopted in preference to PEP 634, then all literal and value patterns would instead be written as constraint patterns:: @@ -201,6 +219,18 @@ value patterns would instead be written as constraint patterns:: case ?...: print("May be useful when writing __getitem__ methods?") + # Matching by identity rather than equality + SENTINEL = object() + match value: + case ?is True: + print("True, not 1") + case ?is False: + print("False, not 0") + case ?is None: + print("None, following PEP 8 comparison guidelines") + case ?is SENTINEL: + print("Matches the sentinel by identity, not just value") + # Constant value patterns from enum import Enum class Sides(str, Enum): @@ -230,7 +260,6 @@ clauses. (This implicit caching is less necessary in this PEP, given that explicit local variable caching becomes a valid option, but it still seems a useful property to preserve) - Mapping patterns ---------------- @@ -270,6 +299,10 @@ to match mapping keys:: case {ADDRESS_KEY: address, PORT_KEY: port}: process_address(address, port) +Note: as complex literals are written as binary operations that are evaluated +at compile time, this PEP requires that they be written in parentheses when +used as a key in a mapping pattern. + Design Discussion ================= @@ -381,7 +414,7 @@ a syntax error if the binding target isn't a simple name). PEP 622's walrus pattern syntax had another odd interaction where it might not bind the same object as the exact same walrus expression in the body of the -case clause, but PEP 634 fixed that disrepancy by replacing walrus patterns +case clause, but PEP 634 fixed that discrepancy by replacing walrus patterns with AS patterns (where the fact that the value bound to the name on the RHS might not be the same value as returned by the LHS is a standard feature common to all uses of the "as" keyword). @@ -399,11 +432,11 @@ is desirable for brevity. Most potential candidates are already used in Python for another unrelated purpose, or would integrate poorly with other aspects of the pattern matching -syntax (e.g. `=` or `==` have multiple problems along those lines, in particular -in the way they would combine with `=` as a keyword separator in class -patterns, or `:` as a key/value separate in mapping patterns). +syntax (e.g. ``=`` or ``==`` have multiple problems along those lines, in particular +in the way they would combine with ``=`` as a keyword separator in class +patterns, or ``:`` as a key/value separate in mapping patterns). -This PEP proposes `?` as the prefix marker as it isn't currently used in Python's +This PEP proposes ``?`` as the prefix marker as it isn't currently used in Python's core syntax, the proposed usage as a prefix marker won't conflict with its use in other Python related contexts (e.g. looking up object help information in IPython), and there are plausible mnemonics that may help users to *remember* @@ -416,9 +449,9 @@ PEP 635 has a good discussion of the problems with this choice in the context of using it as the wildcard pattern marker: An alternative that does not suggest an arbitrary number of items would - be `?`. This is even being proposed independently from pattern matching in - PEP 640. We feel however that using `?` as a special "assignment" target is - likely more confusing to Python users than using `_`. It violates Python's + be ``?``. This is even being proposed independently from pattern matching in + PEP 640. We feel however that using ``?`` as a special "assignment" target is + likely more confusing to Python users than using ``_``. It violates Python's (admittedly vague) principle of using punctuation characters only in ways similar to how they are used in common English usage or in high school math, unless the usage is very well established in other programming languages @@ -434,7 +467,7 @@ of using it as the wildcard pattern marker: PEP 505). An as yet unnamed PEP proposes it to mark optional types, e.g. int?. - Another common use of ? in programming systems is "help", for example, in + Another common use of ``?`` in programming systems is "help", for example, in IPython and Jupyter Notebooks and many interactive command-line utilities. This PEP takes the view that *not* requiring a marker prefix on value lookups @@ -446,29 +479,29 @@ equivalence aside from the exact relative timing of the attribute lookup. Assuming the requirement for a marker prefix is accepted on those grounds, then the syntactic bar to meet isn't "Can users *guess* what the chosen symbol means without anyone ever explaining it to them?" but instead the lower standard -applied when choosing the `@` symbol for both decorator expressions and matrix -multiplication and the `:=` character combination for assignment expressions: +applied when choosing the ``@`` symbol for both decorator expressions and matrix +multiplication and the ``:=`` character combination for assignment expressions: "Can users *remember* what it means once they've had it explained to them at least once?". -This PEP contends that `?` will be able to pass that lower standard, and would +This PEP contends that ``?`` will be able to pass that lower standard, and would pass it even more readily if PEP 640 were also subsequently adopted to allow it as a general purpose non-binding wildcard marker that doesn't conflict with the -use of `_` in application internationalisation use cases. +use of ``_`` in application internationalisation use cases. PEPs proposing additional meanings for this character would need to take the pattern matching meaning into account, but wouldn't necessarily fail purely on -that account (e.g. `@` was adopted as a binary operator for matrix +that account (e.g. ``@`` was adopted as a binary operator for matrix multiplication well after its original adoption as a decorator expression prefix). "Value checking" related use cases such as PEP 505's None-aware operators would likely fare especially well on that front, but each such proposal would continue to be judged on a case-by-case basis. -Using "?" as the wildcard pattern ---------------------------------- +Using ``?`` as the wildcard pattern +----------------------------------- -PEP 635 makes a solid case that introducing "?" *solely* as a wildcard pattern +PEP 635 makes a solid case that introducing ``?`` *solely* as a wildcard pattern marker would be a bad idea. Continuing on from the text already quoted in the previous section: @@ -484,33 +517,37 @@ previous section: means that a regular name received special treatment are not strong enough to introduce syntax that would make Python special. -Other languages with pattern matching don't use `?` as the wildcard pattern -(they all use `_`), and without any other usage in Python's syntax, there -wouldn't be any useful prompts to help users remember what `?` means when +Other languages with pattern matching don't use ``?`` as the wildcard pattern +(they all use ``_``), and without any other usage in Python's syntax, there +wouldn't be any useful prompts to help users remember what ``?`` means when they encounter it in a match pattern. -In this PEP, the adoption of "?" as the wildcard pattern marker instead comes +In this PEP, the adoption of ``?`` as the wildcard pattern marker instead comes from asking the question "What does it mean to omit the constraint expression from a constraint pattern?", and concluding that "match any value" is a more -useful definition than reporting a syntax error. +useful definition in most situations than reporting a syntax error. -While making code and concept sharing with other languages easier is a laudable -goal, it isn't like using `_` as a wildcard marker won't *work* - it will just -bind the `_` name, the same as it does in any other Python assignment context. +That said, one possible modification to consider in the name of making code and +concepts easier to share with other languages would be to exempt ``_`` from the +"no repeated names" compiler check. + +With that change, using ``_`` as a wildcard marker would *work* - it would just +also bind the ``_`` name, the same as it does in any other Python assignment +context. -No special casing for `?None`, `?True`, and `?False` ----------------------------------------------------- +No special casing for ``?None``, ``?True``, and ``?False`` +---------------------------------------------------------- -This PEP follows PEP 622 in treating `None`, `True` and `False` like any other +This PEP follows PEP 622 in treating ``None``, ``True`` and ``False`` like any other value constraint, and comparing them by equality, rather than following PEP 634 in proposing that these values (and only these values) be handled specially and compared via identity. -While writing `x is None` is a common (and PEP 8 recommended) practice, nobody -litters their `if-elif` chains with `x is True` or `x is False` expressions, -they write `x` and `not x`, both of which compare by value, not identity. -Indeed, PEP 8 explicitly disallows the use "if x is True:" and "if x is False:", +While writing ``x is None`` is a common (and PEP 8 recommended) practice, nobody +litters their ``if``-``elif`` chains with ``x is True`` or ``x is False`` expressions, +they write ``x`` and ``not x``, both of which compare by value, not identity. +Indeed, PEP 8 explicitly disallows the use ``if x is True:`` and ``if x is False:``, preferring the forms without any comparison operator at all. The key problem with special casing is that it doesn't interact properly with @@ -531,62 +568,67 @@ saved in a variable or attribute:: case self.expected_match: # Set to 'True' somewhere else ... # Handles the case where "expr == True" -However, the explicit prefix syntax proposed in this PEP leaves the door open -to future proposals that would allow for more exact comparisons when desired: +By contrast, the explicit prefix syntax proposed in this PEP makes it +straightforward to include both equality constraints and identity constraints, +allowing users to specify directly in their case clauses whether they want to +match by identity or by value. -* A version of literal pattern syntax could be reintroduced, such that - `True` checked for `is True` while `?True` checked for `== True` (presumably - accompanied by a PEP 8 update to remove the advice against writing such code - in the first place) -* Constraint expressions could be enhanced such that `==` was just the *default* - comparison operator, and others could be selectively introduced based on - specific use cases (e.g. `case ?is True:`) +This distinction means that case clauses can even be used to provide a dedicated +code path for exact identity matches on arbitrary objects:: -It's also the case that the `bool(True)` and `bool(False)` class patterns would -already exclude truthy-but-not-boolean values, so it isn't at all clear that -any significant expressiveness is gained through these special cases. + match value: + case ?is obj: + ... # Handle being given the exact same object + case ?obj: + ... # Handle being given an equivalent object + case ?: + ... # Handle the non-matching case + + +Deferred Ideas +============== + +Allowing containment checks in match patterns +--------------------------------------------- + +The syntax used for identity constraints would be straightforward to extend to +containment checks: ``?in container``. + +One downside of the proposal in this PEP relative to PEP 634 is that checking +against multiple possible values becomes noticably more verbose, especially +for literal value checks:: + + # PEP 634 literal pattern + match value: + case 0 | 1 | 2 | 3: + ... + + # This PEP's equality constraints + match value: + case ?0 | ?1 | ?2 | ?3: + ... + +Containment constraints would provide a more concise way to check if the +match subject was present in a container:: + + match value: + case ?in {0, 1, 2, 3}: + ... + case ?in range(4): # It would accept any container, not just literal sets + ... + +Such a feature would also be readily extensible to allow all kinds of case +clauses without any further syntax updates, simply by defining ``__contains__`` +appropriately on a custom class definition. + +However, while this does seem like a useful extension, it isn't essential, so +it seems more appropriate to defer it to a separate proposal, rather than +including it here. Rejected Ideas ============== -Providing dedicated syntax for binding matched constraint values ----------------------------------------------------------------- - -The initial (unpublished) draft of this proposal suggested allowing `NAME?EXPR` -as a syntactically unambiguous shorthand for PEP 622's `NAME := BASE.ATTR` or -PEP 634's `BASE.ATTR as NAME`. - -This idea was dropped as it complicated the grammar for no gain in -expressiveness over just using the general purpose approach to combining -capture patterns with other match patterns (i.e. `?EXPR as NAME`) when the -identity of the matched object is important. - - -Requiring the use of constraint prefix markers for mapping pattern keys ------------------------------------------------------------------------ - -The initial (unpublished) draft of this proposal suggested requiring mapping -pattern keys be constraint patterns, just as PEP 634 requires that they be valid -literal or value value patterns:: - - import constants - - match config: - case {?"route": route}: - process_route(route) - case {?constants.DEFAULT_PORT: sub_config, **rest}: - process_config(sub_config, rest) - -However, the extra character is syntactically noisy and unlike its use in -constraint patterns (where it distinguishes them from capture patterns), the -prefix doesn't provide any additional information here that isn't already -conveyed by the expression's position as a key within a mapping pattern. - -Accordingly, the proposal was simplified to omit the marker prefix from mapping -pattern keys. - - Restricting permitted expressions in constraint patterns and mapping pattern keys --------------------------------------------------------------------------------- @@ -636,6 +678,92 @@ then they can limit the permitted expressions at analysis time, rather than the compiler restricting them at compile time. +Keeping literal patterns +------------------------ + +An early (not widely publicised) draft of this proposal considered keeping +PEP 634's literal patterns, as they don't inherently conflict with assignment +statement syntax the way that PEP 634's value patterns do (trying to assign +to a literal is already a syntax error). + +They were subsequently removed (and replaced by identity constraints) due to +the fact that they have the same syntax sensitivity problem as value patterns +do, where attempting to move the literal pattern out to a local variable for +naming clarity would turn the match pattern into a capture pattern:: + + # PEP 634's literal pattern syntax + match expr: + case {"port": 443}: + ... # Handle the case where 'expr["port"] == 443' + case _: + ... # Handle the non-matching case + + HTTPS_PORT = 443 + match expr: + case {"port": HTTPS_PORT}: + ... # Matches any mapping with "port", binding its value to HTTPS_PORT + case _: + ... # Handle the non-matching case + +With equality constraints, this refactoring keeps the original semantics:: + + # This PEP's equality constraints + match expr: + case {"port": ?443}: + ... # Handle the case where 'expr["port"] == 443' + case _: + ... # Handle the non-matching case + + HTTPS_PORT = 443 + match expr: + case {"port": ?HTTPS_PORT}: + ... # Handle the case where 'expr["port"] == 443' + case _: + ... # Handle the non-matching case + + +Requiring the use of constraint prefix markers for mapping pattern keys +----------------------------------------------------------------------- + +The initial (unpublished) draft of this proposal suggested requiring mapping +pattern keys be constraint patterns, just as PEP 634 requires that they be valid +literal or value patterns:: + + import constants + + match config: + case {?"route": route}: + process_route(route) + case {?constants.DEFAULT_PORT: sub_config, **rest}: + process_config(sub_config, rest) + +However, the extra character is syntactically noisy and unlike its use in +constraint patterns (where it distinguishes them from capture patterns), the +prefix doesn't provide any additional information here that isn't already +conveyed by the expression's position as a key within a mapping pattern. + +Accordingly, the proposal was simplified to omit the marker prefix from mapping +pattern keys. + +This omission also aligns with the fact that containers may incorporate both +identity and equality checks into their lookup process - they don't purely +rely on equality checks, as would be incorrectly implied by the use of the +equality constraint prefix. + + +Providing dedicated syntax for binding matched constraint values +---------------------------------------------------------------- + +The initial (unpublished) draft of this proposal suggested allowing ``NAME?EXPR`` +as a syntactically unambiguous shorthand for PEP 622's ``NAME := BASE.ATTR`` or +PEP 634's ``BASE.ATTR as NAME``. + +This idea was dropped as it complicated the grammar for no gain in +expressiveness over just using the general purpose approach to combining +capture patterns with other match patterns (i.e. ``?EXPR as NAME``) when the +identity of the matched object is important. + + Reference Implementation ======================== @@ -643,13 +771,13 @@ A reference implementation for this PEP [3_] is being derived from Brandt Bucher's reference implementation for PEP 634 [4_]. Relative to the text of this PEP, the draft reference implementation currently -retains literal patterns as implemented for PEP 634 (This PEP only removes -them as redundant given constraint patterns, it doesn't inherently conflict with -them, and both the tutorial in PEP 636 and the pattern matching test suite -suggest that keeping literal patterns might be worthwhile even if the spelling -of value matching patterns is changed). +retains literal patterns as implemented for PEP 634. Removing them will be +a matter of deleting the code out of the compiler, and then adding either +``?`` or ``?is`` as necessary to the test cases that no longer compile. This +removal isn't necessary to show that the PEP's proposal is feasible, so that +work has been deferred for now. -Value patterns, wildcard patterns, and mapping patterns are being updated +Value patterns, wildcard patterns, and mapping patterns are all being updated to follow this PEP rather than PEP 634. @@ -719,7 +847,9 @@ Notation used beyond standard EBNF is as per PEP 534: capture_pattern: NAME !('.' | '(' | '=') - constraint_pattern: '?' primary + constraint_pattern: eq_constraint | id_constraint + id_constraint: '?' 'is' primary + eq_constraint: '?' primary wildcard_pattern: '?'