From 38371b4ece0b699e99e60544961b54dd0115419a Mon Sep 17 00:00:00 2001 From: Nick Coghlan Date: Thu, 3 Nov 2016 01:34:50 +1000 Subject: [PATCH] More PEP 532 notes --- pep-0532.txt | 124 ++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 88 insertions(+), 36 deletions(-) diff --git a/pep-0532.txt b/pep-0532.txt index 03171625d..647a323f2 100644 --- a/pep-0532.txt +++ b/pep-0532.txt @@ -20,6 +20,7 @@ that allows objects to customise the behaviour of the following expressions: * the ``and`` logical conjunction operator * the ``or`` logical disjunction operator * chained comparisons (which implicitly invoke ``and``) +* the ``not`` logical negation operator Each of these expressions is ultimately a variant on the underlying pattern:: @@ -42,19 +43,30 @@ chained comparison operations for matrices, where the result is a matrix of boolean values, rather than tautologically returning ``True`` or raising ``ValueError``. -The PEP further proposes the addition of a new ``if_exists`` builtin that allows -conditional branching based on whether or not an object is ``None``, but returns -the original object rather than the existence checking wrapper as the -result of any conditional expressions. This allows existence checking fallback -operations (aka null-coalescing operations) to be written as:: +To properly support logical negation of conditional result managers, a new +``__not__`` protocol methro would also be introduced allowing objects to control +the result of ``not obj`` expressions. - value = if_exists(expr1) or if_exists(expr2) or expr3 +The PEP further proposes the addition of new ``exists`` and ``missing`` builtins +that allow conditional branching based on whether or not an object is ``None``, +but return the original object rather than the existence checking wrapper as +the result of any conditional expressions. In addition to being usable as +a simple boolean operator (e.g. as in ``assert all(exists, items)``), this +allows existence checking fallback operations (aka null-coalescing operations) +to be written as:: + + value = exists(expr1) or exists(expr2) or expr3 and existence checking precondition operations (aka null-propagating -or null-severing operations) to be written as:: +or null-severing operations) can be written as either:: - value = if_exists(obj) and obj.field.of.interest - value = if_exists(obj) and obj["field"]["of"]["interest"] + value = exists(obj) and obj.field.of.interest + value = exists(obj) and obj["field"]["of"]["interest"] + +or:: + + value = missing(obj) or obj.field.of.interest + value = missing(obj) or obj["field"]["of"]["interest"] Relationship with other PEPs @@ -62,8 +74,9 @@ Relationship with other PEPs This PEP is a direct successor to PEP 531, replacing the existence checking protocol and the new ``?then`` and ``?else`` syntactic operators defined there -with the ability to customise the behaviour of the established ``and`` and -``or`` operators. The existence checking use case is taken from that PEP. +with the ability to customise the behaviour of the established ``not``, +``and`` and ``or`` operators. The existence checking use case is taken from +that PEP. It is also a direct successor to PEP 335, which proposed the ability to overload the ``and`` and ``or`` operators directly, rather than indirectly @@ -72,11 +85,15 @@ expressions. The discussion of the element-wise comparison use case is drawn from Guido's rejection of that PEP. This PEP competes with a number of aspects of PEP 505, proposing that improved -support for null-coalescing and null-propagating operations be offered through -a new protocol and new builtin, rather than through new syntax. It doesn't -compete specifically with the proposed shorthands for existence checking -attribute access and subscripting, but instead offers an alternative underlying -semantic framework for defining them. +support for null-coalescing operations be offered through a new protocol and +new builtin, rather than through new syntax. It doesn't compete specifically +with the proposed shorthands for existence checking attribute access and +subscripting, but instead offers an alternative underlying semantic framework +for defining them: + +* ``LHS ?? RHS`` would mean ``exists(LHS) or RHS`` +* ``EXPR?.attr`` would mean ``missing(EXPR) or EXPR.attr`` +* ``EXPR?[key]`` would mean ``missing(EXPR) or EXPR[key]`` Specification @@ -183,17 +200,20 @@ cases by always returning ``True`` from ``__bool__``. Existence checking comparisons ------------------------------ -A new builtin implementing the new protocol is proposed to encapsulate the +Two new builtins implementing the new protocol are proposed to encapsulate the notion of "existence checking": seeing if a value is ``None`` and either falling back to an alternative value (an operation known as "None-coalescing") or passing it through as the result of the overall expression (an operation known as "None-severing" or "None-propagating"). -This builtin would be defined as follows:: +These builtins would be defined as follows:: - class if_exists: + class exists: + """Conditional result manager for 'EXPR is not None' checks""" def __init__(self, value): self.value = value + def __not__(self): + return missing(self.value) def __bool__(self): return self.value is not None def __then__(self, result): @@ -205,12 +225,33 @@ This builtin would be defined as follows:: return result.value return result + class missing: + """Conditional result manager for 'EXPR is None' checks""" + def __init__(self, value): + self.value = value + def __not__(self): + return exists(self.value) + def __bool__(self): + return self.value is None + def __then__(self, result): + if result is self: + return result.value + return result + def __else__(self, result): + if result is self: + return result.value + return result + + Aside from changing the definition of ``__bool__`` to be based on ``is not None`` rather than normal truth checking, the key characteristic of -``if_exists`` is that when it is used as a conditional result manager, it is +``exists`` is that when it is used as a conditional result manager, it is *ephemeral*: when it detects that short circuiting has taken place, it returns the original value, rather than the existence checking wrapper. +``missing`` is defined as the logically inverted counterpart of ``exists``: +``not exists(obj)`` is semantically equivalent to ``missing(obj)``. + Other conditional constructs ---------------------------- @@ -219,9 +260,12 @@ No changes are proposed to if statements, while statements, comprehensions, or generator expressions, as the boolean clauses they contain are purely used for control flow purposes and don't have programmatically accessible "results". -(While that could technically be changed through the definition of suitable -``as`` clauses based on the conditional result management protocol, such -proposals are outside the scope of this PEP) +However, it's worth noting that while such proposals are outside the scope of +this PEP, the conditional result management protocol defined here would be +sufficient to support constructs like:: + + while exists(dynamic_query()) as result: + ... # Code using result Rationale @@ -351,20 +395,20 @@ potentially missing content is the norm rather than the exception. The combined impact of the proposals in this PEP is to allow the above sample expressions to instead be written as: -* ``value1 = if_exists(expr1) and expr1.field.of.interest`` -* ``value2 = if_exists(expr2) and expr2.["field"]["of"]["interest"]`` -* ``value3 = if_exists(expr3) or if_exists(expr4) or expr5`` +* ``value1 = exists(expr1) and expr1.field.of.interest`` +* ``value2 = exists(expr2) and expr2.["field"]["of"]["interest"]`` +* ``value3 = exists(expr3) or exists(expr4) or expr5`` In these forms, significantly more of the text presented to the reader is immediately relevant to the question "What does this code do?", while the boilerplate code to handle missing data by passing it through to the output or falling back to an alternative input, has shrunk to four uses of the new -``if_exists`` builtin, two uses of the ``and`` keyword, and two uses of the +``exists`` builtin, two uses of the ``and`` keyword, and two uses of the ``or`` keyword. In the first two examples, the 31 character boilerplate suffix ``if exprN is not None else None`` (minimally 27 characters for a single letter -variable name) has been replaced by a 20 character `if_exists(expr1) and`` +variable name) has been replaced by a 20 character `exists(expr1) and`` prefix (minimally 16 characters with a single letter variable name), somewhat improving the signal-to-pattern-noise ratio of the lines (especially if it encourages the use of more meaningful variable and field names rather than @@ -372,7 +416,7 @@ making them shorter purely for the sake of expression brevity). In the last example, not only are two instances of the 26 character boilerplate, ``if exprN is not None else`` (minimally 22 characters) replaced with the -14 character function call ``if_exists() or``, with that function call being +14 character function call ``exists() or``, with that function call being placed directly around the original expression, eliminating the need to duplicate it in the conditional existence check. @@ -393,9 +437,9 @@ This history means that the idea of ``and`` and ``or`` suddenly gaining the ability to be interpreted differently based on the type of the left-hand operand is a potentially controversial one from a readability and maintainability perspective, to the point where it may be *less* controversial -to define new ``?then`` and ``?else`` operators as suggested in PEP 531 than -it would be to redefine the existing operators (as currently proposed in this -PEP). +to define a single new ``??`` operator as proposed in PEP 505, or separate +``?then`` and ``?else`` operators as suggested in PEP 531 than it would be to +redefine the existing operators (as currently proposed in this PEP). Such an approach would also address one of Guido's key concerns with PEP 335 [1_] that would also apply to this PEP as currently written: @@ -407,15 +451,23 @@ Such an approach would also address one of Guido's key concerns with PEP 335 If the protocol in this PEP was combined with the core syntactic proposals in PEP 531, then the end result would look something like: -* ``value1 = if_exists(expr1) ?then expr1.field.of.interest`` -* ``value2 = if_exists(expr2) ?then expr2["field"]["of"]["interest"]`` -* ``value3 = if_exists(expr3) ?else if_exists(expr4) ?else expr5`` +* ``value1 = exists(expr1) ?then expr1.field.of.interest`` +* ``value2 = exists(expr2) ?then expr2["field"]["of"]["interest"]`` +* ``value3 = exists(expr3) ?else exists(expr4) ?else expr5`` Rather than indicating use of the existence protocol as suggested in PEP 531, the ``?`` here would indicate use of the conditional result management protocol, and hence the fact the result may be something other than the LHS as written when the short-circuiting path is executed. +Alternatively, if only a single new operator was added as proposed in PEP +505, but it used the semantics proposed for ``or`` in this PEP, then the end +result would look something like: + +* ``value1 = missing(expr1) ?? expr1.field.of.interest`` +* ``value2 = missing(expr2) ?? expr2["field"]["of"]["interest"]`` +* ``value3 = exists(expr3) ?? exists(expr4) ?? expr5`` + If new operators were added rather than redefining the semantics of ``and``, ``or`` and ``if-else``, then it would make sense to *require* that their left hand operand be a conditional result manager that defines both ``__then__`` @@ -423,7 +475,7 @@ and ``__else__``, rather than accepting arbitrary objects as ``and`` and ``or`` do. With that approach, chained comparisons would be conditionally redefined in -terms of ``?then`` when the left comparison produces a conditional result +terms of the new protocol when the left comparison produces a conditional result manager, while continuing to be defined in terms of ``and`` for any other left comparison result.