From fc43230308b622d958d78b93c7af1ba4b7cf52d2 Mon Sep 17 00:00:00 2001 From: Stefano Borini Date: Wed, 23 Dec 2020 18:05:12 +0000 Subject: [PATCH] PEP 637: Feedback and various typo fixes (GH-1741) --- pep-0637.rst | 111 +++++++++++++++++++++++++++++++++++---------------- 1 file changed, 76 insertions(+), 35 deletions(-) diff --git a/pep-0637.rst b/pep-0637.rst index 138310eef..27ad283de 100644 --- a/pep-0637.rst +++ b/pep-0637.rst @@ -31,7 +31,10 @@ to indexing operations:: >>> x[1, 2, a=3, b=4] = val # setitem >>> del x[1, 2, a=3, b=4] # delitem -and would also provide appropriate semantics. +and would also provide appropriate semantics. Single- and double-star unpacking of +arguments is also provided:: + + >>> val = x[*(1, 2), **{a=3, b=4}] # Equivalent to above. This PEP is a successor to PEP 472, which was rejected due to lack of interest in 2019. Since then there's been renewed interest in the feature. @@ -122,7 +125,7 @@ specification would improve notation and provide additional value: arguments) need some way to determine which arguments are destined for the target function, and which are used to configure how they run the target. This is simple (if non-extensible) for positional parameters, - but we need some way to distinguish these for keywords.[#trio-run]_ + but we need some way to distinguish these for keywords. [#trio-run]_ An indexed notation would afford a Pythonic way to pass keyword arguments to these functions without cluttering the caller's code. @@ -464,7 +467,7 @@ classes with indexing semantics) will remain the same and will continue not to accept keyword arguments. In other words, if ``d`` is a ``dict``, the statement ``d[1, a=2]`` will raise ``TypeError``, as their implementation will not support the use of keyword arguments. The same holds for all other classes -(list, frozendict, etc.) +(list, dict, etc.) Corner case and Gotchas ----------------------- @@ -496,7 +499,7 @@ With the introduction of the new notation, a few corner cases need to be analyse 2. a similar case occurs with setter notation:: - # Given type(obj).__setitem__(self, index, value): + # Given type(obj).__setitem__(obj, index, value): obj[1, value=3] = 5 This poses no issue because the value is passed automatically, and the python interpreter will raise @@ -515,9 +518,9 @@ With the introduction of the new notation, a few corner cases need to be analyse they will probably be surprised by the method call:: - # expected type(obj).__getitem__(0, direction='south') + # expected type(obj).__getitem__(obj, 0, direction='south') # but actually get: - obj.__getitem__((0, 'south'), direction='north') + type(obj).__getitem__(obj, (0, 'south'), direction='north') Solution: best practice suggests that keyword subscripts should be flagged as keyword-only when possible:: @@ -529,12 +532,13 @@ With the introduction of the new notation, a few corner cases need to be analyse about subscript methods which don't use the keyword-only flag. 4. As we saw, a single value followed by a keyword argument will not be changed into a tuple, i.e.: - ``d[1, a=3]`` is treated as ``__getitem__(1, a=3)``, NOT ``__getitem__((1,), a=3)``. It would be + ``d[1, a=3]`` is treated as ``__getitem__(d, 1, a=3)``, NOT ``__getitem__(d, (1,), a=3)``. It would be extremely confusing if adding keyword arguments were to change the type of the passed index. In other words, adding a keyword to a single-valued subscript will not change it into a tuple. For those cases where an actual tuple needs to be passed, a proper syntax will have to be used:: - obj[(1,), a=3] # calls __getitem__((1,), a=3) + obj[(1,), a=3] + # calls type(obj).__getitem__(obj, (1,), a=3) In this case, the call is passing a single element (which is passed as is, as from rule above), only that the single element happens to be a tuple. @@ -544,23 +548,40 @@ With the introduction of the new notation, a few corner cases need to be analyse When keywords are present, the rule that you can omit this outermost pair of parentheses is no longer true:: - obj[1] # calls __getitem__(1) - obj[1, a=3] # calls __getitem__(1, a=3) - obj[1,] # calls __getitem__((1,)) - obj[(1,), a=3] # calls __getitem__((1,), a=3) + obj[1] + # calls type(obj).__getitem__(obj, 1) + + obj[1, a=3] + # calls type(obj).__getitem__(obj, 1, a=3) + + obj[1,] + # calls type(obj).__getitem__(obj, (1,)) + + obj[(1,), a=3] + # calls type(obj).__getitem__(obj, (1,), a=3) This is particularly relevant in the case where two entries are passed:: - obj[1, 2] # calls __getitem__((1, 2)) - obj[(1, 2)] # same as above - obj[1, 2, a=3] # calls __getitem__((1, 2), a=3) - obj[(1, 2), a=3] # calls __getitem__((1, 2), a=3) + obj[1, 2] + # calls type(obj).__getitem__(obj, (1, 2)) + + obj[(1, 2)] + # same as above + + obj[1, 2, a=3] + # calls type(obj).__getitem__(obj, (1, 2), a=3) + + obj[(1, 2), a=3] + # calls type(obj).__getitem__(obj, (1, 2), a=3) And particularly when the tuple is extracted as a variable:: t = (1, 2) - obj[t] # calls __getitem__((1, 2)) - obj[t, a=3] # calls __getitem__((1, 2), a=3) + obj[t] + # calls type(obj).__getitem__(obj, (1, 2)) + + obj[t, a=3] + # calls type(obj).__getitem__(obj, (1, 2), a=3) Why? because in the case ``obj[1, 2, a=3]`` we are passing two elements (which are then packed as a tuple and passed as the index). In the case ``obj[(1, 2), a=3]`` @@ -593,6 +614,12 @@ compiler will have to generate these new opcodes. The old C implementations will call the extended methods passing ``NULL`` as kwargs. +Reference Implementation +======================== + +A reference implementation is currently being developed here [#reference-impl]_. + + Workarounds =========== @@ -663,8 +690,7 @@ Previous PEP 472 solutions -------------------------- PEP 472 presents a good amount of ideas that are now all to be considered -Rejected. A personal email from D'Aprano to one of the authors (Stefano Borini) -specifically said: +Rejected. A personal email from D'Aprano to the author specifically said: I have now carefully read through PEP 472 in full, and I am afraid I cannot support any of the strategies currently in the PEP. @@ -727,7 +753,8 @@ The problems with this approach were found to be: indexes. This would look awkward because the visual notation does not match the signature:: - obj[1, 2] = 3 # calls obj.__setitem_ex__(3, 1, 2) + obj[1, 2] = 3 + # calls type(obj).__setitem_ex__(obj, 3, 1, 2) - the solution relies on the assumption that all keyword indices necessarily map into positional indices, or that they must have a name. This assumption may be @@ -751,7 +778,8 @@ create a new "kwslice" object This proposal has already been explored in "New arguments contents" P4 in PEP 472:: - obj[a, b:c, x=1] # calls __getitem__(a, slice(b, c), key(x=1)) + obj[a, b:c, x=1] + # calls type(obj).__getitem__(obj, a, slice(b, c), key(x=1)) This solution requires everyone who needs keyword arguments to parse the tuple and/or key object by hand to extract them. This is painful and opens up to the @@ -774,7 +802,8 @@ meaning that this:: would result in a call to:: - >>> d.__getitem__(1, 2, z=3) # instead of d.__getitem__((1, 2), z=3) + >>> type(obj).__getitem__(obj, 1, 2, z=3) + # instead of type(obj).__getitem__(obj, (1, 2), z=3) This option has been rejected because it feels odd that a signature of a method depends on a specific value of another dunder. It would be confusing for both @@ -829,7 +858,8 @@ The above consideration makes it impossible to have a keyword only dunder, and opens up the question of what entity to pass for the index position when no index is passed:: - obj[k=3] = 5 # would call type(obj).__setitem__(???, 5, k=3) + obj[k=3] = 5 + # would call type(obj).__setitem__(obj, ???, 5, k=3) A proposed hack would be to let the user specify which entity to use when an index is not specified, by specifying a default for the ``index``, but this @@ -837,7 +867,7 @@ forces necessarily to also specify a (never going to be used, as a value is always passed by design) default for the ``value``, as we can't have non-default arguments after defaulted one:: - def __setitem__(index=SENTINEL, value=NEVERUSED, *, k) + def __setitem__(self, index=SENTINEL, value=NEVERUSED, *, k) which seems ugly, redundant and confusing. We must therefore accept that some form of sentinel index must be passed by the python implementation when the @@ -856,7 +886,7 @@ and a user that wants to pass a keyword ``value``:: expecting a call like:: - obj.__setitem__(SENTINEL, 0, **{"value": 1}) + type(obj).__setitem__(obj, SENTINEL, 0, **{"value": 1}) will instead accidentally be catched by the named ``value``, producing a ``duplicate value error``. The user should not be worried about the actual @@ -890,7 +920,7 @@ the options were: For option 1, the call will become:: - type(obj).__getitem__((), k=3) + type(obj).__getitem__(obj, (), k=3) therefore making ``obj[k=3]`` and ``obj[(), k=3]`` degenerate and indistinguishable. @@ -928,8 +958,8 @@ the two degenerate cases. So, an alternative strategy (option 3) would be to use an existing entity that is unlikely to be used as a valid index. One option could be the current built-in constant -``NotImplemented``, which is currently returned by comparison operators to -report that they do not implement the comparison, and a different strategy +``NotImplemented``, which is currently returned by operators methods to +report that they do not implement a particular operation, and a different strategy should be attempted (e.g. to ask the other object). Unfortunately, its name and traditional use calls back to a feature that is not available, rather than the fact that something was not passed by the user. @@ -943,14 +973,22 @@ From a quick inquire, it seems that most people on python-ideas seem to believe it's not crucial, and the empty tuple is an acceptable option. Hence the resulting series will be:: - obj[k=3] # __getitem__((), k=3). Empty tuple - obj[1, k=3] # __getitem__(1, k=3). Integer - obj[1, 2, k=3] # __getitem__((1, 2), k=3). Tuple + obj[k=3] + # type(obj).__getitem__(obj, (), k=3). Empty tuple + + obj[1, k=3] + # type(obj).__getitem__(obj, 1, k=3). Integer + + obj[1, 2, k=3] + # type(obj).__getitem__(obj, (1, 2), k=3). Tuple and the following two notation will be degenerate:: - obj[(), k=3] # __getitem__((), k=3) - obj[k=3] # __getitem__((), k=3) + obj[(), k=3] + # type(obj).__getitem__(obj, (), k=3) + + obj[k=3] + # type(obj).__getitem__(obj, (), k=3) Common objections ================= @@ -971,7 +1009,8 @@ Common objections but for obvious reasons, call syntax using builtins to create custom type hints isn't an option:: - dict(i=float, j=float) # would create a dictionary, not a type + dict(i=float, j=float) + # would create a dictionary, not a type Finally, function calls do not allow for a setitem-like notation, as shown in the Overview: operations such as ``f(1, x=3) = 5`` are not allowed, and are @@ -995,6 +1034,8 @@ References (https://github.com/python-trio/trio/issues/470) .. [#numpy-ml] "[Numpy-discussion] Request for comments on PEP 637 - Support for indexing with keyword arguments" (http://numpy-discussion.10968.n7.nabble.com/Request-for-comments-on-PEP-637-Support-for-indexing-with-keyword-arguments-td48489.html) +.. [#reference-impl] "Reference implementation" + (https://github.com/python/cpython/compare/master...stefanoborini:PEP-637-implementation-attempt-2) Copyright =========