PEP 637: Feedback and various typo fixes (GH-1741)

This commit is contained in:
Stefano Borini 2020-12-23 18:05:12 +00:00 committed by GitHub
parent c86d1cc89d
commit fc43230308
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 76 additions and 35 deletions

View File

@ -31,7 +31,10 @@ to indexing operations::
>>> x[1, 2, a=3, b=4] = val # setitem
>>> del x[1, 2, a=3, b=4] # delitem
and would also provide appropriate semantics.
and would also provide appropriate semantics. Single- and double-star unpacking of
arguments is also provided::
>>> val = x[*(1, 2), **{a=3, b=4}] # Equivalent to above.
This PEP is a successor to PEP 472, which was rejected due to lack of
interest in 2019. Since then there's been renewed interest in the feature.
@ -122,7 +125,7 @@ specification would improve notation and provide additional value:
arguments) need some way to determine which arguments are destined for
the target function, and which are used to configure how they run the
target. This is simple (if non-extensible) for positional parameters,
but we need some way to distinguish these for keywords.[#trio-run]_
but we need some way to distinguish these for keywords. [#trio-run]_
An indexed notation would afford a Pythonic way to pass keyword
arguments to these functions without cluttering the caller's code.
@ -464,7 +467,7 @@ classes with indexing semantics) will remain the same and will continue not to
accept keyword arguments. In other words, if ``d`` is a ``dict``, the
statement ``d[1, a=2]`` will raise ``TypeError``, as their implementation will
not support the use of keyword arguments. The same holds for all other classes
(list, frozendict, etc.)
(list, dict, etc.)
Corner case and Gotchas
-----------------------
@ -496,7 +499,7 @@ With the introduction of the new notation, a few corner cases need to be analyse
2. a similar case occurs with setter notation::
# Given type(obj).__setitem__(self, index, value):
# Given type(obj).__setitem__(obj, index, value):
obj[1, value=3] = 5
This poses no issue because the value is passed automatically, and the python interpreter will raise
@ -515,9 +518,9 @@ With the introduction of the new notation, a few corner cases need to be analyse
they will probably be surprised by the method call::
# expected type(obj).__getitem__(0, direction='south')
# expected type(obj).__getitem__(obj, 0, direction='south')
# but actually get:
obj.__getitem__((0, 'south'), direction='north')
type(obj).__getitem__(obj, (0, 'south'), direction='north')
Solution: best practice suggests that keyword subscripts should be
flagged as keyword-only when possible::
@ -529,12 +532,13 @@ With the introduction of the new notation, a few corner cases need to be analyse
about subscript methods which don't use the keyword-only flag.
4. As we saw, a single value followed by a keyword argument will not be changed into a tuple, i.e.:
``d[1, a=3]`` is treated as ``__getitem__(1, a=3)``, NOT ``__getitem__((1,), a=3)``. It would be
``d[1, a=3]`` is treated as ``__getitem__(d, 1, a=3)``, NOT ``__getitem__(d, (1,), a=3)``. It would be
extremely confusing if adding keyword arguments were to change the type of the passed index.
In other words, adding a keyword to a single-valued subscript will not change it into a tuple.
For those cases where an actual tuple needs to be passed, a proper syntax will have to be used::
obj[(1,), a=3] # calls __getitem__((1,), a=3)
obj[(1,), a=3]
# calls type(obj).__getitem__(obj, (1,), a=3)
In this case, the call is passing a single element (which is passed as is, as from rule above),
only that the single element happens to be a tuple.
@ -544,23 +548,40 @@ With the introduction of the new notation, a few corner cases need to be analyse
When keywords are present, the rule that you can omit this outermost pair of parentheses is no
longer true::
obj[1] # calls __getitem__(1)
obj[1, a=3] # calls __getitem__(1, a=3)
obj[1,] # calls __getitem__((1,))
obj[(1,), a=3] # calls __getitem__((1,), a=3)
obj[1]
# calls type(obj).__getitem__(obj, 1)
obj[1, a=3]
# calls type(obj).__getitem__(obj, 1, a=3)
obj[1,]
# calls type(obj).__getitem__(obj, (1,))
obj[(1,), a=3]
# calls type(obj).__getitem__(obj, (1,), a=3)
This is particularly relevant in the case where two entries are passed::
obj[1, 2] # calls __getitem__((1, 2))
obj[(1, 2)] # same as above
obj[1, 2, a=3] # calls __getitem__((1, 2), a=3)
obj[(1, 2), a=3] # calls __getitem__((1, 2), a=3)
obj[1, 2]
# calls type(obj).__getitem__(obj, (1, 2))
obj[(1, 2)]
# same as above
obj[1, 2, a=3]
# calls type(obj).__getitem__(obj, (1, 2), a=3)
obj[(1, 2), a=3]
# calls type(obj).__getitem__(obj, (1, 2), a=3)
And particularly when the tuple is extracted as a variable::
t = (1, 2)
obj[t] # calls __getitem__((1, 2))
obj[t, a=3] # calls __getitem__((1, 2), a=3)
obj[t]
# calls type(obj).__getitem__(obj, (1, 2))
obj[t, a=3]
# calls type(obj).__getitem__(obj, (1, 2), a=3)
Why? because in the case ``obj[1, 2, a=3]`` we are passing two elements (which
are then packed as a tuple and passed as the index). In the case ``obj[(1, 2), a=3]``
@ -593,6 +614,12 @@ compiler will have to generate these new opcodes. The
old C implementations will call the extended methods passing ``NULL``
as kwargs.
Reference Implementation
========================
A reference implementation is currently being developed here [#reference-impl]_.
Workarounds
===========
@ -663,8 +690,7 @@ Previous PEP 472 solutions
--------------------------
PEP 472 presents a good amount of ideas that are now all to be considered
Rejected. A personal email from D'Aprano to one of the authors (Stefano Borini)
specifically said:
Rejected. A personal email from D'Aprano to the author specifically said:
I have now carefully read through PEP 472 in full, and I am afraid I
cannot support any of the strategies currently in the PEP.
@ -727,7 +753,8 @@ The problems with this approach were found to be:
indexes. This would look awkward because the visual notation does not match
the signature::
obj[1, 2] = 3 # calls obj.__setitem_ex__(3, 1, 2)
obj[1, 2] = 3
# calls type(obj).__setitem_ex__(obj, 3, 1, 2)
- the solution relies on the assumption that all keyword indices necessarily map
into positional indices, or that they must have a name. This assumption may be
@ -751,7 +778,8 @@ create a new "kwslice" object
This proposal has already been explored in "New arguments contents" P4 in PEP 472::
obj[a, b:c, x=1] # calls __getitem__(a, slice(b, c), key(x=1))
obj[a, b:c, x=1]
# calls type(obj).__getitem__(obj, a, slice(b, c), key(x=1))
This solution requires everyone who needs keyword arguments to parse the tuple
and/or key object by hand to extract them. This is painful and opens up to the
@ -774,7 +802,8 @@ meaning that this::
would result in a call to::
>>> d.__getitem__(1, 2, z=3) # instead of d.__getitem__((1, 2), z=3)
>>> type(obj).__getitem__(obj, 1, 2, z=3)
# instead of type(obj).__getitem__(obj, (1, 2), z=3)
This option has been rejected because it feels odd that a signature of a method
depends on a specific value of another dunder. It would be confusing for both
@ -829,7 +858,8 @@ The above consideration makes it impossible to have a keyword only dunder, and
opens up the question of what entity to pass for the index position when no index
is passed::
obj[k=3] = 5 # would call type(obj).__setitem__(???, 5, k=3)
obj[k=3] = 5
# would call type(obj).__setitem__(obj, ???, 5, k=3)
A proposed hack would be to let the user specify which entity to use when an
index is not specified, by specifying a default for the ``index``, but this
@ -837,7 +867,7 @@ forces necessarily to also specify a (never going to be used, as a value is
always passed by design) default for the ``value``, as we can't have
non-default arguments after defaulted one::
def __setitem__(index=SENTINEL, value=NEVERUSED, *, k)
def __setitem__(self, index=SENTINEL, value=NEVERUSED, *, k)
which seems ugly, redundant and confusing. We must therefore accept that some
form of sentinel index must be passed by the python implementation when the
@ -856,7 +886,7 @@ and a user that wants to pass a keyword ``value``::
expecting a call like::
obj.__setitem__(SENTINEL, 0, **{"value": 1})
type(obj).__setitem__(obj, SENTINEL, 0, **{"value": 1})
will instead accidentally be catched by the named ``value``, producing a
``duplicate value error``. The user should not be worried about the actual
@ -890,7 +920,7 @@ the options were:
For option 1, the call will become::
type(obj).__getitem__((), k=3)
type(obj).__getitem__(obj, (), k=3)
therefore making ``obj[k=3]`` and ``obj[(), k=3]`` degenerate and indistinguishable.
@ -928,8 +958,8 @@ the two degenerate cases.
So, an alternative strategy (option 3) would be to use an existing entity that is
unlikely to be used as a valid index. One option could be the current built-in constant
``NotImplemented``, which is currently returned by comparison operators to
report that they do not implement the comparison, and a different strategy
``NotImplemented``, which is currently returned by operators methods to
report that they do not implement a particular operation, and a different strategy
should be attempted (e.g. to ask the other object). Unfortunately, its name and
traditional use calls back to a feature that is not available, rather than the
fact that something was not passed by the user.
@ -943,14 +973,22 @@ From a quick inquire, it seems that most people on python-ideas seem to believe
it's not crucial, and the empty tuple is an acceptable option. Hence the
resulting series will be::
obj[k=3] # __getitem__((), k=3). Empty tuple
obj[1, k=3] # __getitem__(1, k=3). Integer
obj[1, 2, k=3] # __getitem__((1, 2), k=3). Tuple
obj[k=3]
# type(obj).__getitem__(obj, (), k=3). Empty tuple
obj[1, k=3]
# type(obj).__getitem__(obj, 1, k=3). Integer
obj[1, 2, k=3]
# type(obj).__getitem__(obj, (1, 2), k=3). Tuple
and the following two notation will be degenerate::
obj[(), k=3] # __getitem__((), k=3)
obj[k=3] # __getitem__((), k=3)
obj[(), k=3]
# type(obj).__getitem__(obj, (), k=3)
obj[k=3]
# type(obj).__getitem__(obj, (), k=3)
Common objections
=================
@ -971,7 +1009,8 @@ Common objections
but for obvious reasons, call syntax using builtins to create custom type hints
isn't an option::
dict(i=float, j=float) # would create a dictionary, not a type
dict(i=float, j=float)
# would create a dictionary, not a type
Finally, function calls do not allow for a setitem-like notation, as shown
in the Overview: operations such as ``f(1, x=3) = 5`` are not allowed, and are
@ -995,6 +1034,8 @@ References
(https://github.com/python-trio/trio/issues/470)
.. [#numpy-ml] "[Numpy-discussion] Request for comments on PEP 637 - Support for indexing with keyword arguments"
(http://numpy-discussion.10968.n7.nabble.com/Request-for-comments-on-PEP-637-Support-for-indexing-with-keyword-arguments-td48489.html)
.. [#reference-impl] "Reference implementation"
(https://github.com/python/cpython/compare/master...stefanoborini:PEP-637-implementation-attempt-2)
Copyright
=========