PEP 637: Fix rST errors caused by incorrect indentation (#1617)

This commit is contained in:
Eric Wieser 2020-09-24 22:19:40 +01:00 committed by GitHub
parent 692a9623f7
commit ef4663634e
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 85 additions and 85 deletions

View File

@ -11,7 +11,7 @@ Content-Type: text/x-rst
Created: 24-Aug-2020
Python-Version: 3.10
Post-History: 23-Sep-2020
Resolution:
Resolution:
Abstract
========
@ -34,7 +34,7 @@ to indexing operations:
>>> val = x[1, 2, a=3, b=4] # getitem
>>> x[1, 2, a=3, b=4] = val # setitem
>>> del x[1, 2, a=3, b=4] # delitem
and would also provide appropriate semantics.
@ -52,7 +52,7 @@ PEP 472 was opened in 2014. The PEP detailed various use cases and was created b
extracting implementation strategies from a broad discussion on the
python-ideas mailing list, although no clear consensus was reached on which strategy
should be used. Many corner cases have been examined more closely and felt
awkward, backward incompatible or both.
awkward, backward incompatible or both.
The PEP was eventually rejected in 2019 [#rejection]_ mostly
due to lack of interest for the feature despite its 5 years of existence.
@ -86,11 +86,11 @@ specification would improve notation and provide additional value:
>>> grid_position[x=3, y=5, z=8]
>>> rain_amount[time=0:12, location=location]
>>> matrix[row=20, col=40]
2. To enrich the typing notation with keywords, especially during the use of generics
::
::
def function(value: MyType[T=int]):
@ -106,15 +106,15 @@ specification would improve notation and provide additional value:
4. Pandas currently uses a notation such as
::
>>> df[df['x'] == 1]
which could be replaced with df[x=1].
5. xarray has named dimensions. Currently these are handled with functions .isel:
::
::
>>> data.isel(row=10) # Returns the tenth row
which could also be replaced with `data[row=10]`. A more complex example:
@ -122,14 +122,14 @@ specification would improve notation and provide additional value:
::
>>> # old syntax
>>> da.isel(space=0, time=slice(None, 2))[...] = spam
>>> da.isel(space=0, time=slice(None, 2))[...] = spam
>>> # new syntax
>>> da[space=0, time=:2] = spam
>>> da[space=0, time=:2] = spam
Another example:
::
>>> # old syntax
>>> ds["empty"].loc[dict(lon=5, lat=6)] = 10
>>> # new syntax
@ -182,12 +182,12 @@ that, while we cannot prevent abuse, implementors should be aware that the
introduction of keyword arguments to alter the behavior of the lookup may
violate this intrinsic meaning.
The second difference of the indexing notation compared to a function
The second difference of the indexing notation compared to a function
is that indexing can be used for both getting and setting operations.
In python, a function cannot be on the left hand side of an assignment. In
other words, both of these are valid
::
::
>>> x = a[1, 2]
>>> a[1, 2] = 5
@ -201,8 +201,8 @@ but only the first one of these is valid
This asymmetry is important, and makes one understand that there is a natural
imbalance between the two forms. It is therefore not a given that the two
should behave transparently and symmetrically.
should behave transparently and symmetrically.
The third difference is that functions have names assigned to their
arguments, unless the passed parameters are captured with \*args, in which case
they end up as entries in the args tuple. In other words, functions already
@ -217,7 +217,7 @@ __(get|set|del)item__ is not always receiving a tuple as the ``index`` argument
def __getitem__(self, index):
print(index)
The index operation basically forwards the content of the square brackets "as is"
The index operation basically forwards the content of the square brackets "as is"
in the ``index`` argument:
::
@ -229,7 +229,7 @@ in the ``index`` argument:
(0, 1)
>>> x[(0, 1)]
(0, 1)
>>>
>>>
>>> x[()]
()
>>> x[{1, 2, 3}]
@ -247,9 +247,9 @@ colon notations to slices, thanks to support from the parser. This is valid
a[1:3]
this one isn't
::
f(1:3)
The fifth difference is that there's no zero-argument form. This is valid
@ -259,11 +259,11 @@ The fifth difference is that there's no zero-argument form. This is valid
f()
this one isn't
::
a[]
New Proposal
------------
@ -279,7 +279,7 @@ square brackets, the _positional_ index. In other words, what is passed in the
square brackets is trivially used to generate what the code in ``__getitem__`` then uses
for the indicisation operation. As we already saw for the dict, ``d[1]`` has a
positional index of ``1`` and also a final index of ``1`` (because it's the element that is
then added to the dictionary) and ``d[1, 2]`` has positional index of ``(1, 2)`` and
then added to the dictionary) and ``d[1, 2]`` has positional index of ``(1, 2)`` and
final index also of ``(1, 2)`` (because yet again it's the element that is added to the dictionary).
However, the positional index ``d[1,2:3]`` is not accepted by the dictionary, because
there's no way to transform the positional index into a final index, as the slice object is
@ -289,12 +289,12 @@ creates the final index by e.g. converting the positional index to a string.
The new proposal extends the current status quo, and grants more flexibility to
create the _final_ index via an enhanced syntax that combines the positional index
and keyword arguments, if passed.
and keyword arguments, if passed.
The above brings an important point across. Keyword arguments, in the context of the index
operation, may be used to take indexing decisions to obtain the final index, and therefore
will have to accept values that are unconventional for functions. See for
example use case 1, where a slice is accepted.
example use case 1, where a slice is accepted.
The new notation will make all of the following valid notation:
@ -328,13 +328,13 @@ The following old semantics are preserved:
1. As said above, an empty subscript is still illegal, regardless of context.
::
::
obj[] # SyntaxError
2. A single index value remains a single index value when passed:
::
::
obj[index]
# calls type(obj).__getitem__(obj, index)
@ -405,7 +405,7 @@ The following old semantics are preserved:
Note that:
- a single positional index will not turn into a tuple
just because one adds a keyword value.
just because one adds a keyword value.
- for ``__setitem__``, the same order is retained for index and value.
The keyword arguments go at the end, as is normal for a function
@ -442,7 +442,7 @@ The following old semantics are preserved:
confused as to what is happening, and it is best if they receive an
immediate syntax error with an informative error message.
This restriction has however been considered arbitrary by some, and it might
This restriction has however been considered arbitrary by some, and it might
be lifted at a later stage for symmetry with kwargs unpacking, see next.
8. Dict unpacking is permitted:
@ -495,12 +495,12 @@ The following old semantics are preserved:
obj[3, spam=False] # Valid. index = 3, spam = False, eggs = 2
obj[spam=False] # Valid. index = (), spam = False, eggs = 2
obj[] # Invalid.
13. The same semantics given above must be extended to ``__class__getitem__``:
Since PEP 560, type hints are dispatched so that for ``x[y]``, if no
``__getitem__`` method is found, and ``x`` is a type (class) object,
and ``x`` has a class method ``__class_getitem__``, that method is
called. The same changes should be applied to this method as well,
and ``x`` has a class method ``__class_getitem__``, that method is
called. The same changes should be applied to this method as well,
so that a writing like ``list[T=int]`` can be accepted.
Existing indexing implementations in standard classes
@ -512,7 +512,7 @@ statement ``d[1, a=2]`` will raise ``TypeError``, as their implementation will
not support the use of keyword arguments. The same holds for all other classes
(list, frozendict, etc.)
Corner case and Gotchas
Corner case and Gotchas
-----------------------
With the introduction of the new notation, a few corner cases need to be analysed:
@ -530,33 +530,33 @@ With the introduction of the new notation, a few corner cases need to be analyse
obj[3, index=4]
obj[index=1]
The resulting behavior would be an error automatically, since it would be like
attempting to call the method with two values for the ``index`` argument, and
a ``TypeError`` will be raised. In the first case, the ``index`` would be ``3``,
in the second case, it would be the empty tuple ``()``.
The resulting behavior would be an error automatically, since it would be like
attempting to call the method with two values for the ``index`` argument, and
a ``TypeError`` will be raised. In the first case, the ``index`` would be ``3``,
in the second case, it would be the empty tuple ``()``.
Note that this behavior applies for all currently existing classes that rely on
indexing, meaning that there is no way for the new behavior to introduce
backward compatibility issues on this respect.
Note that this behavior applies for all currently existing classes that rely on
indexing, meaning that there is no way for the new behavior to introduce
backward compatibility issues on this respect.
Classes that wish to stress this behavior explicitly can define their
parameters as positional-only:
Classes that wish to stress this behavior explicitly can define their
parameters as positional-only:
::
::
def __getitem__(self, index, /):
2. a similar case occurs with setter notation
::
::
# Given type(obj).__getitem__(self, index, value):
obj[1, value=3] = 5
This poses no issue because the value is passed automatically, and the python interpreter will raise
This poses no issue because the value is passed automatically, and the python interpreter will raise
``TypeError: got multiple values for keyword argument 'value'``
3. If the subscript dunders are declared to use positional-or-keyword
parameters, there may be some surprising cases when arguments are passed
@ -607,9 +607,9 @@ With the introduction of the new notation, a few corner cases need to be analyse
only that the single element happens to be a tuple.
Note that this behavior just reveals the truth that the ``obj[1,]`` notation is shorthand for
``obj[(1,)]`` (and also ``obj[1]`` is shorthand for ``obj[(1)]``, with the expected behavior).
When keywords are present, the rule that you can omit this outermost pair of parentheses is no
longer true.
``obj[(1,)]`` (and also ``obj[1]`` is shorthand for ``obj[(1)]``, with the expected behavior).
When keywords are present, the rule that you can omit this outermost pair of parentheses is no
longer true.
::
@ -623,43 +623,43 @@ With the introduction of the new notation, a few corner cases need to be analyse
::
obj[1, 2] # calls __getitem__((1, 2))
obj[(1, 2)] # same as above
obj[(1, 2)] # same as above
obj[1, 2, a=3] # calls __getitem__((1, 2), a=3)
obj[(1, 2), a=3] # calls __getitem__((1, 2), a=3)
And particularly when the tuple is extracted as a variable:
::
t = (1, 2)
obj[t] # calls __getitem__((1, 2))
obj[t, a=3] # calls __getitem__((1, 2), a=3)
Why? because in the case ``obj[1, 2, a=3]`` we are passing two elements (which
are then packed as a tuple and passed as the index). In the case ``obj[(1, 2), a=3]``
we are passing a single element (which is passed as is) which happens to be a tuple.
are then packed as a tuple and passed as the index). In the case ``obj[(1, 2), a=3]``
we are passing a single element (which is passed as is) which happens to be a tuple.
The final result is that they are the same.
C Interface
===========
Resolution of the indexing operation is performed through a call to
Resolution of the indexing operation is performed through a call to
``PyObject_GetItem(PyObject *o, PyObject *key)`` for the get operation,
``PyObject_SetItem(PyObject *o, PyObject *key, PyObject *value)`` for the set operation, and
``PyObject_SetItem(PyObject *o, PyObject *key, PyObject *value)`` for the set operation, and
``PyObject_DelItem(PyObject *o, PyObject *key)`` for the del operation.
These functions are used extensively within the python executable, and are
also part of the public C API, as exported by ``Include/abstract.h``. It is clear that
the signature of this function cannot be changed, and different C level functions
the signature of this function cannot be changed, and different C level functions
need to be implemented to support the extended call. We propose
``PyObject_GetItemEx(PyObject *o, PyObject *key, PyObject *kwargs)``,
``PyObject_SetItemEx(PyObject *o, PyObject *key, PyObject *value, PyObject *kwargs)`` and
``PyObject_GetItemEx(PyObject *o, PyObject *key, PyObject *kwargs)``,
``PyObject_SetItemEx(PyObject *o, PyObject *key, PyObject *value, PyObject *kwargs)`` and
``PyObject_DetItemEx(PyObject *o, PyObject *key, PyObject *kwargs)``.
Additionally, new opcodes will be needed for the enhanced call.
Additionally, new opcodes will be needed for the enhanced call.
Currently, the implementation uses ``BINARY_SUBSCR``, ``STORE_SUBSCR`` and ``DELETE_SUBSCR``
to invoke the old functions. We propose ``BINARY_SUBSCR_EX``,
``STORE_SUBSCR_EX`` and ``DELETE_SUBSCR_EX`` for the extended operation. The parser will
``STORE_SUBSCR_EX`` and ``DELETE_SUBSCR_EX`` for the extended operation. The parser will
have to generate these new opcodes. The ``PyObject_(Get|Set|Del)Item`` implementations
will call the extended methods passing ``NULL`` as kwargs.
@ -673,7 +673,7 @@ problem that the PEP solves." [#pep-0001]_
Some rough equivalents to the proposed extension, which we call work-arounds,
are already possible. The work-arounds provide an alternative to enabling the
new syntax, while leaving the semantics to be defined elsewhere.
new syntax, while leaving the semantics to be defined elsewhere.
These work-arounds follow. In them the helpers ``H`` and ``P`` are not intended to
be universal. For example, a module or package might require the use of its own
@ -697,20 +697,20 @@ helpers.
2. A helper class, here called ``H``, can be used to swap the container
and parameter roles. In other words, we use
::
::
H(1, 2, a=3, b=4)[x]
as a substitute for
::
::
x[1, 2, a=3, b=4]
This method will work for ``getitem``, ``delitem`` and also for
``setitem``. This is because
::
::
>>> H(1, 2, a=3, b=4)[x] = val
@ -719,7 +719,7 @@ helpers.
3. A helper function, here called ``P``, can be used to store the
arguments in a single object. For example
::
::
>>> x[P(1, 2, a=3, b=4)] = val
@ -728,13 +728,13 @@ helpers.
4. The ``lo:hi:step`` syntax for slices is sometimes very useful. This
syntax is not directly available in the work-arounds. However
::
::
s[lo:hi:step]
provides a work-around that is available everything, where
::
::
class S:
def __getitem__(self, key): return key
@ -750,7 +750,7 @@ Previous PEP 472 solutions
--------------------------
PEP 472 presents a good amount of ideas that are now all to be considered
Rejected. A personal email from D'Aprano to one of the authors (Stefano Borini)
Rejected. A personal email from D'Aprano to one of the authors (Stefano Borini)
specifically said:
I have now carefully read through PEP 472 in full, and I am afraid I
@ -773,7 +773,7 @@ The rationale around this choice is to make the intuition around how to add kwd
arg support to square brackets more obvious and in line with the function
behavior. Given:
::
::
def __getitem_ex__(self, x, y): ...
@ -793,10 +793,10 @@ The problems with this approach were found to be:
- It will slow down subscripting. For every subscript access, this new dunder
attribute gets investigated on the class, and if it is not present then the
default key translation function is executed.
default key translation function is executed.
Different ideas were proposed to handle this, from wrapping the method
only at class instantiation time, to add a bit flag to signal the availability
of these methods. Regardess of the solution, the new dunder would be effective
only at class instantiation time, to add a bit flag to signal the availability
of these methods. Regardess of the solution, the new dunder would be effective
only if added at class creation time, not if it's added later. This would
be unusual and would disallow (and behave unexpectedly) monkeypatching of the
methods for whatever reason it might be needed.
@ -819,17 +819,17 @@ The problems with this approach were found to be:
the signature:
::
obj[1, 2] = 3 # calls obj.__setitem_ex__(3, 1, 2)
- the solution relies on the assumption that all keyword indices necessarily map
into positional indices, or that they must have a name. This assumption may be
- the solution relies on the assumption that all keyword indices necessarily map
into positional indices, or that they must have a name. This assumption may be
false: xarray, which is the primary python package for numpy arrays with
labelled dimensions, supports indexing by additional dimensions (so called
labelled dimensions, supports indexing by additional dimensions (so called
"non-dimension coordinates") that don't correspond directly to the dimensions
of the underlying numpy array, and those have no position to match up to.
In other words, anonymous indexes are a plausible use case that this solution
would remove, although it could be argued that using ``*args`` would solve
would remove, although it could be argued that using ``*args`` would solve
that issue.
Adding an adapter function
@ -858,7 +858,7 @@ make sense and which ones do not.
Using a single bit to change the behavior
-----------------------------------------
A special class dunder flag
A special class dunder flag
::
@ -879,7 +879,7 @@ would result in a call to
This option has been rejected because it feels odd that a signature of a method
depends on a specific value of another dunder. It would be confusing for both
static type checkers and for humans: a static type checker would have to hard-code
static type checkers and for humans: a static type checker would have to hard-code
a special case for this, because there really is nothing else in Python
where the signature of a dunder depends on the value of another dunder.
A human that has to implement a ``__getitem__`` dunder would have to look if in the
@ -905,7 +905,7 @@ a commenter stated
dict type could be made to reject it.
This proposal already established that, in case no positional index is given, the
passed value must be the empty tuple. Allowing for the empty index notation would
passed value must be the empty tuple. Allowing for the empty index notation would
make the dictionary type accept it automatically, to insert or refer to the value with
the empty tuple as key. Moreover, a typing notation such as ``Tuple[]`` can easily
be written as ``Tuple`` without the indexing notation.
@ -913,7 +913,7 @@ be written as ``Tuple`` without the indexing notation.
Use None instead of the empty tuple when no positional index is given
---------------------------------------------------------------------
The case ``obj[k=3]`` will lead to a call ``__getitem__((), k=3)``.
The case ``obj[k=3]`` will lead to a call ``__getitem__((), k=3)``.
The alternative ``__getitem__(None, k=3)`` was considered but rejected:
NumPy uses `None` to indicate inserting a new axis/dimensions (there's
a ``np.newaxis`` alias as well):
@ -947,7 +947,7 @@ no positional arguments
>>> def foo(*args, **kwargs):
... print(args, kwargs)
...
...
>>> foo(k=3)
() {'k': 3}
@ -974,7 +974,7 @@ Common objections
Without kwdargs inside ``[]``, you would not be able to do this:
::
Vector = dict[i=float, j=float]
but for obvious reasons, call syntax using builtins to create custom type hints
@ -989,7 +989,7 @@ References
.. [#rejection] "Rejection of PEP 472"
(https://mail.python.org/pipermail/python-dev/2019-March/156693.html)
.. [#pep-0484] "PEP 484 -- Type hints"
.. [#pep-0484] "PEP 484 -- Type hints"
(https://www.python.org/dev/peps/pep-0484)
.. [#request-1] "Allow kwargs in __{get|set|del}item__"
(https://mail.python.org/archives/list/python-ideas@python.org/thread/EUGDRTRFIY36K4RM3QRR52CKCI7MIR2M/)