2020-09-22 20:14:43 -04:00
|
|
|
|
PEP: 637
|
|
|
|
|
Title: Support for indexing with keyword arguments
|
|
|
|
|
Version: $Revision$
|
|
|
|
|
Last-Modified: $Date$
|
|
|
|
|
Author: Stefano Borini, Jonathan Fine
|
|
|
|
|
Sponsor: Steven D'Aprano
|
|
|
|
|
Discussions-To: python-ideas@python.org
|
|
|
|
|
Status: Draft
|
|
|
|
|
Type: Standards Track
|
|
|
|
|
Content-Type: text/x-rst
|
|
|
|
|
Created: 24-Aug-2020
|
|
|
|
|
Python-Version: 3.10
|
2020-09-23 17:43:22 -04:00
|
|
|
|
Post-History: 23-Sep-2020
|
2020-09-24 17:19:40 -04:00
|
|
|
|
Resolution:
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
Abstract
|
|
|
|
|
========
|
|
|
|
|
|
|
|
|
|
At present keyword arguments are allowed in function calls, but not in
|
|
|
|
|
item access. This PEP proposes that Python be extended to allow keyword
|
|
|
|
|
arguments in item access.
|
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
The following example shows keyword arguments for ordinary function calls::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> val = f(1, 2, a=3, b=4)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
The proposal would extend the syntax to allow a similar construct
|
2020-09-25 20:00:29 -04:00
|
|
|
|
to indexing operations::
|
2020-09-24 17:19:40 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> val = x[1, 2, a=3, b=4] # getitem
|
|
|
|
|
>>> x[1, 2, a=3, b=4] = val # setitem
|
|
|
|
|
>>> del x[1, 2, a=3, b=4] # delitem
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
and would also provide appropriate semantics.
|
|
|
|
|
|
|
|
|
|
This PEP is a successor to PEP 472, which was rejected due to lack of
|
|
|
|
|
interest in 2019. Since then there's been renewed interest in the feature.
|
|
|
|
|
|
|
|
|
|
Overview
|
|
|
|
|
========
|
|
|
|
|
|
|
|
|
|
Background
|
|
|
|
|
----------
|
|
|
|
|
|
|
|
|
|
PEP 472 was opened in 2014. The PEP detailed various use cases and was created by
|
|
|
|
|
extracting implementation strategies from a broad discussion on the
|
|
|
|
|
python-ideas mailing list, although no clear consensus was reached on which strategy
|
|
|
|
|
should be used. Many corner cases have been examined more closely and felt
|
2020-09-24 17:19:40 -04:00
|
|
|
|
awkward, backward incompatible or both.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
The PEP was eventually rejected in 2019 [#rejection]_ mostly
|
|
|
|
|
due to lack of interest for the feature despite its 5 years of existence.
|
|
|
|
|
|
|
|
|
|
However, with the introduction of type hints in PEP 484 [#pep-0484]_ the
|
|
|
|
|
square bracket notation has been used consistently to enrich the typing
|
|
|
|
|
annotations, e.g. to specify a list of integers as Sequence[int]. Additionally,
|
|
|
|
|
there has been an expanded growth of packages for data analysis such as pandas
|
|
|
|
|
and xarray, which use names to describe columns in a table (pandas) or axis in
|
|
|
|
|
an nd-array (xarray). These packages allow users to access specific data by
|
|
|
|
|
names, but cannot currently use index notation ([]) for this functionality.
|
|
|
|
|
|
|
|
|
|
As a result, a renewed interest in a more flexible syntax that would allow for
|
|
|
|
|
named information has been expressed occasionally in many different threads on
|
|
|
|
|
python-ideas, recently by Caleb Donovick [#request-1]_ in 2019 and Andras
|
|
|
|
|
Tantos [#request-2]_ in 2020. These requests prompted a strong activity on the
|
|
|
|
|
python-ideas mailing list, where the various options have been re-discussed and
|
|
|
|
|
a general consensus on an implementation strategy has now been reached.
|
|
|
|
|
|
|
|
|
|
Use cases
|
|
|
|
|
---------
|
|
|
|
|
|
2020-09-26 16:32:35 -04:00
|
|
|
|
The following practical use cases present different cases where a keyword
|
2020-09-22 20:14:43 -04:00
|
|
|
|
specification would improve notation and provide additional value:
|
|
|
|
|
|
|
|
|
|
1. To provide a more communicative meaning to the index, preventing e.g. accidental
|
2020-09-25 20:00:29 -04:00
|
|
|
|
inversion of indexes::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> grid_position[x=3, y=5, z=8]
|
|
|
|
|
>>> rain_amount[time=0:12, location=location]
|
|
|
|
|
>>> matrix[row=20, col=40]
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
2. To enrich the typing notation with keywords, especially during the use of generics::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
def function(value: MyType[T=int]):
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
3. In some domain, such as computational physics and chemistry, the use of a
|
|
|
|
|
notation such as ``Basis[Z=5]`` is a Domain Specific Language notation to represent
|
2020-09-25 20:00:29 -04:00
|
|
|
|
a level of accuracy::
|
2020-09-24 17:19:40 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> low_accuracy_energy = computeEnergy(molecule, BasisSet[Z=3])
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
4. Pandas currently uses a notation such as::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> df[df['x'] == 1]
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
which could be replaced with ``df[x=1]``.
|
2020-09-24 17:19:40 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
5. xarray has named dimensions. Currently these are handled with functions .isel::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> data.isel(row=10) # Returns the tenth row
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
which could also be replaced with ``data[row=10]``. A more complex example::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> # old syntax
|
|
|
|
|
>>> da.isel(space=0, time=slice(None, 2))[...] = spam
|
|
|
|
|
>>> # new syntax
|
|
|
|
|
>>> da[space=0, time=:2] = spam
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
Another example::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> # old syntax
|
|
|
|
|
>>> ds["empty"].loc[dict(lon=5, lat=6)] = 10
|
|
|
|
|
>>> # new syntax
|
|
|
|
|
>>> ds["empty"][lon=5, lat=6] = 10
|
|
|
|
|
|
|
|
|
|
>>> # old syntax
|
|
|
|
|
>>> ds["empty"].loc[dict(lon=slice(1, 5), lat=slice(3, None))] = 10
|
|
|
|
|
>>> # new syntax
|
|
|
|
|
>>> ds["empty"][lon=1:5, lat=6:] = 10
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
It is important to note that how the notation is interpreted is up to the
|
|
|
|
|
implementation. This PEP only defines and dictates the behavior of python
|
|
|
|
|
regarding passed keyword arguments, not how these arguments should be
|
|
|
|
|
interpreted and used by the implementing class.
|
|
|
|
|
|
2020-09-26 16:32:35 -04:00
|
|
|
|
Current status of indexing operation
|
|
|
|
|
------------------------------------
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-26 16:32:35 -04:00
|
|
|
|
Before detailing the new syntax and semantics to the indexing notation, it is
|
|
|
|
|
relevant to analyse how the indexing notation works today, in which contexts,
|
|
|
|
|
and how it is different from a function call.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
Subscripting ``obj[x]`` is, effectively, an alternate and specialised form of
|
|
|
|
|
function call syntax with a number of differences and restrictions compared to
|
|
|
|
|
``obj(x)``. The current python syntax focuses exclusively on position to express
|
|
|
|
|
the index, and also contains syntactic sugar to refer to non-punctiform
|
2020-09-25 20:00:29 -04:00
|
|
|
|
selection (slices). Some common examples::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> a[3] # returns the fourth element of 'a'
|
|
|
|
|
>>> a[1:10:2] # slice notation (extract a non-trivial data subset)
|
|
|
|
|
>>> a[3, 2] # multiple indexes (for multidimensional arrays)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
This translates into a ``__(get|set|del)item__`` dunder call which is passed a single
|
|
|
|
|
parameter containing the index (for ``__getitem__`` and ``__delitem__``) or two parameters
|
|
|
|
|
containing index and value (for ``__setitem__``).
|
|
|
|
|
|
|
|
|
|
The behavior of the indexing call is fundamentally different from a function call
|
|
|
|
|
in various aspects:
|
|
|
|
|
|
|
|
|
|
The first difference is in meaning to the reader. A function call says
|
|
|
|
|
"arbitrary function call potentially with side-effects". An indexing operation
|
|
|
|
|
says "lookup", typically to point at a subset or specific sub-aspect of an
|
|
|
|
|
entity (as in the case of typing notation). This fundamental difference means
|
|
|
|
|
that, while we cannot prevent abuse, implementors should be aware that the
|
|
|
|
|
introduction of keyword arguments to alter the behavior of the lookup may
|
|
|
|
|
violate this intrinsic meaning.
|
|
|
|
|
|
2020-09-24 17:19:40 -04:00
|
|
|
|
The second difference of the indexing notation compared to a function
|
2020-09-22 20:14:43 -04:00
|
|
|
|
is that indexing can be used for both getting and setting operations.
|
|
|
|
|
In python, a function cannot be on the left hand side of an assignment. In
|
2020-09-25 20:00:29 -04:00
|
|
|
|
other words, both of these are valid::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> x = a[1, 2]
|
|
|
|
|
>>> a[1, 2] = 5
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
but only the first one of these is valid::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> x = f(1, 2)
|
|
|
|
|
>>> f(1, 2) = 5 # invalid
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
This asymmetry is important, and makes one understand that there is a natural
|
|
|
|
|
imbalance between the two forms. It is therefore not a given that the two
|
2020-09-24 17:19:40 -04:00
|
|
|
|
should behave transparently and symmetrically.
|
|
|
|
|
|
2020-09-22 20:14:43 -04:00
|
|
|
|
The third difference is that functions have names assigned to their
|
2020-09-26 16:32:35 -04:00
|
|
|
|
arguments, unless the passed parameters are captured with ``*args``, in which case
|
2020-09-22 20:14:43 -04:00
|
|
|
|
they end up as entries in the args tuple. In other words, functions already
|
|
|
|
|
have anonymous argument semantic, exactly like the indexing operation. However,
|
2020-09-26 16:32:35 -04:00
|
|
|
|
``__(get|set|del)item__`` is not always receiving a tuple as the ``index`` argument
|
|
|
|
|
(to be uniform in behavior with ``*args``). In fact, given a trivial class::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
class X:
|
|
|
|
|
def __getitem__(self, index):
|
|
|
|
|
print(index)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-24 17:19:40 -04:00
|
|
|
|
The index operation basically forwards the content of the square brackets "as is"
|
2020-09-25 20:00:29 -04:00
|
|
|
|
in the ``index`` argument::
|
2020-09-26 16:32:35 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> x=X()
|
|
|
|
|
>>> x[0]
|
|
|
|
|
0
|
|
|
|
|
>>> x[0, 1]
|
|
|
|
|
(0, 1)
|
|
|
|
|
>>> x[(0, 1)]
|
|
|
|
|
(0, 1)
|
|
|
|
|
>>>
|
|
|
|
|
>>> x[()]
|
|
|
|
|
()
|
|
|
|
|
>>> x[{1, 2, 3}]
|
|
|
|
|
{1, 2, 3}
|
|
|
|
|
>>> x["hello"]
|
|
|
|
|
hello
|
|
|
|
|
>>> x["hello", "hi"]
|
|
|
|
|
('hello', 'hi')
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
The fourth difference is that the indexing operation knows how to convert
|
2020-09-25 20:00:29 -04:00
|
|
|
|
colon notations to slices, thanks to support from the parser. This is valid::
|
2020-09-24 17:19:40 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
a[1:3]
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
this one isn't::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
f(1:3)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
The fifth difference is that there's no zero-argument form. This is valid::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
f()
|
2020-09-24 17:19:40 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
this one isn't::
|
2020-09-24 17:19:40 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
a[]
|
2020-09-24 17:19:40 -04:00
|
|
|
|
|
2020-09-26 16:32:35 -04:00
|
|
|
|
Specification
|
|
|
|
|
=============
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-26 16:32:35 -04:00
|
|
|
|
Before describing the specification, it is important to stress the difference in
|
|
|
|
|
nomenclature between *positional index*, *final index* and *keyword argument*, as it is important to
|
|
|
|
|
understand the fundamental asymmetries at play. The ``__(get|set|del)item__``
|
2020-09-22 20:14:43 -04:00
|
|
|
|
is fundamentally an indexing operation, and the way the element is retrieved,
|
2020-09-26 16:32:35 -04:00
|
|
|
|
set, or deleted is through an index, the *final index*.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-26 16:32:35 -04:00
|
|
|
|
The current status quo is to directly build the *final index* from what is passed between
|
|
|
|
|
square brackets, the *positional index*. In other words, what is passed in the
|
2020-09-22 20:14:43 -04:00
|
|
|
|
square brackets is trivially used to generate what the code in ``__getitem__`` then uses
|
|
|
|
|
for the indicisation operation. As we already saw for the dict, ``d[1]`` has a
|
|
|
|
|
positional index of ``1`` and also a final index of ``1`` (because it's the element that is
|
2020-09-24 17:19:40 -04:00
|
|
|
|
then added to the dictionary) and ``d[1, 2]`` has positional index of ``(1, 2)`` and
|
2020-09-22 20:14:43 -04:00
|
|
|
|
final index also of ``(1, 2)`` (because yet again it's the element that is added to the dictionary).
|
|
|
|
|
However, the positional index ``d[1,2:3]`` is not accepted by the dictionary, because
|
|
|
|
|
there's no way to transform the positional index into a final index, as the slice object is
|
|
|
|
|
unhashable. The positional index is what is currently known as the ``index`` parameter in
|
|
|
|
|
``__getitem__``. Nevertheless, nothing prevents to construct a dictionary-like class that
|
|
|
|
|
creates the final index by e.g. converting the positional index to a string.
|
|
|
|
|
|
2020-09-26 16:32:35 -04:00
|
|
|
|
This PEP extends the current status quo, and grants more flexibility to
|
|
|
|
|
create the final index via an enhanced syntax that combines the positional index
|
2020-09-24 17:19:40 -04:00
|
|
|
|
and keyword arguments, if passed.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
The above brings an important point across. Keyword arguments, in the context of the index
|
|
|
|
|
operation, may be used to take indexing decisions to obtain the final index, and therefore
|
|
|
|
|
will have to accept values that are unconventional for functions. See for
|
2020-09-24 17:19:40 -04:00
|
|
|
|
example use case 1, where a slice is accepted.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-26 16:32:35 -04:00
|
|
|
|
The successful implementation of this PEP will result in the following behavior:
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-26 16:32:35 -04:00
|
|
|
|
1. An empty subscript is still illegal, regardless of context::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[] # SyntaxError
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
2. A single index value remains a single index value when passed::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[index]
|
|
|
|
|
# calls type(obj).__getitem__(obj, index)
|
|
|
|
|
|
|
|
|
|
obj[index] = value
|
|
|
|
|
# calls type(obj).__setitem__(obj, index, value)
|
|
|
|
|
|
|
|
|
|
del obj[index]
|
|
|
|
|
# calls type(obj).__delitem__(obj, index)
|
2020-09-26 16:32:35 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
This remains the case even if the index is followed by keywords; see point 5 below.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
3. Comma-seperated arguments are still parsed as a tuple and passed as
|
2020-09-25 20:00:29 -04:00
|
|
|
|
a single positional argument::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[spam, eggs]
|
|
|
|
|
# calls type(obj).__getitem__(obj, (spam, eggs))
|
|
|
|
|
|
|
|
|
|
obj[spam, eggs] = value
|
|
|
|
|
# calls type(obj).__setitem__(obj, (spam, eggs), value)
|
|
|
|
|
|
|
|
|
|
del obj[spam, eggs]
|
|
|
|
|
# calls type(obj).__delitem__(obj, (spam, eggs))
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
The points above mean that classes which do not want to support keyword
|
|
|
|
|
arguments in subscripts need do nothing at all, and the feature is therefore
|
|
|
|
|
completely backwards compatible.
|
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
4. Keyword arguments, if any, must follow positional arguments::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[1, 2, spam=None, 3] # SyntaxError
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
This is like function calls, where intermixing positional and keyword
|
|
|
|
|
arguments give a SyntaxError.
|
|
|
|
|
|
|
|
|
|
5. Keyword subscripts, if any, will be handled like they are in
|
2020-09-25 20:00:29 -04:00
|
|
|
|
function calls. Examples::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
# Single index with keywords:
|
|
|
|
|
|
|
|
|
|
obj[index, spam=1, eggs=2]
|
|
|
|
|
# calls type(obj).__getitem__(obj, index, spam=1, eggs=2)
|
|
|
|
|
|
|
|
|
|
obj[index, spam=1, eggs=2] = value
|
|
|
|
|
# calls type(obj).__setitem__(obj, index, value, spam=1, eggs=2)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
del obj[index, spam=1, eggs=2]
|
|
|
|
|
# calls type(obj).__delitem__(obj, index, spam=1, eggs=2)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
# Comma-separated indices with keywords:
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[foo, bar, spam=1, eggs=2]
|
|
|
|
|
# calls type(obj).__getitem__(obj, (foo, bar), spam=1, eggs=2)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[foo, bar, spam=1, eggs=2] = value
|
|
|
|
|
# calls type(obj).__setitem__(obj, (foo, bar), value, spam=1, eggs=2)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
del obj[foo, bar, spam=1, eggs=2]
|
|
|
|
|
# calls type(obj).__detitem__(obj, (foo, bar), spam=1, eggs=2)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
Note that:
|
|
|
|
|
|
|
|
|
|
- a single positional index will not turn into a tuple
|
2020-09-24 17:19:40 -04:00
|
|
|
|
just because one adds a keyword value.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
- for ``__setitem__``, the same order is retained for index and value.
|
|
|
|
|
The keyword arguments go at the end, as is normal for a function
|
|
|
|
|
definition.
|
|
|
|
|
|
|
|
|
|
6. The same rules apply with respect to keyword subscripts as for
|
|
|
|
|
keywords in function calls:
|
|
|
|
|
|
|
|
|
|
- the interpeter matches up each keyword subscript to a named parameter
|
|
|
|
|
in the appropriate method;
|
|
|
|
|
|
|
|
|
|
- if a named parameter is used twice, that is an error;
|
|
|
|
|
|
|
|
|
|
- if there are any named parameters left over (without a value) when the
|
|
|
|
|
keywords are all used, they are assigned their default value (if any);
|
|
|
|
|
|
|
|
|
|
- if any such parameter doesn't have a default, that is an error;
|
|
|
|
|
|
|
|
|
|
- if there are any keyword subscripts remaining after all the named
|
|
|
|
|
parameters are filled, and the method has a ``**kwargs`` parameter,
|
|
|
|
|
they are bound to the ``**kwargs`` parameter as a dict;
|
|
|
|
|
|
|
|
|
|
- but if no ``**kwargs`` parameter is defined, it is an error.
|
|
|
|
|
|
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
7. Sequence unpacking remains a syntax error inside subscripts::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[*items]
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
Reason: unpacking items would result it being immediately repacked into
|
|
|
|
|
a tuple. Anyone using sequence unpacking in the subscript is probably
|
|
|
|
|
confused as to what is happening, and it is best if they receive an
|
|
|
|
|
immediate syntax error with an informative error message.
|
|
|
|
|
|
2020-09-24 17:19:40 -04:00
|
|
|
|
This restriction has however been considered arbitrary by some, and it might
|
2020-09-22 20:14:43 -04:00
|
|
|
|
be lifted at a later stage for symmetry with kwargs unpacking, see next.
|
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
8. Dict unpacking is permitted::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
items = {'spam': 1, 'eggs': 2}
|
|
|
|
|
obj[index, **items]
|
|
|
|
|
# equivalent to obj[index, spam=1, eggs=2]
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
9. Keyword-only subscripts are permitted. The positional index will be the empty tuple::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[spam=1, eggs=2]
|
|
|
|
|
# calls type(obj).__getitem__(obj, (), spam=1, eggs=2)
|
|
|
|
|
|
|
|
|
|
obj[spam=1, eggs=2] = 5
|
|
|
|
|
# calls type(obj).__setitem__(obj, (), 5, spam=1, eggs=2)
|
|
|
|
|
|
|
|
|
|
del obj[spam=1, eggs=2]
|
|
|
|
|
# calls type(obj).__delitem__(obj, (), spam=1, eggs=2)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
10. Keyword arguments must allow slice syntax::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[3:4, spam=1:4, eggs=2]
|
|
|
|
|
# calls type(obj).__getitem__(obj, slice(3, 4, None), spam=slice(1, 4, None), eggs=2)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
This may open up the possibility to accept the same syntax for general function
|
|
|
|
|
calls, but this is not part of this recommendation.
|
|
|
|
|
|
2020-09-26 12:38:24 -04:00
|
|
|
|
11. Keyword arguments allow for default values::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
# Given type(obj).__getitem__(obj, index, spam=True, eggs=2)
|
|
|
|
|
obj[3] # Valid. index = 3, spam = True, eggs = 2
|
|
|
|
|
obj[3, spam=False] # Valid. index = 3, spam = False, eggs = 2
|
|
|
|
|
obj[spam=False] # Valid. index = (), spam = False, eggs = 2
|
|
|
|
|
obj[] # Invalid.
|
2020-09-24 17:19:40 -04:00
|
|
|
|
|
2020-09-26 12:38:24 -04:00
|
|
|
|
12. The same semantics given above must be extended to ``__class__getitem__``:
|
2020-09-22 20:14:43 -04:00
|
|
|
|
Since PEP 560, type hints are dispatched so that for ``x[y]``, if no
|
|
|
|
|
``__getitem__`` method is found, and ``x`` is a type (class) object,
|
2020-09-24 17:19:40 -04:00
|
|
|
|
and ``x`` has a class method ``__class_getitem__``, that method is
|
|
|
|
|
called. The same changes should be applied to this method as well,
|
2020-09-22 20:14:43 -04:00
|
|
|
|
so that a writing like ``list[T=int]`` can be accepted.
|
|
|
|
|
|
2020-09-26 16:32:35 -04:00
|
|
|
|
Indexing behavior in standard classes (dict, list, etc.)
|
|
|
|
|
--------------------------------------------------------
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-26 16:32:35 -04:00
|
|
|
|
None of what is proposed in this PEP will change the behavior of the current
|
|
|
|
|
core classes that use indexing. Adding keywords to the index operation for
|
|
|
|
|
custom classes is not the same as modifying e.g. the standard dict type to
|
|
|
|
|
handle keyword arguments. In fact, dict (as well as list and other stdlib
|
|
|
|
|
classes with indexing semantics) will remain the same and will continue not to
|
|
|
|
|
accept keyword arguments. In other words, if ``d`` is a ``dict``, the
|
2020-09-22 20:14:43 -04:00
|
|
|
|
statement ``d[1, a=2]`` will raise ``TypeError``, as their implementation will
|
|
|
|
|
not support the use of keyword arguments. The same holds for all other classes
|
|
|
|
|
(list, frozendict, etc.)
|
|
|
|
|
|
2020-09-24 17:19:40 -04:00
|
|
|
|
Corner case and Gotchas
|
2020-09-22 20:14:43 -04:00
|
|
|
|
-----------------------
|
|
|
|
|
|
2020-09-26 16:32:35 -04:00
|
|
|
|
With the introduction of the new notation, a few corner cases need to be analysed.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
1. Technically, if a class defines their getter like this::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
def __getitem__(self, index):
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
then the caller could call that using keyword syntax, like these two cases::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[3, index=4]
|
|
|
|
|
obj[index=1]
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-24 17:19:40 -04:00
|
|
|
|
The resulting behavior would be an error automatically, since it would be like
|
|
|
|
|
attempting to call the method with two values for the ``index`` argument, and
|
|
|
|
|
a ``TypeError`` will be raised. In the first case, the ``index`` would be ``3``,
|
|
|
|
|
in the second case, it would be the empty tuple ``()``.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-24 17:19:40 -04:00
|
|
|
|
Note that this behavior applies for all currently existing classes that rely on
|
|
|
|
|
indexing, meaning that there is no way for the new behavior to introduce
|
|
|
|
|
backward compatibility issues on this respect.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-24 17:19:40 -04:00
|
|
|
|
Classes that wish to stress this behavior explicitly can define their
|
2020-09-25 20:00:29 -04:00
|
|
|
|
parameters as positional-only::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
def __getitem__(self, index, /):
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
2. a similar case occurs with setter notation::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
# Given type(obj).__getitem__(self, index, value):
|
|
|
|
|
obj[1, value=3] = 5
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-24 17:19:40 -04:00
|
|
|
|
This poses no issue because the value is passed automatically, and the python interpreter will raise
|
2020-09-22 20:14:43 -04:00
|
|
|
|
``TypeError: got multiple values for keyword argument 'value'``
|
2020-09-24 17:19:40 -04:00
|
|
|
|
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
3. If the subscript dunders are declared to use positional-or-keyword
|
|
|
|
|
parameters, there may be some surprising cases when arguments are passed
|
2020-09-25 20:00:29 -04:00
|
|
|
|
to the method. Given the signature::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
def __getitem__(self, index, direction='north')
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
if the caller uses this::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[0, 'south']
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
they will probably be surprised by the method call::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
# expected type(obj).__getitem__(0, direction='south')
|
|
|
|
|
# but actually get:
|
|
|
|
|
obj.__getitem__((0, 'south'), direction='north')
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
Solution: best practice suggests that keyword subscripts should be
|
2020-09-25 20:00:29 -04:00
|
|
|
|
flagged as keyword-only when possible::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
def __getitem__(self, index, *, direction='north')
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
The interpreter need not enforce this rule, as there could be scenarios
|
|
|
|
|
where this is the desired behaviour. But linters may choose to warn
|
|
|
|
|
about subscript methods which don't use the keyword-only flag.
|
|
|
|
|
|
|
|
|
|
4. As we saw, a single value followed by a keyword argument will not be changed into a tuple, i.e.:
|
|
|
|
|
``d[1, a=3]`` is treated as ``__getitem__(1, a=3)``, NOT ``__getitem__((1,), a=3)``. It would be
|
|
|
|
|
extremely confusing if adding keyword arguments were to change the type of the passed index.
|
|
|
|
|
In other words, adding a keyword to a single-valued subscript will not change it into a tuple.
|
2020-09-25 20:00:29 -04:00
|
|
|
|
For those cases where an actual tuple needs to be passed, a proper syntax will have to be used::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[(1,), a=3] # calls __getitem__((1,), a=3)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
In this case, the call is passing a single element (which is passed as is, as from rule above),
|
|
|
|
|
only that the single element happens to be a tuple.
|
|
|
|
|
|
|
|
|
|
Note that this behavior just reveals the truth that the ``obj[1,]`` notation is shorthand for
|
2020-09-24 17:19:40 -04:00
|
|
|
|
``obj[(1,)]`` (and also ``obj[1]`` is shorthand for ``obj[(1)]``, with the expected behavior).
|
|
|
|
|
When keywords are present, the rule that you can omit this outermost pair of parentheses is no
|
2020-09-25 20:00:29 -04:00
|
|
|
|
longer true::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[1] # calls __getitem__(1)
|
|
|
|
|
obj[1, a=3] # calls __getitem__(1, a=3)
|
|
|
|
|
obj[1,] # calls __getitem__((1,))
|
|
|
|
|
obj[(1,), a=3] # calls __getitem__((1,), a=3)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
This is particularly relevant in the case where two entries are passed::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[1, 2] # calls __getitem__((1, 2))
|
|
|
|
|
obj[(1, 2)] # same as above
|
|
|
|
|
obj[1, 2, a=3] # calls __getitem__((1, 2), a=3)
|
|
|
|
|
obj[(1, 2), a=3] # calls __getitem__((1, 2), a=3)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
And particularly when the tuple is extracted as a variable::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
t = (1, 2)
|
|
|
|
|
obj[t] # calls __getitem__((1, 2))
|
|
|
|
|
obj[t, a=3] # calls __getitem__((1, 2), a=3)
|
2020-09-24 17:19:40 -04:00
|
|
|
|
|
2020-09-22 20:14:43 -04:00
|
|
|
|
Why? because in the case ``obj[1, 2, a=3]`` we are passing two elements (which
|
2020-09-24 17:19:40 -04:00
|
|
|
|
are then packed as a tuple and passed as the index). In the case ``obj[(1, 2), a=3]``
|
|
|
|
|
we are passing a single element (which is passed as is) which happens to be a tuple.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
The final result is that they are the same.
|
|
|
|
|
|
|
|
|
|
C Interface
|
|
|
|
|
===========
|
|
|
|
|
|
2020-09-26 16:32:35 -04:00
|
|
|
|
Resolution of the indexing operation is performed through a call to the following functions
|
|
|
|
|
|
|
|
|
|
- ``PyObject_GetItem(PyObject *o, PyObject *key)`` for the get operation
|
|
|
|
|
- ``PyObject_SetItem(PyObject *o, PyObject *key, PyObject *value)`` for the set operation
|
|
|
|
|
- ``PyObject_DelItem(PyObject *o, PyObject *key)`` for the del operation
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
These functions are used extensively within the python executable, and are
|
|
|
|
|
also part of the public C API, as exported by ``Include/abstract.h``. It is clear that
|
2020-09-24 17:19:40 -04:00
|
|
|
|
the signature of this function cannot be changed, and different C level functions
|
2020-09-22 20:14:43 -04:00
|
|
|
|
need to be implemented to support the extended call. We propose
|
2020-09-26 16:32:35 -04:00
|
|
|
|
|
|
|
|
|
- ``PyObject_GetItemWithKeywords(PyObject *o, PyObject *key, PyObject *kwargs)``
|
|
|
|
|
- ``PyObject_SetItemWithKeywords(PyObject *o, PyObject *key, PyObject *value, PyObject *kwargs)``
|
|
|
|
|
- ``PyObject_GetItemWithKeywords(PyObject *o, PyObject *key, PyObject *kwargs)``
|
|
|
|
|
|
|
|
|
|
Additionally, new opcodes will be needed for the enhanced call. Currently, the
|
|
|
|
|
implementation uses ``BINARY_SUBSCR``, ``STORE_SUBSCR`` and ``DELETE_SUBSCR``
|
|
|
|
|
to invoke the old functions. We propose ``BINARY_SUBSCR_KW``,
|
|
|
|
|
``STORE_SUBSCR_KW`` and ``DELETE_SUBSCR_KW`` for the new operations. The
|
|
|
|
|
compiler will have to generate these new opcodes. The
|
|
|
|
|
old C implementations will call the extended methods passing ``NULL``
|
|
|
|
|
as kwargs.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
Workarounds
|
|
|
|
|
===========
|
|
|
|
|
|
|
|
|
|
Every PEP that changes the Python language should "clearly explain why
|
|
|
|
|
the existing language specification is inadequate to address the
|
|
|
|
|
problem that the PEP solves." [#pep-0001]_
|
|
|
|
|
|
|
|
|
|
Some rough equivalents to the proposed extension, which we call work-arounds,
|
|
|
|
|
are already possible. The work-arounds provide an alternative to enabling the
|
2020-09-24 17:19:40 -04:00
|
|
|
|
new syntax, while leaving the semantics to be defined elsewhere.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
These work-arounds follow. In them the helpers ``H`` and ``P`` are not intended to
|
|
|
|
|
be universal. For example, a module or package might require the use of its own
|
|
|
|
|
helpers.
|
|
|
|
|
|
|
|
|
|
1. User defined classes can be given ``getitem`` and ``delitem`` methods,
|
2020-09-25 20:00:29 -04:00
|
|
|
|
that respectively get and delete values stored in a container::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> val = x.getitem(1, 2, a=3, b=4)
|
|
|
|
|
>>> x.delitem(1, 2, a=3, b=4)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
The same can't be done for ``setitem``. It's not valid syntax::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> x.setitem(1, 2, a=3, b=4) = val
|
|
|
|
|
SyntaxError: can't assign to function call
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
2. A helper class, here called ``H``, can be used to swap the container
|
2020-09-25 20:00:29 -04:00
|
|
|
|
and parameter roles. In other words, we use::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
H(1, 2, a=3, b=4)[x]
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
as a substitute for::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
x[1, 2, a=3, b=4]
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
This method will work for ``getitem``, ``delitem`` and also for
|
2020-09-25 20:00:29 -04:00
|
|
|
|
``setitem``. This is because::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> H(1, 2, a=3, b=4)[x] = val
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
is valid syntax, which can be given the appropriate semantics.
|
|
|
|
|
|
|
|
|
|
3. A helper function, here called ``P``, can be used to store the
|
2020-09-25 20:00:29 -04:00
|
|
|
|
arguments in a single object. For example::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> x[P(1, 2, a=3, b=4)] = val
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
is valid syntax, and can be given the appropriate semantics.
|
|
|
|
|
|
|
|
|
|
4. The ``lo:hi:step`` syntax for slices is sometimes very useful. This
|
2020-09-25 20:00:29 -04:00
|
|
|
|
syntax is not directly available in the work-arounds. However::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
s[lo:hi:step]
|
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
provides a work-around that is available everything, where::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
class S:
|
|
|
|
|
def __getitem__(self, key): return key
|
|
|
|
|
|
|
|
|
|
s = S()
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
defines the helper object ``s``.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
Rejected Ideas
|
|
|
|
|
==============
|
|
|
|
|
|
|
|
|
|
Previous PEP 472 solutions
|
|
|
|
|
--------------------------
|
|
|
|
|
|
|
|
|
|
PEP 472 presents a good amount of ideas that are now all to be considered
|
2020-09-24 17:19:40 -04:00
|
|
|
|
Rejected. A personal email from D'Aprano to one of the authors (Stefano Borini)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
specifically said:
|
|
|
|
|
|
|
|
|
|
I have now carefully read through PEP 472 in full, and I am afraid I
|
|
|
|
|
cannot support any of the strategies currently in the PEP.
|
|
|
|
|
|
|
|
|
|
We agree that those options are inferior to the currently presented, for one
|
|
|
|
|
reason or another.
|
|
|
|
|
|
|
|
|
|
To keep this document compact, we will not present here the objections for
|
|
|
|
|
all options presented in PEP 472. Suffice to say that they were discussed,
|
|
|
|
|
and each proposed alternative had one or few dealbreakers.
|
|
|
|
|
|
|
|
|
|
Adding new dunders
|
|
|
|
|
------------------
|
|
|
|
|
|
|
|
|
|
It was proposed to introduce new dunders ``__(get|set|del)item_ex__``
|
|
|
|
|
that are invoked over the ``__(get|set|del)item__`` triad, if they are present.
|
|
|
|
|
|
|
|
|
|
The rationale around this choice is to make the intuition around how to add kwd
|
|
|
|
|
arg support to square brackets more obvious and in line with the function
|
2020-09-25 20:00:29 -04:00
|
|
|
|
behavior. Given::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
def __getitem_ex__(self, x, y): ...
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
These all just work and produce the same result effortlessly::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[1, 2]
|
|
|
|
|
obj[1, y=2]
|
|
|
|
|
obj[y=2, x=1]
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
In other words, this solution would unify the behavior of ``__getitem__`` to the traditional
|
|
|
|
|
function signature, but since we can't change ``__getitem__`` and break backward compatibility,
|
|
|
|
|
we would have an extended version that is used preferentially.
|
|
|
|
|
|
|
|
|
|
The problems with this approach were found to be:
|
|
|
|
|
|
|
|
|
|
- It will slow down subscripting. For every subscript access, this new dunder
|
|
|
|
|
attribute gets investigated on the class, and if it is not present then the
|
2020-09-24 17:19:40 -04:00
|
|
|
|
default key translation function is executed.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
Different ideas were proposed to handle this, from wrapping the method
|
2020-09-24 17:19:40 -04:00
|
|
|
|
only at class instantiation time, to add a bit flag to signal the availability
|
|
|
|
|
of these methods. Regardess of the solution, the new dunder would be effective
|
2020-09-22 20:14:43 -04:00
|
|
|
|
only if added at class creation time, not if it's added later. This would
|
|
|
|
|
be unusual and would disallow (and behave unexpectedly) monkeypatching of the
|
|
|
|
|
methods for whatever reason it might be needed.
|
|
|
|
|
|
|
|
|
|
- It adds complexity to the mechanism.
|
|
|
|
|
|
|
|
|
|
- Will require a long and painful transition period during which time
|
|
|
|
|
libraries will have to somehow support both calling conventions, because most
|
|
|
|
|
likely, the extended methods will delegate to the traditional ones when the
|
|
|
|
|
right conditions are matched in the arguments, or some classes will support
|
|
|
|
|
the traditional dunder and others the extended dunder. While this will not
|
|
|
|
|
affect calling code, it will affect development.
|
|
|
|
|
|
|
|
|
|
- it would potentially lead to mixed situations where the extended version is
|
|
|
|
|
defined for the getter, but not for the setter.
|
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
- In the ``__setitem_ex__`` signature, value would have to be made the first
|
2020-09-22 20:14:43 -04:00
|
|
|
|
element, because the index is of arbitrary length depending on the specified
|
|
|
|
|
indexes. This would look awkward because the visual notation does not match
|
2020-09-25 20:00:29 -04:00
|
|
|
|
the signature::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[1, 2] = 3 # calls obj.__setitem_ex__(3, 1, 2)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-24 17:19:40 -04:00
|
|
|
|
- the solution relies on the assumption that all keyword indices necessarily map
|
|
|
|
|
into positional indices, or that they must have a name. This assumption may be
|
2020-09-22 20:14:43 -04:00
|
|
|
|
false: xarray, which is the primary python package for numpy arrays with
|
2020-09-24 17:19:40 -04:00
|
|
|
|
labelled dimensions, supports indexing by additional dimensions (so called
|
2020-09-22 20:14:43 -04:00
|
|
|
|
"non-dimension coordinates") that don't correspond directly to the dimensions
|
|
|
|
|
of the underlying numpy array, and those have no position to match up to.
|
|
|
|
|
In other words, anonymous indexes are a plausible use case that this solution
|
2020-09-24 17:19:40 -04:00
|
|
|
|
would remove, although it could be argued that using ``*args`` would solve
|
2020-09-22 20:14:43 -04:00
|
|
|
|
that issue.
|
|
|
|
|
|
|
|
|
|
Adding an adapter function
|
|
|
|
|
--------------------------
|
|
|
|
|
|
|
|
|
|
Similar to the above, in the sense that a pre-function would be called to
|
|
|
|
|
convert the "new style" indexing into "old style indexing" that is then passed.
|
|
|
|
|
Has problems similar to the above.
|
|
|
|
|
|
|
|
|
|
create a new "kwslice" object
|
|
|
|
|
-----------------------------
|
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
This proposal has already been explored in "New arguments contents" P4 in PEP 472::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[a, b:c, x=1] # calls __getitem__(a, slice(b, c), key(x=1))
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
This solution requires everyone who needs keyword arguments to parse the tuple
|
|
|
|
|
and/or key object by hand to extract them. This is painful and opens up to the
|
|
|
|
|
get/set/del function to always accept arbitrary keyword arguments, whether they
|
|
|
|
|
make sense or not. We want the developer to be able to specify which arguments
|
|
|
|
|
make sense and which ones do not.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Using a single bit to change the behavior
|
|
|
|
|
-----------------------------------------
|
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
A special class dunder flag::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
__keyfn__ = True
|
|
|
|
|
|
|
|
|
|
would change the signature of the ``__get|set|delitem__`` to a "function like" dispatch,
|
2020-09-25 20:00:29 -04:00
|
|
|
|
meaning that this::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> d[1, 2, z=3]
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
would result in a call to::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> d.__getitem__(1, 2, z=3) # instead of d.__getitem__((1, 2), z=3)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
This option has been rejected because it feels odd that a signature of a method
|
|
|
|
|
depends on a specific value of another dunder. It would be confusing for both
|
2020-09-24 17:19:40 -04:00
|
|
|
|
static type checkers and for humans: a static type checker would have to hard-code
|
2020-09-22 20:14:43 -04:00
|
|
|
|
a special case for this, because there really is nothing else in Python
|
|
|
|
|
where the signature of a dunder depends on the value of another dunder.
|
|
|
|
|
A human that has to implement a ``__getitem__`` dunder would have to look if in the
|
|
|
|
|
class (or in any of its subclasses) for a ``__keyfn__`` before the dunder can be written.
|
|
|
|
|
Moreover, adding a base classes that have the ``__keyfn__`` flag set would break
|
|
|
|
|
the signature of the current methods. This would be even more problematic if the
|
|
|
|
|
flag is changed at runtime, or if the flag is generated by calling a function
|
|
|
|
|
that returns randomly True or something else.
|
|
|
|
|
|
|
|
|
|
Allowing for empty index notation obj[]
|
|
|
|
|
---------------------------------------
|
|
|
|
|
|
|
|
|
|
The current proposal prevents ``obj[]`` from being valid notation. However
|
|
|
|
|
a commenter stated
|
|
|
|
|
|
|
|
|
|
We have ``Tuple[int, int]`` as a tuple of two integers. And we have `Tuple[int]`
|
|
|
|
|
as a tuple of one integer. And occasionally we need to spell a tuple of *no*
|
|
|
|
|
values, since that's the type of ``()``. But we currently are forced to write
|
|
|
|
|
that as ``Tuple[()]``. If we allowed ``Tuple[]`` that odd edge case would be
|
|
|
|
|
removed.
|
|
|
|
|
|
|
|
|
|
So I probably would be okay with allowing ``obj[]`` syntactically, as long as the
|
|
|
|
|
dict type could be made to reject it.
|
|
|
|
|
|
|
|
|
|
This proposal already established that, in case no positional index is given, the
|
2020-09-24 17:19:40 -04:00
|
|
|
|
passed value must be the empty tuple. Allowing for the empty index notation would
|
2020-09-22 20:14:43 -04:00
|
|
|
|
make the dictionary type accept it automatically, to insert or refer to the value with
|
|
|
|
|
the empty tuple as key. Moreover, a typing notation such as ``Tuple[]`` can easily
|
|
|
|
|
be written as ``Tuple`` without the indexing notation.
|
|
|
|
|
|
|
|
|
|
Use None instead of the empty tuple when no positional index is given
|
|
|
|
|
---------------------------------------------------------------------
|
|
|
|
|
|
2020-09-24 17:19:40 -04:00
|
|
|
|
The case ``obj[k=3]`` will lead to a call ``__getitem__((), k=3)``.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
The alternative ``__getitem__(None, k=3)`` was considered but rejected:
|
|
|
|
|
NumPy uses `None` to indicate inserting a new axis/dimensions (there's
|
2020-09-25 20:00:29 -04:00
|
|
|
|
a ``np.newaxis`` alias as well)::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
arr = np.array(5)
|
|
|
|
|
arr.ndim == 0
|
|
|
|
|
arr[None].ndim == arr[None,].ndim == 1
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
So the final conclusion is that we favor the following series::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[k=3] # __getitem__((), k=3). Empty tuple
|
|
|
|
|
obj[1, k=3] # __getitem__(1, k=3). Integer
|
|
|
|
|
obj[1, 2, k=3] # __getitem__((1, 2), k=3). Tuple
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
more than this::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[k=3] # __getitem__(None, k=3). None
|
|
|
|
|
obj[1, k=3] # __getitem__(1, k=3). Integer
|
|
|
|
|
obj[1, 2, k=3] # __getitem__((1, 2), k=3). Tuple
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
With the first more in line with a \*args semantics for calling a routine with
|
2020-09-25 20:00:29 -04:00
|
|
|
|
no positional arguments::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> def foo(*args, **kwargs):
|
|
|
|
|
... print(args, kwargs)
|
|
|
|
|
...
|
|
|
|
|
>>> foo(k=3)
|
|
|
|
|
() {'k': 3}
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
Although we accept the following asymmetry::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> foo(1, k=3)
|
|
|
|
|
(1,) {'k': 3}
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Common objections
|
|
|
|
|
=================
|
|
|
|
|
|
|
|
|
|
1. Just use a method call.
|
|
|
|
|
|
|
|
|
|
One of the use cases is typing, where the indexing is used exclusively, and
|
|
|
|
|
function calls are out of the question. Moreover, function calls do not handle
|
|
|
|
|
slice notation, which is commonly used in some cases for arrays.
|
|
|
|
|
|
|
|
|
|
One problem is type hint creation has been extended to built-ins in python 3.9,
|
|
|
|
|
so that you do not have to import Dict, List, et al anymore.
|
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
Without kwdargs inside ``[]``, you would not be able to do this::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
Vector = dict[i=float, j=float]
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
but for obvious reasons, call syntax using builtins to create custom type hints
|
2020-09-25 20:00:29 -04:00
|
|
|
|
isn't an option::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
dict(i=float, j=float) # would create a dictionary, not a type
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
References
|
|
|
|
|
==========
|
|
|
|
|
|
|
|
|
|
.. [#rejection] "Rejection of PEP 472"
|
|
|
|
|
(https://mail.python.org/pipermail/python-dev/2019-March/156693.html)
|
2020-09-24 17:19:40 -04:00
|
|
|
|
.. [#pep-0484] "PEP 484 -- Type hints"
|
2020-09-22 20:14:43 -04:00
|
|
|
|
(https://www.python.org/dev/peps/pep-0484)
|
|
|
|
|
.. [#request-1] "Allow kwargs in __{get|set|del}item__"
|
|
|
|
|
(https://mail.python.org/archives/list/python-ideas@python.org/thread/EUGDRTRFIY36K4RM3QRR52CKCI7MIR2M/)
|
|
|
|
|
.. [#request-2] "PEP 472 -- Support for indexing with keyword arguments"
|
|
|
|
|
(https://mail.python.org/archives/list/python-ideas@python.org/thread/6OGAFDWCXT5QVV23OZWKBY4TXGZBVYZS/)
|
|
|
|
|
.. [#pep-0001] "PEP 1 -- PEP Purpose and Guidelines"
|
|
|
|
|
(https://www.python.org/dev/peps/pep-0001/#what-belongs-in-a-successful-pep)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Copyright
|
|
|
|
|
=========
|
|
|
|
|
|
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
..
|
|
|
|
|
Local Variables:
|
|
|
|
|
mode: indented-text
|
|
|
|
|
indent-tabs-mode: nil
|
|
|
|
|
sentence-end-double-space: t
|
|
|
|
|
fill-column: 70
|
|
|
|
|
End:
|