2020-09-22 20:14:43 -04:00
|
|
|
|
PEP: 637
|
|
|
|
|
Title: Support for indexing with keyword arguments
|
|
|
|
|
Version: $Revision$
|
|
|
|
|
Last-Modified: $Date$
|
2020-10-12 14:31:49 -04:00
|
|
|
|
Author: Stefano Borini
|
2020-09-22 20:14:43 -04:00
|
|
|
|
Sponsor: Steven D'Aprano
|
|
|
|
|
Discussions-To: python-ideas@python.org
|
2021-03-22 15:01:32 -04:00
|
|
|
|
Status: Rejected
|
2020-09-22 20:14:43 -04:00
|
|
|
|
Type: Standards Track
|
|
|
|
|
Content-Type: text/x-rst
|
|
|
|
|
Created: 24-Aug-2020
|
|
|
|
|
Python-Version: 3.10
|
2020-09-23 17:43:22 -04:00
|
|
|
|
Post-History: 23-Sep-2020
|
2021-03-22 15:01:32 -04:00
|
|
|
|
Resolution: https://mail.python.org/archives/list/python-dev@python.org/thread/6TAQ2BEVSJNV4JM2RJYSSYFJUT3INGZD/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
.. note::
|
|
|
|
|
This PEP has been rejected. In general, the cost of introducing new syntax
|
|
|
|
|
was not outweighed by the perceived benefits. See the link in the Resolution
|
|
|
|
|
header field for details.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
Abstract
|
|
|
|
|
========
|
|
|
|
|
|
|
|
|
|
At present keyword arguments are allowed in function calls, but not in
|
|
|
|
|
item access. This PEP proposes that Python be extended to allow keyword
|
|
|
|
|
arguments in item access.
|
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
The following example shows keyword arguments for ordinary function calls::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> val = f(1, 2, a=3, b=4)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
The proposal would extend the syntax to allow a similar construct
|
2020-09-25 20:00:29 -04:00
|
|
|
|
to indexing operations::
|
2020-09-24 17:19:40 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> val = x[1, 2, a=3, b=4] # getitem
|
|
|
|
|
>>> x[1, 2, a=3, b=4] = val # setitem
|
|
|
|
|
>>> del x[1, 2, a=3, b=4] # delitem
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-12-23 13:05:12 -05:00
|
|
|
|
and would also provide appropriate semantics. Single- and double-star unpacking of
|
|
|
|
|
arguments is also provided::
|
|
|
|
|
|
|
|
|
|
>>> val = x[*(1, 2), **{a=3, b=4}] # Equivalent to above.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
This PEP is a successor to PEP 472, which was rejected due to lack of
|
|
|
|
|
interest in 2019. Since then there's been renewed interest in the feature.
|
|
|
|
|
|
|
|
|
|
Overview
|
|
|
|
|
========
|
|
|
|
|
|
|
|
|
|
Background
|
|
|
|
|
----------
|
|
|
|
|
|
|
|
|
|
PEP 472 was opened in 2014. The PEP detailed various use cases and was created by
|
|
|
|
|
extracting implementation strategies from a broad discussion on the
|
|
|
|
|
python-ideas mailing list, although no clear consensus was reached on which strategy
|
|
|
|
|
should be used. Many corner cases have been examined more closely and felt
|
2020-09-24 17:19:40 -04:00
|
|
|
|
awkward, backward incompatible or both.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
The PEP was eventually rejected in 2019 [#rejection]_ mostly
|
|
|
|
|
due to lack of interest for the feature despite its 5 years of existence.
|
|
|
|
|
|
|
|
|
|
However, with the introduction of type hints in PEP 484 [#pep-0484]_ the
|
|
|
|
|
square bracket notation has been used consistently to enrich the typing
|
|
|
|
|
annotations, e.g. to specify a list of integers as Sequence[int]. Additionally,
|
|
|
|
|
there has been an expanded growth of packages for data analysis such as pandas
|
|
|
|
|
and xarray, which use names to describe columns in a table (pandas) or axis in
|
|
|
|
|
an nd-array (xarray). These packages allow users to access specific data by
|
|
|
|
|
names, but cannot currently use index notation ([]) for this functionality.
|
|
|
|
|
|
|
|
|
|
As a result, a renewed interest in a more flexible syntax that would allow for
|
|
|
|
|
named information has been expressed occasionally in many different threads on
|
|
|
|
|
python-ideas, recently by Caleb Donovick [#request-1]_ in 2019 and Andras
|
|
|
|
|
Tantos [#request-2]_ in 2020. These requests prompted a strong activity on the
|
|
|
|
|
python-ideas mailing list, where the various options have been re-discussed and
|
|
|
|
|
a general consensus on an implementation strategy has now been reached.
|
|
|
|
|
|
|
|
|
|
Use cases
|
|
|
|
|
---------
|
|
|
|
|
|
2020-09-26 16:32:35 -04:00
|
|
|
|
The following practical use cases present different cases where a keyword
|
2020-09-22 20:14:43 -04:00
|
|
|
|
specification would improve notation and provide additional value:
|
|
|
|
|
|
|
|
|
|
1. To provide a more communicative meaning to the index, preventing e.g. accidental
|
2020-09-25 20:00:29 -04:00
|
|
|
|
inversion of indexes::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> grid_position[x=3, y=5, z=8]
|
|
|
|
|
>>> rain_amount[time=0:12, location=location]
|
|
|
|
|
>>> matrix[row=20, col=40]
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
2. To enrich the typing notation with keywords, especially during the use of generics::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
def function(value: MyType[T=int]):
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
3. In some domain, such as computational physics and chemistry, the use of a
|
|
|
|
|
notation such as ``Basis[Z=5]`` is a Domain Specific Language notation to represent
|
2020-09-25 20:00:29 -04:00
|
|
|
|
a level of accuracy::
|
2020-09-24 17:19:40 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> low_accuracy_energy = computeEnergy(molecule, BasisSet[Z=3])
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
4. Pandas currently uses a notation such as::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> df[df['x'] == 1]
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
which could be replaced with ``df[x=1]``.
|
2020-09-24 17:19:40 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
5. xarray has named dimensions. Currently these are handled with functions .isel::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> data.isel(row=10) # Returns the tenth row
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
which could also be replaced with ``data[row=10]``. A more complex example::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> # old syntax
|
|
|
|
|
>>> da.isel(space=0, time=slice(None, 2))[...] = spam
|
|
|
|
|
>>> # new syntax
|
|
|
|
|
>>> da[space=0, time=:2] = spam
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
Another example::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> # old syntax
|
|
|
|
|
>>> ds["empty"].loc[dict(lon=5, lat=6)] = 10
|
|
|
|
|
>>> # new syntax
|
|
|
|
|
>>> ds["empty"][lon=5, lat=6] = 10
|
|
|
|
|
|
|
|
|
|
>>> # old syntax
|
|
|
|
|
>>> ds["empty"].loc[dict(lon=slice(1, 5), lat=slice(3, None))] = 10
|
|
|
|
|
>>> # new syntax
|
|
|
|
|
>>> ds["empty"][lon=1:5, lat=6:] = 10
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-10-19 10:30:08 -04:00
|
|
|
|
6. Functions/methods whose argument is another function (plus its
|
|
|
|
|
arguments) need some way to determine which arguments are destined for
|
|
|
|
|
the target function, and which are used to configure how they run the
|
|
|
|
|
target. This is simple (if non-extensible) for positional parameters,
|
2020-12-23 13:05:12 -05:00
|
|
|
|
but we need some way to distinguish these for keywords. [#trio-run]_
|
2020-10-19 10:30:08 -04:00
|
|
|
|
|
|
|
|
|
An indexed notation would afford a Pythonic way to pass keyword
|
|
|
|
|
arguments to these functions without cluttering the caller's code.
|
|
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
|
|
>>> # Let's start this example with basic syntax without keywords.
|
|
|
|
|
>>> # the positional values are arguments to `func` while
|
|
|
|
|
>>> # `name=` is processed by `trio.run`.
|
|
|
|
|
>>> trio.run(func, value1, value2, name="func")
|
|
|
|
|
>>> # `trio.run` ends up calling `func(value1, value2)`.
|
|
|
|
|
|
|
|
|
|
>>> # If we want/need to pass value2 by keyword (keyword-only argument,
|
|
|
|
|
>>> # additional arguments that won't break backwards compatibility ...),
|
|
|
|
|
>>> # currently we need to resort to functools.partial:
|
|
|
|
|
>>> trio.run(functools.partial(func, param2=value2), value1, name="func")
|
|
|
|
|
>>> trio.run(functools.partial(func, value1, param2=value2), name="func")
|
|
|
|
|
|
|
|
|
|
>>> # One possible workaround is to convert `trio.run` to an object
|
|
|
|
|
>>> # with a `__call__` method, and use an "option" helper,
|
|
|
|
|
>>> trio.run.option(name="func")(func, value1, param2=value2)
|
|
|
|
|
>>> # However, foo(bar)(baz) is uncommon and thus disruptive to the reader.
|
|
|
|
|
>>> # Also, you need to remember the name of the `option` method.
|
|
|
|
|
|
|
|
|
|
>>> # This PEP allows us to replace `option` with `__getitem__`.
|
|
|
|
|
>>> # The call is now shorter, more mnemonic, and looks+works like typing
|
|
|
|
|
>>> trio.run[name="func"](func, value1, param2=value2)
|
|
|
|
|
|
2021-01-21 18:50:03 -05:00
|
|
|
|
7. Availability of star arguments would benefit PEP-646 Variadic Generics [#pep-0646]_,
|
|
|
|
|
especially in the forms ``a[*x]`` and ``a[*x, *y, p, q, *z]``. The PEP details
|
|
|
|
|
exactly this notation in its "Unpacking: Star Operator" section.
|
2020-10-19 10:30:08 -04:00
|
|
|
|
|
2020-09-22 20:14:43 -04:00
|
|
|
|
It is important to note that how the notation is interpreted is up to the
|
2021-02-03 22:22:25 -05:00
|
|
|
|
implementation. This PEP only defines and dictates the behavior of Python
|
2020-09-22 20:14:43 -04:00
|
|
|
|
regarding passed keyword arguments, not how these arguments should be
|
|
|
|
|
interpreted and used by the implementing class.
|
|
|
|
|
|
2020-09-26 16:32:35 -04:00
|
|
|
|
Current status of indexing operation
|
|
|
|
|
------------------------------------
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-26 16:32:35 -04:00
|
|
|
|
Before detailing the new syntax and semantics to the indexing notation, it is
|
|
|
|
|
relevant to analyse how the indexing notation works today, in which contexts,
|
|
|
|
|
and how it is different from a function call.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
Subscripting ``obj[x]`` is, effectively, an alternate and specialised form of
|
|
|
|
|
function call syntax with a number of differences and restrictions compared to
|
2021-02-03 22:22:25 -05:00
|
|
|
|
``obj(x)``. The current Python syntax focuses exclusively on position to express
|
2020-09-22 20:14:43 -04:00
|
|
|
|
the index, and also contains syntactic sugar to refer to non-punctiform
|
2020-09-25 20:00:29 -04:00
|
|
|
|
selection (slices). Some common examples::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> a[3] # returns the fourth element of 'a'
|
|
|
|
|
>>> a[1:10:2] # slice notation (extract a non-trivial data subset)
|
|
|
|
|
>>> a[3, 2] # multiple indexes (for multidimensional arrays)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
This translates into a ``__(get|set|del)item__`` dunder call which is passed a single
|
|
|
|
|
parameter containing the index (for ``__getitem__`` and ``__delitem__``) or two parameters
|
|
|
|
|
containing index and value (for ``__setitem__``).
|
|
|
|
|
|
|
|
|
|
The behavior of the indexing call is fundamentally different from a function call
|
|
|
|
|
in various aspects:
|
|
|
|
|
|
|
|
|
|
The first difference is in meaning to the reader. A function call says
|
|
|
|
|
"arbitrary function call potentially with side-effects". An indexing operation
|
|
|
|
|
says "lookup", typically to point at a subset or specific sub-aspect of an
|
|
|
|
|
entity (as in the case of typing notation). This fundamental difference means
|
|
|
|
|
that, while we cannot prevent abuse, implementors should be aware that the
|
|
|
|
|
introduction of keyword arguments to alter the behavior of the lookup may
|
|
|
|
|
violate this intrinsic meaning.
|
|
|
|
|
|
2020-09-24 17:19:40 -04:00
|
|
|
|
The second difference of the indexing notation compared to a function
|
2020-09-22 20:14:43 -04:00
|
|
|
|
is that indexing can be used for both getting and setting operations.
|
2021-02-03 22:22:25 -05:00
|
|
|
|
In Python, a function cannot be on the left hand side of an assignment. In
|
2020-09-25 20:00:29 -04:00
|
|
|
|
other words, both of these are valid::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> x = a[1, 2]
|
|
|
|
|
>>> a[1, 2] = 5
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
but only the first one of these is valid::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> x = f(1, 2)
|
|
|
|
|
>>> f(1, 2) = 5 # invalid
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
This asymmetry is important, and makes one understand that there is a natural
|
|
|
|
|
imbalance between the two forms. It is therefore not a given that the two
|
2020-09-24 17:19:40 -04:00
|
|
|
|
should behave transparently and symmetrically.
|
|
|
|
|
|
2020-09-22 20:14:43 -04:00
|
|
|
|
The third difference is that functions have names assigned to their
|
2020-09-26 16:32:35 -04:00
|
|
|
|
arguments, unless the passed parameters are captured with ``*args``, in which case
|
2020-09-22 20:14:43 -04:00
|
|
|
|
they end up as entries in the args tuple. In other words, functions already
|
|
|
|
|
have anonymous argument semantic, exactly like the indexing operation. However,
|
2020-09-26 16:32:35 -04:00
|
|
|
|
``__(get|set|del)item__`` is not always receiving a tuple as the ``index`` argument
|
|
|
|
|
(to be uniform in behavior with ``*args``). In fact, given a trivial class::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
class X:
|
|
|
|
|
def __getitem__(self, index):
|
|
|
|
|
print(index)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-24 17:19:40 -04:00
|
|
|
|
The index operation basically forwards the content of the square brackets "as is"
|
2020-09-26 19:37:32 -04:00
|
|
|
|
in the ``index`` argument::
|
2020-09-26 16:32:35 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> x=X()
|
|
|
|
|
>>> x[0]
|
|
|
|
|
0
|
|
|
|
|
>>> x[0, 1]
|
|
|
|
|
(0, 1)
|
|
|
|
|
>>> x[(0, 1)]
|
|
|
|
|
(0, 1)
|
|
|
|
|
>>>
|
|
|
|
|
>>> x[()]
|
|
|
|
|
()
|
|
|
|
|
>>> x[{1, 2, 3}]
|
|
|
|
|
{1, 2, 3}
|
|
|
|
|
>>> x["hello"]
|
|
|
|
|
hello
|
|
|
|
|
>>> x["hello", "hi"]
|
|
|
|
|
('hello', 'hi')
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
The fourth difference is that the indexing operation knows how to convert
|
2020-09-25 20:00:29 -04:00
|
|
|
|
colon notations to slices, thanks to support from the parser. This is valid::
|
2020-09-24 17:19:40 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
a[1:3]
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
this one isn't::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
f(1:3)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
The fifth difference is that there's no zero-argument form. This is valid::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
f()
|
2020-09-24 17:19:40 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
this one isn't::
|
2020-09-24 17:19:40 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
a[]
|
2020-09-24 17:19:40 -04:00
|
|
|
|
|
2020-09-26 16:32:35 -04:00
|
|
|
|
Specification
|
|
|
|
|
=============
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-26 16:32:35 -04:00
|
|
|
|
Before describing the specification, it is important to stress the difference in
|
|
|
|
|
nomenclature between *positional index*, *final index* and *keyword argument*, as it is important to
|
|
|
|
|
understand the fundamental asymmetries at play. The ``__(get|set|del)item__``
|
2020-09-22 20:14:43 -04:00
|
|
|
|
is fundamentally an indexing operation, and the way the element is retrieved,
|
2020-09-26 16:32:35 -04:00
|
|
|
|
set, or deleted is through an index, the *final index*.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-26 16:32:35 -04:00
|
|
|
|
The current status quo is to directly build the *final index* from what is passed between
|
|
|
|
|
square brackets, the *positional index*. In other words, what is passed in the
|
2020-09-22 20:14:43 -04:00
|
|
|
|
square brackets is trivially used to generate what the code in ``__getitem__`` then uses
|
|
|
|
|
for the indicisation operation. As we already saw for the dict, ``d[1]`` has a
|
|
|
|
|
positional index of ``1`` and also a final index of ``1`` (because it's the element that is
|
2020-09-24 17:19:40 -04:00
|
|
|
|
then added to the dictionary) and ``d[1, 2]`` has positional index of ``(1, 2)`` and
|
2020-09-22 20:14:43 -04:00
|
|
|
|
final index also of ``(1, 2)`` (because yet again it's the element that is added to the dictionary).
|
|
|
|
|
However, the positional index ``d[1,2:3]`` is not accepted by the dictionary, because
|
|
|
|
|
there's no way to transform the positional index into a final index, as the slice object is
|
|
|
|
|
unhashable. The positional index is what is currently known as the ``index`` parameter in
|
|
|
|
|
``__getitem__``. Nevertheless, nothing prevents to construct a dictionary-like class that
|
|
|
|
|
creates the final index by e.g. converting the positional index to a string.
|
|
|
|
|
|
2020-09-26 16:32:35 -04:00
|
|
|
|
This PEP extends the current status quo, and grants more flexibility to
|
|
|
|
|
create the final index via an enhanced syntax that combines the positional index
|
2020-09-24 17:19:40 -04:00
|
|
|
|
and keyword arguments, if passed.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
The above brings an important point across. Keyword arguments, in the context of the index
|
|
|
|
|
operation, may be used to take indexing decisions to obtain the final index, and therefore
|
|
|
|
|
will have to accept values that are unconventional for functions. See for
|
2020-09-24 17:19:40 -04:00
|
|
|
|
example use case 1, where a slice is accepted.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-26 16:32:35 -04:00
|
|
|
|
The successful implementation of this PEP will result in the following behavior:
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-12-23 13:43:35 -05:00
|
|
|
|
1. An empty subscript is still illegal, regardless of context (see Rejected Ideas)::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[] # SyntaxError
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
2. A single index value remains a single index value when passed::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[index]
|
|
|
|
|
# calls type(obj).__getitem__(obj, index)
|
|
|
|
|
|
|
|
|
|
obj[index] = value
|
|
|
|
|
# calls type(obj).__setitem__(obj, index, value)
|
|
|
|
|
|
|
|
|
|
del obj[index]
|
|
|
|
|
# calls type(obj).__delitem__(obj, index)
|
2020-09-26 16:32:35 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
This remains the case even if the index is followed by keywords; see point 5 below.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-26 19:37:32 -04:00
|
|
|
|
3. Comma-separated arguments are still parsed as a tuple and passed as
|
2020-09-25 20:00:29 -04:00
|
|
|
|
a single positional argument::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[spam, eggs]
|
|
|
|
|
# calls type(obj).__getitem__(obj, (spam, eggs))
|
|
|
|
|
|
|
|
|
|
obj[spam, eggs] = value
|
|
|
|
|
# calls type(obj).__setitem__(obj, (spam, eggs), value)
|
|
|
|
|
|
|
|
|
|
del obj[spam, eggs]
|
|
|
|
|
# calls type(obj).__delitem__(obj, (spam, eggs))
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
The points above mean that classes which do not want to support keyword
|
|
|
|
|
arguments in subscripts need do nothing at all, and the feature is therefore
|
|
|
|
|
completely backwards compatible.
|
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
4. Keyword arguments, if any, must follow positional arguments::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[1, 2, spam=None, 3] # SyntaxError
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
This is like function calls, where intermixing positional and keyword
|
|
|
|
|
arguments give a SyntaxError.
|
|
|
|
|
|
|
|
|
|
5. Keyword subscripts, if any, will be handled like they are in
|
2020-09-25 20:00:29 -04:00
|
|
|
|
function calls. Examples::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
# Single index with keywords:
|
|
|
|
|
|
|
|
|
|
obj[index, spam=1, eggs=2]
|
|
|
|
|
# calls type(obj).__getitem__(obj, index, spam=1, eggs=2)
|
|
|
|
|
|
|
|
|
|
obj[index, spam=1, eggs=2] = value
|
|
|
|
|
# calls type(obj).__setitem__(obj, index, value, spam=1, eggs=2)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
del obj[index, spam=1, eggs=2]
|
|
|
|
|
# calls type(obj).__delitem__(obj, index, spam=1, eggs=2)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
# Comma-separated indices with keywords:
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[foo, bar, spam=1, eggs=2]
|
|
|
|
|
# calls type(obj).__getitem__(obj, (foo, bar), spam=1, eggs=2)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[foo, bar, spam=1, eggs=2] = value
|
|
|
|
|
# calls type(obj).__setitem__(obj, (foo, bar), value, spam=1, eggs=2)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
del obj[foo, bar, spam=1, eggs=2]
|
|
|
|
|
# calls type(obj).__detitem__(obj, (foo, bar), spam=1, eggs=2)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
Note that:
|
|
|
|
|
|
|
|
|
|
- a single positional index will not turn into a tuple
|
2020-09-24 17:19:40 -04:00
|
|
|
|
just because one adds a keyword value.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
- for ``__setitem__``, the same order is retained for index and value.
|
|
|
|
|
The keyword arguments go at the end, as is normal for a function
|
|
|
|
|
definition.
|
|
|
|
|
|
|
|
|
|
6. The same rules apply with respect to keyword subscripts as for
|
|
|
|
|
keywords in function calls:
|
|
|
|
|
|
2021-02-03 09:06:23 -05:00
|
|
|
|
- the interpreter matches up each keyword subscript to a named parameter
|
2020-09-22 20:14:43 -04:00
|
|
|
|
in the appropriate method;
|
|
|
|
|
|
|
|
|
|
- if a named parameter is used twice, that is an error;
|
|
|
|
|
|
|
|
|
|
- if there are any named parameters left over (without a value) when the
|
|
|
|
|
keywords are all used, they are assigned their default value (if any);
|
|
|
|
|
|
|
|
|
|
- if any such parameter doesn't have a default, that is an error;
|
|
|
|
|
|
|
|
|
|
- if there are any keyword subscripts remaining after all the named
|
|
|
|
|
parameters are filled, and the method has a ``**kwargs`` parameter,
|
|
|
|
|
they are bound to the ``**kwargs`` parameter as a dict;
|
|
|
|
|
|
|
|
|
|
- but if no ``**kwargs`` parameter is defined, it is an error.
|
|
|
|
|
|
|
|
|
|
|
2020-09-26 19:37:32 -04:00
|
|
|
|
7. Sequence unpacking is allowed inside subscripts::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[*items]
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-26 19:37:32 -04:00
|
|
|
|
This allows notations such as ``[:, *args, :]``, which could be treated
|
2021-02-01 12:33:55 -05:00
|
|
|
|
as ``[(slice(None), *args, slice(None))]``. Multiple star unpacking are
|
|
|
|
|
allowed::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2021-02-01 12:33:55 -05:00
|
|
|
|
obj[1, *(2, 3), *(4, 5), 6, foo=5]
|
|
|
|
|
# Equivalent to obj[(1, 2, 3, 4, 5, 6), foo=3)
|
|
|
|
|
|
|
|
|
|
The following notation equivalence must be honored::
|
2020-09-26 19:37:32 -04:00
|
|
|
|
|
|
|
|
|
obj[*()]
|
2020-12-23 13:43:35 -05:00
|
|
|
|
# Equivalent to obj[()]
|
2020-09-26 19:37:32 -04:00
|
|
|
|
|
|
|
|
|
obj[*(), foo=3]
|
2021-02-01 12:33:55 -05:00
|
|
|
|
# Equivalent to obj[(), foo=3]
|
2020-09-26 19:37:32 -04:00
|
|
|
|
|
|
|
|
|
obj[*(x,)]
|
2021-02-01 12:33:55 -05:00
|
|
|
|
# Equivalent to obj[(x,)]
|
2020-09-26 19:37:32 -04:00
|
|
|
|
|
|
|
|
|
obj[*(x,),]
|
2021-02-01 12:33:55 -05:00
|
|
|
|
# Equivalent to obj[(x,)]
|
|
|
|
|
|
|
|
|
|
Note in particular case 3: sequence unpacking of a single element will
|
|
|
|
|
not behave as if only one single argument was passed. A related case is
|
|
|
|
|
the following example::
|
|
|
|
|
|
|
|
|
|
obj[1, *(), foo=5]
|
|
|
|
|
# Equivalent to obj[(1,), foo=5]
|
|
|
|
|
# calls type(obj).__getitem__(obj, (1,), foo=5)
|
|
|
|
|
|
|
|
|
|
However, as we saw earlier, for backward compatibility a single index will be passed as is::
|
|
|
|
|
|
|
|
|
|
obj[1, foo=5]
|
|
|
|
|
# calls type(obj).__getitem__(obj, 1, foo=5)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2021-02-01 12:33:55 -05:00
|
|
|
|
In other words, a single positional index will be passed "as is" only if no sequence
|
|
|
|
|
unpacking is present. If a sequence unpacking is present, then the index will become a tuple,
|
|
|
|
|
regardless of the resulting number of elements in the index after the unpacking has taken place.
|
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
8. Dict unpacking is permitted::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
items = {'spam': 1, 'eggs': 2}
|
|
|
|
|
obj[index, **items]
|
|
|
|
|
# equivalent to obj[index, spam=1, eggs=2]
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-26 19:37:32 -04:00
|
|
|
|
The following notation equivalent should be honored::
|
|
|
|
|
|
|
|
|
|
obj[**{}]
|
2020-12-23 13:43:35 -05:00
|
|
|
|
# Equivalent to obj[()]
|
2020-09-26 19:37:32 -04:00
|
|
|
|
|
|
|
|
|
obj[3, **{}]
|
|
|
|
|
# Equivalent to obj[3]
|
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
9. Keyword-only subscripts are permitted. The positional index will be the empty tuple::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[spam=1, eggs=2]
|
|
|
|
|
# calls type(obj).__getitem__(obj, (), spam=1, eggs=2)
|
|
|
|
|
|
|
|
|
|
obj[spam=1, eggs=2] = 5
|
|
|
|
|
# calls type(obj).__setitem__(obj, (), 5, spam=1, eggs=2)
|
|
|
|
|
|
|
|
|
|
del obj[spam=1, eggs=2]
|
|
|
|
|
# calls type(obj).__delitem__(obj, (), spam=1, eggs=2)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-28 16:13:20 -04:00
|
|
|
|
The choice of the empty tuple as a sentinel has been debated. Details are provided in
|
|
|
|
|
the Rejected Ideas section.
|
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
10. Keyword arguments must allow slice syntax::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[3:4, spam=1:4, eggs=2]
|
|
|
|
|
# calls type(obj).__getitem__(obj, slice(3, 4, None), spam=slice(1, 4, None), eggs=2)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
This may open up the possibility to accept the same syntax for general function
|
|
|
|
|
calls, but this is not part of this recommendation.
|
|
|
|
|
|
2020-09-26 12:38:24 -04:00
|
|
|
|
11. Keyword arguments allow for default values::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
# Given type(obj).__getitem__(obj, index, spam=True, eggs=2)
|
|
|
|
|
obj[3] # Valid. index = 3, spam = True, eggs = 2
|
|
|
|
|
obj[3, spam=False] # Valid. index = 3, spam = False, eggs = 2
|
|
|
|
|
obj[spam=False] # Valid. index = (), spam = False, eggs = 2
|
|
|
|
|
obj[] # Invalid.
|
2020-09-24 17:19:40 -04:00
|
|
|
|
|
2020-09-26 12:38:24 -04:00
|
|
|
|
12. The same semantics given above must be extended to ``__class__getitem__``:
|
2020-09-22 20:14:43 -04:00
|
|
|
|
Since PEP 560, type hints are dispatched so that for ``x[y]``, if no
|
|
|
|
|
``__getitem__`` method is found, and ``x`` is a type (class) object,
|
2020-09-24 17:19:40 -04:00
|
|
|
|
and ``x`` has a class method ``__class_getitem__``, that method is
|
|
|
|
|
called. The same changes should be applied to this method as well,
|
2020-09-22 20:14:43 -04:00
|
|
|
|
so that a writing like ``list[T=int]`` can be accepted.
|
|
|
|
|
|
2020-09-26 16:32:35 -04:00
|
|
|
|
Indexing behavior in standard classes (dict, list, etc.)
|
|
|
|
|
--------------------------------------------------------
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-26 16:32:35 -04:00
|
|
|
|
None of what is proposed in this PEP will change the behavior of the current
|
|
|
|
|
core classes that use indexing. Adding keywords to the index operation for
|
|
|
|
|
custom classes is not the same as modifying e.g. the standard dict type to
|
|
|
|
|
handle keyword arguments. In fact, dict (as well as list and other stdlib
|
|
|
|
|
classes with indexing semantics) will remain the same and will continue not to
|
|
|
|
|
accept keyword arguments. In other words, if ``d`` is a ``dict``, the
|
2020-09-22 20:14:43 -04:00
|
|
|
|
statement ``d[1, a=2]`` will raise ``TypeError``, as their implementation will
|
|
|
|
|
not support the use of keyword arguments. The same holds for all other classes
|
2020-12-23 13:05:12 -05:00
|
|
|
|
(list, dict, etc.)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-24 17:19:40 -04:00
|
|
|
|
Corner case and Gotchas
|
2020-09-22 20:14:43 -04:00
|
|
|
|
-----------------------
|
|
|
|
|
|
2020-09-26 16:32:35 -04:00
|
|
|
|
With the introduction of the new notation, a few corner cases need to be analysed.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
1. Technically, if a class defines their getter like this::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
def __getitem__(self, index):
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
then the caller could call that using keyword syntax, like these two cases::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[3, index=4]
|
|
|
|
|
obj[index=1]
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-24 17:19:40 -04:00
|
|
|
|
The resulting behavior would be an error automatically, since it would be like
|
|
|
|
|
attempting to call the method with two values for the ``index`` argument, and
|
|
|
|
|
a ``TypeError`` will be raised. In the first case, the ``index`` would be ``3``,
|
|
|
|
|
in the second case, it would be the empty tuple ``()``.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-24 17:19:40 -04:00
|
|
|
|
Note that this behavior applies for all currently existing classes that rely on
|
|
|
|
|
indexing, meaning that there is no way for the new behavior to introduce
|
|
|
|
|
backward compatibility issues on this respect.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-24 17:19:40 -04:00
|
|
|
|
Classes that wish to stress this behavior explicitly can define their
|
2020-09-25 20:00:29 -04:00
|
|
|
|
parameters as positional-only::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
def __getitem__(self, index, /):
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
2. a similar case occurs with setter notation::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-12-23 13:05:12 -05:00
|
|
|
|
# Given type(obj).__setitem__(obj, index, value):
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[1, value=3] = 5
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2021-02-03 22:22:25 -05:00
|
|
|
|
This poses no issue because the value is passed automatically, and the Python interpreter will raise
|
2020-09-22 20:14:43 -04:00
|
|
|
|
``TypeError: got multiple values for keyword argument 'value'``
|
2020-09-24 17:19:40 -04:00
|
|
|
|
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
3. If the subscript dunders are declared to use positional-or-keyword
|
|
|
|
|
parameters, there may be some surprising cases when arguments are passed
|
2020-09-25 20:00:29 -04:00
|
|
|
|
to the method. Given the signature::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
def __getitem__(self, index, direction='north')
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
if the caller uses this::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[0, 'south']
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
they will probably be surprised by the method call::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-12-23 13:05:12 -05:00
|
|
|
|
# expected type(obj).__getitem__(obj, 0, direction='south')
|
2020-09-25 20:00:29 -04:00
|
|
|
|
# but actually get:
|
2020-12-23 13:05:12 -05:00
|
|
|
|
type(obj).__getitem__(obj, (0, 'south'), direction='north')
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
Solution: best practice suggests that keyword subscripts should be
|
2020-09-25 20:00:29 -04:00
|
|
|
|
flagged as keyword-only when possible::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
def __getitem__(self, index, *, direction='north')
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
The interpreter need not enforce this rule, as there could be scenarios
|
|
|
|
|
where this is the desired behaviour. But linters may choose to warn
|
|
|
|
|
about subscript methods which don't use the keyword-only flag.
|
|
|
|
|
|
|
|
|
|
4. As we saw, a single value followed by a keyword argument will not be changed into a tuple, i.e.:
|
2020-12-23 13:05:12 -05:00
|
|
|
|
``d[1, a=3]`` is treated as ``__getitem__(d, 1, a=3)``, NOT ``__getitem__(d, (1,), a=3)``. It would be
|
2020-09-22 20:14:43 -04:00
|
|
|
|
extremely confusing if adding keyword arguments were to change the type of the passed index.
|
|
|
|
|
In other words, adding a keyword to a single-valued subscript will not change it into a tuple.
|
2020-09-25 20:00:29 -04:00
|
|
|
|
For those cases where an actual tuple needs to be passed, a proper syntax will have to be used::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-12-23 13:05:12 -05:00
|
|
|
|
obj[(1,), a=3]
|
|
|
|
|
# calls type(obj).__getitem__(obj, (1,), a=3)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
In this case, the call is passing a single element (which is passed as is, as from rule above),
|
|
|
|
|
only that the single element happens to be a tuple.
|
|
|
|
|
|
|
|
|
|
Note that this behavior just reveals the truth that the ``obj[1,]`` notation is shorthand for
|
2020-09-24 17:19:40 -04:00
|
|
|
|
``obj[(1,)]`` (and also ``obj[1]`` is shorthand for ``obj[(1)]``, with the expected behavior).
|
|
|
|
|
When keywords are present, the rule that you can omit this outermost pair of parentheses is no
|
2020-09-25 20:00:29 -04:00
|
|
|
|
longer true::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-12-23 13:05:12 -05:00
|
|
|
|
obj[1]
|
|
|
|
|
# calls type(obj).__getitem__(obj, 1)
|
|
|
|
|
|
|
|
|
|
obj[1, a=3]
|
|
|
|
|
# calls type(obj).__getitem__(obj, 1, a=3)
|
|
|
|
|
|
|
|
|
|
obj[1,]
|
|
|
|
|
# calls type(obj).__getitem__(obj, (1,))
|
|
|
|
|
|
|
|
|
|
obj[(1,), a=3]
|
|
|
|
|
# calls type(obj).__getitem__(obj, (1,), a=3)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
This is particularly relevant in the case where two entries are passed::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-12-23 13:05:12 -05:00
|
|
|
|
obj[1, 2]
|
|
|
|
|
# calls type(obj).__getitem__(obj, (1, 2))
|
|
|
|
|
|
|
|
|
|
obj[(1, 2)]
|
|
|
|
|
# same as above
|
|
|
|
|
|
|
|
|
|
obj[1, 2, a=3]
|
|
|
|
|
# calls type(obj).__getitem__(obj, (1, 2), a=3)
|
|
|
|
|
|
|
|
|
|
obj[(1, 2), a=3]
|
|
|
|
|
# calls type(obj).__getitem__(obj, (1, 2), a=3)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
And particularly when the tuple is extracted as a variable::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
t = (1, 2)
|
2020-12-23 13:05:12 -05:00
|
|
|
|
obj[t]
|
|
|
|
|
# calls type(obj).__getitem__(obj, (1, 2))
|
|
|
|
|
|
|
|
|
|
obj[t, a=3]
|
|
|
|
|
# calls type(obj).__getitem__(obj, (1, 2), a=3)
|
2020-09-24 17:19:40 -04:00
|
|
|
|
|
2020-09-22 20:14:43 -04:00
|
|
|
|
Why? because in the case ``obj[1, 2, a=3]`` we are passing two elements (which
|
2020-09-24 17:19:40 -04:00
|
|
|
|
are then packed as a tuple and passed as the index). In the case ``obj[(1, 2), a=3]``
|
|
|
|
|
we are passing a single element (which is passed as is) which happens to be a tuple.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
The final result is that they are the same.
|
|
|
|
|
|
|
|
|
|
C Interface
|
|
|
|
|
===========
|
|
|
|
|
|
2020-09-26 16:32:35 -04:00
|
|
|
|
Resolution of the indexing operation is performed through a call to the following functions
|
|
|
|
|
|
|
|
|
|
- ``PyObject_GetItem(PyObject *o, PyObject *key)`` for the get operation
|
|
|
|
|
- ``PyObject_SetItem(PyObject *o, PyObject *key, PyObject *value)`` for the set operation
|
|
|
|
|
- ``PyObject_DelItem(PyObject *o, PyObject *key)`` for the del operation
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2021-02-03 22:22:25 -05:00
|
|
|
|
These functions are used extensively within the Python executable, and are
|
2020-09-22 20:14:43 -04:00
|
|
|
|
also part of the public C API, as exported by ``Include/abstract.h``. It is clear that
|
2020-09-24 17:19:40 -04:00
|
|
|
|
the signature of this function cannot be changed, and different C level functions
|
2020-09-22 20:14:43 -04:00
|
|
|
|
need to be implemented to support the extended call. We propose
|
2020-09-26 16:32:35 -04:00
|
|
|
|
|
|
|
|
|
- ``PyObject_GetItemWithKeywords(PyObject *o, PyObject *key, PyObject *kwargs)``
|
|
|
|
|
- ``PyObject_SetItemWithKeywords(PyObject *o, PyObject *key, PyObject *value, PyObject *kwargs)``
|
|
|
|
|
- ``PyObject_GetItemWithKeywords(PyObject *o, PyObject *key, PyObject *kwargs)``
|
|
|
|
|
|
2020-12-28 12:38:34 -05:00
|
|
|
|
New opcodes will be needed for the enhanced call. Currently, the
|
2020-09-26 16:32:35 -04:00
|
|
|
|
implementation uses ``BINARY_SUBSCR``, ``STORE_SUBSCR`` and ``DELETE_SUBSCR``
|
|
|
|
|
to invoke the old functions. We propose ``BINARY_SUBSCR_KW``,
|
|
|
|
|
``STORE_SUBSCR_KW`` and ``DELETE_SUBSCR_KW`` for the new operations. The
|
|
|
|
|
compiler will have to generate these new opcodes. The
|
|
|
|
|
old C implementations will call the extended methods passing ``NULL``
|
|
|
|
|
as kwargs.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-12-28 12:38:34 -05:00
|
|
|
|
Finally, the following new slots must be added to the ``PyMappingMethods`` struct:
|
|
|
|
|
|
|
|
|
|
- ``mp_subscript_kw``
|
|
|
|
|
- ``mp_ass_subscript_kw``
|
|
|
|
|
|
|
|
|
|
These slots will have the appropriate signature to handle the dictionary object
|
|
|
|
|
containing the keywords.
|
|
|
|
|
|
2021-02-06 13:18:02 -05:00
|
|
|
|
"How to teach" recommendations
|
|
|
|
|
==============================
|
|
|
|
|
|
|
|
|
|
One request that occurred during feedback sessions was to detail a possible narrative
|
|
|
|
|
for teaching the feature, e.g. to students, data scientists, and similar audience.
|
|
|
|
|
This section addresses that need.
|
|
|
|
|
|
|
|
|
|
We will only describe the indexing from the perspective of use, not of
|
|
|
|
|
implementation, because it is the aspect that the above mentioned audience will
|
|
|
|
|
likely encounter. Only a subset of the users will have to implement their own
|
|
|
|
|
dunder functions, and can be considered advanced usage. A proper explanation could be:
|
|
|
|
|
|
|
|
|
|
The indexing operation is generally used to refer to a subset of a larger
|
|
|
|
|
dataset by means of an index. In the commonly seen cases, the index is made by
|
|
|
|
|
one or more numbers, strings, slices, etc.
|
|
|
|
|
|
|
|
|
|
Some types may allow indexing to occur not only with the index, but also with
|
|
|
|
|
named values. These named values are given between square brackets using the
|
|
|
|
|
same syntax used for function call keyword arguments. The meaning of the names
|
|
|
|
|
and their use is found in the documentation of the type, as it varies from one
|
|
|
|
|
type to another.
|
|
|
|
|
|
|
|
|
|
The teacher will now show some practical real world examples, explaining the
|
|
|
|
|
semantics of the feature in the shown library. At the time of writing these
|
|
|
|
|
examples do not exist, obviously, but the libraries most likely to implement
|
|
|
|
|
the feature are pandas and numpy, possibly as a method to refer to columns by
|
|
|
|
|
name.
|
2020-12-28 12:38:34 -05:00
|
|
|
|
|
2020-12-23 13:05:12 -05:00
|
|
|
|
Reference Implementation
|
|
|
|
|
========================
|
|
|
|
|
|
|
|
|
|
A reference implementation is currently being developed here [#reference-impl]_.
|
|
|
|
|
|
|
|
|
|
|
2020-09-22 20:14:43 -04:00
|
|
|
|
Workarounds
|
|
|
|
|
===========
|
|
|
|
|
|
|
|
|
|
Every PEP that changes the Python language should "clearly explain why
|
|
|
|
|
the existing language specification is inadequate to address the
|
|
|
|
|
problem that the PEP solves." [#pep-0001]_
|
|
|
|
|
|
|
|
|
|
Some rough equivalents to the proposed extension, which we call work-arounds,
|
|
|
|
|
are already possible. The work-arounds provide an alternative to enabling the
|
2020-09-24 17:19:40 -04:00
|
|
|
|
new syntax, while leaving the semantics to be defined elsewhere.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
These work-arounds follow. In them the helpers ``H`` and ``P`` are not intended to
|
|
|
|
|
be universal. For example, a module or package might require the use of its own
|
|
|
|
|
helpers.
|
|
|
|
|
|
|
|
|
|
1. User defined classes can be given ``getitem`` and ``delitem`` methods,
|
2020-09-25 20:00:29 -04:00
|
|
|
|
that respectively get and delete values stored in a container::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> val = x.getitem(1, 2, a=3, b=4)
|
|
|
|
|
>>> x.delitem(1, 2, a=3, b=4)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
The same can't be done for ``setitem``. It's not valid syntax::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> x.setitem(1, 2, a=3, b=4) = val
|
|
|
|
|
SyntaxError: can't assign to function call
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
2. A helper class, here called ``H``, can be used to swap the container
|
2020-09-25 20:00:29 -04:00
|
|
|
|
and parameter roles. In other words, we use::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
H(1, 2, a=3, b=4)[x]
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
as a substitute for::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
x[1, 2, a=3, b=4]
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
This method will work for ``getitem``, ``delitem`` and also for
|
2020-09-25 20:00:29 -04:00
|
|
|
|
``setitem``. This is because::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> H(1, 2, a=3, b=4)[x] = val
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
is valid syntax, which can be given the appropriate semantics.
|
|
|
|
|
|
|
|
|
|
3. A helper function, here called ``P``, can be used to store the
|
2020-09-25 20:00:29 -04:00
|
|
|
|
arguments in a single object. For example::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> x[P(1, 2, a=3, b=4)] = val
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
is valid syntax, and can be given the appropriate semantics.
|
|
|
|
|
|
|
|
|
|
4. The ``lo:hi:step`` syntax for slices is sometimes very useful. This
|
2020-09-25 20:00:29 -04:00
|
|
|
|
syntax is not directly available in the work-arounds. However::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
s[lo:hi:step]
|
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
provides a work-around that is available everything, where::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
class S:
|
|
|
|
|
def __getitem__(self, key): return key
|
|
|
|
|
|
|
|
|
|
s = S()
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
defines the helper object ``s``.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
Rejected Ideas
|
|
|
|
|
==============
|
|
|
|
|
|
|
|
|
|
Previous PEP 472 solutions
|
|
|
|
|
--------------------------
|
|
|
|
|
|
|
|
|
|
PEP 472 presents a good amount of ideas that are now all to be considered
|
2020-12-23 13:05:12 -05:00
|
|
|
|
Rejected. A personal email from D'Aprano to the author specifically said:
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
I have now carefully read through PEP 472 in full, and I am afraid I
|
|
|
|
|
cannot support any of the strategies currently in the PEP.
|
|
|
|
|
|
|
|
|
|
We agree that those options are inferior to the currently presented, for one
|
|
|
|
|
reason or another.
|
|
|
|
|
|
|
|
|
|
To keep this document compact, we will not present here the objections for
|
|
|
|
|
all options presented in PEP 472. Suffice to say that they were discussed,
|
|
|
|
|
and each proposed alternative had one or few dealbreakers.
|
|
|
|
|
|
|
|
|
|
Adding new dunders
|
|
|
|
|
------------------
|
|
|
|
|
|
|
|
|
|
It was proposed to introduce new dunders ``__(get|set|del)item_ex__``
|
|
|
|
|
that are invoked over the ``__(get|set|del)item__`` triad, if they are present.
|
|
|
|
|
|
|
|
|
|
The rationale around this choice is to make the intuition around how to add kwd
|
|
|
|
|
arg support to square brackets more obvious and in line with the function
|
2020-09-25 20:00:29 -04:00
|
|
|
|
behavior. Given::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
def __getitem_ex__(self, x, y): ...
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
These all just work and produce the same result effortlessly::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
obj[1, 2]
|
|
|
|
|
obj[1, y=2]
|
|
|
|
|
obj[y=2, x=1]
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
In other words, this solution would unify the behavior of ``__getitem__`` to the traditional
|
|
|
|
|
function signature, but since we can't change ``__getitem__`` and break backward compatibility,
|
|
|
|
|
we would have an extended version that is used preferentially.
|
|
|
|
|
|
|
|
|
|
The problems with this approach were found to be:
|
|
|
|
|
|
|
|
|
|
- It will slow down subscripting. For every subscript access, this new dunder
|
|
|
|
|
attribute gets investigated on the class, and if it is not present then the
|
2020-09-24 17:19:40 -04:00
|
|
|
|
default key translation function is executed.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
Different ideas were proposed to handle this, from wrapping the method
|
2020-09-24 17:19:40 -04:00
|
|
|
|
only at class instantiation time, to add a bit flag to signal the availability
|
|
|
|
|
of these methods. Regardess of the solution, the new dunder would be effective
|
2020-09-22 20:14:43 -04:00
|
|
|
|
only if added at class creation time, not if it's added later. This would
|
|
|
|
|
be unusual and would disallow (and behave unexpectedly) monkeypatching of the
|
|
|
|
|
methods for whatever reason it might be needed.
|
|
|
|
|
|
|
|
|
|
- It adds complexity to the mechanism.
|
|
|
|
|
|
|
|
|
|
- Will require a long and painful transition period during which time
|
|
|
|
|
libraries will have to somehow support both calling conventions, because most
|
|
|
|
|
likely, the extended methods will delegate to the traditional ones when the
|
|
|
|
|
right conditions are matched in the arguments, or some classes will support
|
|
|
|
|
the traditional dunder and others the extended dunder. While this will not
|
|
|
|
|
affect calling code, it will affect development.
|
|
|
|
|
|
|
|
|
|
- it would potentially lead to mixed situations where the extended version is
|
|
|
|
|
defined for the getter, but not for the setter.
|
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
- In the ``__setitem_ex__`` signature, value would have to be made the first
|
2020-09-22 20:14:43 -04:00
|
|
|
|
element, because the index is of arbitrary length depending on the specified
|
|
|
|
|
indexes. This would look awkward because the visual notation does not match
|
2020-09-25 20:00:29 -04:00
|
|
|
|
the signature::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-12-23 13:05:12 -05:00
|
|
|
|
obj[1, 2] = 3
|
|
|
|
|
# calls type(obj).__setitem_ex__(obj, 3, 1, 2)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-24 17:19:40 -04:00
|
|
|
|
- the solution relies on the assumption that all keyword indices necessarily map
|
|
|
|
|
into positional indices, or that they must have a name. This assumption may be
|
2021-02-03 22:22:25 -05:00
|
|
|
|
false: xarray, which is the primary Python package for numpy arrays with
|
2020-09-24 17:19:40 -04:00
|
|
|
|
labelled dimensions, supports indexing by additional dimensions (so called
|
2020-09-22 20:14:43 -04:00
|
|
|
|
"non-dimension coordinates") that don't correspond directly to the dimensions
|
|
|
|
|
of the underlying numpy array, and those have no position to match up to.
|
|
|
|
|
In other words, anonymous indexes are a plausible use case that this solution
|
2020-09-24 17:19:40 -04:00
|
|
|
|
would remove, although it could be argued that using ``*args`` would solve
|
2020-09-22 20:14:43 -04:00
|
|
|
|
that issue.
|
|
|
|
|
|
|
|
|
|
Adding an adapter function
|
|
|
|
|
--------------------------
|
|
|
|
|
|
|
|
|
|
Similar to the above, in the sense that a pre-function would be called to
|
|
|
|
|
convert the "new style" indexing into "old style indexing" that is then passed.
|
|
|
|
|
Has problems similar to the above.
|
|
|
|
|
|
|
|
|
|
create a new "kwslice" object
|
|
|
|
|
-----------------------------
|
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
This proposal has already been explored in "New arguments contents" P4 in PEP 472::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-12-23 13:05:12 -05:00
|
|
|
|
obj[a, b:c, x=1]
|
|
|
|
|
# calls type(obj).__getitem__(obj, a, slice(b, c), key(x=1))
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
This solution requires everyone who needs keyword arguments to parse the tuple
|
|
|
|
|
and/or key object by hand to extract them. This is painful and opens up to the
|
|
|
|
|
get/set/del function to always accept arbitrary keyword arguments, whether they
|
|
|
|
|
make sense or not. We want the developer to be able to specify which arguments
|
|
|
|
|
make sense and which ones do not.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Using a single bit to change the behavior
|
|
|
|
|
-----------------------------------------
|
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
A special class dunder flag::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
__keyfn__ = True
|
|
|
|
|
|
|
|
|
|
would change the signature of the ``__get|set|delitem__`` to a "function like" dispatch,
|
2020-09-25 20:00:29 -04:00
|
|
|
|
meaning that this::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
>>> d[1, 2, z=3]
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
would result in a call to::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-12-23 13:05:12 -05:00
|
|
|
|
>>> type(obj).__getitem__(obj, 1, 2, z=3)
|
|
|
|
|
# instead of type(obj).__getitem__(obj, (1, 2), z=3)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
This option has been rejected because it feels odd that a signature of a method
|
|
|
|
|
depends on a specific value of another dunder. It would be confusing for both
|
2020-09-24 17:19:40 -04:00
|
|
|
|
static type checkers and for humans: a static type checker would have to hard-code
|
2020-09-22 20:14:43 -04:00
|
|
|
|
a special case for this, because there really is nothing else in Python
|
|
|
|
|
where the signature of a dunder depends on the value of another dunder.
|
|
|
|
|
A human that has to implement a ``__getitem__`` dunder would have to look if in the
|
|
|
|
|
class (or in any of its subclasses) for a ``__keyfn__`` before the dunder can be written.
|
|
|
|
|
Moreover, adding a base classes that have the ``__keyfn__`` flag set would break
|
|
|
|
|
the signature of the current methods. This would be even more problematic if the
|
|
|
|
|
flag is changed at runtime, or if the flag is generated by calling a function
|
|
|
|
|
that returns randomly True or something else.
|
|
|
|
|
|
|
|
|
|
Allowing for empty index notation obj[]
|
|
|
|
|
---------------------------------------
|
|
|
|
|
|
|
|
|
|
The current proposal prevents ``obj[]`` from being valid notation. However
|
|
|
|
|
a commenter stated
|
|
|
|
|
|
2021-02-05 11:09:26 -05:00
|
|
|
|
We have ``Tuple[int, int]`` as a tuple of two integers. And we have ``Tuple[int]``
|
2020-09-22 20:14:43 -04:00
|
|
|
|
as a tuple of one integer. And occasionally we need to spell a tuple of *no*
|
|
|
|
|
values, since that's the type of ``()``. But we currently are forced to write
|
|
|
|
|
that as ``Tuple[()]``. If we allowed ``Tuple[]`` that odd edge case would be
|
|
|
|
|
removed.
|
|
|
|
|
|
|
|
|
|
So I probably would be okay with allowing ``obj[]`` syntactically, as long as the
|
|
|
|
|
dict type could be made to reject it.
|
|
|
|
|
|
|
|
|
|
This proposal already established that, in case no positional index is given, the
|
2020-09-24 17:19:40 -04:00
|
|
|
|
passed value must be the empty tuple. Allowing for the empty index notation would
|
2020-09-22 20:14:43 -04:00
|
|
|
|
make the dictionary type accept it automatically, to insert or refer to the value with
|
|
|
|
|
the empty tuple as key. Moreover, a typing notation such as ``Tuple[]`` can easily
|
2020-12-23 13:43:35 -05:00
|
|
|
|
be written as ``Tuple`` without the indexing notation.
|
|
|
|
|
|
|
|
|
|
However, subsequent discussion with Brandt Bucher during implementation has revealed
|
|
|
|
|
that the case ``obj[]`` would fit a natural evolution for variadic generics, giving
|
|
|
|
|
more strength to the above comment. In the end, after a discussion between D'Aprano,
|
|
|
|
|
Bucher and the author, we decided to leave the ``obj[]`` notation as a syntax
|
|
|
|
|
error for now, and possibly extend the notation with an additional PEP to hold
|
|
|
|
|
the equivalence ``obj[]`` as ``obj[()]``.
|
|
|
|
|
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-28 16:13:20 -04:00
|
|
|
|
Sentinel value for no given positional index
|
|
|
|
|
--------------------------------------------
|
|
|
|
|
|
|
|
|
|
The topic of which value to pass as the index in the case of::
|
|
|
|
|
|
|
|
|
|
obj[k=3]
|
|
|
|
|
|
|
|
|
|
has been considerably debated.
|
|
|
|
|
|
|
|
|
|
One apparently rational choice would be to pass no value at all, by making use of
|
|
|
|
|
the keyword only argument feature, but unfortunately will not work well with
|
|
|
|
|
the ``__setitem__`` dunder, as a positional element for the value is always
|
|
|
|
|
passed, and we can't "skip over" the index one unless we introduce a very weird behavior
|
|
|
|
|
where the first argument refers to the index when specified, and to the value when
|
|
|
|
|
index is not specified. This is extremely deceiving and error prone.
|
|
|
|
|
|
|
|
|
|
The above consideration makes it impossible to have a keyword only dunder, and
|
|
|
|
|
opens up the question of what entity to pass for the index position when no index
|
|
|
|
|
is passed::
|
|
|
|
|
|
2020-12-23 13:05:12 -05:00
|
|
|
|
obj[k=3] = 5
|
|
|
|
|
# would call type(obj).__setitem__(obj, ???, 5, k=3)
|
2020-09-28 16:13:20 -04:00
|
|
|
|
|
|
|
|
|
A proposed hack would be to let the user specify which entity to use when an
|
|
|
|
|
index is not specified, by specifying a default for the ``index``, but this
|
|
|
|
|
forces necessarily to also specify a (never going to be used, as a value is
|
|
|
|
|
always passed by design) default for the ``value``, as we can't have
|
|
|
|
|
non-default arguments after defaulted one::
|
|
|
|
|
|
2020-12-23 13:05:12 -05:00
|
|
|
|
def __setitem__(self, index=SENTINEL, value=NEVERUSED, *, k)
|
2020-09-28 16:13:20 -04:00
|
|
|
|
|
|
|
|
|
which seems ugly, redundant and confusing. We must therefore accept that some
|
2021-02-03 22:22:25 -05:00
|
|
|
|
form of sentinel index must be passed by the Python implementation when the
|
2020-09-28 16:13:20 -04:00
|
|
|
|
``obj[k=3]`` notation is used. This also means that default arguments to those
|
|
|
|
|
parameters are simply never going to be used (but it's already the
|
|
|
|
|
case with the current implementation, so no change there).
|
|
|
|
|
|
|
|
|
|
Additionally, some classes may want to use ``**kwargs``, instead of a keyword-only
|
|
|
|
|
argument, meaning that having a definition like::
|
|
|
|
|
|
|
|
|
|
def __setitem__(self, index, value, **kwargs):
|
|
|
|
|
|
|
|
|
|
and a user that wants to pass a keyword ``value``::
|
|
|
|
|
|
|
|
|
|
x[value=1] = 0
|
|
|
|
|
|
|
|
|
|
expecting a call like::
|
|
|
|
|
|
2020-12-23 13:05:12 -05:00
|
|
|
|
type(obj).__setitem__(obj, SENTINEL, 0, **{"value": 1})
|
2020-09-28 16:13:20 -04:00
|
|
|
|
|
|
|
|
|
will instead accidentally be catched by the named ``value``, producing a
|
|
|
|
|
``duplicate value error``. The user should not be worried about the actual
|
|
|
|
|
local names of those two arguments if they are, for all practical purposes,
|
|
|
|
|
positional only. Unfortunately, using positional-only values will ensure this
|
|
|
|
|
does not happen but it will still not solve the need to pass both ``index`` and
|
|
|
|
|
``value`` even when the index is not provided. The point is that the user should not
|
|
|
|
|
be prevented to use keyword arguments to refer to a column ``index``, ``value``
|
|
|
|
|
(or ``self``) just because the class implementor happens to use those names
|
|
|
|
|
in the parameter list.
|
|
|
|
|
|
|
|
|
|
Moreover, we also require the three dunders to behave in the same way: it would
|
|
|
|
|
be extremely inconvenient if only ``__setitem__`` were to receive this
|
|
|
|
|
sentinel, and ``__get|delitem__`` would not because they can get away with a
|
|
|
|
|
signature that allows for no index specification, thus allowing for a
|
|
|
|
|
user-specified default index.
|
|
|
|
|
|
|
|
|
|
Whatever the choice of the sentinel, it will make the following cases
|
|
|
|
|
degenerate and thus impossible to differentiate in the dunder::
|
|
|
|
|
|
|
|
|
|
obj[k=3]
|
|
|
|
|
obj[SENTINEL, k=3]
|
|
|
|
|
|
|
|
|
|
The question now shifts to which entity should represent the sentinel:
|
|
|
|
|
the options were:
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-28 16:13:20 -04:00
|
|
|
|
1. Empty tuple
|
|
|
|
|
2. None
|
|
|
|
|
3. NotImplemented
|
|
|
|
|
4. a new sentinel object (e.g. NoIndex)
|
|
|
|
|
|
|
|
|
|
For option 1, the call will become::
|
|
|
|
|
|
2020-12-23 13:05:12 -05:00
|
|
|
|
type(obj).__getitem__(obj, (), k=3)
|
2020-09-28 16:13:20 -04:00
|
|
|
|
|
|
|
|
|
therefore making ``obj[k=3]`` and ``obj[(), k=3]`` degenerate and indistinguishable.
|
|
|
|
|
|
|
|
|
|
This option sounds appealing because:
|
|
|
|
|
|
|
|
|
|
1. The numpy community was inquired [#numpy-ml]_, and the general consensus
|
|
|
|
|
of the responses was that the empty tuple felt appropriate.
|
|
|
|
|
2. It shows a parallel with the behavior of ``*args`` in a function, when
|
|
|
|
|
no positional arguments are given::
|
|
|
|
|
|
|
|
|
|
>>> def foo(*args, **kwargs):
|
|
|
|
|
... print(args, kwargs)
|
|
|
|
|
...
|
|
|
|
|
>>> foo(k=3)
|
|
|
|
|
() {'k': 3}
|
|
|
|
|
|
|
|
|
|
Although we do accept the following asymmetry in behavior compared to functions
|
|
|
|
|
when a single value is passed, but that ship has sailed::
|
|
|
|
|
|
|
|
|
|
>>> foo(5, k=3)
|
|
|
|
|
(5,) {'k': 3} # for indexing, a plain 5, not a 1-tuple is passed
|
|
|
|
|
|
|
|
|
|
For option 2, using ``None``, it was objected that NumPy uses it to indicate
|
|
|
|
|
inserting a new axis/dimensions (there's a ``np.newaxis`` alias as well)::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
arr = np.array(5)
|
|
|
|
|
arr.ndim == 0
|
|
|
|
|
arr[None].ndim == arr[None,].ndim == 1
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-28 16:13:20 -04:00
|
|
|
|
While this is not an insurmountable issue, it certainly will ripple onto numpy.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-28 16:13:20 -04:00
|
|
|
|
The only issues with both the above is that both the empty tuple and None are
|
|
|
|
|
potential legitimate indexes, and there might be value in being able to differentiate
|
|
|
|
|
the two degenerate cases.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-28 16:13:20 -04:00
|
|
|
|
So, an alternative strategy (option 3) would be to use an existing entity that is
|
|
|
|
|
unlikely to be used as a valid index. One option could be the current built-in constant
|
2020-12-23 13:05:12 -05:00
|
|
|
|
``NotImplemented``, which is currently returned by operators methods to
|
|
|
|
|
report that they do not implement a particular operation, and a different strategy
|
2020-09-28 16:13:20 -04:00
|
|
|
|
should be attempted (e.g. to ask the other object). Unfortunately, its name and
|
|
|
|
|
traditional use calls back to a feature that is not available, rather than the
|
|
|
|
|
fact that something was not passed by the user.
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-28 16:13:20 -04:00
|
|
|
|
This leaves us with option 4: a new built-in constant. This constant
|
|
|
|
|
must be unhashable (so it's never going to be a valid key) and have a clear
|
|
|
|
|
name that makes it obvious its context: ``NoIndex``. This
|
|
|
|
|
would solve all the above issues, but the question is: is it worth it?
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-28 16:13:20 -04:00
|
|
|
|
From a quick inquire, it seems that most people on python-ideas seem to believe
|
|
|
|
|
it's not crucial, and the empty tuple is an acceptable option. Hence the
|
|
|
|
|
resulting series will be::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-12-23 13:05:12 -05:00
|
|
|
|
obj[k=3]
|
|
|
|
|
# type(obj).__getitem__(obj, (), k=3). Empty tuple
|
|
|
|
|
|
|
|
|
|
obj[1, k=3]
|
|
|
|
|
# type(obj).__getitem__(obj, 1, k=3). Integer
|
|
|
|
|
|
|
|
|
|
obj[1, 2, k=3]
|
|
|
|
|
# type(obj).__getitem__(obj, (1, 2), k=3). Tuple
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-28 16:13:20 -04:00
|
|
|
|
and the following two notation will be degenerate::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-12-23 13:05:12 -05:00
|
|
|
|
obj[(), k=3]
|
|
|
|
|
# type(obj).__getitem__(obj, (), k=3)
|
|
|
|
|
|
|
|
|
|
obj[k=3]
|
|
|
|
|
# type(obj).__getitem__(obj, (), k=3)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
Common objections
|
|
|
|
|
=================
|
|
|
|
|
|
|
|
|
|
1. Just use a method call.
|
|
|
|
|
|
|
|
|
|
One of the use cases is typing, where the indexing is used exclusively, and
|
|
|
|
|
function calls are out of the question. Moreover, function calls do not handle
|
|
|
|
|
slice notation, which is commonly used in some cases for arrays.
|
|
|
|
|
|
2021-02-03 22:22:25 -05:00
|
|
|
|
One problem is type hint creation has been extended to built-ins in Python 3.9,
|
2020-09-22 20:14:43 -04:00
|
|
|
|
so that you do not have to import Dict, List, et al anymore.
|
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
Without kwdargs inside ``[]``, you would not be able to do this::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-25 20:00:29 -04:00
|
|
|
|
Vector = dict[i=float, j=float]
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
but for obvious reasons, call syntax using builtins to create custom type hints
|
2020-09-25 20:00:29 -04:00
|
|
|
|
isn't an option::
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-12-23 13:05:12 -05:00
|
|
|
|
dict(i=float, j=float)
|
|
|
|
|
# would create a dictionary, not a type
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
2020-09-28 16:13:20 -04:00
|
|
|
|
Finally, function calls do not allow for a setitem-like notation, as shown
|
|
|
|
|
in the Overview: operations such as ``f(1, x=3) = 5`` are not allowed, and are
|
|
|
|
|
instead allowed for indexing operations.
|
|
|
|
|
|
|
|
|
|
|
2020-09-22 20:14:43 -04:00
|
|
|
|
References
|
|
|
|
|
==========
|
|
|
|
|
|
|
|
|
|
.. [#rejection] "Rejection of PEP 472"
|
|
|
|
|
(https://mail.python.org/pipermail/python-dev/2019-March/156693.html)
|
2020-09-24 17:19:40 -04:00
|
|
|
|
.. [#pep-0484] "PEP 484 -- Type hints"
|
2020-09-22 20:14:43 -04:00
|
|
|
|
(https://www.python.org/dev/peps/pep-0484)
|
|
|
|
|
.. [#request-1] "Allow kwargs in __{get|set|del}item__"
|
|
|
|
|
(https://mail.python.org/archives/list/python-ideas@python.org/thread/EUGDRTRFIY36K4RM3QRR52CKCI7MIR2M/)
|
|
|
|
|
.. [#request-2] "PEP 472 -- Support for indexing with keyword arguments"
|
|
|
|
|
(https://mail.python.org/archives/list/python-ideas@python.org/thread/6OGAFDWCXT5QVV23OZWKBY4TXGZBVYZS/)
|
|
|
|
|
.. [#pep-0001] "PEP 1 -- PEP Purpose and Guidelines"
|
|
|
|
|
(https://www.python.org/dev/peps/pep-0001/#what-belongs-in-a-successful-pep)
|
2020-10-19 10:30:08 -04:00
|
|
|
|
.. [#trio-run] "trio.run() should take \*\*kwargs in addition to \*args"
|
|
|
|
|
(https://github.com/python-trio/trio/issues/470)
|
2021-01-21 18:50:03 -05:00
|
|
|
|
.. [#pep-0646] "PEP 646 -- Variadic Generics"
|
|
|
|
|
(https://www.python.org/dev/peps/pep-0646)
|
2020-09-28 16:13:20 -04:00
|
|
|
|
.. [#numpy-ml] "[Numpy-discussion] Request for comments on PEP 637 - Support for indexing with keyword arguments"
|
|
|
|
|
(http://numpy-discussion.10968.n7.nabble.com/Request-for-comments-on-PEP-637-Support-for-indexing-with-keyword-arguments-td48489.html)
|
2020-12-23 13:05:12 -05:00
|
|
|
|
.. [#reference-impl] "Reference implementation"
|
|
|
|
|
(https://github.com/python/cpython/compare/master...stefanoborini:PEP-637-implementation-attempt-2)
|
2020-09-22 20:14:43 -04:00
|
|
|
|
|
|
|
|
|
Copyright
|
|
|
|
|
=========
|
|
|
|
|
|
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
..
|
|
|
|
|
Local Variables:
|
|
|
|
|
mode: indented-text
|
|
|
|
|
indent-tabs-mode: nil
|
|
|
|
|
sentence-end-double-space: t
|
|
|
|
|
fill-column: 70
|
|
|
|
|
End:
|