652 lines
26 KiB
ReStructuredText
652 lines
26 KiB
ReStructuredText
PEP: 472
|
||
Title: Support for indexing with keyword arguments
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Stefano Borini, Joseph Martinot-Lagarde
|
||
Discussions-To: python-ideas@python.org
|
||
Status: Rejected
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 24-Jun-2014
|
||
Python-Version: 3.6
|
||
Post-History: 02-Jul-2014
|
||
Resolution: https://mail.python.org/pipermail/python-dev/2019-March/156693.html
|
||
|
||
Abstract
|
||
========
|
||
|
||
This PEP proposes an extension of the indexing operation to support keyword
|
||
arguments. Notations in the form ``a[K=3,R=2]`` would become legal syntax.
|
||
For future-proofing considerations, ``a[1:2, K=3, R=4]`` are considered and
|
||
may be allowed as well, depending on the choice for implementation. In addition
|
||
to a change in the parser, the index protocol (``__getitem__``, ``__setitem__``
|
||
and ``__delitem__``) will also potentially require adaptation.
|
||
|
||
Motivation
|
||
==========
|
||
|
||
The indexing syntax carries a strong semantic content, differentiating it from
|
||
a method call: it implies referring to a subset of data. We believe this
|
||
semantic association to be important, and wish to expand the strategies allowed
|
||
to refer to this data.
|
||
|
||
As a general observation, the number of indices needed by an indexing operation
|
||
depends on the dimensionality of the data: one-dimensional data (e.g. a list)
|
||
requires one index (e.g. ``a[3]``), two-dimensional data (e.g. a matrix) requires
|
||
two indices (e.g. ``a[2,3]``) and so on. Each index is a selector along one of the
|
||
axes of the dimensionality, and the position in the index tuple is the
|
||
metainformation needed to associate each index to the corresponding axis.
|
||
|
||
The current python syntax focuses exclusively on position to express the
|
||
association to the axes, and also contains syntactic sugar to refer to
|
||
non-punctiform selection (slices)
|
||
|
||
::
|
||
|
||
>>> a[3] # returns the fourth element of a
|
||
>>> a[1:10:2] # slice notation (extract a non-trivial data subset)
|
||
>>> a[3,2] # multiple indexes (for multidimensional arrays)
|
||
|
||
The additional notation proposed in this PEP would allow notations involving
|
||
keyword arguments in the indexing operation, e.g.
|
||
|
||
::
|
||
|
||
>>> a[K=3, R=2]
|
||
|
||
which would allow to refer to axes by conventional names.
|
||
|
||
One must additionally consider the extended form that allows both positional
|
||
and keyword specification
|
||
|
||
::
|
||
|
||
>>> a[3,R=3,K=4]
|
||
|
||
This PEP will explore different strategies to enable the use of these notations.
|
||
|
||
Use cases
|
||
=========
|
||
|
||
The following practical use cases present two broad categories of usage of a
|
||
keyworded specification: Indexing and contextual option. For indexing:
|
||
|
||
1. To provide a more communicative meaning to the index, preventing e.g. accidental
|
||
inversion of indexes
|
||
|
||
::
|
||
|
||
>>> gridValues[x=3, y=5, z=8]
|
||
>>> rain[time=0:12, location=location]
|
||
|
||
2. In some domain, such as computational physics and chemistry, the use of a
|
||
notation such as ``Basis[Z=5]`` is a Domain Specific Language notation to represent
|
||
a level of accuracy
|
||
|
||
::
|
||
|
||
>>> low_accuracy_energy = computeEnergy(molecule, BasisSet[Z=3])
|
||
|
||
In this case, the index operation would return a basis set at the chosen level
|
||
of accuracy (represented by the parameter Z). The reason behind an indexing is that
|
||
the BasisSet object could be internally represented as a numeric table, where
|
||
rows (the "coefficient" axis, hidden to the user in this example) are associated
|
||
to individual elements (e.g. row 0:5 contains coefficients for element 1,
|
||
row 5:8 coefficients for element 2) and each column is associated to a given
|
||
degree of accuracy ("accuracy" or "Z" axis) so that first column is low
|
||
accuracy, second column is medium accuracy and so on. With that indexing,
|
||
the user would obtain another object representing the contents of the column
|
||
of the internal table for accuracy level 3.
|
||
|
||
Additionally, the keyword specification can be used as an option contextual to
|
||
the indexing. Specifically:
|
||
|
||
1. A "default" option allows to specify a default return value when the index
|
||
is not present
|
||
|
||
::
|
||
|
||
>>> lst = [1, 2, 3]
|
||
>>> value = lst[5, default=0] # value is 0
|
||
|
||
2. For a sparse dataset, to specify an interpolation strategy
|
||
to infer a missing point from e.g. its surrounding data.
|
||
|
||
::
|
||
|
||
>>> value = array[1, 3, interpolate=spline_interpolator]
|
||
|
||
3. A unit could be specified with the same mechanism
|
||
|
||
::
|
||
|
||
>>> value = array[1, 3, unit="degrees"]
|
||
|
||
How the notation is interpreted is up to the implementing class.
|
||
|
||
Current implementation
|
||
======================
|
||
|
||
Currently, the indexing operation is handled by methods ``__getitem__``,
|
||
``__setitem__`` and ``__delitem__``. These methods' signature accept one argument
|
||
for the index (with ``__setitem__`` accepting an additional argument for the set
|
||
value). In the following, we will analyze ``__getitem__(self, idx)`` exclusively,
|
||
with the same considerations implied for the remaining two methods.
|
||
|
||
When an indexing operation is performed, ``__getitem__(self, idx)`` is called.
|
||
Traditionally, the full content between square brackets is turned into a single
|
||
object passed to argument ``idx``:
|
||
|
||
- When a single element is passed, e.g. ``a[2]``, ``idx`` will be ``2``.
|
||
- When multiple elements are passed, they must be separated by commas: ``a[2, 3]``.
|
||
In this case, ``idx`` will be a tuple ``(2, 3)``. With ``a[2, 3, "hello", {}]``
|
||
``idx`` will be ``(2, 3, "hello", {})``.
|
||
- A slicing notation e.g. ``a[2:10]`` will produce a slice object, or a tuple
|
||
containing slice objects if multiple values were passed.
|
||
|
||
Except for its unique ability to handle slice notation, the indexing operation
|
||
has similarities to a plain method call: it acts like one when invoked with
|
||
only one element; If the number of elements is greater than one, the ``idx``
|
||
argument behaves like a ``*args``. However, as stated in the Motivation section,
|
||
an indexing operation has the strong semantic implication of extraction of a
|
||
subset out of a larger set, which is not automatically associated to a regular
|
||
method call unless appropriate naming is chosen. Moreover, its different visual
|
||
style is important for readability.
|
||
|
||
Specifications
|
||
==============
|
||
|
||
The implementation should try to preserve the current signature for
|
||
``__getitem__``, or modify it in a backward-compatible way. We will present
|
||
different alternatives, taking into account the possible cases that need
|
||
to be addressed
|
||
|
||
::
|
||
|
||
C0. a[1]; a[1,2] # Traditional indexing
|
||
C1. a[Z=3]
|
||
C2. a[Z=3, R=4]
|
||
C3. a[1, Z=3]
|
||
C4. a[1, Z=3, R=4]
|
||
C5. a[1, 2, Z=3]
|
||
C6. a[1, 2, Z=3, R=4]
|
||
C7. a[1, Z=3, 2, R=4] # Interposed ordering
|
||
|
||
Strategy "Strict dictionary"
|
||
----------------------------
|
||
|
||
This strategy acknowledges that ``__getitem__`` is special in accepting only
|
||
one object, and the nature of that object must be non-ambiguous in its
|
||
specification of the axes: it can be either by order, or by name. As a result
|
||
of this assumption, in presence of keyword arguments, the passed entity is a
|
||
dictionary and all labels must be specified.
|
||
|
||
::
|
||
|
||
C0. a[1]; a[1,2] -> idx = 1; idx = (1, 2)
|
||
C1. a[Z=3] -> idx = {"Z": 3}
|
||
C2. a[Z=3, R=4] -> idx = {"Z": 3, "R": 4}
|
||
C3. a[1, Z=3] -> raise SyntaxError
|
||
C4. a[1, Z=3, R=4] -> raise SyntaxError
|
||
C5. a[1, 2, Z=3] -> raise SyntaxError
|
||
C6. a[1, 2, Z=3, R=4] -> raise SyntaxError
|
||
C7. a[1, Z=3, 2, R=4] -> raise SyntaxError
|
||
|
||
Pros
|
||
''''
|
||
|
||
- Strong conceptual similarity between the tuple case and the dictionary case.
|
||
In the first case, we are specifying a tuple, so we are naturally defining
|
||
a plain set of values separated by commas. In the second, we are specifying a
|
||
dictionary, so we are specifying a homogeneous set of key/value pairs, as
|
||
in ``dict(Z=3, R=4)``;
|
||
- Simple and easy to parse on the ``__getitem__`` side: if it gets a tuple,
|
||
determine the axes using positioning. If it gets a dictionary, use
|
||
the keywords.
|
||
- C interface does not need changes.
|
||
|
||
Neutral
|
||
'''''''
|
||
|
||
- Degeneracy of ``a[{"Z": 3, "R": 4}]`` with ``a[Z=3, R=4]`` means the notation
|
||
is syntactic sugar.
|
||
|
||
Cons
|
||
''''
|
||
|
||
- Very strict.
|
||
- Destroys ordering of the passed arguments. Preserving the
|
||
order would be possible with an OrderedDict as drafted by :pep:`468`.
|
||
- Does not allow use cases with mixed positional/keyword arguments such as
|
||
``a[1, 2, default=5]``.
|
||
|
||
Strategy "mixed dictionary"
|
||
---------------------------
|
||
|
||
This strategy relaxes the above constraint to return a dictionary containing
|
||
both numbers and strings as keys.
|
||
|
||
::
|
||
|
||
C0. a[1]; a[1,2] -> idx = 1; idx = (1, 2)
|
||
C1. a[Z=3] -> idx = {"Z": 3}
|
||
C2. a[Z=3, R=4] -> idx = {"Z": 3, "R": 4}
|
||
C3. a[1, Z=3] -> idx = { 0: 1, "Z": 3}
|
||
C4. a[1, Z=3, R=4] -> idx = { 0: 1, "Z": 3, "R": 4}
|
||
C5. a[1, 2, Z=3] -> idx = { 0: 1, 1: 2, "Z": 3}
|
||
C6. a[1, 2, Z=3, R=4] -> idx = { 0: 1, 1: 2, "Z": 3, "R": 4}
|
||
C7. a[1, Z=3, 2, R=4] -> idx = { 0: 1, "Z": 3, 2: 2, "R": 4}
|
||
|
||
Pros
|
||
''''
|
||
- Opens for mixed cases.
|
||
|
||
Cons
|
||
''''
|
||
- Destroys ordering information for string keys. We have no way of saying if
|
||
``"Z"`` in C7 was in position 1 or 3.
|
||
- Implies switching from a tuple to a dict as soon as one specified index
|
||
has a keyword argument. May be confusing to parse.
|
||
|
||
Strategy "named tuple"
|
||
-----------------------
|
||
|
||
Return a named tuple for ``idx`` instead of a tuple. Keyword arguments would
|
||
obviously have their stated name as key, and positional argument would have an
|
||
underscore followed by their order:
|
||
|
||
::
|
||
|
||
C0. a[1]; a[1,2] -> idx = 1; idx = (_0=1, _1=2)
|
||
C1. a[Z=3] -> idx = (Z=3)
|
||
C2. a[Z=3, R=2] -> idx = (Z=3, R=2)
|
||
C3. a[1, Z=3] -> idx = (_0=1, Z=3)
|
||
C4. a[1, Z=3, R=2] -> idx = (_0=1, Z=3, R=2)
|
||
C5. a[1, 2, Z=3] -> idx = (_0=1, _2=2, Z=3)
|
||
C6. a[1, 2, Z=3, R=4] -> (_0=1, _1=2, Z=3, R=4)
|
||
C7. a[1, Z=3, 2, R=4] -> (_0=1, Z=3, _1=2, R=4)
|
||
or (_0=1, Z=3, _2=2, R=4)
|
||
or raise SyntaxError
|
||
|
||
The required typename of the namedtuple could be ``Index`` or the name of the
|
||
argument in the function definition, it keeps the ordering and is easy to
|
||
analyse by using the ``_fields`` attribute. It is backward compatible, provided
|
||
that C0 with more than one entry now passes a namedtuple instead of a plain
|
||
tuple.
|
||
|
||
Pros
|
||
''''
|
||
- Looks nice. namedtuple transparently replaces tuple and gracefully
|
||
degrades to the old behavior.
|
||
- Does not require a change in the C interface
|
||
|
||
Cons
|
||
''''
|
||
- According to some sources [#namedtuple]_ namedtuple is not well developed.
|
||
To include it as such important object would probably require rework
|
||
and improvement;
|
||
- The namedtuple fields, and thus the type, will have to change according
|
||
to the passed arguments. This can be a performance bottleneck, and makes
|
||
it impossible to guarantee that two subsequent index accesses get the same
|
||
Index class;
|
||
- the ``_n`` "magic" fields are a bit unusual, but ipython already uses them
|
||
for result history.
|
||
- Python currently has no builtin namedtuple. The current one is available
|
||
in the "collections" module in the standard library.
|
||
- Differently from a function, the two notations ``gridValues[x=3, y=5, z=8]``
|
||
and ``gridValues[3,5,8]`` would not gracefully match if the order is modified
|
||
at call time (e.g. we ask for ``gridValues[y=5, z=8, x=3])``. In a function,
|
||
we can pre-define argument names so that keyword arguments are properly
|
||
matched. Not so in ``__getitem__``, leaving the task for interpreting and
|
||
matching to ``__getitem__`` itself.
|
||
|
||
|
||
Strategy "New argument contents"
|
||
--------------------------------
|
||
|
||
In the current implementation, when many arguments are passed to ``__getitem__``,
|
||
they are grouped in a tuple and this tuple is passed to ``__getitem__`` as the
|
||
single argument ``idx``. This strategy keeps the current signature, but expands the
|
||
range of variability in type and contents of ``idx`` to more complex representations.
|
||
|
||
We identify four possible ways to implement this strategy:
|
||
|
||
- **P1**: uses a single dictionary for the keyword arguments.
|
||
- **P2**: uses individual single-item dictionaries.
|
||
- **P3**: similar to **P2**, but replaces single-item dictionaries with a ``(key, value)`` tuple.
|
||
- **P4**: similar to **P2**, but uses a special and additional new object: ``keyword()``
|
||
|
||
Some of these possibilities lead to degenerate notations, i.e. indistinguishable
|
||
from an already possible representation. Once again, the proposed notation
|
||
becomes syntactic sugar for these representations.
|
||
|
||
Under this strategy, the old behavior for C0 is unchanged.
|
||
|
||
::
|
||
|
||
C0: a[1] -> idx = 1 # integer
|
||
a[1,2] -> idx = (1,2) # tuple
|
||
|
||
In C1, we can use either a dictionary or a tuple to represent key and value pair
|
||
for the specific indexing entry. We need to have a tuple with a tuple in C1
|
||
because otherwise we cannot differentiate ``a["Z", 3]`` from ``a[Z=3]``.
|
||
|
||
::
|
||
|
||
C1: a[Z=3] -> idx = {"Z": 3} # P1/P2 dictionary with single key
|
||
or idx = (("Z", 3),) # P3 tuple of tuples
|
||
or idx = keyword("Z", 3) # P4 keyword object
|
||
|
||
As you can see, notation P1/P2 implies that ``a[Z=3]`` and ``a[{"Z": 3}]`` will
|
||
call ``__getitem__`` passing the exact same value, and is therefore syntactic
|
||
sugar for the latter. Same situation occurs, although with different index, for
|
||
P3. Using a keyword object as in P4 would remove this degeneracy.
|
||
|
||
For the C2 case:
|
||
|
||
::
|
||
|
||
C2. a[Z=3, R=4] -> idx = {"Z": 3, "R": 4} # P1 dictionary/ordereddict
|
||
or idx = ({"Z": 3}, {"R": 4}) # P2 tuple of two single-key dict
|
||
or idx = (("Z", 3), ("R", 4)) # P3 tuple of tuples
|
||
or idx = (keyword("Z", 3),
|
||
keyword("R", 4) ) # P4 keyword objects
|
||
|
||
|
||
P1 naturally maps to the traditional ``**kwargs`` behavior, however it breaks
|
||
the convention that two or more entries for the index produce a tuple. P2
|
||
preserves this behavior, and additionally preserves the order. Preserving the
|
||
order would also be possible with an OrderedDict as drafted by :pep:`468`.
|
||
|
||
The remaining cases are here shown:
|
||
|
||
::
|
||
|
||
C3. a[1, Z=3] -> idx = (1, {"Z": 3}) # P1/P2
|
||
or idx = (1, ("Z", 3)) # P3
|
||
or idx = (1, keyword("Z", 3)) # P4
|
||
|
||
C4. a[1, Z=3, R=4] -> idx = (1, {"Z": 3, "R": 4}) # P1
|
||
or idx = (1, {"Z": 3}, {"R": 4}) # P2
|
||
or idx = (1, ("Z", 3), ("R", 4)) # P3
|
||
or idx = (1, keyword("Z", 3),
|
||
keyword("R", 4)) # P4
|
||
|
||
C5. a[1, 2, Z=3] -> idx = (1, 2, {"Z": 3}) # P1/P2
|
||
or idx = (1, 2, ("Z", 3)) # P3
|
||
or idx = (1, 2, keyword("Z", 3)) # P4
|
||
|
||
C6. a[1, 2, Z=3, R=4] -> idx = (1, 2, {"Z":3, "R": 4}) # P1
|
||
or idx = (1, 2, {"Z": 3}, {"R": 4}) # P2
|
||
or idx = (1, 2, ("Z", 3), ("R", 4)) # P3
|
||
or idx = (1, 2, keyword("Z", 3),
|
||
keyword("R", 4)) # P4
|
||
|
||
C7. a[1, Z=3, 2, R=4] -> idx = (1, 2, {"Z": 3, "R": 4}) # P1. Pack the keyword arguments. Ugly.
|
||
or raise SyntaxError # P1. Same behavior as in function calls.
|
||
or idx = (1, {"Z": 3}, 2, {"R": 4}) # P2
|
||
or idx = (1, ("Z", 3), 2, ("R", 4)) # P3
|
||
or idx = (1, keyword("Z", 3),
|
||
2, keyword("R", 4)) # P4
|
||
|
||
Pros
|
||
''''
|
||
- Signature is unchanged;
|
||
- P2/P3 can preserve ordering of keyword arguments as specified at indexing,
|
||
- P1 needs an OrderedDict, but would destroy interposed ordering if allowed:
|
||
all keyword indexes would be dumped into the dictionary;
|
||
- Stays within traditional types: tuples and dicts. Evt. OrderedDict;
|
||
- Some proposed strategies are similar in behavior to a traditional function call;
|
||
- The C interface for ``PyObject_GetItem`` and family would remain unchanged.
|
||
|
||
Cons
|
||
''''
|
||
- Apparently complex and wasteful;
|
||
- Degeneracy in notation (e.g. ``a[Z=3]`` and ``a[{"Z":3}]`` are equivalent and
|
||
indistinguishable notations at the ``__[get|set|del]item__`` level).
|
||
This behavior may or may not be acceptable.
|
||
- for P4, an additional object similar in nature to slice() is needed,
|
||
but only to disambiguate the above degeneracy.
|
||
- ``idx`` type and layout seems to change depending on the whims of the caller;
|
||
- May be complex to parse what is passed, especially in the case of tuple of tuples;
|
||
- P2 Creates a lot of single keys dictionary as members of a tuple. Looks ugly.
|
||
P3 would be lighter and easier to use than the tuple of dicts, and still
|
||
preserves order (unlike the regular dict), but would result in clumsy
|
||
extraction of keywords.
|
||
|
||
Strategy "kwargs argument"
|
||
---------------------------
|
||
|
||
``__getitem__`` accepts an optional ``**kwargs`` argument which should be keyword only.
|
||
``idx`` also becomes optional to support a case where no non-keyword arguments are allowed.
|
||
The signature would then be either
|
||
|
||
::
|
||
|
||
__getitem__(self, idx)
|
||
__getitem__(self, idx, **kwargs)
|
||
__getitem__(self, **kwargs)
|
||
|
||
Applied to our cases would produce:
|
||
|
||
::
|
||
|
||
C0. a[1,2] -> idx=(1,2); kwargs={}
|
||
C1. a[Z=3] -> idx=None ; kwargs={"Z":3}
|
||
C2. a[Z=3, R=4] -> idx=None ; kwargs={"Z":3, "R":4}
|
||
C3. a[1, Z=3] -> idx=1 ; kwargs={"Z":3}
|
||
C4. a[1, Z=3, R=4] -> idx=1 ; kwargs={"Z":3, "R":4}
|
||
C5. a[1, 2, Z=3] -> idx=(1,2); kwargs={"Z":3}
|
||
C6. a[1, 2, Z=3, R=4] -> idx=(1,2); kwargs={"Z":3, "R":4}
|
||
C7. a[1, Z=3, 2, R=4] -> raise SyntaxError # in agreement to function behavior
|
||
|
||
Empty indexing ``a[]`` of course remains invalid syntax.
|
||
|
||
Pros
|
||
''''
|
||
- Similar to function call, evolves naturally from it;
|
||
- Use of keyword indexing with an object whose ``__getitem__``
|
||
doesn't have a kwargs will fail in an obvious way.
|
||
That's not the case for the other strategies.
|
||
|
||
Cons
|
||
''''
|
||
- It doesn't preserve order, unless an OrderedDict is used;
|
||
- Forbids C7, but is it really needed?
|
||
- Requires a change in the C interface to pass an additional
|
||
PyObject for the keyword arguments.
|
||
|
||
|
||
C interface
|
||
===========
|
||
|
||
As briefly introduced in the previous analysis, the C interface would
|
||
potentially have to change to allow the new feature. Specifically,
|
||
``PyObject_GetItem`` and related routines would have to accept an additional
|
||
``PyObject *kw`` argument for Strategy "kwargs argument". The remaining
|
||
strategies would not require a change in the C function signatures, but the
|
||
different nature of the passed object would potentially require adaptation.
|
||
|
||
Strategy "named tuple" would behave correctly without any change: the class
|
||
returned by the factory method in collections returns a subclass of tuple,
|
||
meaning that ``PyTuple_*`` functions can handle the resulting object.
|
||
|
||
Alternative Solutions
|
||
=====================
|
||
|
||
In this section, we present alternative solutions that would workaround the
|
||
missing feature and make the proposed enhancement not worth of implementation.
|
||
|
||
Use a method
|
||
------------
|
||
|
||
One could keep the indexing as is, and use a traditional ``get()`` method for those
|
||
cases where basic indexing is not enough. This is a good point, but as already
|
||
reported in the introduction, methods have a different semantic weight from
|
||
indexing, and you can't use slices directly in methods. Compare e.g.
|
||
``a[1:3, Z=2]`` with ``a.get(slice(1,3), Z=2)``.
|
||
|
||
The authors however recognize this argument as compelling, and the advantage
|
||
in semantic expressivity of a keyword-based indexing may be offset by a rarely
|
||
used feature that does not bring enough benefit and may have limited adoption.
|
||
|
||
Emulate requested behavior by abusing the slice object
|
||
------------------------------------------------------
|
||
|
||
This extremely creative method exploits the slice objects' behavior, provided
|
||
that one accepts to use strings (or instantiate properly named placeholder
|
||
objects for the keys), and accept to use ":" instead of "=".
|
||
|
||
::
|
||
|
||
>>> a["K":3]
|
||
slice('K', 3, None)
|
||
>>> a["K":3, "R":4]
|
||
(slice('K', 3, None), slice('R', 4, None))
|
||
>>>
|
||
|
||
While clearly smart, this approach does not allow easy inquire of the key/value
|
||
pair, it's too clever and esotheric, and does not allow to pass a slice as in
|
||
``a[K=1:10:2]``.
|
||
|
||
However, Tim Delaney comments
|
||
|
||
"I really do think that ``a[b=c, d=e]`` should just be syntax sugar for
|
||
``a['b':c, 'd':e]``. It's simple to explain, and gives the greatest backwards
|
||
compatibility. In particular, libraries that already abused slices in this
|
||
way will just continue to work with the new syntax."
|
||
|
||
We think this behavior would produce inconvenient results. The library Pandas uses
|
||
strings as labels, allowing notation such as
|
||
|
||
::
|
||
|
||
>>> a[:, "A":"F"]
|
||
|
||
to extract data from column "A" to column "F". Under the above comment, this notation
|
||
would be equally obtained with
|
||
|
||
::
|
||
|
||
>>> a[:, A="F"]
|
||
|
||
which is weird and collides with the intended meaning of keyword in indexing, that
|
||
is, specifying the axis through conventional names rather than positioning.
|
||
|
||
Pass a dictionary as an additional index
|
||
----------------------------------------
|
||
|
||
::
|
||
|
||
>>> a[1, 2, {"K": 3}]
|
||
|
||
this notation, although less elegant, can already be used and achieves similar
|
||
results. It's evident that the proposed Strategy "New argument contents" can be
|
||
interpreted as syntactic sugar for this notation.
|
||
|
||
Additional Comments
|
||
===================
|
||
|
||
Commenters also expressed the following relevant points:
|
||
|
||
Relevance of ordering of keyword arguments
|
||
------------------------------------------
|
||
|
||
As part of the discussion of this PEP, it's important to decide if the ordering
|
||
information of the keyword arguments is important, and if indexes and keys can
|
||
be ordered in an arbitrary way (e.g. ``a[1,Z=3,2,R=4]``). :pep:`468`
|
||
tries to address the first point by proposing the use of an ordereddict,
|
||
however one would be inclined to accept that keyword arguments in indexing are
|
||
equivalent to kwargs in function calls, and therefore as of today equally
|
||
unordered, and with the same restrictions.
|
||
|
||
Need for homogeneity of behavior
|
||
--------------------------------
|
||
|
||
Relative to Strategy "New argument contents", a comment from Ian Cordasco
|
||
points out that
|
||
|
||
"it would be unreasonable for just one method to behave totally
|
||
differently from the standard behaviour in Python. It would be confusing for
|
||
only ``__getitem__`` (and ostensibly, ``__setitem__``) to take keyword
|
||
arguments but instead of turning them into a dictionary, turn them into
|
||
individual single-item dictionaries." We agree with his point, however it must
|
||
be pointed out that ``__getitem__`` is already special in some regards when it
|
||
comes to passed arguments.
|
||
|
||
Chris Angelico also states:
|
||
|
||
"it seems very odd to start out by saying "here, let's give indexing the
|
||
option to carry keyword args, just like with function calls", and then come
|
||
back and say "oh, but unlike function calls, they're inherently ordered and
|
||
carried very differently"." Again, we agree on this point. The most
|
||
straightforward strategy to keep homogeneity would be Strategy "kwargs
|
||
argument", opening to a ``**kwargs`` argument on ``__getitem__``.
|
||
|
||
One of the authors (Stefano Borini) thinks that only the "strict dictionary"
|
||
strategy is worth of implementation. It is non-ambiguous, simple, does not
|
||
force complex parsing, and addresses the problem of referring to axes either
|
||
by position or by name. The "options" use case is probably best handled with
|
||
a different approach, and may be irrelevant for this PEP. The alternative
|
||
"named tuple" is another valid choice.
|
||
|
||
Having .get() become obsolete for indexing with default fallback
|
||
----------------------------------------------------------------
|
||
|
||
Introducing a "default" keyword could make ``dict.get()`` obsolete, which would be
|
||
replaced by ``d["key", default=3]``. Chris Angelico however states:
|
||
|
||
"Currently, you need to write ``__getitem__`` (which raises an exception on
|
||
finding a problem) plus something else, e.g. ``get()``, which returns a default
|
||
instead. By your proposal, both branches would go inside ``__getitem__``, which
|
||
means they could share code; but there still need to be two branches."
|
||
|
||
Additionally, Chris continues:
|
||
|
||
"There'll be an ad-hoc and fairly arbitrary puddle of names (some will go
|
||
``default=``, others will say that's way too long and go ``def=``, except that
|
||
that's a keyword so they'll use ``dflt=`` or something...), unless there's a
|
||
strong force pushing people to one consistent name.".
|
||
|
||
This argument is valid but it's equally valid for any function call, and is
|
||
generally fixed by established convention and documentation.
|
||
|
||
On degeneracy of notation
|
||
-------------------------
|
||
|
||
User Drekin commented: "The case of ``a[Z=3]`` and ``a[{"Z": 3}]`` is similar to
|
||
current ``a[1, 2]`` and ``a[(1, 2)]``. Even though one may argue that the parentheses
|
||
are actually not part of tuple notation but are just needed because of syntax,
|
||
it may look as degeneracy of notation when compared to function call: ``f(1, 2)``
|
||
is not the same thing as ``f((1, 2))``.".
|
||
|
||
References
|
||
==========
|
||
|
||
.. [#keyword-1] "keyword-only args in __getitem__"
|
||
(http://article.gmane.org/gmane.comp.python.ideas/27584)
|
||
|
||
.. [#keyword-2] "Accepting keyword arguments for __getitem__"
|
||
(https://mail.python.org/pipermail/python-ideas/2014-June/028164.html)
|
||
|
||
.. [#keyword-3] "PEP pre-draft: Support for indexing with keyword arguments"
|
||
https://mail.python.org/pipermail/python-ideas/2014-July/028250.html
|
||
|
||
.. [#namedtuple] "namedtuple is not as good as it should be"
|
||
(https://mail.python.org/pipermail/python-ideas/2013-June/021257.html)
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
End:
|