PEP: 357 Title: Allowing Any Object to be Used for Slicing Version: $Revision$ Last Modified: $Date$ Author: Travis Oliphant Status: Final Type: Standards Track Created: 09-Feb-2006 Python-Version: 2.5 Abstract This PEP proposes adding an nb_index slot in PyNumberMethods and an __index__ special method so that arbitrary objects can be used whenever integers are explicitly needed in Python, such as in slice syntax (from which the slot gets its name). Rationale Currently integers and long integers play a special role in slicing in that they are the only objects allowed in slice syntax. In other words, if X is an object implementing the sequence protocol, then X[obj1:obj2] is only valid if obj1 and obj2 are both integers or long integers. There is no way for obj1 and obj2 to tell Python that they could be reasonably used as indexes into a sequence. This is an unnecessary limitation. In NumPy, for example, there are 8 different integer scalars corresponding to unsigned and signed integers of 8, 16, 32, and 64 bits. These type-objects could reasonably be used as integers in many places where Python expects true integers but cannot inherit from the Python integer type because of incompatible memory layouts. There should be some way to be able to tell Python that an object can behave like an integer. It is not possible to use the nb_int (and __int__ special method) for this purpose because that method is used to *coerce* objects to integers. It would be inappropriate to allow every object that can be coerced to an integer to be used as an integer everywhere Python expects a true integer. For example, if __int__ were used to convert an object to an integer in slicing, then float objects would be allowed in slicing and x[3.2:5.8] would not raise an error as it should. Proposal Add an nb_index slot to PyNumberMethods, and a corresponding __index__ special method. Objects could define a function to place in the nb_index slot that returns an appropriate C-integer (Py_ssize_t after PEP 353). This C-integer will be used whenever Python needs one such as in PySequence_GetSlice, PySequence_SetSlice, and PySequence_DelSlice. Specification: 1) The nb_index slot will have the signature Py_ssize_t index_func (PyObject *self) 2) The __index__ special method will have the signature def __index__(self): return obj where obj must be either an int or a long. 3) A new C-API function PyNumber_Index will be added with signature Py_ssize_t PyNumber_Index (PyObject *obj) which will return obj->ob_type->tp_as_number->nb_index(obj) if it is available. A -1 will be returned and an exception set on an error. 4) A new operator.index(obj) function will be added that calls equivalent of obj.__index__() and raises an error if obj does not implement the special method. Implementation Plan 1) Add the nb_index slot in object.h and modify typeobject.c to create the __index__ method 2) Change the ISINT macro in ceval.c to ISINDEX and alter it to accomodate objects with the index slot defined. 3) Change the _PyEval_SliceIndex function to accomodate objects with the index slot defined. 4) Change all builtin objects (e.g. lists) that use the as_mapping slots for subscript access and use a special-check for integers to check for the slot as well. 5) Add the nb_index slot to integers and long_integers. 6) Add PyNumber_Index C-API to return an integer from any Python Object that has the nb_index slot. 7) Add the operator.index(x) function. Discussion Questions Speed: Implementation should not slow down Python because integers and long integers used as indexes will complete in the same number of instructions. The only change will be that what used to generate an error will now be acceptable. Why not use nb_int which is already there? The nb_int method is used for coercion and so means something fundamentally different than what is requested here. This PEP proposes a method for something that *can* already be thought of as an integer communicate that information to Python when it needs an integer. The biggest example of why using nb_int would be a bad thing is that float objects already define the nb_int method, but float objects *should not* be used as indexes in a sequence. Why the name __index__? Some questions were raised regarding the name __index__ when other interpretations of the slot are possible. For example, the slot can be used any time Python requires an integer internally (such as in "mystring" * 3). The name was suggested by Guido because slicing syntax is the biggest reason for having such a slot and in the end no better name emerged. See the discussion thread: http://mail.python.org/pipermail/python-dev/2006-February/thread.html#60594 for examples of names that were suggested such as "__discrete__" and "__ordinal__". Why return Py_ssize_t from nb_index? The nb_index slot is primarily intended to return an integer needed by the sequence interface. In Python 2.5 this is Py_ssize_t. As this is the primary purpose of the slot, it makes sense to return the C-integer directly and not wrapped in a Python int object. Why can't __index__ return any object with the nb_index method? This would allow infinite recursion in many different ways that are not easy to check for. This restriction is similar to the requirement that __nonzero__ return an int or a bool. Reference Implementation Submitted as patch 1436368 to SourceForge. Copyright This document is placed in the public domain