294 lines
10 KiB
ReStructuredText
294 lines
10 KiB
ReStructuredText
PEP: 204
|
|
Title: Range Literals
|
|
Author: Thomas Wouters <thomas@python.org>
|
|
Status: Rejected
|
|
Type: Standards Track
|
|
Created: 14-Jul-2000
|
|
Python-Version: 2.0
|
|
Post-History:
|
|
|
|
.. rejected::
|
|
|
|
After careful consideration, and a period of meditation, this
|
|
proposal has been rejected. The open issues, as well as some
|
|
confusion between ranges and slice syntax, raised enough questions
|
|
for Guido not to accept it for Python 2.0, and later to reject the
|
|
proposal altogether. The new syntax and its intentions were deemed
|
|
not obvious enough.
|
|
|
|
[ TBD: Guido, amend/confirm this, please. Preferably both; this
|
|
is a PEP, it should contain *all* the reasons for rejection
|
|
and/or reconsideration, for future reference. ]
|
|
|
|
Introduction
|
|
============
|
|
|
|
This PEP describes the "range literal" proposal for Python 2.0.
|
|
This PEP tracks the status and ownership of this feature, slated
|
|
for introduction in Python 2.0. It contains a description of the
|
|
feature and outlines changes necessary to support the feature.
|
|
This PEP summarizes discussions held in mailing list forums, and
|
|
provides URLs for further information, where appropriate. The CVS
|
|
revision history of this file contains the definitive historical
|
|
record.
|
|
|
|
|
|
List ranges
|
|
===========
|
|
|
|
Ranges are sequences of numbers of a fixed stepping, often used in
|
|
for-loops. The Python for-loop is designed to iterate over a
|
|
sequence directly::
|
|
|
|
>>> l = ['a', 'b', 'c', 'd']
|
|
>>> for item in l:
|
|
... print item
|
|
a
|
|
b
|
|
c
|
|
d
|
|
|
|
However, this solution is not always prudent. Firstly, problems
|
|
arise when altering the sequence in the body of the for-loop,
|
|
resulting in the for-loop skipping items. Secondly, it is not
|
|
possible to iterate over, say, every second element of the
|
|
sequence. And thirdly, it is sometimes necessary to process an
|
|
element based on its index, which is not readily available in the
|
|
above construct.
|
|
|
|
For these instances, and others where a range of numbers is
|
|
desired, Python provides the ``range`` builtin function, which
|
|
creates a list of numbers. The ``range`` function takes three
|
|
arguments, *start*, *end* and *step*. *start* and *step* are
|
|
optional, and default to 0 and 1, respectively.
|
|
|
|
The ``range`` function creates a list of numbers, starting at
|
|
*start*, with a step of *step*, up to, but not including *end*, so
|
|
that ``range(10)`` produces a list that has exactly 10 items, the
|
|
numbers 0 through 9.
|
|
|
|
Using the ``range`` function, the above example would look like
|
|
this::
|
|
|
|
>>> for i in range(len(l)):
|
|
... print l[i]
|
|
a
|
|
b
|
|
c
|
|
d
|
|
|
|
Or, to start at the second element of ``l`` and processing only
|
|
every second element from then on::
|
|
|
|
>>> for i in range(1, len(l), 2):
|
|
... print l[i]
|
|
b
|
|
d
|
|
|
|
There are several disadvantages with this approach:
|
|
|
|
- Clarity of purpose: Adding another function call, possibly with
|
|
extra arithmetic to determine the desired length and step of the
|
|
list, does not improve readability of the code. Also, it is
|
|
possible to "shadow" the builtin ``range`` function by supplying a
|
|
local or global variable with the same name, effectively
|
|
replacing it. This may or may not be a desired effect.
|
|
|
|
- Efficiency: because the ``range`` function can be overridden, the
|
|
Python compiler cannot make assumptions about the for-loop, and
|
|
has to maintain a separate loop counter.
|
|
|
|
- Consistency: There already is a syntax that is used to denote
|
|
ranges, as shown below. This syntax uses the exact same
|
|
arguments, though all optional, in the exact same way. It seems
|
|
logical to extend this syntax to ranges, to form "range
|
|
literals".
|
|
|
|
|
|
Slice Indices
|
|
=============
|
|
|
|
In Python, a sequence can be indexed in one of two ways:
|
|
retrieving a single item, or retrieving a range of items.
|
|
Retrieving a range of items results in a new object of the same
|
|
type as the original sequence, containing zero or more items from
|
|
the original sequence. This is done using a "range notation"::
|
|
|
|
>>> l[2:4]
|
|
['c', 'd']
|
|
|
|
This range notation consists of zero, one or two indices separated
|
|
by a colon. The first index is the *start* index, the second the
|
|
*end*. When either is left out, they default to respectively the
|
|
start and the end of the sequence.
|
|
|
|
There is also an extended range notation, which incorporates
|
|
*step* as well. Though this notation is not currently supported
|
|
by most builtin types, if it were, it would work as follows::
|
|
|
|
>>> l[1:4:2]
|
|
['b', 'd']
|
|
|
|
The third "argument" to the slice syntax is exactly the same as
|
|
the *step* argument to ``range()``. The underlying mechanisms of the
|
|
standard, and these extended slices, are sufficiently different
|
|
and inconsistent that many classes and extensions outside of
|
|
mathematical packages do not implement support for the extended
|
|
variant. While this should be resolved, it is beyond the scope of
|
|
this PEP.
|
|
|
|
Extended slices do show, however, that there is already a
|
|
perfectly valid and applicable syntax to denote ranges in a way
|
|
that solve all of the earlier stated disadvantages of the use of
|
|
the ``range()`` function:
|
|
|
|
- It is clearer, more concise syntax, which has already proven to
|
|
be both intuitive and easy to learn.
|
|
|
|
- It is consistent with the other use of ranges in Python
|
|
(e.g. slices).
|
|
|
|
- Because it is built-in syntax, instead of a builtin function, it
|
|
cannot be overridden. This means both that a viewer can be
|
|
certain about what the code does, and that an optimizer will not
|
|
have to worry about ``range()`` being "shadowed".
|
|
|
|
|
|
The Proposed Solution
|
|
=====================
|
|
|
|
The proposed implementation of range-literals combines the syntax
|
|
for list literals with the syntax for (extended) slices, to form
|
|
range literals::
|
|
|
|
>>> [1:10]
|
|
[1, 2, 3, 4, 5, 6, 7, 8, 9]
|
|
>>> [:5]
|
|
[0, 1, 2, 3, 4]
|
|
>>> [5:1:-1]
|
|
[5, 4, 3, 2]
|
|
|
|
There is one minor difference between range literals and the slice
|
|
syntax: though it is possible to omit all of *start*, *end* and
|
|
*step* in slices, it does not make sense to omit *end* in range
|
|
literals. In slices, *end* would default to the end of the list,
|
|
but this has no meaning in range literals.
|
|
|
|
|
|
Reference Implementation
|
|
========================
|
|
|
|
The proposed implementation can be found on SourceForge [1]_. It
|
|
adds a new bytecode, ``BUILD_RANGE``, that takes three arguments from
|
|
the stack and builds a list on the bases of those. The list is
|
|
pushed back on the stack.
|
|
|
|
The use of a new bytecode is necessary to be able to build ranges
|
|
based on other calculations, whose outcome is not known at compile
|
|
time.
|
|
|
|
The code introduces two new functions to ``listobject.c``, which are
|
|
currently hovering between private functions and full-fledged API
|
|
calls.
|
|
|
|
``PyList_FromRange()`` builds a list from start, end and step,
|
|
returning NULL if an error occurs. Its prototype is::
|
|
|
|
PyObject * PyList_FromRange(long start, long end, long step)
|
|
|
|
``PyList_GetLenOfRange()`` is a helper function used to determine the
|
|
length of a range. Previously, it was a static function in
|
|
``bltinmodule.c``, but is now necessary in both ``listobject.c`` and
|
|
``bltinmodule.c`` (for ``xrange``). It is made non-static solely to avoid
|
|
code duplication. Its prototype is::
|
|
|
|
long PyList_GetLenOfRange(long start, long end, long step)
|
|
|
|
|
|
Open issues
|
|
===========
|
|
|
|
- One possible solution to the discrepancy of requiring the *end*
|
|
argument in range literals is to allow the range syntax to
|
|
create a "generator", rather than a list, such as the ``xrange``
|
|
builtin function does. However, a generator would not be a
|
|
list, and it would be impossible, for instance, to assign to
|
|
items in the generator, or append to it.
|
|
|
|
The range syntax could conceivably be extended to include tuples
|
|
(i.e. immutable lists), which could then be safely implemented
|
|
as generators. This may be a desirable solution, especially for
|
|
large number arrays: generators require very little in the way
|
|
of storage and initialization, and there is only a small
|
|
performance impact in calculating and creating the appropriate
|
|
number on request. (TBD: is there any at all? Cursory testing
|
|
suggests equal performance even in the case of ranges of length
|
|
1)
|
|
|
|
However, even if idea was adopted, would it be wise to "special
|
|
case" the second argument, making it optional in one instance of
|
|
the syntax, and non-optional in other cases ?
|
|
|
|
- Should it be possible to mix range syntax with normal list
|
|
literals, creating a single list? E.g.::
|
|
|
|
>>> [5, 6, 1:6, 7, 9]
|
|
|
|
to create::
|
|
|
|
[5, 6, 1, 2, 3, 4, 5, 7, 9]
|
|
|
|
- How should range literals interact with another proposed new
|
|
feature, :pep:`"list comprehensions" <202>`? Specifically, should it be
|
|
possible to create lists in list comprehensions? E.g.::
|
|
|
|
>>> [x:y for x in (1, 2) y in (3, 4)]
|
|
|
|
Should this example return a single list with multiple ranges::
|
|
|
|
[1, 2, 1, 2, 3, 2, 2, 3]
|
|
|
|
Or a list of lists, like so::
|
|
|
|
[[1, 2], [1, 2, 3], [2], [2, 3]]
|
|
|
|
However, as the syntax and semantics of list comprehensions are
|
|
still subject of hot debate, these issues are probably best
|
|
addressed by the "list comprehensions" PEP.
|
|
|
|
- Range literals accept objects other than integers: it performs
|
|
``PyInt_AsLong()`` on the objects passed in, so as long as the
|
|
objects can be coerced into integers, they will be accepted.
|
|
The resulting list, however, is always composed of standard
|
|
integers.
|
|
|
|
Should range literals create a list of the passed-in type? It
|
|
might be desirable in the cases of other builtin types, such as
|
|
longs and strings::
|
|
|
|
>>> [ 1L : 2L<<64 : 2<<32L ]
|
|
>>> ["a":"z":"b"]
|
|
>>> ["a":"z":2]
|
|
|
|
However, this might be too much "magic" to be obvious. It might
|
|
also present problems with user-defined classes: even if the
|
|
base class can be found and a new instance created, the instance
|
|
may require additional arguments to ``__init__``, causing the
|
|
creation to fail.
|
|
|
|
- The ``PyList_FromRange()`` and ``PyList_GetLenOfRange()`` functions need
|
|
to be classified: are they part of the API, or should they be
|
|
made private functions?
|
|
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document has been placed in the Public Domain.
|
|
|
|
|
|
References
|
|
==========
|
|
|
|
.. [1] http://sourceforge.net/patch/?func=detailpatch&patch_id=100902&group_id=5470
|