python-peps/pep-0322.txt

218 lines
6.4 KiB
Plaintext
Raw Normal View History

PEP: 322
Title: Reverse Iteration Methods
Version: $Revision$
Last-Modified: $Date$
Author: Raymond Hettinger <python@rcn.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 24-Sep-2003
Python-Version: 2.4
Post-History: 24-Sep-2003
Abstract
========
This proposal is to extend the API of several sequence types
2003-09-25 23:07:56 -04:00
to include a method for iterating over the sequence in reverse.
Motivation
==========
For indexable objects, current approaches for reverse iteration are
error prone, unnatural, and not especially readable::
for i in xrange(n-1, -1, -1):
print seqn[i]
One other current approach involves reversing a list before iterating
over it. That technique wastes computer cycles, memory, and lines of
code::
rseqn = list(seqn)
rseqn.reverse()
for value in rseqn:
print value
Extended slicing is a third approach that minimizes the code overhead
but does nothing for memory efficiency, beauty, or clarity.
Reverse iteration is much less common than forward iteration, but it
does arise regularly in practice. See `Real World Use Cases`_ below.
Proposal
========
Add a method called *iterreverse()* to sequence objects that can
benefit from it. The above examples then simplify to::
for i in xrange(n).iterreverse():
print seqn[i]
::
for elem in seqn.iterreverse():
print elem
The new protocol would be applied to lists, tuples, strings, and
xrange objects. It would not apply to unordered collections like
dicts and sets.
No language syntax changes are needed.
Alternative Method Names
========================
* *iterbackwards* -- like iteritems() but somewhat long
* *backwards* -- more pithy, less explicit
* *ireverse* -- reminiscent of imap(), izip(), and ifilter()
Other Issues
============
* Should *tuple* objects be included? In the past, they have been
denied some list like behaviors such as count() and index(). I
prefer that it be included.
* Should *file* objects be included? Implementing reverse iteration
may not be easy though it would be useful on occasion. I think
this one should be skipped.
* Should *enumerate* objects be included? They can provide reverse
iteration only when the underlying sequences support *__len__*
and reverse iteration. I think this can be saved for another
day if the need arises.
Real World Use Cases
====================
Here are some instances of reverse iteration taken from the standard
library and comments on why reverse iteration was necessary:
* atexit.exit_handlers() uses::
while _exithandlers:
func, targs, kargs = _exithandlers.pop()
. . .
The application dictates the need to run exit handlers in the
reverse order they were built. The ``while alist: alist.pop()``
form is readable and clean; however, it would be slightly faster
and clearer with::
for func, target, kargs in _exithandlers.iterreverse():
. . .
del _exithandlers
Note, if the order of deletion is important, then the first form
is still needed.
* difflib.get_close_matches() uses::
result.sort() # Retain only the best n.
result = result[-n:] # Move best-scorer to head of list.
result.reverse() # Strip scores.
return [x for score, x in result]
The need for reverse iteration arises from a requirement to return
a portion of a sort in an order opposite of the sort criterion. The
list comprehension is incidental (the third step of a Schwartzian
transform). This particular use case can met with extended slicing,
but the code is somewhat unattractive, hard to visually verify,
and difficult for beginners to construct::
result.sort()
return [x for score, x in result[:-n-1:-1]]
The proposed form is much easier to construct and verify::
result.sort()
return [x for score, x in result[-n:].iterreverse()]
* heapq.heapify() uses ``for i in xrange(n//2 - 1, -1, -1)`` because
higher-level orderings are more easily formed from pairs of
lower-level orderings. A forward version of this algorithm is
possible; however, that would complicate the rest of the heap code
which iterates over the underlying list in the opposite direction.
* mhlib.test() uses::
testfolders.reverse();
for t in testfolders:
do('mh.deletefolder(%s)' % `t`)
The need for reverse iteration arises because the tail of the
underlying list is altered during iteration.
* platform._dist_try_harder() uses
``for n in range(len(verfiles)-1,-1,-1)`` because the loop deletes
selected elements from *verfiles* but needs to leave the rest of
the list intact for further iteration.
* random.shuffle() uses ``for i in xrange(len(x)-1, 0, -1)`` because
the algorithm is most easily understood as randomly selecting
elements from an ever diminishing pool. In fact, the algorithm can
be run in a forward direction but is less intuitive and rarely
presented that way in literature. The replacement code
``for i in xrange(1, len(x)).iterreverse()`` is much easier
to mentally verify.
* rfc822.Message.__delitem__() uses::
list.reverse()
for i in list:
del self.headers[i]
The need for reverse iteration arises because the tail of the
underlying list is altered during iteration.
Alternative Ideas
=================
* Add a builtin function, *riter()* which calls a magic method,
*__riter__*. I see this as more overhead for no additional benefit.
* Several variants were submitted that provided fallback behavior
when *__riter__* is not defined:
- fallback to: ``for i in xrange(len(obj)-1,-1,-1): yield obj[i]``
2003-09-25 23:07:56 -04:00
- fallback to: ``for i in itertools.count(): yield obj[-i]``
- fallback to: ``tmp=list(obj); tmp.reverse(); return iter(tmp)``
All of these attempt to save implementing some object methods at the
expense of adding a new builtin function and of creating a new magic
method name.
The approaches using *__getitem__()* are slower than using a custom
method for each object. Also, the *__getitem__()* variants produce
bizarre results when applied to mappings.
All of the variants crash when applied to an infinite iterator.
The last variant can invisibly slip into a low performance mode
(in terms of time and memory) which could be made more visible with
an explicit ``ro=list(obj); ro.reverse()``.
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End: