python-peps/pep-0322.txt

PEP: 322
Title: Reverse Iteration Methods
Version: $Revision$
Last-Modified: $Date$
Author: Raymond Hettinger <python@rcn.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 24-Sep-2003
Python-Version: 2.4
Post-History: 24-Sep-2003


Abstract
========

This proposal is to extend the API of several sequence types
to include methods for iterating over the sequence in reverse.


Motivation
==========

For indexable objects, current approaches for reverse iteration are
error prone, unnatural, and not especially readable::

    for i in xrange(n-1, -1, -1):
        print seqn[i]

One other current approach involves reversing a list before iterating
over it.  That technique wastes computer cycles, memory, and lines of
code.  Also, it only works with lists (strings, for example, do not
define a reverse method)::

    rseqn = list(seqn)
    rseqn.reverse()
    for value in rseqn:
        print value

Extending slicing minimizes the code overhead but does nothing for
memory efficiency, beauty, or clarity.

Reverse iteration is much less common than forward iteration, but it
does arise regularly in practice.  See `Real World Use Cases`_ below.


Proposal
========

Add a method called *iter_backwards()* to sequence objects that can
benefit from it.  The above examples then simplify to::

    for i in xrange(n).iter_backwards():
        print seqn[i]

::

    for elem in seqn.iter_backwards():
        print elem

The new protocol would be applied to lists, strings, xrange objects,
and possibly other sequence objects as well (depending on use cases
and implementation issues).  It would not apply to unordered
collections like dicts and sets.

No language syntax changes are needed.


Alternative Method Names
========================

* *iterbackwards* -- like iteritems() but somewhat long
* *backwards* -- more pithy, less explicit
* *ireverse* -- reminiscent of imap(), izip(), and ifilter()


Open Issues
===========

* Should *tuple* objects be included?  In the past, they have been
  denied some list like behaviors such as count() and index().

* Should *file* objects be included?  Implementing reverse iteration
  may not be easy though it would be useful on occasion.

* Should *enumerate* objects be included?  They can provide reverse
  iteration only when the underlying sequences support *__len__*
  and reverse iteration.


Real World Use Cases
====================

Here are some instances of reverse iteration taken from the standard
library and comments on why reverse iteration was necessary:

* atexit.exit_handlers() uses::

    while _exithandlers:
        func, targs, kargs = _exithandlers.pop()
            . . .

  The application dictates the need to run exit handlers in the
  reverse order they were built.  The ``while alist: alist.pop()``
  form is readable and clean; however, it would be slightly faster
  and clearer with::

    for func, target, kargs in _exithandlers.iter_backwards():
        . . .
    del _exithandlers

* difflib.get_close_matches() uses::

    result.sort()           # Retain only the best n.
    result = result[-n:]    # Move best-scorer to head of list.
    result.reverse()        # Strip scores.
    return [x for score, x in result]

  The need for reverse iteration arises from a requirement to return
  a portion of a sort in an order opposite of the sort criterion.  The
  list comprehension is incidental (the third step of a Schwartzian
  transform).  This particular use case can met with extended slicing,
  but the code is somewhat unattractive, hard to visually verify,
  and difficult for beginners to construct::

      result.sort()
      return [x for score, x in result[:-n-1:-1]]

  The proposed form is much easier to construct and verify::

      result.sort()
      return [x for score, x in result[-n:].iter_backwards()]

* heapq.heapify() uses ``for i in xrange(n//2 - 1, -1, -1)`` because
  higher-level orderings are more easily formed from pairs of
  lower-level orderings.  A forward version of this algorithm is
  possible; however, that would complicate the rest of the heap code
  which iterates over the underlying list in the opposite direction.

* mhlib.test() uses::

    testfolders.reverse();
    for t in testfolders:
        do('mh.deletefolder(%s)' % `t`)

  The need for reverse iteration arises because the tail of the
  underlying list is altered during iteration.

* platform._dist_try_harder() uses
  ``for n in range(len(verfiles)-1,-1,-1)`` because the loop deletes
  selected elements from *verfiles* but needs to leave the rest of
  the list intact for further iteration.  This use case could be
  addressed with *itertools.ifilter()* but would require the
  selection predicate to be in a *lambda* expression.  The net
  result is less clear and readable than the original.  A better
  reformulation is to replace the first line with the proposed
  method.

* random.shuffle() uses ``for i in xrange(len(x)-1, 0, -1)`` because
  the algorithm is most easily understood as randomly selecting
  elements from an ever diminishing pool.  In fact, the algorithm can
  be run in a forward direction but is less intuitive and rarely
  presented that way in literature.

* rfc822.Message.__delitem__() uses::

    list.reverse()
    for i in list:
        del self.headers[i]

  The need for reverse iteration arises because the tail of the
  underlying list is altered during iteration.


Rejected Alternative Ideas
==========================

* Add a builtin function, *reverse()* which calls a magic method,
  __riter__.  I see this as more overhead for no additional benefit.

* Add a builtin function, *reverse()* which does the above, and
  if *__riter__* is not found, constructs its own using
  *__getitem__*, and if *__getitem__* is not found, builds a list
  from *__iter__* and returns a reverse iterator over the new list.
  The advantage is that one function takes care of almost everything
  that is potentially reversible.  A disadvantage is that it can
  invisibility slip in to a low performance mode (in terms of time
  and memory) which would be more visible with an explicit
  ``list(obj).reverse()``.  Another problem is that *__getitem__*
  is also used in mappings as well as sequences and that could lead
  to bizarre results.


Copyright
=========

This document has been placed in the public domain.


..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   End:
Add a new PEP for reverse iteration methods 2003-09-24 06:30:08 -04:00			`PEP: 322`
			`Title: Reverse Iteration Methods`
			`Version: $Revision$`
			`Last-Modified: $Date$`
			`Author: Raymond Hettinger <python@rcn.com>`
			`Status: Draft`
			`Type: Standards Track`
			`Content-Type: text/x-rst`
			`Created: 24-Sep-2003`
			`Python-Version: 2.4`
			`Post-History: 24-Sep-2003`


			`Abstract`
			`========`

			`This proposal is to extend the API of several sequence types`
			`to include methods for iterating over the sequence in reverse.`


			`Motivation`
			`==========`

			`For indexable objects, current approaches for reverse iteration are`
			`error prone, unnatural, and not especially readable::`

			`for i in xrange(n-1, -1, -1):`
			`print seqn[i]`

			`One other current approach involves reversing a list before iterating`
			`over it. That technique wastes computer cycles, memory, and lines of`
			`code. Also, it only works with lists (strings, for example, do not`
			`define a reverse method)::`

			`rseqn = list(seqn)`
			`rseqn.reverse()`
			`for value in rseqn:`
			`print value`

			`Extending slicing minimizes the code overhead but does nothing for`
			`memory efficiency, beauty, or clarity.`

			`Reverse iteration is much less common than forward iteration, but it`
			does arise regularly in practice. See `Real World Use Cases`_ below.


			`Proposal`
			`========`

			`Add a method called iter_backwards() to sequence objects that can`
			`benefit from it. The above examples then simplify to::`

			`for i in xrange(n).iter_backwards():`
			`print seqn[i]`

			`::`

			`for elem in seqn.iter_backwards():`
			`print elem`

			`The new protocol would be applied to lists, strings, xrange objects,`
			`and possibly other sequence objects as well (depending on use cases`
			`and implementation issues). It would not apply to unordered`
			`collections like dicts and sets.`

			`No language syntax changes are needed.`


			`Alternative Method Names`
			`========================`

			`* iterbackwards -- like iteritems() but somewhat long`
			`* backwards -- more pithy, less explicit`
			`* ireverse -- reminiscent of imap(), izip(), and ifilter()`


			`Open Issues`
			`===========`

			`* Should tuple objects be included? In the past, they have been`
			`denied some list like behaviors such as count() and index().`

			`* Should file objects be included? Implementing reverse iteration`
			`may not be easy though it would be useful on occasion.`

			`* Should enumerate objects be included? They can provide reverse`
			`iteration only when the underlying sequences support __len__`
			`and reverse iteration.`


			`Real World Use Cases`
			`====================`

			`Here are some instances of reverse iteration taken from the standard`
			`library and comments on why reverse iteration was necessary:`

			`* atexit.exit_handlers() uses::`

			`while _exithandlers:`
			`func, targs, kargs = _exithandlers.pop()`
			`. . .`

			`The application dictates the need to run exit handlers in the`
			reverse order they were built. The ``while alist: alist.pop()``
			`form is readable and clean; however, it would be slightly faster`
			`and clearer with::`

			`for func, target, kargs in _exithandlers.iter_backwards():`
			`. . .`
			`del _exithandlers`

			`* difflib.get_close_matches() uses::`

			`result.sort() # Retain only the best n.`
			`result = result[-n:] # Move best-scorer to head of list.`
			`result.reverse() # Strip scores.`
			`return [x for score, x in result]`

			`The need for reverse iteration arises from a requirement to return`
			`a portion of a sort in an order opposite of the sort criterion. The`
			`list comprehension is incidental (the third step of a Schwartzian`
			`transform). This particular use case can met with extended slicing,`
			`but the code is somewhat unattractive, hard to visually verify,`
			`and difficult for beginners to construct::`

			`result.sort()`
			`return [x for score, x in result[:-n-1:-1]]`

			`The proposed form is much easier to construct and verify::`

			`result.sort()`
			`return [x for score, x in result[-n:].iter_backwards()]`

			* heapq.heapify() uses ``for i in xrange(n//2 - 1, -1, -1)`` because
			`higher-level orderings are more easily formed from pairs of`
			`lower-level orderings. A forward version of this algorithm is`
			`possible; however, that would complicate the rest of the heap code`
			`which iterates over the underlying list in the opposite direction.`

			`* mhlib.test() uses::`

			`testfolders.reverse();`
			`for t in testfolders:`
			do('mh.deletefolder(%s)' % `t`)

			`The need for reverse iteration arises because the tail of the`
			`underlying list is altered during iteration.`

			`* platform._dist_try_harder() uses`
			``for n in range(len(verfiles)-1,-1,-1)`` because the loop deletes
			`selected elements from verfiles but needs to leave the rest of`
			`the list intact for further iteration. This use case could be`
			`addressed with itertools.ifilter() but would require the`
			`selection predicate to be in a lambda expression. The net`
			`result is less clear and readable than the original. A better`
			`reformulation is to replace the first line with the proposed`
			`method.`

			* random.shuffle() uses ``for i in xrange(len(x)-1, 0, -1)`` because
			`the algorithm is most easily understood as randomly selecting`
			`elements from an ever diminishing pool. In fact, the algorithm can`
			`be run in a forward direction but is less intuitive and rarely`
			`presented that way in literature.`

			`* rfc822.Message.__delitem__() uses::`

			`list.reverse()`
			`for i in list:`
			`del self.headers[i]`

			`The need for reverse iteration arises because the tail of the`
			`underlying list is altered during iteration.`


			`Rejected Alternative Ideas`
			`==========================`

			`* Add a builtin function, reverse() which calls a magic method,`
			`__riter__. I see this as more overhead for no additional benefit.`

			`* Add a builtin function, reverse() which does the above, and`
			`if __riter__ is not found, constructs its own using`
			`__getitem__, and if __getitem__ is not found, builds a list`
			`from __iter__ and returns a reverse iterator over the new list.`
			`The advantage is that one function takes care of almost everything`
			`that is potentially reversible. A disadvantage is that it can`
			`invisibility slip in to a low performance mode (in terms of time`
			`and memory) which would be more visible with an explicit`
			``list(obj).reverse()``. Another problem is that __getitem__
			`is also used in mappings as well as sequences and that could lead`
			`to bizarre results.`


			`Copyright`
			`=========`

			`This document has been placed in the public domain.`



			`..`
			`Local Variables:`
			`mode: indented-text`
			`indent-tabs-mode: nil`
			`sentence-end-double-space: t`
			`fill-column: 70`
			`End:`