python-peps/pep-0284.txt

PEP: 284
Title: Integer for-loops
Version: $Revision$
Last-Modified: $Date$
Author: eppstein@ics.uci.edu (David Eppstein),
    greg@cosc.canterbury.ac.nz (Greg Ewing)
Status: Draft
Type: Standards Track
Created: 1-Mar-2002
Python-Version: 2.3
Post-History:


Abstract

    This PEP proposes to simplify iteration over intervals of
    integers, by extending the range of expressions allowed after a
    "for" keyword to allow three-way comparisons such as

        for lower <= var < upper:

    in place of the current

        for item in list:

    syntax.  The resulting loop or list iteration will loop over all
    values of var that make the comparison true, starting from the
    left endpoint of the given interval.


Rationale

    One of the most common uses of for-loops in Python is to iterate
    over an interval of integers.  Python provides functions range()
    and xrange() to generate lists and iterators for such intervals,
    which work best for the most frequent case: half-open intervals
    increasing from zero.  However, the range() syntax is more awkward
    for open or closed intervals, and lacks symmetry when reversing
    the order of iteration.  In addition, the call to an unfamiliar
    function makes it difficult for newcomers to Python to understand
    code that uses range() or xrange().

    The perceived lack of a natural, intuitive integer iteration
    syntax has led to heated debate on python-list, and spawned at
    least four PEPs before this one.  PEP 204 [1] (rejected) proposed
    to re-use Python's slice syntax for integer ranges, leading to a
    terser syntax but not solving the readability problem of
    multi-argument range().  PEP 212 [2] (deferred) proposed several
    syntaxes for directly converting a list to a sequence of integer
    indices, in place of the current idiom

        range(len(list))

    for such conversion, and PEP 281 [3] proposes to simplify the same
    idiom by allowing it to be written as

        range(list).

    PEP 276 [4] proposes to allow automatic conversion of integers to
    iterators, simplifying the most common half-open case but not
    addressing the complexities of other types of interval.
    Additional alternatives have been discussed on python-list.

    The solution described here is to allow a three-way comparison
    after a "for" keyword, both in the context of a for-loop and of a
    list comprehension:

        for lower <= var < upper:

    This would cause iteration over an interval of consecutive
    integers, beginning at the left bound in the comparison and ending
    at the right bound.  The exact comparison operations used would
    determine whether the interval is open or closed at either end and
    whether the integers are considered in ascending or descending
    order.

    This syntax closely matches standard mathematical notation, so is
    likely to be more familiar to Python novices than the current
    range() syntax.  Open and closed interval endpoints are equally
    easy to express, and the reversal of an integer interval can be
    formed simply by swapping the two endpoints and reversing the
    comparisons.  In addition, the semantics of such a loop would
    closely resemble one way of interpreting the existing Python
    for-loops:

        for item in list

    iterates over exactly those values of item that cause the
    expression

        item in list

    to be true.  Similarly, the new format

        for lower <= var < upper:

    would iterate over exactly those integer values of var that cause
    the expression

        lower <= var < upper

    to be true.


Specification

    We propose to extend the syntax of a for statement, currently

        for_stmt: "for" target_list "in" expression_list ":" suite
                  ["else" ":" suite]

    as described below:

        for_stmt: "for" for_test ":" suite ["else" ":" suite]
        for_test: target_list "in" expression_list |
                  or_expr less_comp or_expr less_comp or_expr |
                  or_expr greater_comp or_expr greater_comp or_expr
        less_comp: "<" | "<="
        greater_comp: ">" | ">="

    Similarly, we propose to extend the syntax of list comprehensions,
    currently

        list_for: "for" expression_list "in" testlist [list_iter]

    by replacing it with:

        list_for: "for" for_test [list_iter]

    In all cases the expression formed by for_test would be subject to
    the same precedence rules as comparisons in expressions.  The two
    comp_operators in a for_test must be required to be both of
    similar types, unlike chained comparisons in expressions which do
    not have such a restriction.

    We refer to the two or_expr's occurring on the left and right
    sides of the for-loop syntax as the bounds of the loop, and the
    middle or_expr as the variable of the loop.  When a for-loop using
    the new syntax is executed, the expressions for both bounds will
    be evaluated, and an iterator object created that iterates through
    all integers between the two bounds according to the comparison
    operations used.  The iterator will begin with an integer equal or
    near to the left bound, and then step through the remaining
    integers with a step size of +1 or -1 if the comparison operation
    is in the set described by less_comp or greater_comp respectively.
    The execution will then proceed as if the expression had been

        for variable in iterator

    where "variable" refers to the variable of the loop and "iterator"
    refers to the iterator created for the given integer interval.

    The values taken by the loop variable in an integer for-loop may
    be either plain integers or long integers, according to the
    magnitude of the bounds.  Both bounds of an integer for-loop must
    evaluate to a real numeric type (integer, long, or float).  Any
    other value will cause the for-loop statement to raise a TypeError
    exception.


Issues

    The following issues were raised in discussion of this and related
    proposals on the Python list.

    - Should the right bound be evaluated once, or every time through
      the loop?  Clearly, it only makes sense to evaluate the left
      bound once.  For reasons of consistency and efficiency, we have
      chosen the same convention for the right bound.

    - Although the new syntax considerably simplifies integer
      for-loops, list comprehensions using the new syntax are not as
      simple.  We feel that this is appropriate since for-loops are
      more frequent than comprehensions.

    - The proposal does not allow access to integer iterator objects
      such as would be created by xrange.  True, but we see this as a
      shortcoming in the general list-comprehension syntax, beyond the
      scope of this proposal.  In addition, xrange() will still be
      available.

    - The proposal does not allow increments other than 1 and -1.
      More general arithmetic progressions would need to be created by
      range() or xrange(), or by a list comprehension syntax such as

        [2*x for 0 <= x <= 100]

    - The position of the loop variable in the middle of a three-way
      comparison is not as apparent as the variable in the present

        for item in list

      syntax, leading to a possible loss of readability.  We feel that
      this loss is outweighed by the increase in readability from a
      natural integer iteration syntax.

    - To some extent, this PEP addresses the same issues as PEP 276
      [4].  We feel that the two PEPs are not in conflict since PEP
      276 is primarily concerned with half-open ranges starting in 0
      (the easy case of range()) while this PEP is primarily concerned
      with simplifying all other cases.  However, if this PEP is
      approved, its new simpler syntax for integer loops could to some
      extent reduce the motivation for PEP 276.

    - It is not clear whether it makes sense to allow floating point
      bounds for an integer loop: if a float represents an inexact
      value, how can it be used to determine an exact sequence of
      integers?  On the other hand, disallowing float bounds would
      make it difficult to use floor() and ceiling() in integer
      for-loops, as it is difficult to use them now with range().  We
      have erred on the side of flexibility, but this may lead to some
      implementation difficulties in determining the smallest and
      largest integer values that would cause a given comparison to be
      true.

    - Should types other than int, long, and float be allowed as
      bounds?  Another choice would be to convert all bounds to
      integers by int(), and allow as bounds anything that can be so
      converted instead of just floats.  However, this would change
      the semantics: 0.3 <= x is not the same as int(0.3) <= x, and it
      would be confusing for a loop with 0.3 as lower bound to start
      at zero.  Also, in general int(f) can be very far from f.


Implementation

    An implementation is not available at this time.  Implementation
    is not expected to pose any great difficulties: the new syntax
    could, if necessary, be recognized by parsing a general expression
    after each "for" keyword and testing whether the top level
    operation of the expression is "in" or a three-way comparison.
    The Python compiler would convert any instance of the new syntax
    into a loop over the items in a special iterator object.


References

    [1] PEP 204, Range Literals
        http://www.python.org/peps/pep-0204.html

    [2] PEP 212, Loop Counter Iteration
        http://www.python.org/peps/pep-0212.html

    [3] PEP 281, Loop Counter Iteration with range and xrange
        http://www.python.org/peps/pep-0281.html

    [4] PEP 276, Simple Iterator for ints
        http://www.python.org/peps/pep-0276.html


Copyright

    This document has been placed in the public domain.


Local Variables:
mode: indented-text
indent-tabs-mode: nil
fill-column: 70
End: