From 9b27c83c792aea70d3d03bb2337cb2407f958f4b Mon Sep 17 00:00:00 2001 From: Barry Warsaw Date: Wed, 6 Mar 2002 13:16:08 +0000 Subject: [PATCH] PEP 284, Integer for-loops, Eppstein & Ewing --- pep-0284.txt | 261 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 261 insertions(+) create mode 100644 pep-0284.txt diff --git a/pep-0284.txt b/pep-0284.txt new file mode 100644 index 000000000..364a5b658 --- /dev/null +++ b/pep-0284.txt @@ -0,0 +1,261 @@ +PEP: 284 +Title: Integer for-loops +Version: $Revision$ +Last-Modified: $Date$ +Author: eppstein@ics.uci.edu (David Eppstein), + greg@cosc.canterbury.ac.nz (Greg Ewing) +Status: Draft +Type: Standards Track +Created: 1-Mar-2002 +Python-Version: 2.3 +Post-History: + + +Abstract + + This PEP proposes to simplify iteration over intervals of + integers, by extending the range of expressions allowed after a + "for" keyword to allow three-way comparisons such as + + for lower <= var < upper: + + in place of the current + + for item in list: + + syntax. The resulting loop or list iteration will loop over all + values of var that make the comparison true, starting from the + left endpoint of the given interval. + + +Rationale + + One of the most common uses of for-loops in Python is to iterate + over an interval of integers. Python provides functions range() + and xrange() to generate lists and iterators for such intervals, + which work best for the most frequent case: half-open intervals + increasing from zero. However, the range() syntax is more awkward + for open or closed intervals, and lacks symmetry when reversing + the order of iteration. In addition, the call to an unfamiliar + function makes it difficult for newcomers to Python to understand + code that uses range() or xrange(). + + The perceived lack of a natural, intuitive integer iteration + syntax has led to heated debate on python-list, and spawned at + least four PEPs before this one. PEP 204 [1] (rejected) proposed + to re-use Python's slice syntax for integer ranges, leading to a + terser syntax but not solving the readability problem of + multi-argument range(). PEP 212 [2] (deferred) proposed several + syntaxes for directly converting a list to a sequence of integer + indices, in place of the current idiom + + range(len(list)) + + for such conversion, and PEP 281 [3] proposes to simplify the same + idiom by allowing it to be written as + + range(list). + + PEP 276 [4] proposes to allow automatic conversion of integers to + iterators, simplifying the most common half-open case but not + addressing the complexities of other types of interval. + Additional alternatives have been discussed on python-list. + + The solution described here is to allow a three-way comparison + after a "for" keyword, both in the context of a for-loop and of a + list comprehension: + + for lower <= var < upper: + + This would cause iteration over an interval of consecutive + integers, beginning at the left bound in the comparison and ending + at the right bound. The exact comparison operations used would + determine whether the interval is open or closed at either end and + whether the integers are considered in ascending or descending + order. + + This syntax closely matches standard mathematical notation, so is + likely to be more familiar to Python novices than the current + range() syntax. Open and closed interval endpoints are equally + easy to express, and the reversal of an integer interval can be + formed simply by swapping the two endpoints and reversing the + comparisons. In addition, the semantics of such a loop would + closely resemble one way of interpreting the existing Python + for-loops: + + for item in list + + iterates over exactly those values of item that cause the + expression + + item in list + + to be true. Similarly, the new format + + for lower <= var < upper: + + would iterate over exactly those integer values of var that cause + the expression + + lower <= var < upper + + to be true. + + +Specification + + We propose to extend the syntax of a for statement, currently + + for_stmt: "for" target_list "in" expression_list ":" suite + ["else" ":" suite] + + as described below: + + for_stmt: "for" for_test ":" suite ["else" ":" suite] + for_test: target_list "in" expression_list | + or_expr less_comp or_expr less_comp or_expr | + or_expr greater_comp identifier greater_comp or_expr + less_comp: "<" | "<=" + greater_comp: ">" | ">=" + + Similarly, we propose to extend the syntax of list comprehensions, + currently + + list_for: "for" expression_list "in" testlist [list_iter] + + by replacing it with: + + list_for: "for" for_test [list_iter] + + In all cases the expression formed by for_test would be subject to + the same precedence rules as comparisons in expressions. The two + comp_operators in a for_test must be required to be both of + similar types, unlike chained comparisons in expressions which do + not have such a restriction. + + We refer to the two or_expr's occurring on the left and right + sides of the for-loop syntax as the bounds of the loop, and the + middle or_expr as the variable of the loop. When a for-loop using + the new syntax is executed, the expressions for both bounds will + be evaluated, and an iterator object created that iterates through + all integers between the two bounds according to the comparison + operations used. The iterator will begin with an integer equal or + near to the left bound, and then step through the remaining + integers with a step size of +1 or -1 if the comparison operation + is in the set described by less_comp or greater_comp respectively. + The execution will then proceed as if the expression had been + + for variable in iterator + + where "variable" refers to the variable of the loop and "iterator" + refers to the iterator created for the given integer interval. + + The values taken by the loop variable in an integer for-loop may + be either plain integers or long integers, according to the + magnitude of the bounds. Both bounds of an integer for-loop must + evaluate to a real numeric type (integer, long, or float). Any + other value will cause the for-loop statement to raise a TypeError + exception. + + +Issues + + The following issues were raised in discussion of this and related + proposals on the Python list. + + - Should the right bound be evaluated once, or every time through + the loop? Clearly, it only makes sense to evaluate the left + bound once. For reasons of consistency and efficiency, we have + chosen the same convention for the right bound. + + - Although the new syntax considerably simplifies integer + for-loops, list comprehensions using the new syntax are not as + simple. We feel that this is appropriate since for-loops are + more frequent than comprehensions. + + - The proposal does not allow access to integer iterator objects + such as would be created by xrange. True, but we see this as a + shortcoming in the general list-comprehension syntax, beyond the + scope of this proposal. In addition, xrange() will still be + available. + + - The proposal does not allow increments other than 1 and -1. + More general arithmetic progressions would need to be created by + range() or xrange(), or by a list comprehension syntax such as + + [2*x for 0 <= x <= 100] + + - The position of the loop variable in the middle of a three-way + comparison is not as apparent as the variable in the present + + for item in list + + syntax, leading to a possible loss of readability. We feel that + this loss is outweighed by the increase in readability from a + natural integer iteration syntax. + + - To some extent, this PEP addresses the same issues as PEP 276 + [4]. We feel that the two PEPs are not in conflict since PEP + 276 is primarily concerned with half-open ranges starting in 0 + (the easy case of range()) while this PEP is primarily concerned + with simplifying all other cases. However, if this PEP is + approved, its new simpler syntax for integer loops could to some + extent reduce the motivation for PEP 276. + + - It is not clear whether it makes sense to allow floating point + bounds for an integer loop: if a float represents an inexact + value, how can it be used to determine an exact sequence of + integers? On the other hand, disallowing float bounds would + make it difficult to use floor() and ceiling() in integer + for-loops, as it is difficult to use them now with range(). We + have erred on the side of flexibility, but this may lead to some + implementation difficulties in determining the smallest and + largest integer values that would cause a given comparison to be + true. + + - Should types other than int, long, and float be allowed as + bounds? Another choice would be to convert all bounds to + integers by int(), and allow as bounds anything that can be so + converted instead of just floats. However, this would change + the semantics: 0.3 <= x is not the same as int(0.3) <= x, and it + would be confusing for a loop with 0.3 as lower bound to start + at zero. Also, in general int(f) can be very far from f. + + +Implementation + + An implementation is not available at this time. Implementation + is not expected to pose any great difficulties: the new syntax + could, if necessary, be recognized by parsing a general expression + after each "for" keyword and testing whether the top level + operation of the expression is "in" or a three-way comparison. + The Python compiler would convert any instance of the new syntax + into a loop over the items in a special iterator object. + + +References + + [1] PEP 204, Range Literals + http://www.python.org/peps/pep-0204.html + + [2] PEP 212, Loop Counter Iteration + http://www.python.org/peps/pep-0212.html + + [3] PEP 281, Loop Counter Iteration with range and xrange + http://www.python.org/peps/pep-0281.html + + [4] PEP 276, Simple Iterator for ints + http://www.python.org/peps/pep-0276.html + + +Copyright + + This document has been placed in the public domain. + + + +Local Variables: +mode: indented-text +indent-tabs-mode: nil +fill-column: 70 +End: