reSTify PEP 276 (#357)

This commit is contained in:
Huang Huang 2017-08-22 01:02:31 +08:00 committed by Brett Cannon
parent 8af972b089
commit a5101b1988
1 changed files with 320 additions and 300 deletions

View File

@ -5,425 +5,445 @@ Last-Modified: $Date$
Author: james_althoff@i2.com (Jim Althoff)
Status: Rejected
Type: Standards Track
Content-Type: text/x-rst
Created: 12-Nov-2001
Python-Version: 2.3
Post-History:
Abstract
========
Python 2.1 added new functionality to support iterators[1].
Iterators have proven to be useful and convenient in many coding
situations. It is noted that the implementation of Python's
for-loop control structure uses the iterator protocol as of
release 2.1. It is also noted that Python provides iterators for
the following builtin types: lists, tuples, dictionaries, strings,
and files. This PEP proposes the addition of an iterator for the
builtin type int (types.IntType). Such an iterator would simplify
the coding of certain for-loops in Python.
Python 2.1 added new functionality to support iterators [1]_.
Iterators have proven to be useful and convenient in many coding
situations. It is noted that the implementation of Python's
for-loop control structure uses the iterator protocol as of
release 2.1. It is also noted that Python provides iterators for
the following builtin types: lists, tuples, dictionaries, strings,
and files. This PEP proposes the addition of an iterator for the
builtin type int (``types.IntType``). Such an iterator would simplify
the coding of certain for-loops in Python.
BDFL Pronouncement
==================
This PEP was rejected on 17 June 2005 with a note to python-dev.
This PEP was rejected on 17 June 2005 with a note to python-dev.
Much of the original need was met by the enumerate() function which
was accepted for Python 2.3.
Much of the original need was met by the ``enumerate()`` function which
was accepted for Python 2.3.
Also, the proposal both allowed and encouraged misuses such as:
Also, the proposal both allowed and encouraged misuses such as::
>>> for i in 3: print i
0
1
2
>>> for i in 3: print i
0
1
2
Likewise, it was not helpful that the proposal would disable the
syntax error in statements like:
Likewise, it was not helpful that the proposal would disable the
syntax error in statements like::
x, = 1
x, = 1
Specification
=============
Define an iterator for types.intType (i.e., the builtin type
"int") that is returned from the builtin function "iter" when
called with an instance of types.intType as the argument.
Define an iterator for types.intType (i.e., the builtin type
"int") that is returned from the builtin function "iter" when
called with an instance of types.intType as the argument.
The returned iterator has the following behavior:
The returned iterator has the following behavior:
- Assume that object i is an instance of types.intType (the
builtin type int) and that i > 0
- Assume that object i is an instance of ``types.intType`` (the
builtin type int) and that i > 0
- iter(i) returns an iterator object
- ``iter(i)`` returns an iterator object
- said iterator object iterates through the sequence of ints
0,1,2,...,i-1
- said iterator object iterates through the sequence of ints
0,1,2,...,i-1
Example:
Example:
iter(5) returns an iterator object that iterates through the
sequence of ints 0,1,2,3,4
``iter(5)`` returns an iterator object that iterates through the
sequence of ints 0,1,2,3,4
- if i <= 0, iter(i) returns an "empty" iterator, i.e., one that
throws StopIteration upon the first call of its "next" method
- if i <= 0, ``iter(i)`` returns an "empty" iterator, i.e., one that
throws StopIteration upon the first call of its "next" method
In other words, the conditions and semantics of said iterator is
consistent with the conditions and semantics of the range() and
xrange() functions.
In other words, the conditions and semantics of said iterator is
consistent with the conditions and semantics of the ``range()`` and
``xrange()`` functions.
Note that the sequence 0,1,2,...,i-1 associated with the int i is
considered "natural" in the context of Python programming because
it is consistent with the builtin indexing protocol of sequences
in Python. Python lists and tuples, for example, are indexed
starting at 0 and ending at len(object)-1 (when using positive
indices). In other words, such objects are indexed with the
sequence 0,1,2,...,len(object)-1
Note that the sequence 0,1,2,...,i-1 associated with the int i is
considered "natural" in the context of Python programming because
it is consistent with the builtin indexing protocol of sequences
in Python. Python lists and tuples, for example, are indexed
starting at 0 and ending at len(object)-1 (when using positive
indices). In other words, such objects are indexed with the
sequence 0,1,2,...,len(object)-1
Rationale
=========
A common programming idiom is to take a collection of objects and
apply some operation to each item in the collection in some
established sequential order. Python provides the "for in"
looping control structure for handling this common idiom. Cases
arise, however, where it is necessary (or more convenient) to
access each item in an "indexed" collection by iterating through
each index and accessing each item in the collection using the
corresponding index.
A common programming idiom is to take a collection of objects and
apply some operation to each item in the collection in some
established sequential order. Python provides the "for in"
looping control structure for handling this common idiom. Cases
arise, however, where it is necessary (or more convenient) to
access each item in an "indexed" collection by iterating through
each index and accessing each item in the collection using the
corresponding index.
For example, one might have a two-dimensional "table" object where one
requires the application of some operation to the first column of
each row in the table. Depending on the implementation of the table
it might not be possible to access first each row and then each
column as individual objects. It might, rather, be possible to
access a cell in the table using a row index and a column index.
In such a case it is necessary to use an idiom where one iterates
through a sequence of indices (indexes) in order to access the
desired items in the table. (Note that the commonly used
DefaultTableModel class in Java-Swing-Jython has this very protocol).
For example, one might have a two-dimensional "table" object where one
requires the application of some operation to the first column of
each row in the table. Depending on the implementation of the table
it might not be possible to access first each row and then each
column as individual objects. It might, rather, be possible to
access a cell in the table using a row index and a column index.
In such a case it is necessary to use an idiom where one iterates
through a sequence of indices (indexes) in order to access the
desired items in the table. (Note that the commonly used
DefaultTableModel class in Java-Swing-Jython has this very protocol).
Another common example is where one needs to process two or more
collections in parallel. Another example is where one needs to
access, say, every second item in a collection.
Another common example is where one needs to process two or more
collections in parallel. Another example is where one needs to
access, say, every second item in a collection.
There are many other examples where access to items in a
collection is facilitated by a computation on an index thus
necessitating access to the indices rather than direct access to
the items themselves.
There are many other examples where access to items in a
collection is facilitated by a computation on an index thus
necessitating access to the indices rather than direct access to
the items themselves.
Let's call this idiom the "indexed for-loop" idiom. Some
programming languages provide builtin syntax for handling this
idiom. In Python the common convention for implementing the
indexed for-loop idiom is to use the builtin range() or xrange()
function to generate a sequence of indices as in, for example:
Let's call this idiom the "indexed for-loop" idiom. Some
programming languages provide builtin syntax for handling this
idiom. In Python the common convention for implementing the
indexed for-loop idiom is to use the builtin ``range()`` or ``xrange()``
function to generate a sequence of indices as in, for example::
for rowcount in range(table.getRowCount()):
print table.getValueAt(rowcount, 0)
for rowcount in range(table.getRowCount()):
print table.getValueAt(rowcount, 0)
or
or
for rowcount in xrange(table.getRowCount()):
print table.getValueAt(rowcount, 0)
::
From time to time there are discussions in the Python community
about the indexed for-loop idiom. It is sometimes argued that the
need for using the range() or xrange() function for this design
idiom is:
for rowcount in xrange(table.getRowCount()):
print table.getValueAt(rowcount, 0)
- Not obvious (to new-to-Python programmers),
From time to time there are discussions in the Python community
about the indexed for-loop idiom. It is sometimes argued that the
need for using the ``range()`` or ``xrange()`` function for this design
idiom is:
- Error prone (easy to forget, even for experienced Python
programmers)
- Not obvious (to new-to-Python programmers),
- Confusing and distracting for those who feel compelled to understand
the differences and recommended usage of xrange() vis-a-vis range()
- Error prone (easy to forget, even for experienced Python
programmers)
- Unwieldy, especially when combined with the len() function,
i.e., xrange(len(sequence))
- Confusing and distracting for those who feel compelled to understand
the differences and recommended usage of ``xrange()`` vis-a-vis ``range()``
- Not as convenient as equivalent mechanisms in other languages,
- Unwieldy, especially when combined with the ``len()`` function,
i.e., ``xrange(len(sequence))``
- Annoying, a "wart", etc.
- Not as convenient as equivalent mechanisms in other languages,
And from time to time proposals are put forth for ways in which
Python could provide a better mechanism for this idiom. Recent
examples include PEP 204, "Range Literals", and PEP 212, "Loop
Counter Iteration".
- Annoying, a "wart", etc.
Most often, such proposal include changes to Python's syntax and
other "heavyweight" changes.
And from time to time proposals are put forth for ways in which
Python could provide a better mechanism for this idiom. Recent
examples include PEP 204, "Range Literals", and PEP 212, "Loop
Counter Iteration".
Part of the difficulty here is that advocating new syntax implies
a comprehensive solution for "general indexing" that has to
include aspects like:
Most often, such proposal include changes to Python's syntax and
other "heavyweight" changes.
- starting index value
Part of the difficulty here is that advocating new syntax implies
a comprehensive solution for "general indexing" that has to
include aspects like:
- ending index value
- starting index value
- step value
- ending index value
- open intervals versus closed intervals versus half opened intervals
- step value
Finding a new syntax that is comprehensive, simple, general,
Pythonic, appealing to many, easy to implement, not in conflict
with existing structures, not excessively overloading of existing
structures, etc. has proven to be more difficult than one might
anticipate.
- open intervals versus closed intervals versus half opened intervals
The proposal outlined in this PEP tries to address the problem by
suggesting a simple "lightweight" solution that helps the most
common case by using a proven mechanism that is already available
(as of Python 2.1): namely, iterators.
Finding a new syntax that is comprehensive, simple, general,
Pythonic, appealing to many, easy to implement, not in conflict
with existing structures, not excessively overloading of existing
structures, etc. has proven to be more difficult than one might
anticipate.
Because for-loops already use "iterator" protocol as of Python
2.1, adding an iterator for types.IntType as proposed in this PEP
would enable by default the following shortcut for the indexed
for-loop idiom:
The proposal outlined in this PEP tries to address the problem by
suggesting a simple "lightweight" solution that helps the most
common case by using a proven mechanism that is already available
(as of Python 2.1): namely, iterators.
Because for-loops already use "iterator" protocol as of Python
2.1, adding an iterator for types.IntType as proposed in this PEP
would enable by default the following shortcut for the indexed
for-loop idiom::
for rowcount in table.getRowCount():
print table.getValueAt(rowcount, 0)
The following benefits for this approach vis-a-vis the current
mechanism of using the ``range()`` or ``xrange()`` functions are claimed
to be:
- Simpler,
- Less cluttered,
- Focuses on the problem at hand without the need to resort to
secondary implementation-oriented functions (``range()`` and
``xrange()``)
And compared to other proposals for change:
- Requires no new syntax
- Requires no new keywords
- Takes advantage of the new and well-established iterator mechanism
And generally:
- Is consistent with iterator-based "convenience" changes already
included (as of Python 2.1) for other builtin types such as:
lists, tuples, dictionaries, strings, and files.
Backwards Compatibility
=======================
The proposed mechanism is generally backwards compatible as it
calls for neither new syntax nor new keywords. All existing,
valid Python programs should continue to work unmodified.
However, this proposal is not perfectly backwards compatible in
the sense that certain statements that are currently invalid
would, under the current proposal, become valid.
Tim Peters has pointed out two such examples:
1) The common case where one forgets to include ``range()`` or
``xrange()``, for example::
for rowcount in table.getRowCount():
print table.getValueAt(rowcount, 0)
The following benefits for this approach vis-a-vis the current
mechanism of using the range() or xrange() functions are claimed
to be:
in Python 2.2 raises a TypeError exception.
- Simpler,
Under the current proposal, the above statement would be valid
and would work as (presumably) intended. Presumably, this is a
good thing.
- Less cluttered,
As noted by Tim, this is the common case of the "forgotten
range" mistake (which one currently corrects by adding a call
to ``range()`` or ``xrange()``).
- Focuses on the problem at hand without the need to resort to
secondary implementation-oriented functions (range() and
xrange())
2) The (hopefully) very uncommon case where one makes a typing
mistake when using tuple unpacking. For example::
And compared to other proposals for change:
x, = 1
- Requires no new syntax
in Python 2.2 raises a ``TypeError`` exception.
- Requires no new keywords
- Takes advantage of the new and well-established iterator mechanism
And generally:
- Is consistent with iterator-based "convenience" changes already
included (as of Python 2.1) for other builtin types such as:
lists, tuples, dictionaries, strings, and files.
Under the current proposal, the above statement would be valid
and would set x to 0. The PEP author has no data as to how
common this typing error is nor how difficult it would be to
catch such an error under the current proposal. He imagines
that it does not occur frequently and that it would be
relatively easy to correct should it happen.
Backwards Compatibility
Issues
======
The proposed mechanism is generally backwards compatible as it
calls for neither new syntax nor new keywords. All existing,
valid Python programs should continue to work unmodified.
Extensive discussions concerning PEP 276 on the Python interest
mailing list suggests a range of opinions: some in favor, some
neutral, some against. Those in favor tend to agree with the
claims above of the usefulness, convenience, ease of learning,
and simplicity of a simple iterator for integers.
However, this proposal is not perfectly backwards compatible in
the sense that certain statements that are currently invalid
would, under the current proposal, become valid.
Issues with PEP 276 include:
Tim Peters has pointed out two such examples:
- Using range/xrange is fine as is.
1) The common case where one forgets to include range() or
xrange(), for example:
Response: Some posters feel this way. Other disagree.
for rowcount in table.getRowCount():
print table.getValueAt(rowcount, 0)
- Some feel that iterating over the sequence "0, 1, 2, ..., n-1"
for an integer n is not intuitive. "for i in 5:" is considered
(by some) to be "non-obvious", for example. Some dislike this
usage because it doesn't have "the right feel". Some dislike it
because they believe that this type of usage forces one to view
integers as a sequences and this seems wrong to them. Some
dislike it because they prefer to view for-loops as dealing
with explicit sequences rather than with arbitrary iterators.
in Python 2.2 raises a TypeError exception.
Response: Some like the proposed idiom and see it as simple,
elegant, easy to learn, and easy to use. Some are neutral on
this issue. Others, as noted, dislike it.
Under the current proposal, the above statement would be valid
and would work as (presumably) intended. Presumably, this is a
good thing.
- Is it obvious that ``iter(5)`` maps to the sequence 0,1,2,3,4?
As noted by Tim, this is the common case of the "forgotten
range" mistake (which one currently corrects by adding a call
to range() or xrange()).
Response: Given, as noted above, that Python has a strong
convention for indexing sequences starting at 0 and stopping at
(inclusively) the index whose value is one less than the length
of the sequence, it is argued that the proposed sequence is
reasonably intuitive to the Python programmer while being useful
and practical. More importantly, it is argued that once learned
this convention is very easy to remember. Note that the doc
string for the range function makes a reference to the
natural and useful association between ``range(n)`` and the indices
for a list whose length is n.
2) The (hopefully) very uncommon case where one makes a typing
mistake when using tuple unpacking. For example:
- Possible ambiguity
x, = 1
::
in Python 2.2 raises a TypeError exception.
for i in 10: print i
Under the current proposal, the above statement would be valid
and would set x to 0. The PEP author has no data as to how
common this typing error is nor how difficult it would be to
catch such an error under the current proposal. He imagines
that it does not occur frequently and that it would be
relatively easy to correct should it happen.
might be mistaken for
::
Issues:
for i in (10,): print i
Extensive discussions concerning PEP 276 on the Python interest
mailing list suggests a range of opinions: some in favor, some
neutral, some against. Those in favor tend to agree with the
claims above of the usefulness, convenience, ease of learning,
and simplicity of a simple iterator for integers.
Response: This is exactly the same situation with strings in
current Python (replace 10 with 'spam' in the above, for
example).
Issues with PEP 276 include:
- Too general: in the newest releases of Python there are
contexts -- as with for-loops -- where iterators are called
implicitly. Some fear that having an iterator invoked for
an integer in one of the context (excluding for-loops) might
lead to unexpected behavior and bugs. The "x, = 1" example
noted above is an a case in point.
- Using range/xrange is fine as is.
Response: From the author's perspective the examples of the
above that were identified in the PEP 276 discussions did
not appear to be ones that would be accidentally misused
in ways that would lead to subtle and hard-to-detect errors.
Response: Some posters feel this way. Other disagree.
In addition, it seems that there is a way to deal with this
issue by using a variation of what is outlined in the
specification section of this proposal. Instead of adding
an ``__iter__`` method to class int, change the for-loop handling
code to convert (in essence) from
- Some feel that iterating over the sequence "0, 1, 2, ..., n-1"
for an integer n is not intuitive. "for i in 5:" is considered
(by some) to be "non-obvious", for example. Some dislike this
usage because it doesn't have "the right feel". Some dislike it
because they believe that this type of usage forces one to view
integers as a sequences and this seems wrong to them. Some
dislike it because they prefer to view for-loops as dealing
with explicit sequences rather than with arbitrary iterators.
::
Response: Some like the proposed idiom and see it as simple,
elegant, easy to learn, and easy to use. Some are neutral on
this issue. Others, as noted, dislike it.
for i in n: # when isinstance(n,int) is 1
- Is it obvious that iter(5) maps to the sequence 0,1,2,3,4?
to
Response: Given, as noted above, that Python has a strong
convention for indexing sequences starting at 0 and stopping at
(inclusively) the index whose value is one less than the length
of the sequence, it is argued that the proposed sequence is
reasonably intuitive to the Python programmer while being useful
and practical. More importantly, it is argued that once learned
this convention is very easy to remember. Note that the doc
string for the range function makes a reference to the
natural and useful association between range(n) and the indices
for a list whose length is n.
::
- Possible ambiguity
for i in xrange(n):
for i in 10: print i
This approach gives the same results in a for-loop as an
``__iter__`` method would but would prevent iteration on integer
values in any other context. Lists and tuples, for example,
don't have ``__iter__`` and are handled with special code.
Integer values would be one more special case.
might be mistaken for
- "i in n" seems very unnatural.
for i in (10,): print i
Response: Some feel that "i in len(mylist)" would be easily
understandable and useful. Some don't like it, particularly
when a literal is used as in "i in 5". If the variant
mentioned in the response to the previous issue is implemented,
this issue is moot. If not, then one could also address this
issue by defining a ``__contains__`` method in class int that would
always raise a TypeError. This would then make the behavior of
"i in n" identical to that of current Python.
Response: This is exactly the same situation with strings in
current Python (replace 10 with 'spam' in the above, for
example).
- Might dissuade newbies from using the indexed for-loop idiom when
the standard "for item in collection:" idiom is clearly better.
- Too general: in the newest releases of Python there are
contexts -- as with for-loops -- where iterators are called
implicitly. Some fear that having an iterator invoked for
an integer in one of the context (excluding for-loops) might
lead to unexpected behavior and bugs. The "x, = 1" example
noted above is an a case in point.
Response: The standard idiom is so nice when it fits that it
needs neither extra "carrot" nor "stick". On the other hand,
one does notice cases of overuse/misuse of the standard idiom
(due, most likely, to the awkwardness of the indexed for-loop
idiom), as in::
Response: From the author's perspective the examples of the
above that were identified in the PEP 276 discussions did
not appear to be ones that would be accidentally misused
in ways that would lead to subtle and hard-to-detect errors.
for item in sequence:
print sequence.index(item)
In addition, it seems that there is a way to deal with this
issue by using a variation of what is outlined in the
specification section of this proposal. Instead of adding
an __iter__ method to class int, change the for-loop handling
code to convert (in essence) from
- Why not propose even bigger changes?
for i in n: # when isinstance(n,int) is 1
The majority of disagreement with PEP 276 came from those who
favor much larger changes to Python to address the more general
problem of specifying a sequence of integers where such
a specification is general enough to handle the starting value,
ending value, and stepping value of the sequence and also
addresses variations of open, closed, and half-open (half-closed)
integer intervals. Many suggestions of such were discussed.
to
These include:
for i in xrange(n):
- adding Haskell-like notation for specifying a sequence of
integers in a literal list,
This approach gives the same results in a for-loop as an
__iter__ method would but would prevent iteration on integer
values in any other context. Lists and tuples, for example,
don't have __iter__ and are handled with special code.
Integer values would be one more special case.
- various uses of slicing notation to specify sequences,
- "i in n" seems very unnatural.
- changes to the syntax of for-in loops to allow the use of
relational operators in the loop header,
Response: Some feel that "i in len(mylist)" would be easily
understandable and useful. Some don't like it, particularly
when a literal is used as in "i in 5". If the variant
mentioned in the response to the previous issue is implemented,
this issue is moot. If not, then one could also address this
issue by defining a __contains__ method in class int that would
always raise a TypeError. This would then make the behavior of
"i in n" identical to that of current Python.
- creation of an integer-interval class along with methods that
overload relational operators or division operators
to provide "slicing" on integer-interval objects,
- Might dissuade newbies from using the indexed for-loop idiom when
the standard "for item in collection:" idiom is clearly better.
- and more.
Response: The standard idiom is so nice when it fits that it
needs neither extra "carrot" nor "stick". On the other hand,
one does notice cases of overuse/misuse of the standard idiom
(due, most likely, to the awkwardness of the indexed for-loop
idiom), as in:
It should be noted that there was much debate but not an
overwhelming consensus for any of these larger-scale suggestions.
for item in sequence:
print sequence.index(item)
- Why not propose even bigger changes?
The majority of disagreement with PEP 276 came from those who
favor much larger changes to Python to address the more general
problem of specifying a sequence of integers where such
a specification is general enough to handle the starting value,
ending value, and stepping value of the sequence and also
addresses variations of open, closed, and half-open (half-closed)
integer intervals. Many suggestions of such were discussed.
These include:
- adding Haskell-like notation for specifying a sequence of
integers in a literal list,
- various uses of slicing notation to specify sequences,
- changes to the syntax of for-in loops to allow the use of
relational operators in the loop header,
- creation of an integer-interval class along with methods that
overload relational operators or division operators
to provide "slicing" on integer-interval objects,
- and more.
It should be noted that there was much debate but not an
overwhelming consensus for any of these larger-scale suggestions.
Clearly, PEP 276 does not propose such a large-scale change
and instead focuses on a specific problem area. Towards the
end of the discussion period, several posters expressed favor
for the narrow focus and simplicity of PEP 276 vis-a-vis the more
ambitious suggestions that were advanced. There did appear to be
consensus for the need for a PEP for any such larger-scale,
alternative suggestion. In light of this recognition, details of
the various alternative suggestions are not discussed here further.
Clearly, PEP 276 does not propose such a large-scale change
and instead focuses on a specific problem area. Towards the
end of the discussion period, several posters expressed favor
for the narrow focus and simplicity of PEP 276 vis-a-vis the more
ambitious suggestions that were advanced. There did appear to be
consensus for the need for a PEP for any such larger-scale,
alternative suggestion. In light of this recognition, details of
the various alternative suggestions are not discussed here further.
Implementation
==============
An implementation is not available at this time but is expected
to be straightforward. The author has implemented a subclass of
int with an __iter__ method (written in Python) as a means to test
out the ideas in this proposal, however.
An implementation is not available at this time but is expected
to be straightforward. The author has implemented a subclass of
int with an ``__iter__`` method (written in Python) as a means to test
out the ideas in this proposal, however.
References
==========
[1] PEP 234, Iterators
http://www.python.org/dev/peps/pep-0234/
.. [1] PEP 234, Iterators
http://www.python.org/dev/peps/pep-0234/
[2] PEP 204, Range Literals
http://www.python.org/dev/peps/pep-0204/
.. [2] PEP 204, Range Literals
http://www.python.org/dev/peps/pep-0204/
[3] PEP 212, Loop Counter Iteration
http://www.python.org/dev/peps/pep-0212/
.. [3] PEP 212, Loop Counter Iteration
http://www.python.org/dev/peps/pep-0212/
Copyright
=========
This document has been placed in the public domain.
This document has been placed in the public domain.
Local Variables:
mode: indented-text
indent-tabs-mode: nil
fill-column: 70
End:
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
fill-column: 70
End: