Moved all the discussion items together at the end, in two sections
"Open Issues" and "Resolved Issues".
This commit is contained in:
parent
e5aa4a1379
commit
3deab93236
224
pep-0234.txt
224
pep-0234.txt
|
@ -97,22 +97,6 @@ C API Specification
|
|||
reference to themselves; this is needed to make it possible to
|
||||
use an iterator (as opposed to a sequence) in a for loop.
|
||||
|
||||
Discussion: should the next() method be renamed to __next__()?
|
||||
Every other method corresponding to a tp_<something> slot has a
|
||||
special name. On the other hand, this would suggest that there
|
||||
should also be a primitive operation next(x) that would call
|
||||
x.__next__(), and this just looks like adding complexity without
|
||||
benefit. So I think it's better to stick with next(). On the
|
||||
other hand, Marc-Andre Lemburg points out: "Even though .next()
|
||||
reads better, I think that we should stick to the convention that
|
||||
interpreter APIs use the __xxx__ naming scheme. Otherwise, people
|
||||
will have a hard time differentiating between user-level protocols
|
||||
and interpreter-level ones. AFAIK, .next() would be the first
|
||||
low-level API not using this convention." My (BDFL's) response:
|
||||
there are other important protocols with a user-level name
|
||||
(e.g. keys()), and I don't see the importance of this particular
|
||||
rule. BDFL pronouncement: this topic is closed. next() it is.
|
||||
|
||||
|
||||
Python API Specification
|
||||
|
||||
|
@ -150,35 +134,6 @@ Python API Specification
|
|||
above. A class that wants to be an iterator also ought to
|
||||
implement __iter__() returning itself.
|
||||
|
||||
Discussion:
|
||||
|
||||
- The name iter() is an abbreviation. Alternatives proposed
|
||||
include iterate(), harp(), traverse(), narrate().
|
||||
|
||||
- Using the same name for two different operations (getting an
|
||||
iterator from an object and making an iterator for a function
|
||||
with an sentinel value) is somewhat ugly. I haven't seen a
|
||||
better name for the second operation though.
|
||||
|
||||
- There's a bit of undefined behavior for iterators: once a
|
||||
particular iterator object has raised StopIteration, will it
|
||||
also raise StopIteration on all subsequent next() calls? Some
|
||||
say that it would be useful to require this, others say that it
|
||||
is useful to leave this open to individual iterators. Note that
|
||||
this may require an additional state bit for some iterator
|
||||
implementations (e.g. function-wrapping iterators).
|
||||
|
||||
- Some folks have requested the ability to restart an iterator. I
|
||||
believe this should be dealt with by calling iter() on a
|
||||
sequence repeatedly, not by the iterator protocol itself.
|
||||
|
||||
- It was originally proposed that rather than having a next()
|
||||
method, an iterator object should simply be callable. This was
|
||||
rejected in favor of an explicit next() method. The reason is
|
||||
clarity: if you don't know the code very well, "x = s()" does
|
||||
not give a hint about what it does; but "x = s.next()" is pretty
|
||||
clear. BDFL pronouncement: this topic is closed. next() it is.
|
||||
|
||||
|
||||
Dictionary Iterators
|
||||
|
||||
|
@ -211,46 +166,11 @@ Dictionary Iterators
|
|||
as long as the restriction on modifications to the dictionary
|
||||
(either by the loop or by another thread) are not violated.
|
||||
|
||||
There is no doubt that the dict.has_keys(x) interpretation of "x
|
||||
in dict" is by far the most useful interpretation, probably the
|
||||
only useful one. There has been resistance against this because
|
||||
"x in list" checks whether x is present among the values, while
|
||||
the proposal makes "x in dict" check whether x is present among
|
||||
the keys. Given that the symmetry between lists and dictionaries
|
||||
is very weak, this argument does not have much weight.
|
||||
|
||||
The main discussion focuses on whether
|
||||
|
||||
for x in dict: ...
|
||||
|
||||
should assign x the successive keys, values, or items of the
|
||||
dictionary. The symmetry between "if x in y" and "for x in y"
|
||||
suggests that it should iterate over keys. This symmetry has been
|
||||
observed by many independently and has even been used to "explain"
|
||||
one using the other. This is because for sequences, "if x in y"
|
||||
iterates over y comparing the iterated values to x. If we adopt
|
||||
both of the above proposals, this will also hold for
|
||||
dictionaries.
|
||||
|
||||
The argument against making "for x in dict" iterate over the keys
|
||||
comes mostly from a practicality point of view: scans of the
|
||||
standard library show that there are about as many uses of "for x
|
||||
in dict.items()" as there are of "for x in dict.keys()", with the
|
||||
items() version having a small majority. Presumably many of the
|
||||
loops using keys() use the corresponding value anyway, by writing
|
||||
dict[x], so (the argument goes) by making both the key and value
|
||||
available, we could support the largest number of cases. While
|
||||
this is true, I (Guido) find the correspondence between "for x in
|
||||
dict" and "if x in dict" too compelling to break, and there's not
|
||||
much overhead in having to write dict[x] to explicitly get the
|
||||
value. We could also add methods to dictionaries that return
|
||||
different kinds of iterators, e.g.
|
||||
|
||||
for key, value in dict.iteritems(): ...
|
||||
|
||||
for value in dict.itervalues(): ...
|
||||
|
||||
for key in dict.iterkeys(): ...
|
||||
If this proposal is accepted, it makes sense to recommend that
|
||||
other mappings, if they support iterators at all, should also
|
||||
iterate over the keys. However, this should not be taken as an
|
||||
absolute rule; specific applications may have different
|
||||
requirements.
|
||||
|
||||
|
||||
File Iterators
|
||||
|
@ -309,6 +229,139 @@ Rationale
|
|||
{__getitem__, keys, values, items}.
|
||||
|
||||
|
||||
Open Issues
|
||||
|
||||
The following questions are still open.
|
||||
|
||||
- The name iter() is an abbreviation. Alternatives proposed
|
||||
include iterate(), harp(), traverse(), narrate().
|
||||
|
||||
- Using the same name for two different operations (getting an
|
||||
iterator from an object and making an iterator for a function
|
||||
with an sentinel value) is somewhat ugly. I haven't seen a
|
||||
better name for the second operation though.
|
||||
|
||||
- Once a particular iterator object has raised StopIteration, will
|
||||
it also raise StopIteration on all subsequent next() calls?
|
||||
Some say that it would be useful to require this, others say
|
||||
that it is useful to leave this open to individual iterators.
|
||||
Note that this may require an additional state bit for some
|
||||
iterator implementations (e.g. function-wrapping iterators).
|
||||
|
||||
- Some folks have requested extensions of the iterator protocol,
|
||||
e.g. prev() to get the previous item, current() to get the
|
||||
current item again, finished() to test whether the iterator is
|
||||
finished, and maybe even others, like rewind(), __len__(),
|
||||
position().
|
||||
|
||||
While some of these are useful, many of these cannot easily be
|
||||
implemented for all iterator types without adding arbitrary
|
||||
buffering, and sometimes they can't be implemented at all (or
|
||||
not reasonably). E.g. anything to do with reversing directions
|
||||
can't be done when iterating over a file or function. Maybe a
|
||||
separate PEP can be drafted to standardize the names for such
|
||||
operations when the are implementable.
|
||||
|
||||
- There is still discussion about whether
|
||||
|
||||
for x in dict: ...
|
||||
|
||||
should assign x the successive keys, values, or items of the
|
||||
dictionary. The symmetry between "if x in y" and "for x in y"
|
||||
suggests that it should iterate over keys. This symmetry has been
|
||||
observed by many independently and has even been used to "explain"
|
||||
one using the other. This is because for sequences, "if x in y"
|
||||
iterates over y comparing the iterated values to x. If we adopt
|
||||
both of the above proposals, this will also hold for
|
||||
dictionaries.
|
||||
|
||||
The argument against making "for x in dict" iterate over the keys
|
||||
comes mostly from a practicality point of view: scans of the
|
||||
standard library show that there are about as many uses of "for x
|
||||
in dict.items()" as there are of "for x in dict.keys()", with the
|
||||
items() version having a small majority. Presumably many of the
|
||||
loops using keys() use the corresponding value anyway, by writing
|
||||
dict[x], so (the argument goes) by making both the key and value
|
||||
available, we could support the largest number of cases. While
|
||||
this is true, I (Guido) find the correspondence between "for x in
|
||||
dict" and "if x in dict" too compelling to break, and there's not
|
||||
much overhead in having to write dict[x] to explicitly get the
|
||||
value. We could also add methods to dictionaries that return
|
||||
different kinds of iterators, e.g.
|
||||
|
||||
for key, value in dict.iteritems(): ...
|
||||
|
||||
for value in dict.itervalues(): ...
|
||||
|
||||
for key in dict.iterkeys(): ...
|
||||
|
||||
|
||||
Resolved Issues
|
||||
|
||||
The following topics have been decided by consensus or BDFL
|
||||
pronouncement.
|
||||
|
||||
- Two alternative spellings for next() have been proposed but
|
||||
rejected: __next__(), because it corresponds to a type object
|
||||
slot (tp_iternext); and __call__(), because this is the only
|
||||
operation.
|
||||
|
||||
Arguments against __next__(): while many iterators are used in
|
||||
for loops, it is expected that user code will also call next()
|
||||
directly, so having to write __next__() is ugly; also, a
|
||||
possible extension of the protocol would be to allow for prev(),
|
||||
current() and reset() operations; surely we don't want to use
|
||||
__prev__(), __current__(), __reset__().
|
||||
|
||||
Arguments against __call__() (the original proposal): taken out
|
||||
of context, x() is not very readable, while x.next() is clear;
|
||||
there's a danger that every special-purpose object wants to use
|
||||
__call__() for its most common operation, causing more confusion
|
||||
than clarity.
|
||||
|
||||
- Some folks have requested the ability to restart an iterator.
|
||||
This should be dealt with by calling iter() on a sequence
|
||||
repeatedly, not by the iterator protocol itself.
|
||||
|
||||
- It has been questioned whether an exception to signal the end of
|
||||
the iteration isn't too expensive. Several alternatives for the
|
||||
StopIteration exception have been proposed: a special value End
|
||||
to signal the end, a function end() to test whether the iterator
|
||||
is finished, even reusing the IndexError exception.
|
||||
|
||||
- A special value has the problem that if a sequence ever
|
||||
contains that special value, a loop over that sequence will
|
||||
end prematurely without any warning. If the experience with
|
||||
null-terminated C strings hasn't taught us the problems this
|
||||
can cause, imagine the trouble a Python introspection tool
|
||||
would have iterating over a list of all built-in names,
|
||||
assuming that the special End value was a built-in name!
|
||||
|
||||
- Calling an end() function would require two calls per
|
||||
iteration. Two calls is much more expensive than one call
|
||||
plus a test for an exception. Especially the time-critical
|
||||
for loop can test very cheaply for an exception.
|
||||
|
||||
- Reusing IndexError can cause confusion because it can be a
|
||||
genuine error, which would be masked by ending the loop
|
||||
prematurely.
|
||||
|
||||
- Some have asked for a standard iterator type. Presumably all
|
||||
iterators would have to be derived from this type. But this is
|
||||
not the Python way: dictionaries are mappings because they
|
||||
support __getitem__() and a handful other operations, not
|
||||
because they are derived from an abstract mapping type.
|
||||
|
||||
- Regarding "if key in dict": there is no doubt that the
|
||||
dict.has_keys(x) interpretation of "x in dict" is by far the
|
||||
most useful interpretation, probably the only useful one. There
|
||||
has been resistance against this because "x in list" checks
|
||||
whether x is present among the values, while the proposal makes
|
||||
"x in dict" check whether x is present among the keys. Given
|
||||
that the symmetry between lists and dictionaries is very weak,
|
||||
this argument does not have much weight.
|
||||
|
||||
|
||||
Mailing Lists
|
||||
|
||||
The iterator protocol has been discussed extensively in a mailing
|
||||
|
@ -321,6 +374,7 @@ Mailing Lists
|
|||
|
||||
http://groups.yahoo.com/group/python-iter
|
||||
|
||||
|
||||
Copyright
|
||||
|
||||
This document is in the public domain.
|
||||
|
|
Loading…
Reference in New Issue