diff --git a/pep-0234.txt b/pep-0234.txt index c0ac73a90..782ad4e39 100644 --- a/pep-0234.txt +++ b/pep-0234.txt @@ -97,22 +97,6 @@ C API Specification reference to themselves; this is needed to make it possible to use an iterator (as opposed to a sequence) in a for loop. - Discussion: should the next() method be renamed to __next__()? - Every other method corresponding to a tp_ slot has a - special name. On the other hand, this would suggest that there - should also be a primitive operation next(x) that would call - x.__next__(), and this just looks like adding complexity without - benefit. So I think it's better to stick with next(). On the - other hand, Marc-Andre Lemburg points out: "Even though .next() - reads better, I think that we should stick to the convention that - interpreter APIs use the __xxx__ naming scheme. Otherwise, people - will have a hard time differentiating between user-level protocols - and interpreter-level ones. AFAIK, .next() would be the first - low-level API not using this convention." My (BDFL's) response: - there are other important protocols with a user-level name - (e.g. keys()), and I don't see the importance of this particular - rule. BDFL pronouncement: this topic is closed. next() it is. - Python API Specification @@ -150,35 +134,6 @@ Python API Specification above. A class that wants to be an iterator also ought to implement __iter__() returning itself. - Discussion: - - - The name iter() is an abbreviation. Alternatives proposed - include iterate(), harp(), traverse(), narrate(). - - - Using the same name for two different operations (getting an - iterator from an object and making an iterator for a function - with an sentinel value) is somewhat ugly. I haven't seen a - better name for the second operation though. - - - There's a bit of undefined behavior for iterators: once a - particular iterator object has raised StopIteration, will it - also raise StopIteration on all subsequent next() calls? Some - say that it would be useful to require this, others say that it - is useful to leave this open to individual iterators. Note that - this may require an additional state bit for some iterator - implementations (e.g. function-wrapping iterators). - - - Some folks have requested the ability to restart an iterator. I - believe this should be dealt with by calling iter() on a - sequence repeatedly, not by the iterator protocol itself. - - - It was originally proposed that rather than having a next() - method, an iterator object should simply be callable. This was - rejected in favor of an explicit next() method. The reason is - clarity: if you don't know the code very well, "x = s()" does - not give a hint about what it does; but "x = s.next()" is pretty - clear. BDFL pronouncement: this topic is closed. next() it is. - Dictionary Iterators @@ -211,46 +166,11 @@ Dictionary Iterators as long as the restriction on modifications to the dictionary (either by the loop or by another thread) are not violated. - There is no doubt that the dict.has_keys(x) interpretation of "x - in dict" is by far the most useful interpretation, probably the - only useful one. There has been resistance against this because - "x in list" checks whether x is present among the values, while - the proposal makes "x in dict" check whether x is present among - the keys. Given that the symmetry between lists and dictionaries - is very weak, this argument does not have much weight. - - The main discussion focuses on whether - - for x in dict: ... - - should assign x the successive keys, values, or items of the - dictionary. The symmetry between "if x in y" and "for x in y" - suggests that it should iterate over keys. This symmetry has been - observed by many independently and has even been used to "explain" - one using the other. This is because for sequences, "if x in y" - iterates over y comparing the iterated values to x. If we adopt - both of the above proposals, this will also hold for - dictionaries. - - The argument against making "for x in dict" iterate over the keys - comes mostly from a practicality point of view: scans of the - standard library show that there are about as many uses of "for x - in dict.items()" as there are of "for x in dict.keys()", with the - items() version having a small majority. Presumably many of the - loops using keys() use the corresponding value anyway, by writing - dict[x], so (the argument goes) by making both the key and value - available, we could support the largest number of cases. While - this is true, I (Guido) find the correspondence between "for x in - dict" and "if x in dict" too compelling to break, and there's not - much overhead in having to write dict[x] to explicitly get the - value. We could also add methods to dictionaries that return - different kinds of iterators, e.g. - - for key, value in dict.iteritems(): ... - - for value in dict.itervalues(): ... - - for key in dict.iterkeys(): ... + If this proposal is accepted, it makes sense to recommend that + other mappings, if they support iterators at all, should also + iterate over the keys. However, this should not be taken as an + absolute rule; specific applications may have different + requirements. File Iterators @@ -309,6 +229,139 @@ Rationale {__getitem__, keys, values, items}. +Open Issues + + The following questions are still open. + + - The name iter() is an abbreviation. Alternatives proposed + include iterate(), harp(), traverse(), narrate(). + + - Using the same name for two different operations (getting an + iterator from an object and making an iterator for a function + with an sentinel value) is somewhat ugly. I haven't seen a + better name for the second operation though. + + - Once a particular iterator object has raised StopIteration, will + it also raise StopIteration on all subsequent next() calls? + Some say that it would be useful to require this, others say + that it is useful to leave this open to individual iterators. + Note that this may require an additional state bit for some + iterator implementations (e.g. function-wrapping iterators). + + - Some folks have requested extensions of the iterator protocol, + e.g. prev() to get the previous item, current() to get the + current item again, finished() to test whether the iterator is + finished, and maybe even others, like rewind(), __len__(), + position(). + + While some of these are useful, many of these cannot easily be + implemented for all iterator types without adding arbitrary + buffering, and sometimes they can't be implemented at all (or + not reasonably). E.g. anything to do with reversing directions + can't be done when iterating over a file or function. Maybe a + separate PEP can be drafted to standardize the names for such + operations when the are implementable. + + - There is still discussion about whether + + for x in dict: ... + + should assign x the successive keys, values, or items of the + dictionary. The symmetry between "if x in y" and "for x in y" + suggests that it should iterate over keys. This symmetry has been + observed by many independently and has even been used to "explain" + one using the other. This is because for sequences, "if x in y" + iterates over y comparing the iterated values to x. If we adopt + both of the above proposals, this will also hold for + dictionaries. + + The argument against making "for x in dict" iterate over the keys + comes mostly from a practicality point of view: scans of the + standard library show that there are about as many uses of "for x + in dict.items()" as there are of "for x in dict.keys()", with the + items() version having a small majority. Presumably many of the + loops using keys() use the corresponding value anyway, by writing + dict[x], so (the argument goes) by making both the key and value + available, we could support the largest number of cases. While + this is true, I (Guido) find the correspondence between "for x in + dict" and "if x in dict" too compelling to break, and there's not + much overhead in having to write dict[x] to explicitly get the + value. We could also add methods to dictionaries that return + different kinds of iterators, e.g. + + for key, value in dict.iteritems(): ... + + for value in dict.itervalues(): ... + + for key in dict.iterkeys(): ... + + +Resolved Issues + + The following topics have been decided by consensus or BDFL + pronouncement. + + - Two alternative spellings for next() have been proposed but + rejected: __next__(), because it corresponds to a type object + slot (tp_iternext); and __call__(), because this is the only + operation. + + Arguments against __next__(): while many iterators are used in + for loops, it is expected that user code will also call next() + directly, so having to write __next__() is ugly; also, a + possible extension of the protocol would be to allow for prev(), + current() and reset() operations; surely we don't want to use + __prev__(), __current__(), __reset__(). + + Arguments against __call__() (the original proposal): taken out + of context, x() is not very readable, while x.next() is clear; + there's a danger that every special-purpose object wants to use + __call__() for its most common operation, causing more confusion + than clarity. + + - Some folks have requested the ability to restart an iterator. + This should be dealt with by calling iter() on a sequence + repeatedly, not by the iterator protocol itself. + + - It has been questioned whether an exception to signal the end of + the iteration isn't too expensive. Several alternatives for the + StopIteration exception have been proposed: a special value End + to signal the end, a function end() to test whether the iterator + is finished, even reusing the IndexError exception. + + - A special value has the problem that if a sequence ever + contains that special value, a loop over that sequence will + end prematurely without any warning. If the experience with + null-terminated C strings hasn't taught us the problems this + can cause, imagine the trouble a Python introspection tool + would have iterating over a list of all built-in names, + assuming that the special End value was a built-in name! + + - Calling an end() function would require two calls per + iteration. Two calls is much more expensive than one call + plus a test for an exception. Especially the time-critical + for loop can test very cheaply for an exception. + + - Reusing IndexError can cause confusion because it can be a + genuine error, which would be masked by ending the loop + prematurely. + + - Some have asked for a standard iterator type. Presumably all + iterators would have to be derived from this type. But this is + not the Python way: dictionaries are mappings because they + support __getitem__() and a handful other operations, not + because they are derived from an abstract mapping type. + + - Regarding "if key in dict": there is no doubt that the + dict.has_keys(x) interpretation of "x in dict" is by far the + most useful interpretation, probably the only useful one. There + has been resistance against this because "x in list" checks + whether x is present among the values, while the proposal makes + "x in dict" check whether x is present among the keys. Given + that the symmetry between lists and dictionaries is very weak, + this argument does not have much weight. + + Mailing Lists The iterator protocol has been discussed extensively in a mailing @@ -321,6 +374,7 @@ Mailing Lists http://groups.yahoo.com/group/python-iter + Copyright This document is in the public domain.