diff --git a/pep-0234.txt b/pep-0234.txt index f406bd616..0352b2d4f 100644 --- a/pep-0234.txt +++ b/pep-0234.txt @@ -66,15 +66,16 @@ C API Specification - Some exception is set; this means an error occurred, and should be propagated normally. - In addition to the tp_iternext slot, every iterator object must - also implement a next() method, callable without arguments. This - should have the same semantics as the tp_iternext slot function, - except that the only way to signal the end of the iteration is to - raise StopIteration. The iterator object should not care whether - its tp_iternext slot function is called or its next() method, and - the caller may mix calls arbitrarily. (The next() method is for - the benefit of Python code using iterators directly; the - tp_iternext slot is added to make 'for' loops more efficient.) + Iterators implemented in C should *not* implement a next() method + with similar semantics as the tp_iternext slot! When the type's + dictionary is initialized (by PyType_Ready()), the presence of a + tp_iternext slot causes a method next() wrapping that slot to be + added to the type's tp_dict. (Exception: if the type doesn't use + PyObject_GenericGetAttr() to access instance attributes, the + next() method in the type's tp_dict may not be seen.) (Due to a + misunderstanding in the original text of this PEP, in Python 2.2, + all iterator types implemented a next() method that was overridden + by the wrapper; this has been fixed in Python 2.3.) To ensure binary backwards compatibility, a new flag Py_TPFLAGS_HAVE_ITER is added to the set of flags in the tp_flags @@ -83,7 +84,8 @@ C API Specification PyIter_Check() tests whether an object has the appropriate flag set and has a non-NULL tp_iternext slot. There is no such macro for the tp_iter slot (since the only place where this slot is - referenced should be PyObject_GetIter()). + referenced should be PyObject_GetIter(), and this can check for + the Py_TPFLAGS_HAVE_ITER flag directly). (Note: the tp_iter slot can be present on any object; the tp_iternext slot should only be present on objects that act as @@ -107,6 +109,14 @@ C API Specification reference to themselves; this is needed to make it possible to use an iterator (as opposed to a sequence) in a for loop. + Iterator implementations (in C or in Python) should guarantee that + once the iterator has signalled its exhaustion, subsequent calls + to tp_iternext or to the next() method will continue to do so. It + is not specified whether an iterator should enter the exhausted + state when an exception (other than StopIteration) is raised. + Note that Python cannot guarantee that user-defined or 3rd party + iterators implement this requirement correctly. + Python API Specification @@ -163,10 +173,6 @@ Python API Specification Dictionary Iterators - The following two proposals are somewhat controversial. They are - also independent from the main iterator implementation. However, - they are both very useful. - - Dictionaries implement a sq_contains slot that implements the same test as the has_key() method. This means that we can write @@ -204,8 +210,7 @@ Dictionary Iterators This means that "for x in dict" is shorthand for "for x in dict.iterkeys()". - If this proposal is accepted, it makes sense to recommend that - other mappings, if they support iterators at all, should also + Other mappings, if they support iterators at all, should also iterate over the keys. However, this should not be taken as an absolute rule; specific applications may have different requirements. @@ -213,11 +218,9 @@ Dictionary Iterators File Iterators - The following proposal is not controversial, but should be - considered a separate step after introducing the iterator - framework described above. It is useful because it provides us - with a good answer to the complaint that the common idiom to - iterate over the lines of a file is ugly and slow. + The following proposal is useful because it provides us with a + good answer to the complaint that the common idiom to iterate over + the lines of a file is ugly and slow. - Files implement a tp_iter slot that is equivalent to iter(f.readline, ""). This means that we can write @@ -245,6 +248,33 @@ File Iterators solutions don't work for all file types, e.g. they don't work when the open file object really represents a pipe or a stream socket. + Because the file iterator uses an internal buffer, mixing this + with other file operations (e.g. file.readline()) doesn't work + right. Also, the following code: + + for line in file: + if line == "\n": + break + for line in file: + print line, + + doesn't work as you might expect, because the iterator created by + the second for-loop doesn't take the buffer read-ahead by the + first for-loop into account. A correct way to write this is: + + it = iter(file) + for line in it: + if line == "\n": + break + for line in it: + print line, + + (The rationale for these restrictions are that "for line in file" + ought to become the recommended, standard way to iterate over the + lines of a file, and this should be as fast as can be. The + iterator version is considerable faster than calling readline(), + due to the internal buffer in the iterator.) + Rationale @@ -293,9 +323,15 @@ Resolved Issues __call__() for its most common operation, causing more confusion than clarity. + (In retrospect, it might have been better to go for __next__() + and have a new built-in, next(it), which calls it.__next__(). + But alas, it's too late; this has been deployed in Python 2.2 + since December 2001.) + - Some folks have requested the ability to restart an iterator. This should be dealt with by calling iter() on a sequence - repeatedly, not by the iterator protocol itself. + repeatedly, not by the iterator protocol itself. (See also + requested extensions below.) - It has been questioned whether an exception to signal the end of the iteration isn't too expensive. Several alternatives for the @@ -361,6 +397,11 @@ Resolved Issues Resolution: once StopIteration is raised, calling it.next() continues to raise StopIteration. + Note: this was in fact not implemented in Python 2.2; there are + many cases where an iterator's next() method can raise + StopIteration on one call but not on the next. This has been + remedied in Python 2.3. + - It has been proposed that a file object should be its own iterator, with a next() method returning the next line. This has certain advantages, and makes it even clearer that this @@ -368,7 +409,8 @@ Resolved Issues make it even more painful to implement the "sticky StopIteration" feature proposed in the previous bullet. - Resolution: this has been implemented. + Resolution: tentatively rejected (though there are still people + arguing for this). - Some folks have requested extensions of the iterator protocol, e.g. prev() to get the previous item, current() to get the @@ -386,7 +428,7 @@ Resolved Issues Resolution: rejected. - - There is still discussion about whether + - There has been a long discussion about whether for x in dict: ...