Add a number of clarifications and updates.
This commit is contained in:
parent
6c29d4df54
commit
9aee42f23e
90
pep-0234.txt
90
pep-0234.txt
|
@ -66,15 +66,16 @@ C API Specification
|
|||
- Some exception is set; this means an error occurred, and should
|
||||
be propagated normally.
|
||||
|
||||
In addition to the tp_iternext slot, every iterator object must
|
||||
also implement a next() method, callable without arguments. This
|
||||
should have the same semantics as the tp_iternext slot function,
|
||||
except that the only way to signal the end of the iteration is to
|
||||
raise StopIteration. The iterator object should not care whether
|
||||
its tp_iternext slot function is called or its next() method, and
|
||||
the caller may mix calls arbitrarily. (The next() method is for
|
||||
the benefit of Python code using iterators directly; the
|
||||
tp_iternext slot is added to make 'for' loops more efficient.)
|
||||
Iterators implemented in C should *not* implement a next() method
|
||||
with similar semantics as the tp_iternext slot! When the type's
|
||||
dictionary is initialized (by PyType_Ready()), the presence of a
|
||||
tp_iternext slot causes a method next() wrapping that slot to be
|
||||
added to the type's tp_dict. (Exception: if the type doesn't use
|
||||
PyObject_GenericGetAttr() to access instance attributes, the
|
||||
next() method in the type's tp_dict may not be seen.) (Due to a
|
||||
misunderstanding in the original text of this PEP, in Python 2.2,
|
||||
all iterator types implemented a next() method that was overridden
|
||||
by the wrapper; this has been fixed in Python 2.3.)
|
||||
|
||||
To ensure binary backwards compatibility, a new flag
|
||||
Py_TPFLAGS_HAVE_ITER is added to the set of flags in the tp_flags
|
||||
|
@ -83,7 +84,8 @@ C API Specification
|
|||
PyIter_Check() tests whether an object has the appropriate flag
|
||||
set and has a non-NULL tp_iternext slot. There is no such macro
|
||||
for the tp_iter slot (since the only place where this slot is
|
||||
referenced should be PyObject_GetIter()).
|
||||
referenced should be PyObject_GetIter(), and this can check for
|
||||
the Py_TPFLAGS_HAVE_ITER flag directly).
|
||||
|
||||
(Note: the tp_iter slot can be present on any object; the
|
||||
tp_iternext slot should only be present on objects that act as
|
||||
|
@ -107,6 +109,14 @@ C API Specification
|
|||
reference to themselves; this is needed to make it possible to
|
||||
use an iterator (as opposed to a sequence) in a for loop.
|
||||
|
||||
Iterator implementations (in C or in Python) should guarantee that
|
||||
once the iterator has signalled its exhaustion, subsequent calls
|
||||
to tp_iternext or to the next() method will continue to do so. It
|
||||
is not specified whether an iterator should enter the exhausted
|
||||
state when an exception (other than StopIteration) is raised.
|
||||
Note that Python cannot guarantee that user-defined or 3rd party
|
||||
iterators implement this requirement correctly.
|
||||
|
||||
|
||||
Python API Specification
|
||||
|
||||
|
@ -163,10 +173,6 @@ Python API Specification
|
|||
|
||||
Dictionary Iterators
|
||||
|
||||
The following two proposals are somewhat controversial. They are
|
||||
also independent from the main iterator implementation. However,
|
||||
they are both very useful.
|
||||
|
||||
- Dictionaries implement a sq_contains slot that implements the
|
||||
same test as the has_key() method. This means that we can write
|
||||
|
||||
|
@ -204,8 +210,7 @@ Dictionary Iterators
|
|||
This means that "for x in dict" is shorthand for "for x in
|
||||
dict.iterkeys()".
|
||||
|
||||
If this proposal is accepted, it makes sense to recommend that
|
||||
other mappings, if they support iterators at all, should also
|
||||
Other mappings, if they support iterators at all, should also
|
||||
iterate over the keys. However, this should not be taken as an
|
||||
absolute rule; specific applications may have different
|
||||
requirements.
|
||||
|
@ -213,11 +218,9 @@ Dictionary Iterators
|
|||
|
||||
File Iterators
|
||||
|
||||
The following proposal is not controversial, but should be
|
||||
considered a separate step after introducing the iterator
|
||||
framework described above. It is useful because it provides us
|
||||
with a good answer to the complaint that the common idiom to
|
||||
iterate over the lines of a file is ugly and slow.
|
||||
The following proposal is useful because it provides us with a
|
||||
good answer to the complaint that the common idiom to iterate over
|
||||
the lines of a file is ugly and slow.
|
||||
|
||||
- Files implement a tp_iter slot that is equivalent to
|
||||
iter(f.readline, ""). This means that we can write
|
||||
|
@ -245,6 +248,33 @@ File Iterators
|
|||
solutions don't work for all file types, e.g. they don't work when
|
||||
the open file object really represents a pipe or a stream socket.
|
||||
|
||||
Because the file iterator uses an internal buffer, mixing this
|
||||
with other file operations (e.g. file.readline()) doesn't work
|
||||
right. Also, the following code:
|
||||
|
||||
for line in file:
|
||||
if line == "\n":
|
||||
break
|
||||
for line in file:
|
||||
print line,
|
||||
|
||||
doesn't work as you might expect, because the iterator created by
|
||||
the second for-loop doesn't take the buffer read-ahead by the
|
||||
first for-loop into account. A correct way to write this is:
|
||||
|
||||
it = iter(file)
|
||||
for line in it:
|
||||
if line == "\n":
|
||||
break
|
||||
for line in it:
|
||||
print line,
|
||||
|
||||
(The rationale for these restrictions are that "for line in file"
|
||||
ought to become the recommended, standard way to iterate over the
|
||||
lines of a file, and this should be as fast as can be. The
|
||||
iterator version is considerable faster than calling readline(),
|
||||
due to the internal buffer in the iterator.)
|
||||
|
||||
|
||||
Rationale
|
||||
|
||||
|
@ -293,9 +323,15 @@ Resolved Issues
|
|||
__call__() for its most common operation, causing more confusion
|
||||
than clarity.
|
||||
|
||||
(In retrospect, it might have been better to go for __next__()
|
||||
and have a new built-in, next(it), which calls it.__next__().
|
||||
But alas, it's too late; this has been deployed in Python 2.2
|
||||
since December 2001.)
|
||||
|
||||
- Some folks have requested the ability to restart an iterator.
|
||||
This should be dealt with by calling iter() on a sequence
|
||||
repeatedly, not by the iterator protocol itself.
|
||||
repeatedly, not by the iterator protocol itself. (See also
|
||||
requested extensions below.)
|
||||
|
||||
- It has been questioned whether an exception to signal the end of
|
||||
the iteration isn't too expensive. Several alternatives for the
|
||||
|
@ -361,6 +397,11 @@ Resolved Issues
|
|||
Resolution: once StopIteration is raised, calling it.next()
|
||||
continues to raise StopIteration.
|
||||
|
||||
Note: this was in fact not implemented in Python 2.2; there are
|
||||
many cases where an iterator's next() method can raise
|
||||
StopIteration on one call but not on the next. This has been
|
||||
remedied in Python 2.3.
|
||||
|
||||
- It has been proposed that a file object should be its own
|
||||
iterator, with a next() method returning the next line. This
|
||||
has certain advantages, and makes it even clearer that this
|
||||
|
@ -368,7 +409,8 @@ Resolved Issues
|
|||
make it even more painful to implement the "sticky
|
||||
StopIteration" feature proposed in the previous bullet.
|
||||
|
||||
Resolution: this has been implemented.
|
||||
Resolution: tentatively rejected (though there are still people
|
||||
arguing for this).
|
||||
|
||||
- Some folks have requested extensions of the iterator protocol,
|
||||
e.g. prev() to get the previous item, current() to get the
|
||||
|
@ -386,7 +428,7 @@ Resolved Issues
|
|||
|
||||
Resolution: rejected.
|
||||
|
||||
- There is still discussion about whether
|
||||
- There has been a long discussion about whether
|
||||
|
||||
for x in dict: ...
|
||||
|
||||
|
|
Loading…
Reference in New Issue