Split PEP 342 (Enhanced Iterators) off of PEP 340.

This commit is contained in:
Guido van Rossum 2005-05-11 21:58:43 +00:00
parent cd4723bc31
commit 9cb17e5426
3 changed files with 265 additions and 209 deletions

View File

@ -119,6 +119,7 @@ Index by Category
S 338 Executing modules inside packages with '-m' Coghlan
S 340 Anonymous Block Statements GvR
S 341 Unifying try-except and try-finally Birkenfeld
S 342 Enhanced Iterators GvR
S 754 IEEE 754 Floating Point Special Values Warnes
Finished PEPs (done, implemented in CVS)
@ -378,6 +379,7 @@ Numerical Index
I 339 How to Change CPython's Bytecode Cannon
S 340 Anonymous Block Statements GvR
S 341 Unifying try-except and try-finally Birkenfeld
S 342 Enhanced Iterators GvR
SR 666 Reject Foolish Indentation Creighton
S 754 IEEE 754 Floating Point Special Values Warnes
I 3000 Python 3.0 Plans Kuchling, Cannon

View File

@ -12,30 +12,26 @@ Post-History:
Introduction
This PEP proposes a new type of compound statement which can be
used for resource management purposes, and a new iterator API to
go with it. The new statement type is provisionally called the
block-statement because the keyword to be used has not yet been
chosen.
used for resource management purposes. The new statement type
is provisionally called the block-statement because the keyword
to be used has not yet been chosen.
This PEP competes with several other PEPs: PEP 288 (Generators
Attributes and Exceptions; only the second part), PEP 310
(Reliable Acquisition/Release Pairs), and PEP 325
(Resource-Release Support for Generators).
I should clarify that there are a few separable proposals in this
PEP.
I should clarify that using a generator to "drive" a block
statement is really a separable proposal; with just the definition
of the block statement from the PEP you could implement all the
examples using a class (similar to example 6, which is easily
turned into a template). But the key idea is using a generator to
drive a block statement; the rest is elaboration, so I'd like to
keep these two parts together.
- Using "continue EXPR" which calls its.__next__(EXPR) which
becomes the return value of a yield-expression is entirely
orthogonal with the rest of the PEP.
- Similarly, using a generator to "drive" a block statement is
also separable; with just the definition of the block statement
from the PEP you could implement all the examples using a class
(similar to example 6, which is easily turned into a template).
But the key idea is using a generator to drive a block statement;
the rest is elaboration.
(PEP 342, Enhanced Iterators, was originally a part of this PEP;
but the two proposals are really independent and with Steven
Bethard's help I have moved it to a separate PEP.)
Motivation and Summary
@ -83,18 +79,6 @@ Use Cases
See the Examples section near the end.
Specification: the __next__() Method
A new method for iterators is proposed, called __next__(). It
takes one optional argument, which defaults to None. Calling the
__next__() method without argument or with None is equivalent to
using the old iterator API, next(). For backwards compatibility,
it is recommended that iterators also implement a next() method as
an alias for calling the __next__() method without an argument.
The argument to the __next__() method may be used by the iterator
as a hint on what to do next.
Specification: the __exit__() Method
An optional new method for iterators is proposed, called
@ -103,76 +87,6 @@ Specification: the __exit__() Method
traceback. If all three arguments are None, sys.exc_info() may be
consulted to provide suitable default values.
Specification: the next() Built-in Function
This is a built-in function defined as follows:
def next(itr, arg=None):
nxt = getattr(itr, "__next__", None)
if nxt is not None:
return nxt(arg)
if arg is None:
return itr.next()
raise TypeError("next() with arg for old-style iterator")
This function is proposed because there is often a need to call
the next() method outside a for-loop; the new API, and the
backwards compatibility code, is too ugly to have to repeat in
user code.
Note that I'm not proposing a built-in function to call the
__exit__() method of an iterator. I don't expect that this will
be called much outside the block-statement.
Specification: a Change to the 'for' Loop
A small change in the translation of the for-loop is proposed.
The statement
for VAR1 in EXPR1:
BLOCK1
else:
BLOCK2
will be translated as follows:
itr = iter(EXPR1)
arg = None # Set by "continue EXPR2", see below
brk = False
while True:
try:
VAR1 = next(itr, arg)
except StopIteration:
brk = True
break
arg = None
BLOCK1
if brk:
BLOCK2
(However, the variables 'itr' etc. are not user-visible and the
built-in names used cannot be overridden by the user.)
Specification: the Extended 'continue' Statement
In the translation of the for-loop, inside BLOCK1, the new syntax
continue EXPR2
is legal and is translated into
arg = EXPR2
continue
(Where 'arg' references the corresponding hidden variable from the
previous section.)
This is also the case in the body of the block-statement proposed
below.
EXPR2 may contain commas; "continue 1, 2, 3" is equivalent to
"continue (1, 2, 3)".
Specification: the Anonymous Block Statement
A new statement is proposed with the syntax
@ -206,16 +120,15 @@ Specification: the Anonymous Block Statement
parser, however, it is always a loop; break and continue return
transfer to the block's iterator (see below for details).
The translation is subtly different from the translation of a
for-loop: iter() is not called, so EXPR1 should already be an
iterator (not just an iterable); and the iterator is guaranteed to
be notified when the block-statement is left, regardless if this
is due to a break, return or exception:
The translation is subtly different from a for-loop: iter() is
not called, so EXPR1 should already be an iterator (not just an
iterable); and the iterator is guaranteed to be notified when
the block-statement is left, regardless if this is due to a
break, return or exception:
itr = EXPR1 # The iterator
ret = False # True if a return statement is active
val = None # Return value, if ret == True
arg = None # Argument to __next__() (value from continue)
exc = None # sys.exc_info() tuple if an exception is active
while True:
try:
@ -226,28 +139,23 @@ Specification: the Anonymous Block Statement
else:
raise exc[0], exc[1], exc[2]
else:
VAR1 = next(itr, arg) # May raise StopIteration
VAR1 = itr.next() # May raise StopIteration
except StopIteration:
if ret:
return val
break
try:
ret = False
val = arg = exc = None
val = exc = None
BLOCK1
except:
exc = sys.exc_info()
(Again, the variables and built-ins are hidden from the user.)
(However, the variables 'itr' etc. are not user-visible and the
built-in names used cannot be overridden by the user.)
Inside BLOCK1, the following special translations apply:
- "continue" and "continue EXPR2" are always legal; the latter is
translated as shown earlier:
arg = EXPR2
continue
- "break" is always legal; it is translated into:
exc = (StopIteration, None, None)
@ -261,26 +169,25 @@ Specification: the Anonymous Block Statement
val = EXPR3
continue
The net effect is that break, continue and return behave much the
same as if the block-statement were a for-loop, except that the
iterator gets a chance at resource cleanup before the
block-statement is left, through the optional __exit__() method.
The iterator also gets a chance if the block-statement is left
through raising an exception. If the iterator doesn't have an
__exit__() method, there is no difference with a for-loop (except
that a for-loop calls iter() on EXPR1).
The net effect is that break and return behave much the same as
if the block-statement were a for-loop, except that the iterator
gets a chance at resource cleanup before the block-statement is
left, through the optional __exit__() method. The iterator also
gets a chance if the block-statement is left through raising an
exception. If the iterator doesn't have an __exit__() method,
there is no difference with a for-loop (except that a for-loop
calls iter() on EXPR1).
Note that a yield-statement (or a yield-expression, see below) in
a block-statement is not treated differently. It suspends the
function containing the block *without* notifying the block's
iterator. The block's iterator is entirely unaware of this
yield, since the local control flow doesn't actually leave the
block. In other words, it is *not* like a break, continue or
return statement. When the loop that was resumed by the yield
calls next(), the block is resumed right after the yield. The
generator finalization semantics described below guarantee (within
the limitations of all finalization semantics) that the block will
be resumed eventually.
Note that a yield-statement in a block-statement is not treated
differently. It suspends the function containing the block
*without* notifying the block's iterator. The block's iterator is
entirely unaware of this yield, since the local control flow
doesn't actually leave the block. In other words, it is *not*
like a break or return statement. When the loop that was resumed
by the yield calls next(), the block is resumed right after the
yield. (See example 7 below.) The generator finalization
semantics described below guarantee (within the limitations of all
finalization semantics) that the block will be resumed eventually.
Unlike the for-loop, the block-statement does not have an
else-clause. I think it would be confusing, and emphasize the
@ -291,10 +198,7 @@ Specification: the Anonymous Block Statement
Specification: Generator Exit Handling
Generators will implement the new __next__() method API, as well
as the old argument-less next() method which becomes an alias for
calling __next__() without an argument. They will also implement
the new __exit__() method API.
Generators will implement the new __exit__() method API.
Generators will be allowed to have a yield statement inside a
try-finally statement.
@ -302,54 +206,17 @@ Specification: Generator Exit Handling
The expression argument to the yield-statement will become
optional (defaulting to None).
The yield-statement will be allowed to be used on the right-hand
side of an assignment; in that case it is referred to as
yield-expression. The value of this yield-expression is None
unless __next__() was called with an argument; see below.
A yield-expression must always be parenthesized except when it
occurs at the top-level expression on the right-hand side of an
assignment. So
x = yield 42
x = yield
x = 12 + (yield 42)
x = 12 + (yield)
foo(yield 42)
foo(yield)
are all legal, but
x = 12 + yield 42
x = 12 + yield
foo(yield 42, 12)
foo(yield, 12)
are all illegal. (Some of the edge cases are motivated by the
current legality of "yield 12, 42".)
When __exit__() is called, the generator is resumed but at the
point of the yield-statement or -expression the exception
represented by the __exit__ argument(s) is raised. The generator
may re-raise this exception, raise another exception, or yield
another value, execpt that if the exception passed in to
__exit__() was StopIteration, it ought to raise StopIteration
(otherwise the effect would be that a break is turned into
continue, which is unexpected at least). When the *initial* call
resuming the generator is an __exit__() call instead of a
__next__() call, the generator's execution is aborted and the
exception is re-raised without passing control to the generator's
body.
When __next__() is called with an argument that is not None, the
yield-expression that it resumes will return the argument. If it
resumes a yield-statement, the value is ignored (this is similar
to ignoring the value returned by a function call). When the
*initial* call to __next__() receives an argument that is not
None, TypeError is raised; this is likely caused by some logic
error. When __next__() is called without an argument or with None
as argument, and a yield-expression is resumed, the
yield-expression returns None.
point of the yield-statement the exception represented by the
__exit__ argument(s) is raised. The generator may re-raise this
exception, raise another exception, or yield another value,
except that if the exception passed in to __exit__() was
StopIteration, it ought to raise StopIteration (otherwise the
effect would be that a break is turned into continue, which is
unexpected at least). When the *initial* call resuming the
generator is an __exit__() call instead of a next() call, the
generator's execution is aborted and the exception is re-raised
without passing control to the generator's body.
When a generator that has not yet terminated is garbage-collected
(either through reference counting or by the cyclical garbage
@ -363,16 +230,6 @@ Specification: Generator Exit Handling
is no different than the guarantees that are made about finalizers
(__del__() methods) of other objects.
Note: the syntactic extensions to yield make its use very similar
to that in Ruby. This is intentional. Do note that in Python the
block passes a value to the generator using "continue EXPR" rather
than "return EXPR", and the underlying mechanism whereby control
is passed between the generator and the block is completely
different. Blocks in Python are not compiled into thunks; rather,
yield suspends execution of the generator's frame. Some edge
cases work differently; in Python, you cannot save the block for
later use, and you cannot test whether there is a block or not.
Alternatives Considered and Rejected
- Many alternatives have been proposed for 'block'. I haven't
@ -654,26 +511,23 @@ Examples
8. A variant on opening() that also returns an error condition:
def opening_w_error(filename, mode="r"):
try:
f = open(filename, mode)
except IOError, err:
yield None, err
else:
try:
f = open(filename, mode)
except IOError, err:
yield None, err
else:
try:
yield f, None
finally:
f.close()
yield f, None
finally:
f.close()
Used as follows:
block opening_w_error("/etc/passwd", "a") as f, err:
if err:
print "IOError:", err
else:
f.write("guido::0:0::/:/bin/sh\n")
9. More examples are needed: showing "continue EXPR", and the use
of continue, break and return in a block-statement.
print "IOError:", err
else:
f.write("guido::0:0::/:/bin/sh\n")
Acknowledgements
@ -695,7 +549,6 @@ References
[3] http://effbot.org/zone/asyncore-generators.htm
Copyright
This document has been placed in the public domain.

201
pep-0342.txt Normal file
View File

@ -0,0 +1,201 @@
PEP: 342
Title: Enhanced Iterators
Version: $Revision$
Last-Modified: $Date$
Author: Guido van Rossum
Status: Draft
Type: Standards Track
Content-Type: text/plain
Created: 10-May-2005
Post-History:
Introduction
This PEP proposes a new iterator API that allows values to be
passed into an iterator using "continue EXPR". These values are
received in the iterator as an argument to the new __next__
method, and can be accessed in a generator with a
yield-expression.
The content of this PEP is derived from the original content of
PEP 340, broken off into its own PEP as the new iterator API is
pretty much orthogonal from the anonymous block statement
discussion. Thanks to Steven Bethard for doing the editing.
Motivation and Summary
TBD.
Use Cases
See the Examples section near the end.
Specification: the __next__() Method
A new method for iterators is proposed, called __next__(). It
takes one optional argument, which defaults to None. Calling the
__next__() method without argument or with None is equivalent to
using the old iterator API, next(). For backwards compatibility,
it is recommended that iterators also implement a next() method as
an alias for calling the __next__() method without an argument.
The argument to the __next__() method may be used by the iterator
as a hint on what to do next.
Specification: the next() Built-in Function
This is a built-in function defined as follows:
def next(itr, arg=None):
nxt = getattr(itr, "__next__", None)
if nxt is not None:
return nxt(arg)
if arg is None:
return itr.next()
raise TypeError("next() with arg for old-style iterator")
This function is proposed because there is often a need to call
the next() method outside a for-loop; the new API, and the
backwards compatibility code, is too ugly to have to repeat in
user code.
Specification: a Change to the 'for' Loop
A small change in the translation of the for-loop is proposed.
The statement
for VAR1 in EXPR1:
BLOCK1
else:
BLOCK2
will be translated as follows:
itr = iter(EXPR1)
arg = None # Set by "continue EXPR2", see below
brk = False
while True:
try:
VAR1 = next(itr, arg)
except StopIteration:
brk = True
break
arg = None
BLOCK1
if brk:
BLOCK2
(However, the variables 'itr' etc. are not user-visible and the
built-in names used cannot be overridden by the user.)
Specification: the Extended 'continue' Statement
In the translation of the for-loop, inside BLOCK1, the new syntax
continue EXPR2
is legal and is translated into
arg = EXPR2
continue
(Where 'arg' references the corresponding hidden variable from the
previous section.)
This is also the case in the body of the block-statement proposed
below.
EXPR2 may contain commas; "continue 1, 2, 3" is equivalent to
"continue (1, 2, 3)".
Specification: Generators and Yield-Expressions
Generators will implement the new __next__() method API, as well
as the old argument-less next() method which becomes an alias for
calling __next__() without an argument.
The yield-statement will be allowed to be used on the right-hand
side of an assignment; in that case it is referred to as
yield-expression. The value of this yield-expression is None
unless __next__() was called with an argument; see below.
A yield-expression must always be parenthesized except when it
occurs at the top-level expression on the right-hand side of an
assignment. So
x = yield 42
x = yield
x = 12 + (yield 42)
x = 12 + (yield)
foo(yield 42)
foo(yield)
are all legal, but
x = 12 + yield 42
x = 12 + yield
foo(yield 42, 12)
foo(yield, 12)
are all illegal. (Some of the edge cases are motivated by the
current legality of "yield 12, 42".)
When __next__() is called with an argument that is not None, the
yield-expression that it resumes will return the argument. If it
resumes a yield-statement, the value is ignored (this is similar
to ignoring the value returned by a function call). When the
*initial* call to __next__() receives an argument that is not
None, TypeError is raised; this is likely caused by some logic
error. When __next__() is called without an argument or with None
as argument, and a yield-expression is resumed, the
yield-expression returns None.
Note: the syntactic extensions to yield make its use very similar
to that in Ruby. This is intentional. Do note that in Python the
block passes a value to the generator using "continue EXPR" rather
than "return EXPR", and the underlying mechanism whereby control
is passed between the generator and the block is completely
different. Blocks in Python are not compiled into thunks; rather,
yield suspends execution of the generator's frame. Some edge
cases work differently; in Python, you cannot save the block for
later use, and you cannot test whether there is a block or not.
Alternative
An alternative proposal is still under consideration, where
instead of adding a __next__() method, the existing next() method
is given an optional argument. The next() built-in function is
then unnecessary. The only line that changes in the translation is
the line
VAR1 = next(itr, arg)
which will be replaced by this
if arg is None:
VAR1 = itr.next()
else:
VAR1 = itr.next(arg)
If "continue EXPR2" is used and EXPR2 does not evaluate to None,
and the iterator's next() method does not support the optional
argument, a TypeError exception will be raised, which is the same
behavior as above.
This proposal is more compatible (no new method name, no new
built-in needed) but less future-proof; in some sense it was a
mistake to call this method next() instead of __next__(), since
*all* other operations corresponding to function pointers in the C
type structure have names with leading and trailing underscores.
Acknowledgements
See Acknowledgements of PEP 340.
References
TBD.
Copyright
This document has been placed in the public domain.