PEP: 3142
Title: Add a "while" clause to generator expressions
Version: $Revision$
Last-Modified: $Date$
Author: Gerald Britton <gerald.britton@gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/plain
Created: 12-Jan-2009
Python-Version: 3.0
Post-History:


Abstract

   This PEP proposes an enhancement to generator expressions, adding a
   "while" clause to complement the existing "if" clause.


Rationale

   A generator expression (PEP 289 [1]) is a concise method to serve
   dynamically-generated objects to list comprehensions (PEP 202 [2]).
   Current generator expressions allow for an "if" clause to filter
   the objects that are returned to those meeting some set of
   criteria.  However, since the "if" clause is evaluated for every
   object that may be returned, in some cases it is possible that all
   objects would be rejected after a certain point.  For example:

       g = (n for n in range(100) if n*n < 50)

   which is equivalent to the using a generator function
   (PEP 255 [3]):

       def __gen(exp):
           for n in exp:
               if n*n < 50:
                   yield n
       g = __gen(iter(range(10)))

   would yield 0, 1, 2, 3, 4, 5, 6 and 7, but would also consider
   the numbers from 8 to 99 and reject them all since n*n >= 50 for
   numbers in that range.  Allowing for a "while" clause would allow
   the redundant tests to be short-circuited:

       g = (n for n in range(100) while n*n < 50)

   would also yield 0, 1, 2, 3, 4, 5, 6 and 7, but would stop at 8
   since the condition (n*n < 50) is no longer true.  This would be
   equivalent to the generator function:

       def __gen(exp):
           for n in exp:
               if n*n < 50:
                   yield n
               else:
                   break
       g = __gen(iter(range(100)))

   Currently, in order to achieve the same result, one would need to
   either write a generator function such as the one above or use the
   takewhile function from itertools:

       from itertools import takewhile
       g = takewhile(lambda n: n*n < 50, range(100))

   The takewhile code achieves the same result as the proposed syntax,
   albeit in a longer (some would say "less-elegant") fashion.  Also,
   the takewhile version requires an extra function call (the lambda
   in the example above) with the associated performance penalty.
   A simple test shows that:

       for n in (n for n in range(100) if 1): pass

   performs about 10% better than:

       for n in takewhile(lambda n: 1, range(100)): pass

   though they achieve similar results.  (The first example uses a
   generator; takewhile is an iterator).  If similarly implemented,
   a "while" clause should perform about the same as the "if" clause
   does today.

   The reader may ask if the "if" and "while" clauses should be
   mutually exclusive.  There are good examples that show that there
   are times when both may be used to good advantage. For example:

       p = (p for p in primes() if p > 100 while p < 1000)

   should return prime numbers found between 100 and 1000, assuming
   I have a primes() generator that yields prime numbers.

   Adding a "while" clause to generator expressions maintains the
   compact form while adding a useful facility for short-circuiting
   the expression.


Acknowledgements

   Raymond Hettinger first proposed the concept of generator
   expressions in January 2002.


References

   [1] PEP 289: Generator Expressions
       http://www.python.org/dev/peps/pep-0289/

   [2] PEP 202: List Comprehensions
       http://www.python.org/dev/peps/pep-0202/

   [3] PEP 255: Simple Generators
       http://www.python.org/dev/peps/pep-0255/


Copyright

   This document has been placed in the public domain.


Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: