PEP: 288 Title: Generators Attributes and Exceptions Version: $Revision$ Last-Modified: $Date$ Author: python@rcn.com (Raymond D. Hettinger) Status: Draft Type: Standards Track Created: 21-Mar-2002 Python-Version: 2.4 Post-History: Abstract This PEP introduces ideas for enhancing the generators introduced in Python version 2.2 [1]. The goal is to increase the convenience, utility, and power of generators by providing a mechanism for passing data into a generator and for triggering exceptions inside a generator. These mechanisms were first proposed along with two other generator tools in PEP 279 [7]. They were split-off to this separate PEP to allow the ideas more time to mature and for alternatives to be considered. Subsequently, the argument passing idea gave way to Detlef Lannert's idea of using attributes. Rationale Python 2.2 introduced the concept of an iterable interface as proposed in PEP 234 [2]. The iter() factory function was provided as common calling convention and deep changes were made to use iterators as a unifying theme throughout Python. The unification came in the form of establishing a common iterable interface for mappings, sequences, and file objects. Generators, as proposed in PEP 255 [1], were introduced as a means for making it easier to create iterators, especially ones with complex internal states. The next step in the evolution of generators is to allow generators to accept attribute assignments. This allows data to be passed in a standard Python fashion. A related evolutionary step is to add a generator method to enable exceptions to be passed to a generator. Currently, there is no clean method for triggering exceptions from outside the generator. Also, generator exception passing helps mitigate the try/finally prohibition for generators. These suggestions are designed to take advantage of the existing implementation and require little additional effort to incorporate. They are backwards compatible and require no new keywords. They are being recommended for Python version 2.4. Specification for Generator Attributes Essentially, the proposal is to emulate attribute writing for classes. The only wrinkle is that generators lack a way to refer to instances of themselves. So, generators need an automatic instance variable, __self__. Here is a minimal example: def mygen(): while True: print __self__.data yield None g = mygen() g.data = 1 g.next() # prints 1 g.data = 2 g.next() # prints 2 The control flow of 'yield' and 'next' is unchanged by this proposal. The only change is that data can be sent into the generator. By analogy, consider the quality improvement from GOSUB (which had no argument passing mechanism) to modern procedure calls (which can pass in arguments and return values). Most of the underlying machinery is already in place, only the __self__ variable needs to be added. Yield is more than just a simple iterator creator. It does something else truly wonderful -- it suspends execution and saves state. It is good for a lot more than writing iterators. This proposal further taps its capabilities by making it easier to share data with the generator. The attribute mechanism is especially useful for: 1. Sending data to any generator 2. Writing lazy consumers with complex execution states 3. Writing co-routines (as demonstrated in Dr. Mertz's articles [3]) The proposal is a clear improvement over the existing alternative of passing data via global variables. It is also much simpler, more readable and easier to debug than an approach involving the threading module with its attendant mutexes, semaphores, and data queues. A class-based approach competes well when there are no complex execution states or variable states. However, when the complexity increases, generators with writable attributes are much simpler because they automatically save state (unlike classes which must explicitly save the variable and execution state in instance variables). Examples Example of a Complex Consumer The encoder for arithmetic compression sends a series of fractional values to a complex, lazy consumer. That consumer makes computations based on previous inputs and only writes out when certain conditions have been met. After the last fraction is received, it has a procedure for flushing any unwritten data. Example of a Consumer Stream def filelike(packagename, appendOrOverwrite): data = [] if appendOrOverwrite == 'w+': data.extend(packages[packagename]) try: while True: data.append(__self__.dat) yield None except FlushStream: packages[packagename] = data ostream = filelike('mydest','w') ostream.dat = firstdat; ostream.next() ostream.dat = firstdat; ostream.next() ostream.throw(FlushStream) # Throw is proposed below Example of a Complex Consumer Loop over the picture files in a directory, shrink them one at a time to thumbnail size using PIL [4], and send them to a lazy consumer. That consumer is responsible for creating a large blank image, accepting thumbnails one at a time and placing them in a 5 by 3 grid format onto the blank image. Whenever the grid is full, it writes-out the large image as an index print. A FlushStream exception indicates that no more thumbnails are available and that the partial index print should be written out if there are one or more thumbnails on it. Example of a Producer and Consumer Used Together in a Pipe-like Fashion 'Analogy to Linux style pipes: source | upper | sink' sink = sinkgen() for word in source(): sink.data = word.upper() sink.next() Initialization Mechanism If the attribute passing idea is accepted, Detlef Lannert further proposed that generator instances have attributes initialized to values in the generator's func_dict. This makes it easy to set default values. For example: def mygen(): while True: print __self__.data yield None mygen.data = 0 g = mygen() # g initialized with .data set to 0 g.next() # prints 0 g.data = 1 g.next() # prints 1 Rejected Alternative One idea for passing data into a generator was to pass an argument through next() and make a assignment using the yield keyword: datain = yield dataout . . . dataout = gen.next(datain) The intractable problem is that the argument to the first next() call has to be thrown away, because it doesn't correspond to a yield keyword. Specification for Generator Exception Passing: Add a .throw(exception) method to the generator interface: def logger(): start = time.time() log = [] try: while True: log.append( time.time() - start ) yield log[-1] except WriteLog: writelog(log) g = logger() for i in [10,20,40,80,160]: testsuite(i) g.next() g.throw(WriteLog) There is no existing work-around for triggering an exception inside a generator. It is the only case in Python where active code cannot be excepted to or through. Generator exception passing also helps address an intrinsic limitation on generators, the prohibition against their using try/finally to trigger clean-up code [1]. Without .throw(), the current work-around forces the resolution or clean-up code to be moved outside the generator. Note A: The name of the throw method was selected for several reasons. Raise is a keyword and so cannot be used as a method name. Unlike raise which immediately raises an exception from the current execution point, throw will first return to the generator and then raise the exception. The word throw is suggestive of putting the exception in another location. The word throw is already associated with exceptions in other languages. Alternative method names were considered: resolve(), signal(), genraise(), raiseinto(), and flush(). None of these seem to fit as well as throw(). Note B: The throw syntax should exactly match raise's syntax: throw([expression, [expression, [expression]]]) Accordingly, it should be implemented to handle all of the following: raise string g.throw(string) raise string, data g.throw(string,data) raise class, instance g.throw(class,instance) raise instance g.throw(instance) raise g.throw() Comments from GvR: I'm not convinced that the cleanup problem that this is trying to solve exists in practice. I've never felt the need to put yield inside a try/except. I think the PEP doesn't make enough of a case that this is useful. This one gets a big fat -1 until there's a good motivational section. Comments from Ka-Ping Yee: I agree that the exception issue needs to be resolved and [that] you have suggested a fine solution. Comments from Neil Schemenauer: The exception passing idea is one I hadn't thought of before and looks interesting. If we enable the passing of values back, then we should add this feature too. Comments for Magnus Lie Hetland: Even though I cannot speak for the ease of implementation, I vote +1 for the exception passing mechanism. Comments from the Community: The response has been mostly favorable. One negative comment from GvR is shown above. The other was from Martin von Loewis who was concerned that it could be difficult to implement and is withholding his support until a working patch is available. To probe Martin's comment, I checked with the implementers of the original generator PEP for an opinion on the ease of implementation. They felt that implementation would be straight-forward and could be grafted onto the existing implementation without disturbing its internals. Author response: When the sole use of generators is to simplify writing iterators for lazy producers, then the odds of needing generator exception passing are slim. If, on the other hand, generators are used to write lazy consumers, create coroutines, generate output streams, or simply for their marvelous capability for restarting a previously frozen state, THEN the need to raise exceptions will come up frequently. References [1] PEP 255 Simple Generators http://www.python.org/peps/pep-0255.html [2] PEP 234 Iterators http://www.python.org/peps/pep-0234.html [3] Dr. David Mertz's draft column for Charming Python. http://gnosis.cx/publish/programming/charming_python_b5.txt http://gnosis.cx/publish/programming/charming_python_b7.txt [4] PIL, the Python Imaging Library can be found at: http://www.pythonware.com/products/pil/ Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil fill-column: 70 End: