PEP: 205 Title: Weak References Version: $Revision$ Owner: Fred L. Drake, Jr. Python-Version: 2.1 Status: Incomplete Type: Standards Track Post-History: Motivation There are two basic applications for weak references which have been noted by Python programmers: object caches and reduction of pain from circular references. Caches (weak dictionaries) There is a need to allow objects to be maintained to represent external state, mapping a single instance to the external reality, where allowing multiple instances to be mapped to the same external resource would create unnecessary difficulty maintaining synchronization among instances. In these cases, a common idiom is to support a cache of instances; a factory function is used to return either a new or existing instance. The difficulty in this approach is that one of two things must be tolerated: either the cache grows without bound, or there needs to be explicit management of the cache elsewhere in the application. The later can be very tedious and leads to more code than is really necessary to solve the problem at hand, and the former can be unacceptable for long-running processes or even relatively short processes with substantial memory requirements. - External objects that need to be represented by a single instance, no matter how many internal users there are. This can be useful for representing files that need to be written back to disk in whole rather than locked & modified for every use. - Objects which are expensive to create, but may be needed by multiple internal consumers. Similar to the first case, but not necessarily bound to external resources, and possibly not an issue for shared state. Weak references are only useful in this case if there is some flavor of "soft" references or if there is a high likelihood that users of individual objects will overlap in lifespan. Circular references - DOMs require a huge amount of circular (to parent & document nodes), but most of these aren't useful. Using weak references allows applications to hold onto less of the tree without a lot of difficulty. This might be especially useful in the context of something like xml.dom.pulldom. Weak References in Java http://java.sun.com/j2se/1.3/docs/api/java/lang/ref/package-summary.html Java provides three forms of weak references, and one interesting helper class. The three forms are called "weak", "soft", and "phantom" references. The relevant classes are defined in the java.lang.ref package. For each of the reference types, there is an option to add the reference to a queue when it is invalidated by the memory allocator. The primary purpose of this facility seems to be that it allows larger structures to be composed to incorporate weak-reference semantics without having to impose substantial additional locking requirements. For instance, it would not be difficult to use this facility to create a "weak" hash table which removes keys and referents when a reference is no longer used elsewhere. Using weak references for the objects without some sort of notification queue for invalidations leads to much more tedious implementation of the various operations required on hash tables. This can be a performance bottleneck if deallocations of the stored objects are infrequent. Java's "weak" references are most like Diane Hackborn's old vref proposal: a reference object refers to a single Python object, but does not own a reference to that object. When that object is deallocated, the reference object is invalidated. Users of the reference object can easily determine that the reference has been invalidated, or a NullObjectDereferenceError can be raised when an attempt is made to use the referred-to object. The "soft" references are similar, but are not invalidated as soon as all other references to the referred-to object have been released. The "soft" reference does own a reference, but allows the memory allocator to free the referent if the memory is needed elsewhere. It is not clear whether this means soft references are released before the malloc() implementation calls sbrk() or its equivalent, or if soft references are only cleared when malloc() returns NULL. XXX -- Need to figure out what phantom references are all about. Unlike the other two reference types, "phantom" references must be associated with an invalidation queue. Previous Weak Reference Work in Python Diane Hackborn's vref proposal. 'vref' objects were very similar to java.lang.ref.WeakReference objects, except there was no equivalent to the invalidation queues. Implementing a "weak dictionary" would be just as difficult as using only weak references (without the invalidation queue) in Java. Information on this has disappeared from the Web. Original discussion occurred in the comp.lang.python newsgroup; a good archive of that may turn up something more. Dianne doesn't have any record of her proposal, and doesn't recall doing an implementation. Marc-André Lemburg's mx.Proxy package. These Web pages appear to be unavailable at the moment. http://starship.python.net/crew/lemburg/ The weakdict module by Dieter Maurer is implemented in C and Python. It appears that the Web pages have not been updated since Python 1.5.2a, so I'm not yet sure if the implementation is compatible with Python 2.0. http://www.handshake.de/~dieter/weakdict.html Possible Applications PyGTK+ bindings? Tkinter? DOM trees? Proposed Implementation XXX -- Not yet. Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil End: