2000-07-13 23:44:01 -04:00
|
|
|
|
PEP: 205
|
|
|
|
|
Title: Weak References
|
|
|
|
|
Version: $Revision$
|
2000-11-08 01:20:40 -05:00
|
|
|
|
Owner: Fred L. Drake, Jr. <fdrake@acm.org>
|
2000-07-13 23:44:01 -04:00
|
|
|
|
Python-Version: 2.1
|
|
|
|
|
Status: Incomplete
|
2000-11-08 01:20:40 -05:00
|
|
|
|
Type: Standards Track
|
|
|
|
|
Post-History:
|
|
|
|
|
|
|
|
|
|
Motivation
|
|
|
|
|
|
|
|
|
|
There are two basic applications for weak references which have
|
|
|
|
|
been noted by Python programmers: object caches and reduction of
|
|
|
|
|
pain from circular references.
|
|
|
|
|
|
|
|
|
|
Caches (weak dictionaries)
|
|
|
|
|
|
|
|
|
|
There is a need to allow objects to be maintained to represent
|
|
|
|
|
external state, mapping a single instance to the external
|
|
|
|
|
reality, where allowing multiple instances to be mapped to the
|
|
|
|
|
same external resource would create unnecessary difficulty
|
|
|
|
|
maintaining synchronization among instances. In these cases,
|
|
|
|
|
a common idiom is to support a cache of instances; a factory
|
|
|
|
|
function is used to return either a new or existing instance.
|
|
|
|
|
|
|
|
|
|
The difficulty in this approach is that one of two things must
|
|
|
|
|
be tolerated: either the cache grows without bound, or there
|
|
|
|
|
needs to be explicit management of the cache elsewhere in the
|
|
|
|
|
application. The later can be very tedious and leads to more
|
|
|
|
|
code than is really necessary to solve the problem at hand,
|
|
|
|
|
and the former can be unacceptable for long-running processes
|
|
|
|
|
or even relatively short processes with substantial memory
|
|
|
|
|
requirements.
|
|
|
|
|
|
|
|
|
|
- External objects that need to be represented by a single
|
|
|
|
|
instance, no matter how many internal users there are. This
|
|
|
|
|
can be useful for representing files that need to be written
|
|
|
|
|
back to disk in whole rather than locked & modified for
|
|
|
|
|
every use.
|
|
|
|
|
|
|
|
|
|
- Objects which are expensive to create, but may be needed by
|
|
|
|
|
multiple internal consumers. Similar to the first case, but
|
|
|
|
|
not necessarily bound to external resources, and possibly
|
|
|
|
|
not an issue for shared state. Weak references are only
|
|
|
|
|
useful in this case if there is some flavor of "soft"
|
|
|
|
|
references or if there is a high likelihood that users of
|
|
|
|
|
individual objects will overlap in lifespan.
|
|
|
|
|
|
|
|
|
|
Circular references
|
|
|
|
|
|
|
|
|
|
- DOMs require a huge amount of circular (to parent & document
|
|
|
|
|
nodes), but most of these aren't useful. Using weak
|
|
|
|
|
references allows applications to hold onto less of the tree
|
|
|
|
|
without a lot of difficulty. This might be especially
|
|
|
|
|
useful in the context of something like xml.dom.pulldom.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Weak References in Java
|
|
|
|
|
|
|
|
|
|
http://java.sun.com/j2se/1.3/docs/api/java/lang/ref/package-summary.html
|
|
|
|
|
|
|
|
|
|
Java provides three forms of weak references, and one interesting
|
|
|
|
|
helper class. The three forms are called "weak", "soft", and
|
|
|
|
|
"phantom" references. The relevant classes are defined in the
|
|
|
|
|
java.lang.ref package.
|
|
|
|
|
|
|
|
|
|
For each of the reference types, there is an option to add the
|
|
|
|
|
reference to a queue when it is invalidated by the memory
|
|
|
|
|
allocator. The primary purpose of this facility seems to be that
|
|
|
|
|
it allows larger structures to be composed to incorporate
|
|
|
|
|
weak-reference semantics without having to impose substantial
|
|
|
|
|
additional locking requirements. For instance, it would not be
|
|
|
|
|
difficult to use this facility to create a "weak" hash table which
|
|
|
|
|
removes keys and referents when a reference is no longer used
|
|
|
|
|
elsewhere. Using weak references for the objects without some
|
|
|
|
|
sort of notification queue for invalidations leads to much more
|
|
|
|
|
tedious implementation of the various operations required on hash
|
|
|
|
|
tables. This can be a performance bottleneck if deallocations of
|
|
|
|
|
the stored objects are infrequent.
|
|
|
|
|
|
|
|
|
|
Java's "weak" references are most like Diane Hackborn's old vref
|
|
|
|
|
proposal: a reference object refers to a single Python object,
|
|
|
|
|
but does not own a reference to that object. When that object is
|
|
|
|
|
deallocated, the reference object is invalidated. Users of the
|
|
|
|
|
reference object can easily determine that the reference has been
|
|
|
|
|
invalidated, or a NullObjectDereferenceError can be raised when
|
|
|
|
|
an attempt is made to use the referred-to object.
|
|
|
|
|
|
|
|
|
|
The "soft" references are similar, but are not invalidated as soon
|
|
|
|
|
as all other references to the referred-to object have been
|
|
|
|
|
released. The "soft" reference does own a reference, but allows
|
|
|
|
|
the memory allocator to free the referent if the memory is needed
|
|
|
|
|
elsewhere. It is not clear whether this means soft references are
|
|
|
|
|
released before the malloc() implementation calls sbrk() or its
|
|
|
|
|
equivalent, or if soft references are only cleared when malloc()
|
|
|
|
|
returns NULL.
|
|
|
|
|
|
|
|
|
|
XXX -- Need to figure out what phantom references are all about.
|
|
|
|
|
|
|
|
|
|
Unlike the other two reference types, "phantom" references must be
|
|
|
|
|
associated with an invalidation queue.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Previous Weak Reference Work in Python
|
|
|
|
|
|
2000-11-17 17:54:45 -05:00
|
|
|
|
Dianne Hackborn's proposed something called "virtual references".
|
|
|
|
|
'vref' objects were very similar to java.lang.ref.WeakReference
|
|
|
|
|
objects, except there was no equivalent to the invalidation
|
|
|
|
|
queues. Implementing a "weak dictionary" would be just as
|
|
|
|
|
difficult as using only weak references (without the invalidation
|
|
|
|
|
queue) in Java. Information on this has disappeared from the Web.
|
|
|
|
|
Original discussion occurred in the comp.lang.python newsgroup; a
|
|
|
|
|
good archive of that may turn up something more.
|
2000-11-08 01:20:40 -05:00
|
|
|
|
|
|
|
|
|
Marc-Andr<64> Lemburg's mx.Proxy package. These Web pages appear to
|
|
|
|
|
be unavailable at the moment.
|
|
|
|
|
|
|
|
|
|
http://starship.python.net/crew/lemburg/
|
|
|
|
|
|
|
|
|
|
The weakdict module by Dieter Maurer is implemented in C and
|
|
|
|
|
Python. It appears that the Web pages have not been updated since
|
|
|
|
|
Python 1.5.2a, so I'm not yet sure if the implementation is
|
|
|
|
|
compatible with Python 2.0.
|
|
|
|
|
|
|
|
|
|
http://www.handshake.de/~dieter/weakdict.html
|
|
|
|
|
|
2000-11-17 17:54:45 -05:00
|
|
|
|
PyWeakReference by Alex Shindich:
|
|
|
|
|
|
|
|
|
|
http://sourceforge.net/projects/pyweakreference/
|
|
|
|
|
|
2000-11-08 01:20:40 -05:00
|
|
|
|
|
|
|
|
|
Possible Applications
|
|
|
|
|
|
|
|
|
|
PyGTK+ bindings?
|
|
|
|
|
|
2000-11-17 17:54:45 -05:00
|
|
|
|
Tkinter -- could avoid circular references by using weak
|
|
|
|
|
references from widgets to their parents. Objects won't be
|
|
|
|
|
discarded any sooner in the typical case, but there won't be so
|
|
|
|
|
much dependence on the programmer calling .destroy() before
|
|
|
|
|
releasing a reference. This would mostly benefit long-running
|
|
|
|
|
applications.
|
2000-11-08 01:20:40 -05:00
|
|
|
|
|
|
|
|
|
DOM trees?
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Proposed Implementation
|
|
|
|
|
|
|
|
|
|
XXX -- Not yet.
|
|
|
|
|
|
|
|
|
|
|
2000-11-17 17:54:45 -05:00
|
|
|
|
Appendix -- Dianne Hackborn's vref proposal (1995)
|
|
|
|
|
|
|
|
|
|
[This has been indented and paragraphs reflowed, but there have be
|
|
|
|
|
no content changes. --Fred]
|
|
|
|
|
|
|
|
|
|
Proposal: Virtual References
|
|
|
|
|
|
|
|
|
|
In an attempt to partly address the recurring discussion
|
|
|
|
|
concerning reference counting vs. garbage collection, I would like
|
|
|
|
|
to propose an extension to Python which should help in the
|
|
|
|
|
creation of "well structured" cyclic graphs. In particular, it
|
|
|
|
|
should allow at least trees with parent back-pointers and
|
|
|
|
|
doubly-linked lists to be created without worry about cycles.
|
|
|
|
|
|
|
|
|
|
The basic mechanism I'd like to propose is that of a "virtual
|
|
|
|
|
reference," or a "vref" from here on out. A vref is essentially a
|
|
|
|
|
handle on an object that does not increment the object's reference
|
|
|
|
|
count. This means that holding a vref on an object will not keep
|
|
|
|
|
the object from being destroyed. This would allow the Python
|
|
|
|
|
programmer, for example, to create the aforementioned tree
|
|
|
|
|
structure tree structure, which is automatically destroyed when it
|
|
|
|
|
is no longer in use -- by making all of the parent back-references
|
|
|
|
|
into vrefs, they no longer create reference cycles which keep the
|
|
|
|
|
tree from being destroyed.
|
|
|
|
|
|
|
|
|
|
In order to implement this mechanism, the Python core must ensure
|
|
|
|
|
that no -real- pointers are ever left referencing objects that no
|
|
|
|
|
longer exist. The implementation I would like to propose involves
|
|
|
|
|
two basic additions to the current Python system:
|
|
|
|
|
|
|
|
|
|
1. A new "vref" type, through which the Python programmer creates
|
|
|
|
|
and manipulates virtual references. Internally, it is
|
|
|
|
|
basically a C-level Python object with a pointer to the Python
|
|
|
|
|
object it is a reference to. Unlike all other Python code,
|
|
|
|
|
however, it does not change the reference count of this object.
|
|
|
|
|
In addition, it includes two pointers to implement a
|
|
|
|
|
doubly-linked list, which is used below.
|
|
|
|
|
|
|
|
|
|
2. The addition of a new field to the basic Python object
|
|
|
|
|
[PyObject_Head in object.h], which is either NULL, or points to
|
|
|
|
|
the head of a list of all vref objects that reference it. When
|
|
|
|
|
a vref object attaches itself to another object, it adds itself
|
|
|
|
|
to this linked list. Then, if an object with any vrefs on it
|
|
|
|
|
is deallocated, it may walk this list and ensure that all of
|
|
|
|
|
the vrefs on it point to some safe value, e.g. Nothing.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This implementation should hopefully have a minimal impact on the
|
|
|
|
|
current Python core -- when no vrefs exist, it should only add one
|
|
|
|
|
pointer to all objects, and a check for a NULL pointer every time
|
|
|
|
|
an object is deallocated.
|
|
|
|
|
|
|
|
|
|
Back at the Python language level, I have considered two possible
|
|
|
|
|
semantics for the vref object --
|
|
|
|
|
|
|
|
|
|
==> Pointer semantics:
|
|
|
|
|
|
|
|
|
|
In this model, a vref behaves essentially like a Python-level
|
|
|
|
|
pointer; the Python program must explicitly dereference the vref
|
|
|
|
|
to manipulate the actual object it references.
|
|
|
|
|
|
|
|
|
|
An example vref module using this model could include the
|
|
|
|
|
function "new"; When used as 'MyVref = vref.new(MyObject)', it
|
|
|
|
|
returns a new vref object such that that MyVref.object ==
|
|
|
|
|
MyObject. MyVref.object would then change to Nothing if
|
|
|
|
|
MyObject is ever deallocated.
|
|
|
|
|
|
|
|
|
|
For a concrete example, we may introduce some new C-style syntax:
|
|
|
|
|
|
|
|
|
|
& -- unary operator, creates a vref on an object, same as vref.new().
|
|
|
|
|
* -- unary operator, dereference a vref, same as VrefObject.object.
|
|
|
|
|
|
|
|
|
|
We can then define:
|
|
|
|
|
|
|
|
|
|
1. type(&MyObject) == vref.VrefType
|
|
|
|
|
2. *(&MyObject) == MyObject
|
|
|
|
|
3. (*(&MyObject)).attr == MyObject.attr
|
|
|
|
|
4. &&MyObject == Nothing
|
|
|
|
|
5. *MyObject -> exception
|
|
|
|
|
|
|
|
|
|
Rule #4 is subtle, but comes about because we have made a vref
|
|
|
|
|
to (a vref with no real references). Thus the outer vref is
|
|
|
|
|
cleared to Nothing when the inner one inevitably disappears.
|
|
|
|
|
|
|
|
|
|
==> Proxy semantics:
|
|
|
|
|
|
|
|
|
|
In this model, the Python programmer manipulates vref objects
|
|
|
|
|
just as if she were manipulating the object it is a reference
|
|
|
|
|
of. This is accomplished by implementing the vref so that all
|
|
|
|
|
operations on it are redirected to its referenced object. With
|
|
|
|
|
this model, the dereference operator (*) no longer makes sense;
|
|
|
|
|
instead, we have only the reference operator (&), and define:
|
|
|
|
|
|
|
|
|
|
1. type(&MyObject) == type(MyObject)
|
|
|
|
|
2. &MyObject == MyObject
|
|
|
|
|
3. (&MyObject).attr == MyObject.attr
|
|
|
|
|
4. &&MyObject == MyObject
|
|
|
|
|
|
|
|
|
|
Again, rule #4 is important -- here, the outer vref is in fact a
|
|
|
|
|
reference to the original object, and -not- the inner vref.
|
|
|
|
|
This is because all operations applied to a vref actually apply
|
|
|
|
|
to its object, so that creating a vref of a vref actually
|
|
|
|
|
results in creating a vref of the latter's object.
|
|
|
|
|
|
|
|
|
|
The first, pointer semantics, has the advantage that it would be
|
|
|
|
|
very easy to implement; the vref type is extremely simple,
|
|
|
|
|
requiring at minimum a single attribute, object, and a function to
|
|
|
|
|
create a reference.
|
|
|
|
|
|
|
|
|
|
However, I really like the proxy semantics. Not only does it put
|
|
|
|
|
less of a burden on the Python programmer, but it allows you to do
|
|
|
|
|
nice things like use a vref anywhere you would use the actual
|
|
|
|
|
object. Unfortunately, it would probably an extreme pain, if not
|
|
|
|
|
practically impossible, to implement in the current Python
|
|
|
|
|
implementation. I do have some thoughts, though, on how to do
|
|
|
|
|
this, if it seems interesting; one possibility is to introduce new
|
|
|
|
|
type-checking functions which handle the vref. This would
|
|
|
|
|
hopefully older C modules which don't expect vrefs to simply
|
|
|
|
|
return a type error, until they can be fixed.
|
|
|
|
|
|
|
|
|
|
Finally, there are some other additional capabilities that this
|
|
|
|
|
system could provide. One that seems particularily interesting to
|
|
|
|
|
me involves allowing the Python programmer to add "destructor"
|
|
|
|
|
function to a vref -- this Python function would be called
|
|
|
|
|
immediately prior to the referenced object being deallocated,
|
|
|
|
|
allowing a Python program to invisibly attach itself to another
|
|
|
|
|
object and watch for it to disappear. This seems neat, though I
|
|
|
|
|
haven't actually come up with any practical uses for it, yet... :)
|
|
|
|
|
|
|
|
|
|
-- Dianne
|
|
|
|
|
|
|
|
|
|
|
2000-11-08 01:20:40 -05:00
|
|
|
|
Copyright
|
|
|
|
|
|
|
|
|
|
This document has been placed in the public domain.
|
2000-07-13 23:44:01 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Local Variables:
|
|
|
|
|
mode: indented-text
|
|
|
|
|
indent-tabs-mode: nil
|
|
|
|
|
End:
|