python-peps/pep-0205.txt

PEP: 205
Title: Weak References
Version: $Revision$
Author: Fred L. Drake, Jr. <fdrake@acm.org>
Python-Version: 2.1
Status: Incomplete
Type: Standards Track
Post-History: 11 January 2001

Motivation

    There are two basic applications for weak references which have
    been noted by Python programmers: object caches and reduction of
    pain from circular references.

    Caches (weak dictionaries)

        There is a need to allow objects to be maintained that represent
        external state, mapping a single instance to the external
        reality, where allowing multiple instances to be mapped to the
        same external resource would create unnecessary difficulty
        maintaining synchronization among instances.  In these cases,
        a common idiom is to support a cache of instances; a factory
        function is used to return either a new or existing instance.

        The difficulty in this approach is that one of two things must
        be tolerated: either the cache grows without bound, or there
        needs to be explicit management of the cache elsewhere in the
        application.  The later can be very tedious and leads to more
        code than is really necessary to solve the problem at hand,
        and the former can be unacceptable for long-running processes
        or even relatively short processes with substantial memory
        requirements.

        - External objects that need to be represented by a single
          instance, no matter how many internal users there are.  This
          can be useful for representing files that need to be written
          back to disk in whole rather than locked & modified for
          every use.

        - Objects that are expensive to create, but may be needed by
          multiple internal consumers.  Similar to the first case, but
          not necessarily bound to external resources, and possibly
          not an issue for shared state.  Weak references are only
          useful in this case if there is some flavor of "soft"
          references or if there is a high likelihood that users of
          individual objects will overlap in lifespan.

    Circular references

        - DOMs require a huge amount of circular (to parent & document
          nodes), but these could be eliminated using a weak
          dictionary mapping from each node to it's parent.  This
          might be especially useful in the context of something like
          xml.dom.pulldom, allowing the .unlink() operation to become
          a no-op.

    This proposal is divided into the following sections:

        - Proposed Solution
        - Implementation Strategy
        - Possible Applications
        - Previous Weak Reference Work in Python
        - Weak References in Java

    The full text of one early proposal is included as an appendix
    since it does not appear to be available on the net.


Proposed Solution

    Weak references should be able to point to any Python object that
    may have substantial memory size (directly or indirectly), or hold
    references to external resources (database connections, open
    files, etc.).

    A new module, weakref, will contain new functions used to create
    weak references.  weakref.new() will create a "weak reference
    object" and optionally attach a callback which will be called when
    the object is about to be finalized.  weakref.mapping() will
    create a "weak dictionary".  A third function, weakref.proxy(),
    will create a proxy object that behaves somewhat like the original
    object.

    A weak reference object will allow access to the referenced object
    if it hasn't been collected and to determine if the object still
    exists in memory.  These operations are accessed via the .get()
    and .islive() methods.  If the object has been collected, .get()
    will return None and islive() will return false.  Weak references
    are not hashable.

    A weak dictionary maps arbitrary keys to values, but does not own
    a reference to the values.  When the values are finalized, the
    (key, value) pairs for which it is a value are removed from all
    the mappings containing such pairs.  These mapping objects provide
    an additional method, .setitem(), which accepts a key, value, and
    optional callback which will be called when the object is removed
    from the mapping.  If the object is removed from the mapping by
    deleting the entry by key or calling the mapping's .clear()
    method, the callback is not called (since the object is not about
    to be finalized), and will be unregistered from the object.  Like
    dictionaries, weak dictionaries are not hashable.

    Proxy objects are weak references that attempt to behave like the
    object they proxy, as much as they can.  Regardless of the
    underlying type, proxies are not hashable since their ability to
    act as a weak reference relies on a fundamental mutability that
    will cause failures when used as dictionary keys -- even if the
    proper hash value is computed before the referent dies, the
    resulting proxy cannot be used as a dictionary key since it cannot
    be compared once the referent has expired, and comparability is
    necessary for dictionary keys.  Operations on proxy objects after
    the referent dies cause weakref.ReferenceError to be raised in
    most cases.  "is" comparisons, type(), and id() will continue to
    work, but always refer to the proxy and not the referent.

    The callbacks registered with weak references must accept a single
    parameter, which will be the weak-ly referenced object itself.
    The object can be resurrected by creating some other reference to
    the object in the callback, in which case the weak reference
    generating the callback will still be cleared but no remaining
    weak references to the object will be cleared.


Implementation Strategy

    The implementation of weak references will include a list of
    reference containers that must be cleared for each weakly-
    referencable object.  If the reference is from a weak dictionary,
    the dictionary entry is cleared first.  Then, any associated
    callback is called with the object passed as a parameter.  If the
    callback generates additional references, no further callbacks are
    called and the object is resurrected.  Once all callbacks have
    been called and the object has not been resurrected, it is
    finalized and deallocated.

    Many built-in types will participate in the weak-reference
    management, and any extension type can elect to do so.  The type
    structure will contain an additional field which provides an
    offset into the instance structure which contains a list of weak
    reference structures.  If the value of the field is <= 0, the
    object does not participate.  In this case, weakref.new(),
    <weakdict>.setitem(), .setdefault(), and item assignment will
    raise TypeError.  If the value of the field is > 0, a new weak
    reference can be generated and added to the list.

    This approach is taken to allow arbitrary extension types to
    participate, without taking a memory hit for numbers or other
    small types.

    XXX -- Need to determine the exact list of standard types which
    should support weak references.  This should include instances and
    dictionaries.  Whether it should include strings, buffers, lists,
    files, or sockets is debatable.

    There is a preliminary patch on SourceForge:

    http://sourceforge.net/patch/?func=detailpatch&patch_id=103203&group_id=5470


Possible Applications

    PyGTK+ bindings?

    Tkinter -- could avoid circular references by using weak
    references from widgets to their parents.  Objects won't be
    discarded any sooner in the typical case, but there won't be so
    much dependence on the programmer calling .destroy() before
    releasing a reference.  This would mostly benefit long-running
    applications.

    DOM trees.


Previous Weak Reference Work in Python

    Dianne Hackborn has proposed something called "virtual references".
    'vref' objects are very similar to java.lang.ref.WeakReference
    objects, except there is no equivalent to the invalidation
    queues.  Implementing a "weak dictionary" would be just as
    difficult as using only weak references (without the invalidation
    queue) in Java.  Information on this has disappeared from the Web,
    but is included below as an Appendix.

    Marc-Andr<64> Lemburg's mx.Proxy package:

        http://www.lemburg.com/files/python/mxProxy.html

    The weakdict module by Dieter Maurer is implemented in C and
    Python.  It appears that the Web pages have not been updated since
    Python 1.5.2a, so I'm not yet sure if the implementation is
    compatible with Python 2.0.

        http://www.handshake.de/~dieter/weakdict.html

    PyWeakReference by Alex Shindich:

        http://sourceforge.net/projects/pyweakreference/

    Eric Tiedemann has a weak dictionary implementation:

        http://www.hyperreal.org/~est/python/weak/


Weak References in Java

    http://java.sun.com/j2se/1.3/docs/api/java/lang/ref/package-summary.html

    Java provides three forms of weak references, and one interesting
    helper class.  The three forms are called "weak", "soft", and
    "phantom" references.  The relevant classes are defined in the
    java.lang.ref package.

    For each of the reference types, there is an option to add the
    reference to a queue when it is invalidated by the memory
    allocator.  The primary purpose of this facility seems to be that
    it allows larger structures to be composed to incorporate
    weak-reference semantics without having to impose substantial
    additional locking requirements.  For instance, it would not be
    difficult to use this facility to create a "weak" hash table which
    removes keys and referents when a reference is no longer used
    elsewhere.  Using weak references for the objects without some
    sort of notification queue for invalidations leads to much more
    tedious implementation of the various operations required on hash
    tables.  This can be a performance bottleneck if deallocations of
    the stored objects are infrequent.

    Java's "weak" references are most like Dianne Hackborn's old vref
    proposal: a reference object refers to a single Python object,
    but does not own a reference to that object.  When that object is
    deallocated, the reference object is invalidated.  Users of the
    reference object can easily determine that the reference has been
    invalidated, or a NullObjectDereferenceError can be raised when
    an attempt is made to use the referred-to object.

    The "soft" references are similar, but are not invalidated as soon
    as all other references to the referred-to object have been
    released.  The "soft" reference does own a reference, but allows
    the memory allocator to free the referent if the memory is needed
    elsewhere.  It is not clear whether this means soft references are
    released before the malloc() implementation calls sbrk() or its
    equivalent, or if soft references are only cleared when malloc()
    returns NULL.

    XXX -- Need to figure out what phantom references are all about.

    Unlike the other two reference types, "phantom" references must be
    associated with an invalidation queue.


Appendix -- Dianne Hackborn's vref proposal (1995)

    [This has been indented and paragraphs reflowed, but there have be
    no content changes.  --Fred]

    Proposal: Virtual References

    In an attempt to partly address the recurring discussion
    concerning reference counting vs. garbage collection, I would like
    to propose an extension to Python which should help in the
    creation of "well structured" cyclic graphs.  In particular, it
    should allow at least trees with parent back-pointers and
    doubly-linked lists to be created without worry about cycles.

    The basic mechanism I'd like to propose is that of a "virtual
    reference," or a "vref" from here on out.  A vref is essentially a
    handle on an object that does not increment the object's reference
    count.  This means that holding a vref on an object will not keep
    the object from being destroyed.  This would allow the Python
    programmer, for example, to create the aforementioned tree
    structure tree structure, which is automatically destroyed when it
    is no longer in use -- by making all of the parent back-references
    into vrefs, they no longer create reference cycles which keep the
    tree from being destroyed.

    In order to implement this mechanism, the Python core must ensure
    that no -real- pointers are ever left referencing objects that no
    longer exist.  The implementation I would like to propose involves
    two basic additions to the current Python system:

    1. A new "vref" type, through which the Python programmer creates
       and manipulates virtual references.  Internally, it is
       basically a C-level Python object with a pointer to the Python
       object it is a reference to.  Unlike all other Python code,
       however, it does not change the reference count of this object.
       In addition, it includes two pointers to implement a
       doubly-linked list, which is used below.

    2. The addition of a new field to the basic Python object
       [PyObject_Head in object.h], which is either NULL, or points to
       the head of a list of all vref objects that reference it.  When
       a vref object attaches itself to another object, it adds itself
       to this linked list.  Then, if an object with any vrefs on it
       is deallocated, it may walk this list and ensure that all of
       the vrefs on it point to some safe value, e.g. Nothing.


    This implementation should hopefully have a minimal impact on the
    current Python core -- when no vrefs exist, it should only add one
    pointer to all objects, and a check for a NULL pointer every time
    an object is deallocated.

    Back at the Python language level, I have considered two possible
    semantics for the vref object --

    ==> Pointer semantics:

      In this model, a vref behaves essentially like a Python-level
      pointer; the Python program must explicitly dereference the vref
      to manipulate the actual object it references.

      An example vref module using this model could include the
      function "new"; When used as 'MyVref = vref.new(MyObject)', it
      returns a new vref object such that that MyVref.object ==
      MyObject.  MyVref.object would then change to Nothing if
      MyObject is ever deallocated.

      For a concrete example, we may introduce some new C-style syntax:

      & -- unary operator, creates a vref on an object, same as vref.new().
      * -- unary operator, dereference a vref, same as VrefObject.object.

      We can then define:

      1.     type(&MyObject) == vref.VrefType
      2.        *(&MyObject) == MyObject
      3. (*(&MyObject)).attr == MyObject.attr
      4.          &&MyObject == Nothing
      5.           *MyObject -> exception

      Rule #4 is subtle, but comes about because we have made a vref
      to (a vref with no real references).  Thus the outer vref is
      cleared to Nothing when the inner one inevitably disappears.

    ==> Proxy semantics:

      In this model, the Python programmer manipulates vref objects
      just as if she were manipulating the object it is a reference
      of.  This is accomplished by implementing the vref so that all
      operations on it are redirected to its referenced object.  With
      this model, the dereference operator (*) no longer makes sense;
      instead, we have only the reference operator (&), and define:

      1.  type(&MyObject) == type(MyObject)
      2.        &MyObject == MyObject
      3. (&MyObject).attr == MyObject.attr
      4.       &&MyObject == MyObject

      Again, rule #4 is important -- here, the outer vref is in fact a
      reference to the original object, and -not- the inner vref.
      This is because all operations applied to a vref actually apply
      to its object, so that creating a vref of a vref actually
      results in creating a vref of the latter's object.

    The first, pointer semantics, has the advantage that it would be
    very easy to implement; the vref type is extremely simple,
    requiring at minimum a single attribute, object, and a function to
    create a reference.

    However, I really like the proxy semantics.  Not only does it put
    less of a burden on the Python programmer, but it allows you to do
    nice things like use a vref anywhere you would use the actual
    object.  Unfortunately, it would probably an extreme pain, if not
    practically impossible, to implement in the current Python
    implementation.  I do have some thoughts, though, on how to do
    this, if it seems interesting; one possibility is to introduce new
    type-checking functions which handle the vref.  This would
    hopefully older C modules which don't expect vrefs to simply
    return a type error, until they can be fixed.

    Finally, there are some other additional capabilities that this
    system could provide.  One that seems particularily interesting to
    me involves allowing the Python programmer to add "destructor"
    function to a vref -- this Python function would be called
    immediately prior to the referenced object being deallocated,
    allowing a Python program to invisibly attach itself to another
    object and watch for it to disappear.  This seems neat, though I
    haven't actually come up with any practical uses for it, yet... :)

    -- Dianne


Copyright

    This document has been placed in the public domain.


Local Variables:
mode: indented-text
indent-tabs-mode: nil
End:
-												Weak References PEP, assigned to Fred Drake

											
										
										
											2000-07-13 23:44:01 -04:00
+								PEP: 205
 								Title: Weak References
 								Version: $Revision$
-												Add PEP 230, Warning Framework.
Changed Owner: to Author: in a number of PEPs.

											
										
										
											2000-11-28 17:23:25 -05:00
+								Author: Fred L. Drake, Jr. <fdrake@acm.org>
-												Weak References PEP, assigned to Fred Drake

											
										
										
											2000-07-13 23:44:01 -04:00
+								Python-Version: 2.1
 								Status: Incomplete
-												Added some text describing the motivations for weak references, what is
available in Java, and pointers to previous work in Python.

There is a lot to be done before there can be a sample implementation;
issues related to implementation need to be discussed.

											
										
										
											2000-11-08 01:20:40 -05:00
+								Type: Standards Track
-												Remove the .clear() method; add a link to the patch on SF.

											
										
										
											2001-01-22 15:24:56 -05:00
+								Post-History: 11 January 2001
-												Added some text describing the motivations for weak references, what is
available in Java, and pointers to previous work in Python.

There is a lot to be done before there can be a sample implementation;
issues related to implementation need to be discussed.

											
										
										
											2000-11-08 01:20:40 -05:00
 								Motivation
 								    There are two basic applications for weak references which have
 								    been noted by Python programmers: object caches and reduction of
 								    pain from circular references.
 								    Caches (weak dictionaries)
-												Slight grammar and spelling nits; updated mx.Proxy URL; added
reference to EST's pyweak.

											
										
										
											2000-11-28 08:43:00 -05:00
+								        There is a need to allow objects to be maintained that represent
-												Added some text describing the motivations for weak references, what is
available in Java, and pointers to previous work in Python.

There is a lot to be done before there can be a sample implementation;
issues related to implementation need to be discussed.

											
										
										
											2000-11-08 01:20:40 -05:00
+								        external state, mapping a single instance to the external
 								        reality, where allowing multiple instances to be mapped to the
 								        same external resource would create unnecessary difficulty
 								        maintaining synchronization among instances.  In these cases,
 								        a common idiom is to support a cache of instances; a factory
 								        function is used to return either a new or existing instance.
 								        The difficulty in this approach is that one of two things must
 								        be tolerated: either the cache grows without bound, or there
 								        needs to be explicit management of the cache elsewhere in the
 								        application.  The later can be very tedious and leads to more
 								        code than is really necessary to solve the problem at hand,
 								        and the former can be unacceptable for long-running processes
 								        or even relatively short processes with substantial memory
 								        requirements.
 								        - External objects that need to be represented by a single
 								          instance, no matter how many internal users there are.  This
 								          can be useful for representing files that need to be written
 								          back to disk in whole rather than locked & modified for
 								          every use.
-												Slight grammar and spelling nits; updated mx.Proxy URL; added
reference to EST's pyweak.

											
										
										
											2000-11-28 08:43:00 -05:00
+								        - Objects that are expensive to create, but may be needed by
-												Added some text describing the motivations for weak references, what is
available in Java, and pointers to previous work in Python.

There is a lot to be done before there can be a sample implementation;
issues related to implementation need to be discussed.

											
										
										
											2000-11-08 01:20:40 -05:00
+								          multiple internal consumers.  Similar to the first case, but
 								          not necessarily bound to external resources, and possibly
 								          not an issue for shared state.  Weak references are only
 								          useful in this case if there is some flavor of "soft"
 								          references or if there is a high likelihood that users of
 								          individual objects will overlap in lifespan.
 								    Circular references
 								        - DOMs require a huge amount of circular (to parent & document
-												Substantially updated; now includes a description of the proposed API and
rough outline of how the implementation should work.

Hopefully will be ready for broad posting sometime tomorrow.

											
										
										
											2001-01-10 00:36:31 -05:00
+								          nodes), but these could be eliminated using a weak
 								          dictionary mapping from each node to it's parent.  This
 								          might be especially useful in the context of something like
 								          xml.dom.pulldom, allowing the .unlink() operation to become
 								          a no-op.
 								    This proposal is divided into the following sections:
 								        - Proposed Solution
 								        - Implementation Strategy
 								        - Possible Applications
 								        - Previous Weak Reference Work in Python
 								        - Weak References in Java
 								    The full text of one early proposal is included as an appendix
 								    since it does not appear to be available on the net.
 								Proposed Solution
 								    Weak references should be able to point to any Python object that
 								    may have substantial memory size (directly or indirectly), or hold
 								    references to external resources (database connections, open
 								    files, etc.).
-												Some minor updates to reflect the removal of .clear().  Added information
(preliminary) about proxy objects.

											
										
										
											2001-01-22 17:11:03 -05:00
+								    A new module, weakref, will contain new functions used to create
 								    weak references.  weakref.new() will create a "weak reference
 								    object" and optionally attach a callback which will be called when
 								    the object is about to be finalized.  weakref.mapping() will
 								    create a "weak dictionary".  A third function, weakref.proxy(),
 								    will create a proxy object that behaves somewhat like the original
 								    object.
-												Substantially updated; now includes a description of the proposed API and
rough outline of how the implementation should work.

Hopefully will be ready for broad posting sometime tomorrow.

											
										
										
											2001-01-10 00:36:31 -05:00
 								    A weak reference object will allow access to the referenced object
-												Remove the .clear() method; add a link to the patch on SF.

											
										
										
											2001-01-22 15:24:56 -05:00
+								    if it hasn't been collected and to determine if the object still
 								    exists in memory.  These operations are accessed via the .get()
 								    and .islive() methods.  If the object has been collected, .get()
 								    will return None and islive() will return false.  Weak references
 								    are not hashable.
-												Substantially updated; now includes a description of the proposed API and
rough outline of how the implementation should work.

Hopefully will be ready for broad posting sometime tomorrow.

											
										
										
											2001-01-10 00:36:31 -05:00
 								    A weak dictionary maps arbitrary keys to values, but does not own
 								    a reference to the values.  When the values are finalized, the
 								    (key, value) pairs for which it is a value are removed from all
 								    the mappings containing such pairs.  These mapping objects provide
 								    an additional method, .setitem(), which accepts a key, value, and
 								    optional callback which will be called when the object is removed
 								    from the mapping.  If the object is removed from the mapping by
 								    deleting the entry by key or calling the mapping's .clear()
 								    method, the callback is not called (since the object is not about
 								    to be finalized), and will be unregistered from the object.  Like
 								    dictionaries, weak dictionaries are not hashable.
-												Some minor updates to reflect the removal of .clear().  Added information
(preliminary) about proxy objects.

											
										
										
											2001-01-22 17:11:03 -05:00
+								    Proxy objects are weak references that attempt to behave like the
 								    object they proxy, as much as they can.  Regardless of the
 								    underlying type, proxies are not hashable since their ability to
 								    act as a weak reference relies on a fundamental mutability that
 								    will cause failures when used as dictionary keys -- even if the
 								    proper hash value is computed before the referent dies, the
 								    resulting proxy cannot be used as a dictionary key since it cannot
 								    be compared once the referent has expired, and comparability is
 								    necessary for dictionary keys.  Operations on proxy objects after
 								    the referent dies cause weakref.ReferenceError to be raised in
 								    most cases.  "is" comparisons, type(), and id() will continue to
 								    work, but always refer to the proxy and not the referent.
 								    The callbacks registered with weak references must accept a single
 								    parameter, which will be the weak-ly referenced object itself.
 								    The object can be resurrected by creating some other reference to
 								    the object in the callback, in which case the weak reference
 								    generating the callback will still be cleared but no remaining
 								    weak references to the object will be cleared.
-												Substantially updated; now includes a description of the proposed API and
rough outline of how the implementation should work.

Hopefully will be ready for broad posting sometime tomorrow.

											
										
										
											2001-01-10 00:36:31 -05:00
 								Implementation Strategy
 								    The implementation of weak references will include a list of
 								    reference containers that must be cleared for each weakly-
 								    referencable object.  If the reference is from a weak dictionary,
 								    the dictionary entry is cleared first.  Then, any associated
 								    callback is called with the object passed as a parameter.  If the
 								    callback generates additional references, no further callbacks are
 								    called and the object is resurrected.  Once all callbacks have
 								    been called and the object has not been resurrected, it is
 								    finalized and deallocated.
 								    Many built-in types will participate in the weak-reference
 								    management, and any extension type can elect to do so.  The type
 								    structure will contain an additional field which provides an
 								    offset into the instance structure which contains a list of weak
 								    reference structures.  If the value of the field is <= 0, the
 								    object does not participate.  In this case, weakref.new(),
 								    <weakdict>.setitem(), .setdefault(), and item assignment will
 								    raise TypeError.  If the value of the field is > 0, a new weak
 								    reference can be generated and added to the list.
 								    This approach is taken to allow arbitrary extension types to
 								    participate, without taking a memory hit for numbers or other
 								    small types.
 								    XXX -- Need to determine the exact list of standard types which
 								    should support weak references.  This should include instances and
 								    dictionaries.  Whether it should include strings, buffers, lists,
 								    files, or sockets is debatable.
-												Remove the .clear() method; add a link to the patch on SF.

											
										
										
											2001-01-22 15:24:56 -05:00
+								    There is a preliminary patch on SourceForge:
 								    http://sourceforge.net/patch/?func=detailpatch&patch_id=103203&group_id=5470
-												Substantially updated; now includes a description of the proposed API and
rough outline of how the implementation should work.

Hopefully will be ready for broad posting sometime tomorrow.

											
										
										
											2001-01-10 00:36:31 -05:00
 								Possible Applications
 								    PyGTK+ bindings?
 								    Tkinter -- could avoid circular references by using weak
 								    references from widgets to their parents.  Objects won't be
 								    discarded any sooner in the typical case, but there won't be so
 								    much dependence on the programmer calling .destroy() before
 								    releasing a reference.  This would mostly benefit long-running
 								    applications.
 								    DOM trees.
 								Previous Weak Reference Work in Python
 								    Dianne Hackborn has proposed something called "virtual references".
 								    'vref' objects are very similar to java.lang.ref.WeakReference
 								    objects, except there is no equivalent to the invalidation
 								    queues.  Implementing a "weak dictionary" would be just as
 								    difficult as using only weak references (without the invalidation
 								    queue) in Java.  Information on this has disappeared from the Web,
 								    but is included below as an Appendix.
 								    Marc-Andr<64> Lemburg's mx.Proxy package:
 								        http://www.lemburg.com/files/python/mxProxy.html
 								    The weakdict module by Dieter Maurer is implemented in C and
 								    Python.  It appears that the Web pages have not been updated since
 								    Python 1.5.2a, so I'm not yet sure if the implementation is
 								    compatible with Python 2.0.
 								        http://www.handshake.de/~dieter/weakdict.html
 								    PyWeakReference by Alex Shindich:
 								        http://sourceforge.net/projects/pyweakreference/
 								    Eric Tiedemann has a weak dictionary implementation:
 								        http://www.hyperreal.org/~est/python/weak/
-												Added some text describing the motivations for weak references, what is
available in Java, and pointers to previous work in Python.

There is a lot to be done before there can be a sample implementation;
issues related to implementation need to be discussed.

											
										
										
											2000-11-08 01:20:40 -05:00
 								Weak References in Java
 								    http://java.sun.com/j2se/1.3/docs/api/java/lang/ref/package-summary.html
 								    Java provides three forms of weak references, and one interesting
 								    helper class.  The three forms are called "weak", "soft", and
 								    "phantom" references.  The relevant classes are defined in the
 								    java.lang.ref package.
 								    For each of the reference types, there is an option to add the
 								    reference to a queue when it is invalidated by the memory
 								    allocator.  The primary purpose of this facility seems to be that
 								    it allows larger structures to be composed to incorporate
 								    weak-reference semantics without having to impose substantial
 								    additional locking requirements.  For instance, it would not be
 								    difficult to use this facility to create a "weak" hash table which
 								    removes keys and referents when a reference is no longer used
 								    elsewhere.  Using weak references for the objects without some
 								    sort of notification queue for invalidations leads to much more
 								    tedious implementation of the various operations required on hash
 								    tables.  This can be a performance bottleneck if deallocations of
 								    the stored objects are infrequent.
-												Slight grammar and spelling nits; updated mx.Proxy URL; added
reference to EST's pyweak.

											
										
										
											2000-11-28 08:43:00 -05:00
+								    Java's "weak" references are most like Dianne Hackborn's old vref
-												Added some text describing the motivations for weak references, what is
available in Java, and pointers to previous work in Python.

There is a lot to be done before there can be a sample implementation;
issues related to implementation need to be discussed.

											
										
										
											2000-11-08 01:20:40 -05:00
+								    proposal: a reference object refers to a single Python object,
 								    but does not own a reference to that object.  When that object is
 								    deallocated, the reference object is invalidated.  Users of the
 								    reference object can easily determine that the reference has been
 								    invalidated, or a NullObjectDereferenceError can be raised when
 								    an attempt is made to use the referred-to object.
 								    The "soft" references are similar, but are not invalidated as soon
 								    as all other references to the referred-to object have been
 								    released.  The "soft" reference does own a reference, but allows
 								    the memory allocator to free the referent if the memory is needed
 								    elsewhere.  It is not clear whether this means soft references are
 								    released before the malloc() implementation calls sbrk() or its
 								    equivalent, or if soft references are only cleared when malloc()
 								    returns NULL.
 								    XXX -- Need to figure out what phantom references are all about.
 								    Unlike the other two reference types, "phantom" references must be
 								    associated with an invalidation queue.
-												Updated information on Dianne Hackborn's old vref proposal.  A copy of
her original proposal has been included as an appendix, since it is hard
to get since DejaNews lost their old archives.

											
										
										
											2000-11-17 17:54:45 -05:00
+								Appendix -- Dianne Hackborn's vref proposal (1995)
 								    [This has been indented and paragraphs reflowed, but there have be
 								    no content changes.  --Fred]
 								    Proposal: Virtual References
 								    In an attempt to partly address the recurring discussion
 								    concerning reference counting vs. garbage collection, I would like
 								    to propose an extension to Python which should help in the
 								    creation of "well structured" cyclic graphs.  In particular, it
 								    should allow at least trees with parent back-pointers and
 								    doubly-linked lists to be created without worry about cycles.
 								    The basic mechanism I'd like to propose is that of a "virtual
 								    reference," or a "vref" from here on out.  A vref is essentially a
 								    handle on an object that does not increment the object's reference
 								    count.  This means that holding a vref on an object will not keep
 								    the object from being destroyed.  This would allow the Python
 								    programmer, for example, to create the aforementioned tree
 								    structure tree structure, which is automatically destroyed when it
 								    is no longer in use -- by making all of the parent back-references
 								    into vrefs, they no longer create reference cycles which keep the
 								    tree from being destroyed.
 								    In order to implement this mechanism, the Python core must ensure
 								    that no -real- pointers are ever left referencing objects that no
 								    longer exist.  The implementation I would like to propose involves
 								    two basic additions to the current Python system:
 . A new "vref" type, through which the Python programmer creates
 								       and manipulates virtual references.  Internally, it is
 								       basically a C-level Python object with a pointer to the Python
 								       object it is a reference to.  Unlike all other Python code,
 								       however, it does not change the reference count of this object.
 								       In addition, it includes two pointers to implement a
 								       doubly-linked list, which is used below.
 . The addition of a new field to the basic Python object
 								       [PyObject_Head in object.h], which is either NULL, or points to
 								       the head of a list of all vref objects that reference it.  When
 								       a vref object attaches itself to another object, it adds itself
 								       to this linked list.  Then, if an object with any vrefs on it
 								       is deallocated, it may walk this list and ensure that all of
 								       the vrefs on it point to some safe value, e.g. Nothing.
 								    This implementation should hopefully have a minimal impact on the
 								    current Python core -- when no vrefs exist, it should only add one
 								    pointer to all objects, and a check for a NULL pointer every time
 								    an object is deallocated.
 								    Back at the Python language level, I have considered two possible
 								    semantics for the vref object --
 								    ==> Pointer semantics:
 								      In this model, a vref behaves essentially like a Python-level
 								      pointer; the Python program must explicitly dereference the vref
 								      to manipulate the actual object it references.
 								      An example vref module using this model could include the
 								      function "new"; When used as 'MyVref = vref.new(MyObject)', it
 								      returns a new vref object such that that MyVref.object ==
 								      MyObject.  MyVref.object would then change to Nothing if
 								      MyObject is ever deallocated.
 								      For a concrete example, we may introduce some new C-style syntax:
 								      & -- unary operator, creates a vref on an object, same as vref.new().
 								      * -- unary operator, dereference a vref, same as VrefObject.object.
 								      We can then define:
 .     type(&MyObject) == vref.VrefType
 .        *(&MyObject) == MyObject
 . (*(&MyObject)).attr == MyObject.attr
 .          &&MyObject == Nothing
 .           *MyObject -> exception
 								      Rule #4 is subtle, but comes about because we have made a vref
 								      to (a vref with no real references).  Thus the outer vref is
 								      cleared to Nothing when the inner one inevitably disappears.
 								    ==> Proxy semantics:
 								      In this model, the Python programmer manipulates vref objects
 								      just as if she were manipulating the object it is a reference
 								      of.  This is accomplished by implementing the vref so that all
 								      operations on it are redirected to its referenced object.  With
 								      this model, the dereference operator (*) no longer makes sense;
 								      instead, we have only the reference operator (&), and define:
 .  type(&MyObject) == type(MyObject)
 .        &MyObject == MyObject
 . (&MyObject).attr == MyObject.attr
 .       &&MyObject == MyObject
 								      Again, rule #4 is important -- here, the outer vref is in fact a
 								      reference to the original object, and -not- the inner vref.
 								      This is because all operations applied to a vref actually apply
 								      to its object, so that creating a vref of a vref actually
 								      results in creating a vref of the latter's object.
 								    The first, pointer semantics, has the advantage that it would be
 								    very easy to implement; the vref type is extremely simple,
 								    requiring at minimum a single attribute, object, and a function to
 								    create a reference.
 								    However, I really like the proxy semantics.  Not only does it put
 								    less of a burden on the Python programmer, but it allows you to do
 								    nice things like use a vref anywhere you would use the actual
 								    object.  Unfortunately, it would probably an extreme pain, if not
 								    practically impossible, to implement in the current Python
 								    implementation.  I do have some thoughts, though, on how to do
 								    this, if it seems interesting; one possibility is to introduce new
 								    type-checking functions which handle the vref.  This would
 								    hopefully older C modules which don't expect vrefs to simply
 								    return a type error, until they can be fixed.
 								    Finally, there are some other additional capabilities that this
 								    system could provide.  One that seems particularily interesting to
 								    me involves allowing the Python programmer to add "destructor"
 								    function to a vref -- this Python function would be called
 								    immediately prior to the referenced object being deallocated,
 								    allowing a Python program to invisibly attach itself to another
 								    object and watch for it to disappear.  This seems neat, though I
 								    haven't actually come up with any practical uses for it, yet... :)
 								    -- Dianne
-												Added some text describing the motivations for weak references, what is
available in Java, and pointers to previous work in Python.

There is a lot to be done before there can be a sample implementation;
issues related to implementation need to be discussed.

											
										
										
											2000-11-08 01:20:40 -05:00
+								Copyright
 								    This document has been placed in the public domain.
-												Weak References PEP, assigned to Fred Drake

											
										
										
											2000-07-13 23:44:01 -04:00
 								Local Variables:
 								mode: indented-text
 								indent-tabs-mode: nil
 								End: