Greg Stein was a bad boy! <268 spank>

Jeremy Hylton's "Optimized Access to Module Namespaces" PEP gets 267. We'll let Greg slide this time, so his (albeit incomplete other than a Rationale) PEP gets 268. It must get filled in soon though or it'll get moved to the "Empty" category. :(
2001-08-21 00:02:26 +00:00 · 2001-08-21 00:02:26 +00:00 · 5331ad949f
parent 318841b0cc
commit 5331ad949f
3 changed files with 319 additions and 41 deletions
--- a/pep-0000.txt
+++ b/pep-0000.txt
@ -60,7 +60,8 @@ Index by Category
 S   262  Database of Installed Python Packages        Kuchling
 S   263  Defining Python Source Code Encodings        Lemburg
 S   265  Sorting Dictionaries by Value                Griffin
- S   267  Extended HTTP functionality and WebDAV       Stein
+ S   267  Optimized Access to Module Namespaces        Hylton
 S   268  Extended HTTP functionality and WebDAV       Stein
 Py-in-the-sky PEPs (not considered for Python 2.2)
@ -214,7 +215,8 @@ Numerical Index
 SA  264  Future statements in simulated shells        Hudson
 S   265  Sorting Dictionaries by Value                Griffin
 S   266  Optimizing Global Variable/Attribute Access  Montanaro
- S   267  Extended HTTP functionality and WebDAV       Stein
+ S   267  Optimized Access to Module Namespaces        Hylton
 S   268  Extended HTTP functionality and WebDAV       Stein
 Key
--- a/pep-0267.txt
+++ b/pep-0267.txt
@ -1,66 +1,268 @@
 PEP: 267
-Title: Extended HTTP functionality and WebDAV
+Title: Optimized Access to Module Namespaces
 Version: $Revision$
 Last-Modified: $Date$
-Author: gstein@lyra.org (Greg Stein)
+Author: jeremy@zope.com (Jeremy Hylton)
 Status: Draft
-Type: Standards Trap
+Type: Standards Track
-Created: 20-Aug-2001
+Created: 23-May-2001
 Python-Version: 2.2
 Post-History:
 Abstract
-    This PEP discusses new modules and extended functionality for
+    This PEP proposes a new implementation of global module namespaces
-    Python's HTTP support. Notably, the addition of authenticated
+    and the builtin namespace that speeds name resolution.  The
-    requests, proxy support, authenticated proxy usage, and WebDAV [1]
+    implementation would use an array of object pointers for most
-    capabilities.
+    operations in these namespaces.  The compiler would assign indices
    for global variables and module attributes at compile time.
    The current implementation represents these namespaces as
    dictionaries.  A global name incurs a dictionary lookup each time
    it is used; a builtin name incurs two dictionary lookups, a failed
    lookup in the global namespace and a second lookup in the builtin
    namespace.
    This implementation should speed Python code that uses
    module-level functions and variables.  It should also eliminate
    awkward coding styles that have evolved to speed access to these
    names.
    The implementation is complicated because the global and builtin
    namespaces can be modified dynamically in ways that are impossible
    for the compiler to detect.  (Example: A module's namespace is
    modified by a script after the module is imported.)  As a result,
    the implementation must maintain several auxiliary data structures
    to preserve these dynamic features.
-Rationale
+Introduction
-    Python has been quite popular as a result of its "batteries
+    This PEP proposes a new implementation of attribute access for
-    included" positioning. One of the most heavily used protocols,
+    module objects that optimizes access to module variables known at
-    HTTP, has been included with Python for years (httplib). However,
+    compile time.  The module will store these variables in an array
-    this support has not kept up with the full needs and requirements
+    and provide an interface to lookup attributes using array offsets.
-    of many HTTP-based applications and systems. In addition, new
+    For globals, builtins, and attributes of imported modules, the
-    protocols based on HTTP, such as WebDAV and XML-RPC, are becoming
+    compiler will generate code that uses the array offsets for fast
-    useful and are seeing increasing usage. Supplying this
+    access.
    functionality meets Python's "batteries included" role and also
    keeps Python at the leading edge of new technologies.
-    While authentication and proxy support are two very notable
+    [describe the key parts of the design: dlict, compiler support,
-    features missing from Python's core HTTP processing, they are
+    stupid name trick workarounds, optimization of other module's
-    minimally handled as part of Python's URL handling (urllib and
+    globals]
    urllib2). However, applications that need fine-grained or
    sophisticated HTTP handling cannot make use of the features while
    they reside in urllib. Refactoring these features into a location
    where they can be directly associated with an HTTP connection will
    improve their utility for both urllib and for sophisticated
    applications.
-    The motivation for this PEP was from several people requesting
+    The implementation will preserve existing semantics for module
-    these features directly, and from a number of feature requests on
+    namespaces, including the ability to modify module namespaces at
-    SourceForge. Since the exact form of the modules to be provided
+    runtime in ways that affect the visibility of builtin names.
    and the classes/architecture used could be subject to debate, this
    PEP was created to provide a focal point for those discussions.
-Specification
+DLict design
-    more info here...
+    The namespaces are implemented using a data structure that has
    sometimes gone under the name dlict.  It is a dictionary that has
    numbered slots for some dictionary entries.  The type must be
    implemented in C to achieve acceptable performance.  The new
    type-class unification work should make this fairly easy.  The
    DLict will presumably be a subclass of dictionary with an
    alternate storage module for some keys.
    A Python implementation is included here to illustrate the basic
    design:
        """A dictionary-list hybrid"""
        import types
        class DLict:
            def __init__(self, names):
                assert isinstance(names, types.DictType)
                self.names = {}
                self.list = [None] * size
                self.empty = [1] * size
                self.dict = {}
                self.size = 0
            def __getitem__(self, name):
                i = self.names.get(name)
                if i is None:
                    return self.dict[name]
                if self.empty[i] is not None:
                    raise KeyError, name
                return self.list[i]
            def __setitem__(self, name, val):
                i = self.names.get(name)
                if i is None:
                    self.dict[name] = val
                else:
                    self.empty[i] = None
                    self.list[i] = val
                    self.size += 1
            def __delitem__(self, name):
                i = self.names.get(name)
                if i is None:
                    del self.dict[name]
                else:
                    if self.empty[i] is not None:
                        raise KeyError, name
                    self.empty[i] = 1
                    self.list[i] = None
                    self.size -= 1
            def keys(self):
                if self.dict:
                    return self.names.keys() + self.dict.keys()
                else:
                    return self.names.keys()
            def values(self):
                if self.dict:
                    return self.names.values() + self.dict.values()
                else:
                    return self.names.values()
            def items(self):
                if self.dict:
                    return self.names.items()
                else:
                    return self.names.items() + self.dict.items()
            def __len__(self):
                return self.size + len(self.dict)
            def __cmp__(self, dlict):
                c = cmp(self.names, dlict.names)
                if c != 0:
                    return c
                c = cmp(self.size, dlict.size)
                if c != 0:
                    return c
                for i in range(len(self.names)):
                    c = cmp(self.empty[i], dlict.empty[i])
                    if c != 0:
                        return c
                    if self.empty[i] is None:
                        c = cmp(self.list[i], dlict.empty[i])
                        if c != 0:
                            return c
                return cmp(self.dict, dlict.dict)
            def clear(self):
                self.dict.clear()
                for i in range(len(self.names)):
                    if self.empty[i] is None:
                        self.empty[i] = 1
                        self.list[i] = None
            def update(self):
                pass
            def load(self, index):
                """dlict-special method to support indexed access"""
                if self.empty[index] is None:
                    return self.list[index]
                else:
                    raise KeyError, index # XXX might want reverse mapping
            def store(self, index, val):
                """dlict-special method to support indexed access"""
                self.empty[index] = None
                self.list[index] = val
            def delete(self, index):
                """dlict-special method to support indexed access"""
                self.empty[index] = 1
                self.list[index] = None
-Reference Implementation
+Compiler issues
-    reference imp will probably go into /nondist/sandbox/...
+    The compiler currently collects the names of all global variables
    in a module.  These are names bound at the module level or bound
    in a class or function body that declares them to be global.
    The compiler would assign indices for each global name and add the
    names and indices of the globals to the module's code object.
    Each code object would then be bound irrevocably to the module it
    was defined in.  (Not sure if there are some subtle problems with
    this.)
    For attributes of imported modules, the module will store an
    indirection record.  Internally, the module will store a pointer
    to the defining module and the offset of the attribute in the
    defining module's global variable array.  The offset would be
    initialized the first time the name is looked up.
-References
+Runtime model
-    [1] http://www.webdav.org/
+    The PythonVM will be extended with new opcodes to access globals
    and module attributes via a module-level array.
    A function object would need to point to the module that defined
    it in order to provide access to the module-level global array.
    For module attributes stored in the dlict (call them static
    attributes), the get/delattr implementation would need to track
    access to these attributes using the old by-name interface.  If a
    static attribute is updated dynamically, e.g.
        mod.__dict__["foo"] = 2
    The implementation would need to update the array slot instead of
    the backup dict.
 Backwards compatibility
    The dlict will need to maintain meta-information about whether a
    slot is currently used or not.  It will also need to maintain a
    pointer to the builtin namespace.  When a name is not currently
    used in the global namespace, the lookup will have to fail over to
    the builtin namespace.
    In the reverse case, each module may need a special accessor
    function for the builtin namespace that checks to see if a global
    shadowing the builtin has been added dynamically.  This check
    would only occur if there was a dynamic change to the module's
    dlict, i.e. when a name is bound that wasn't discovered at
    compile-time.
    These mechanisms would have little if any cost for the common case
    whether a module's global namespace is not modified in strange
    ways at runtime.  They would add overhead for modules that did
    unusual things with global names, but this is an uncommon practice
    and probably one worth discouraging.
    It may be desirable to disable dynamic additions to the global
    namespace in some future version of Python.  If so, the new
    implementation could provide warnings.
 Related PEPs
    PEP 266, Optimizing Global Variable/Attribute Access, proposes a
    different mechanism for optimizing access to global variables as
    well as attributes of objects.  The mechanism uses two new opcodes
    TRACK_OBJECT and UNTRACK_OBJECT to create a slot in the local
    variables array that aliases the global or object attribute.  If
    the object being aliases is rebound, the rebind operation is
    responsible for updating the aliases.
    The objecting tracking approach applies to a wider range of
    objects than just module.  It may also have a higher runtime cost,
    because each function that uses a global or object attribute must
    execute extra opcodes to register its interest in an object and
    unregister on exit; the cost of registration is unclear, but
    presumably involves a dynamically resizable data structure to hold
    a list of callbacks.
    The implementation proposed here avoids the need for registration,
    because it does not create aliases.  Instead it allows functions
    that reference a global variable or module attribute to retain a
    pointer to the location where the original binding is stored.  A
    second advantage is that the initial lookup is performed once per
    module rather than once per function call.
 Copyright
--- a/pep-0268.txt
+++ b/pep-0268.txt
@ -0,0 +1,74 @@
 PEP: 267
 Title: Extended HTTP functionality and WebDAV
 Version: $Revision$
 Last-Modified: $Date$
 Author: gstein@lyra.org (Greg Stein)
 Status: Draft
 Type: Standards Track
 Created: 20-Aug-2001
 Python-Version: 2.2
 Post-History:
 Abstract
    This PEP discusses new modules and extended functionality for
    Python's HTTP support.  Notably, the addition of authenticated
    requests, proxy support, authenticated proxy usage, and WebDAV [1]
    capabilities.
 Rationale
    Python has been quite popular as a result of its "batteries
    included" positioning.  One of the most heavily used protocols,
    HTTP, has been included with Python for years (httplib).  However,
    this support has not kept up with the full needs and requirements
    of many HTTP-based applications and systems.  In addition, new
    protocols based on HTTP, such as WebDAV and XML-RPC, are becoming
    useful and are seeing increasing usage.  Supplying this
    functionality meets Python's "batteries included" role and also
    keeps Python at the leading edge of new technologies.
    While authentication and proxy support are two very notable
    features missing from Python's core HTTP processing, they are
    minimally handled as part of Python's URL handling (urllib and
    urllib2).  However, applications that need fine-grained or
    sophisticated HTTP handling cannot make use of the features while
    they reside in urllib.  Refactoring these features into a location
    where they can be directly associated with an HTTP connection will
    improve their utility for both urllib and for sophisticated
    applications.
    The motivation for this PEP was from several people requesting
    these features directly, and from a number of feature requests on
    SourceForge.  Since the exact form of the modules to be provided
    and the classes/architecture used could be subject to debate, this
    PEP was created to provide a focal point for those discussions.
 Specification
    more info here...
 Reference Implementation
    reference imp will probably go into /nondist/sandbox/...
 References
    [1] http://www.webdav.org/
 Copyright
    This document has been placed in the public domain.
 Local Variables:
 mode: indented-text
 indent-tabs-mode: nil
 End: