- some reflow caused by David Goodger's spell checking

- added "Optional Extensions to the Importer Protocol" section - fixed flaw in the importer protocol: i.find_module() will now receive the package.__path__ (or None for a plain module) as an additional argument, if it's installed on sys.meta_path. This is needed to be able to add a hook that implements the full sys.path/pkg.__path__ semantics. See also footnote [7]. The patch on sf will be updated shortly to match the new version of the PEP.
2002-12-23 22:13:48 +00:00 · 2002-12-23 22:13:48 +00:00 · 74737f0d45
parent fb0080cbcb
commit 74737f0d45
1 changed files with 101 additions and 20 deletions
--- a/pep-0302.txt
+++ b/pep-0302.txt
@ -190,17 +190,17 @@ Specification part 1: The Importer Protocol
    module being imported (may be a dotted name) and a reference to the
    current global namespace.

-    The built-in __import__ function (known as PyImport_ImportModuleEx in
-    import.c) will then check to see whether the module doing the import
-    is a package by looking for a __path__ variable in the current
-    global namespace.  If it is indeed a package, it first tries to do
-    the import relative to the package.  For example if a package named
-    "spam" does "import eggs", it will first look for a module named
-    "spam.eggs".  If that fails, the import continues as an absolute
-    import: it will look for a module named "eggs".  Dotted name imports
-    work pretty much the same: if package "spam" does "import
-    eggs.bacon", first "spam.eggs.bacon" is tried, and only if that
-    fails "eggs.bacon" is tried.
+    The built-in __import__ function (known as PyImport_ImportModuleEx
+    in import.c) will then check to see whether the module doing the
+    import is a package by looking for a __path__ variable in the
+    current global namespace.  If it is indeed a package, it first tries
+    to do the import relative to the package.  For example if a package
+    named "spam" does "import eggs", it will first look for a module
+    named "spam.eggs".  If that fails, the import continues as an
+    absolute import: it will look for a module named "eggs".  Dotted
+    name imports work pretty much the same: if package "spam" does
+    "import eggs.bacon", first "spam.eggs.bacon" is tried, and only if
+    that fails "eggs.bacon" is tried.

    Deeper down in the mechanism, a dotted name import is split up by
    its components.  For "import spam.ham", first an "import spam" is
@ -214,11 +214,15 @@ Specification part 1: The Importer Protocol
    The protocol involves two objects: an importer and a loader.  An
    importer object has a single method:

-        importer.find_module(fullname)
+        importer.find_module(fullname, path=None)

-    This method returns a loader object if the module was found, or None
-    if it wasn't.  If find_module() raises an exception, it will be
-    propagated to the caller, aborting the import.
+    This method will be called with the fully qualified name of the
+    module.  If the importer is installed on sys.meta_path, it will
+    receive a second argument, which is None for a top-level module, or
+    package.__path__ for submodules or subpackages[7].  It should return
+    a loader object if the module was found, or None if it wasn't.  If
+    find_module() raises an exception, it will be propagated to the
+    caller, aborting the import.

    A loader object also has one method:

@ -348,6 +352,72 @@ Packages and the role of __path__
    which can be empty.


+Optional Extensions to the Importer Protocol
+
+    The Importer Protocol defines two optional extensions.  One is to
+    retrieve data files, the other is to support module packaging tools
+    and/or tools that analyze module dependencies (for example Freeze
+    [3]).  The latter category of tools usually don't actually *load*
+    modules, they only need to know if and where they are available.
+    Both extensions are highly recommended for general purpose
+    importers, but may safely be left out if those features aren't
+    needed.
+
+    To retrieve the data for arbitrary "files" from the underlying
+    storage backend, loader objects may supply a method named get_data:
+
+        loader.get_data(name)
+
+    This method returns the data as a string, or raise IOError if the
+    "file" wasn't found.  The 'name' argument should be seen as a
+    'cookie', meaning the protocol doesn't prescribe any semantics for
+    it. However, for importer objects that have some file system-like
+    properties (for example zipimporter) it is recommended to use os.sep
+    as a separator character to specify a (possibly virtual) directory
+    hierarchy. For example if the importer allows access to a module's
+    source code via i.get_data(name), the 'name' argument should be
+    constructed like this:
+
+        name = mod.__name__.replace(".", os.sep) + ".py"
+
+    Note that this is not the recommended way to retrieve source code,
+    the (also optional) method loader.get_source(fullname) is more
+    general, as it doesn't imply *any* file-system-like characteristics.
+    This leads us to the next extension.
+
+    The following set of methods may be implemented if support for (for
+    example) Freeze-like tools is desirable.  It consists of three
+    additional methods which, to make it easier for the caller, each of
+    which should be implemented, or none at all.
+
+        loader.get_package_path(fullname)
+        loader.get_code(fullname)
+        loader.get_source(fullname)
+
+    All three methods should raise ImportError if the module wasn't
+    found.
+
+    The loader.get_package_path(fullname) method should return None if
+    the module specified by 'fullname' is not a package, or a list to
+    serve as pkg.__path__ if it is.  It can be used to check
+    package-ness for a module ("loader.get_package_path(fullname) is not
+    None") but its main purpose is to tell our caller what pkg.__path__
+    would be if the module would actually be loaded.
+
+    The loader.get_code(fullname) method should return the code object
+    associated with the module, or None if it's a built-in or extension
+    module.  If the loader doesn't have the code object but it _does_
+    have the source code, it should return the compiled the source code.
+     (This is so that our caller doesn't also need to check get_source()
+    if all it needs is the code object.)
+
+    The loader.get_source(fullname) method should return the source code
+    for the module as a string (using newline characters for line
+    endings) or None if the source is not available (yet it should still
+    raise ImportError if the module can't be found by the importer at
+    all).
+
+
 Integration with the 'imp' module

    The new import hooks are not easily integrated in the existing
@ -355,10 +425,11 @@ Integration with the 'imp' module
    whether it's possible at all without breaking code; it is better to
    simply add a new function to the imp module.  The meaning of the
    existing imp.find_module() and imp.load_module() calls changes from:
-    "they expose the built-in import mechanism" to "they expose the basic
-    *unhooked* built-in import mechanism".  They simply won't invoke any
-    import hooks.  A new imp module function is proposed under the name
-    "find_module2", with is used like the following pattern:
+    "they expose the built-in import mechanism" to "they expose the
+    basic *unhooked* built-in import mechanism".  They simply won't
+    invoke any import hooks.  A new imp module function is proposed
+    under the name "find_module2", with is used like the following
+    pattern:

        loader = imp.find_module2(fullname, path)
        if loader is not None:
@ -438,21 +509,31 @@ Implementation
    http://www.python.org/sf/652586


-References
+References and Footnotes

    [1] Installer by Gordon McMillan
    http://www.mcmillan-inc.com/install1.html
+
    [2] PEP 273, Import Modules from Zip Archives, Ahlstrom
    http://www.python.org/peps/pep-0273.html
+
    [3] The Freeze tool
    Tools/freeze/ in a Python source distribution
+
    [4] Squeeze
    http://starship.python.net/crew/fredrik/ipa/squeeze.htm
+
    [5] py2exe by Thomas Heller
    http://py2exe.sourceforge.net/
+
    [6] imp.set_frozenmodules() patch
    http://www.python.org/sf/642578

+    [7] The path argument to importer.find_module() is there because the
+    pkg.__path__ variable may be needed at this point.  It may either
+    come from the actual parent module or be supplied by
+    imp.find_module() or the proposed imp.find_module2() function.
+

 Copyright