2002-12-20 08:07:24 -05:00
|
|
|
|
PEP: 302
|
|
|
|
|
Title: New Import Hooks
|
|
|
|
|
Version: $Revision$
|
|
|
|
|
Last-Modified: $Date$
|
|
|
|
|
Author: Just van Rossum <just@letterror.com>,
|
2015-02-12 14:13:56 -05:00
|
|
|
|
Paul Moore <p.f.moore@gmail.com>
|
2007-05-18 14:05:05 -04:00
|
|
|
|
Status: Final
|
2002-12-20 08:07:24 -05:00
|
|
|
|
Type: Standards Track
|
2012-05-03 11:47:57 -04:00
|
|
|
|
Content-Type: text/x-rst
|
2002-12-20 08:07:24 -05:00
|
|
|
|
Created: 19-Dec-2002
|
|
|
|
|
Python-Version: 2.3
|
|
|
|
|
Post-History: 19-Dec-2002
|
|
|
|
|
|
2013-05-04 14:04:46 -04:00
|
|
|
|
.. warning::
|
|
|
|
|
The language reference for import [10]_ and importlib documentation
|
2016-07-11 11:14:08 -04:00
|
|
|
|
[11]_ now supersede this PEP. This document is no longer updated
|
2013-05-04 14:04:46 -04:00
|
|
|
|
and provided for historical purposes only.
|
|
|
|
|
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
|
|
|
|
Abstract
|
2012-05-03 11:47:57 -04:00
|
|
|
|
========
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
2012-05-03 11:47:57 -04:00
|
|
|
|
This PEP proposes to add a new set of import hooks that offer better
|
|
|
|
|
customization of the Python import mechanism. Contrary to the current
|
|
|
|
|
``__import__`` hook, a new-style hook can be injected into the existing
|
|
|
|
|
scheme, allowing for a finer grained control of how modules are found and how
|
|
|
|
|
they are loaded.
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Motivation
|
2012-05-03 11:47:57 -04:00
|
|
|
|
==========
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
2012-05-03 11:47:57 -04:00
|
|
|
|
The only way to customize the import mechanism is currently to override the
|
|
|
|
|
built-in ``__import__`` function. However, overriding ``__import__`` has many
|
|
|
|
|
problems. To begin with:
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
2016-05-03 04:18:02 -04:00
|
|
|
|
* An ``__import__`` replacement needs to *fully* reimplement the entire
|
|
|
|
|
import mechanism, or call the original ``__import__`` before or after the
|
|
|
|
|
custom code.
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
2016-05-03 04:18:02 -04:00
|
|
|
|
* It has very complex semantics and responsibilities.
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
2016-05-03 04:18:02 -04:00
|
|
|
|
* ``__import__`` gets called even for modules that are already in
|
|
|
|
|
``sys.modules``, which is almost never what you want, unless you're writing
|
|
|
|
|
some sort of monitoring tool.
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
2012-05-03 11:47:57 -04:00
|
|
|
|
The situation gets worse when you need to extend the import mechanism from C:
|
|
|
|
|
it's currently impossible, apart from hacking Python's ``import.c`` or
|
|
|
|
|
reimplementing much of ``import.c`` from scratch.
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
2012-05-03 11:47:57 -04:00
|
|
|
|
There is a fairly long history of tools written in Python that allow extending
|
|
|
|
|
the import mechanism in various way, based on the ``__import__`` hook. The
|
|
|
|
|
Standard Library includes two such tools: ``ihooks.py`` (by GvR) and
|
|
|
|
|
``imputil.py`` [1]_ (Greg Stein), but perhaps the most famous is ``iu.py`` by
|
|
|
|
|
Gordon McMillan, available as part of his Installer package. Their usefulness
|
|
|
|
|
is somewhat limited because they are written in Python; bootstrapping issues
|
|
|
|
|
need to worked around as you can't load the module containing the hook with
|
|
|
|
|
the hook itself. So if you want the entire Standard Library to be loadable
|
|
|
|
|
from an import hook, the hook must be written in C.
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Use cases
|
2012-05-03 11:47:57 -04:00
|
|
|
|
=========
|
|
|
|
|
|
|
|
|
|
This section lists several existing applications that depend on import hooks.
|
|
|
|
|
Among these, a lot of duplicate work was done that could have been saved if
|
|
|
|
|
there had been a more flexible import hook at the time. This PEP should make
|
|
|
|
|
life a lot easier for similar projects in the future.
|
|
|
|
|
|
|
|
|
|
Extending the import mechanism is needed when you want to load modules that
|
|
|
|
|
are stored in a non-standard way. Examples include modules that are bundled
|
|
|
|
|
together in an archive; byte code that is not stored in a ``pyc`` formatted
|
|
|
|
|
file; modules that are loaded from a database over a network.
|
|
|
|
|
|
2022-01-21 06:03:51 -05:00
|
|
|
|
The work on this PEP was partly triggered by the implementation of :pep:`273`,
|
2012-05-03 11:47:57 -04:00
|
|
|
|
which adds imports from Zip archives as a built-in feature to Python. While
|
|
|
|
|
the PEP itself was widely accepted as a must-have feature, the implementation
|
|
|
|
|
left a few things to desire. For one thing it went through great lengths to
|
|
|
|
|
integrate itself with ``import.c``, adding lots of code that was either
|
|
|
|
|
specific for Zip file imports or *not* specific to Zip imports, yet was not
|
2022-01-21 06:03:51 -05:00
|
|
|
|
generally useful (or even desirable) either. Yet the :pep:`273` implementation
|
2012-05-03 11:47:57 -04:00
|
|
|
|
can hardly be blamed for this: it is simply extremely hard to do, given the
|
|
|
|
|
current state of ``import.c``.
|
|
|
|
|
|
|
|
|
|
Packaging applications for end users is a typical use case for import hooks,
|
|
|
|
|
if not *the* typical use case. Distributing lots of source or ``pyc`` files
|
|
|
|
|
around is not always appropriate (let alone a separate Python installation),
|
|
|
|
|
so there is a frequent desire to package all needed modules in a single file.
|
|
|
|
|
So frequent in fact that multiple solutions have been implemented over the
|
|
|
|
|
years.
|
|
|
|
|
|
|
|
|
|
The oldest one is included with the Python source code: Freeze [2]_. It puts
|
|
|
|
|
marshalled byte code into static objects in C source code. Freeze's "import
|
|
|
|
|
hook" is hard wired into ``import.c``, and has a couple of issues. Later
|
|
|
|
|
solutions include Fredrik Lundh's Squeeze, Gordon McMillan's Installer, and
|
|
|
|
|
Thomas Heller's py2exe [3]_. MacPython ships with a tool called
|
|
|
|
|
``BuildApplication``.
|
|
|
|
|
|
|
|
|
|
Squeeze, Installer and py2exe use an ``__import__`` based scheme (py2exe
|
|
|
|
|
currently uses Installer's ``iu.py``, Squeeze used ``ihooks.py``), MacPython
|
|
|
|
|
has two Mac-specific import hooks hard wired into ``import.c``, that are
|
|
|
|
|
similar to the Freeze hook. The hooks proposed in this PEP enables us (at
|
2021-02-03 09:06:23 -05:00
|
|
|
|
least in theory; it's not a short-term goal) to get rid of the hard coded
|
2012-05-03 11:47:57 -04:00
|
|
|
|
hooks in ``import.c``, and would allow the ``__import__``-based tools to get
|
|
|
|
|
rid of most of their ``import.c`` emulation code.
|
|
|
|
|
|
|
|
|
|
Before work on the design and implementation of this PEP was started, a new
|
|
|
|
|
``BuildApplication``-like tool for Mac OS X prompted one of the authors of
|
|
|
|
|
this PEP (JvR) to expose the table of frozen modules to Python, in the ``imp``
|
|
|
|
|
module. The main reason was to be able to use the freeze import hook
|
|
|
|
|
(avoiding fancy ``__import__`` support), yet to also be able to supply a set
|
|
|
|
|
of modules at runtime. This resulted in issue #642578 [4]_, which was
|
|
|
|
|
mysteriously accepted (mostly because nobody seemed to care either way ;-).
|
|
|
|
|
Yet it is completely superfluous when this PEP gets accepted, as it offers a
|
|
|
|
|
much nicer and general way to do the same thing.
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Rationale
|
2012-05-03 11:47:57 -04:00
|
|
|
|
=========
|
|
|
|
|
|
|
|
|
|
While experimenting with alternative implementation ideas to get built-in Zip
|
|
|
|
|
import, it was discovered that achieving this is possible with only a fairly
|
|
|
|
|
small amount of changes to ``import.c``. This allowed to factor out the
|
|
|
|
|
Zip-specific stuff into a new source file, while at the same time creating a
|
|
|
|
|
*general* new import hook scheme: the one you're reading about now.
|
|
|
|
|
|
|
|
|
|
An earlier design allowed non-string objects on ``sys.path``. Such an object
|
|
|
|
|
would have the necessary methods to handle an import. This has two
|
|
|
|
|
disadvantages: 1) it breaks code that assumes all items on ``sys.path`` are
|
|
|
|
|
strings; 2) it is not compatible with the ``PYTHONPATH`` environment variable.
|
|
|
|
|
The latter is directly needed for Zip imports. A compromise came from Jython:
|
|
|
|
|
allow string *subclasses* on ``sys.path``, which would then act as importer
|
|
|
|
|
objects. This avoids some breakage, and seems to work well for Jython (where
|
|
|
|
|
it is used to load modules from ``.jar`` files), but it was perceived as an
|
|
|
|
|
"ugly hack".
|
|
|
|
|
|
2021-02-03 09:06:23 -05:00
|
|
|
|
This led to a more elaborate scheme, (mostly copied from McMillan's
|
2012-05-03 11:47:57 -04:00
|
|
|
|
``iu.py``) in which each in a list of candidates is asked whether it can
|
|
|
|
|
handle the ``sys.path`` item, until one is found that can. This list of
|
|
|
|
|
candidates is a new object in the ``sys`` module: ``sys.path_hooks``.
|
|
|
|
|
|
|
|
|
|
Traversing ``sys.path_hooks`` for each path item for each new import can be
|
|
|
|
|
expensive, so the results are cached in another new object in the ``sys``
|
|
|
|
|
module: ``sys.path_importer_cache``. It maps ``sys.path`` entries to importer
|
|
|
|
|
objects.
|
|
|
|
|
|
|
|
|
|
To minimize the impact on ``import.c`` as well as to avoid adding extra
|
|
|
|
|
overhead, it was chosen to not add an explicit hook and importer object for
|
|
|
|
|
the existing file system import logic (as ``iu.py`` has), but to simply fall
|
|
|
|
|
back to the built-in logic if no hook on ``sys.path_hooks`` could handle the
|
|
|
|
|
path item. If this is the case, a ``None`` value is stored in
|
|
|
|
|
``sys.path_importer_cache``, again to avoid repeated lookups. (Later we can
|
|
|
|
|
go further and add a real importer object for the built-in mechanism, for now,
|
|
|
|
|
the ``None`` fallback scheme should suffice.)
|
|
|
|
|
|
|
|
|
|
A question was raised: what about importers that don't need *any* entry on
|
|
|
|
|
``sys.path``? (Built-in and frozen modules fall into that category.) Again,
|
|
|
|
|
Gordon McMillan to the rescue: ``iu.py`` contains a thing he calls the
|
|
|
|
|
*metapath*. In this PEP's implementation, it's a list of importer objects
|
|
|
|
|
that is traversed *before* ``sys.path``. This list is yet another new object
|
|
|
|
|
in the ``sys`` module: ``sys.meta_path``. Currently, this list is empty by
|
|
|
|
|
default, and frozen and built-in module imports are done after traversing
|
|
|
|
|
``sys.meta_path``, but still before ``sys.path``.
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Specification part 1: The Importer Protocol
|
2012-05-03 11:47:57 -04:00
|
|
|
|
===========================================
|
|
|
|
|
|
|
|
|
|
This PEP introduces a new protocol: the "Importer Protocol". It is important
|
|
|
|
|
to understand the context in which the protocol operates, so here is a brief
|
|
|
|
|
overview of the outer shells of the import mechanism.
|
|
|
|
|
|
|
|
|
|
When an import statement is encountered, the interpreter looks up the
|
|
|
|
|
``__import__`` function in the built-in name space. ``__import__`` is then
|
|
|
|
|
called with four arguments, amongst which are the name of the module being
|
|
|
|
|
imported (may be a dotted name) and a reference to the current global
|
|
|
|
|
namespace.
|
|
|
|
|
|
|
|
|
|
The built-in ``__import__`` function (known as ``PyImport_ImportModuleEx()``
|
|
|
|
|
in ``import.c``) will then check to see whether the module doing the import is
|
|
|
|
|
a package or a submodule of a package. If it is indeed a (submodule of a)
|
|
|
|
|
package, it first tries to do the import relative to the package (the parent
|
2016-07-11 11:14:08 -04:00
|
|
|
|
package for a submodule). For example, if a package named "spam" does "import
|
2012-05-03 11:47:57 -04:00
|
|
|
|
eggs", it will first look for a module named "spam.eggs". If that fails, the
|
|
|
|
|
import continues as an absolute import: it will look for a module named
|
|
|
|
|
"eggs". Dotted name imports work pretty much the same: if package "spam" does
|
|
|
|
|
"import eggs.bacon" (and "spam.eggs" exists and is itself a package),
|
|
|
|
|
"spam.eggs.bacon" is tried. If that fails "eggs.bacon" is tried. (There are
|
|
|
|
|
more subtleties that are not described here, but these are not relevant for
|
|
|
|
|
implementers of the Importer Protocol.)
|
|
|
|
|
|
|
|
|
|
Deeper down in the mechanism, a dotted name import is split up by its
|
|
|
|
|
components. For "import spam.ham", first an "import spam" is done, and only
|
|
|
|
|
when that succeeds is "ham" imported as a submodule of "spam".
|
|
|
|
|
|
|
|
|
|
The Importer Protocol operates at this level of *individual* imports. By the
|
|
|
|
|
time an importer gets a request for "spam.ham", module "spam" has already been
|
|
|
|
|
imported.
|
|
|
|
|
|
|
|
|
|
The protocol involves two objects: a *finder* and a *loader*. A finder object
|
|
|
|
|
has a single method::
|
|
|
|
|
|
|
|
|
|
finder.find_module(fullname, path=None)
|
|
|
|
|
|
|
|
|
|
This method will be called with the fully qualified name of the module. If
|
|
|
|
|
the finder is installed on ``sys.meta_path``, it will receive a second
|
|
|
|
|
argument, which is ``None`` for a top-level module, or ``package.__path__``
|
|
|
|
|
for submodules or subpackages [5]_. It should return a loader object if the
|
|
|
|
|
module was found, or ``None`` if it wasn't. If ``find_module()`` raises an
|
|
|
|
|
exception, it will be propagated to the caller, aborting the import.
|
|
|
|
|
|
|
|
|
|
A loader object also has one method::
|
|
|
|
|
|
|
|
|
|
loader.load_module(fullname)
|
|
|
|
|
|
|
|
|
|
This method returns the loaded module or raises an exception, preferably
|
|
|
|
|
``ImportError`` if an existing exception is not being propagated. If
|
|
|
|
|
``load_module()`` is asked to load a module that it cannot, ``ImportError`` is
|
|
|
|
|
to be raised.
|
|
|
|
|
|
|
|
|
|
In many cases the finder and loader can be one and the same object:
|
|
|
|
|
``finder.find_module()`` would just return ``self``.
|
|
|
|
|
|
|
|
|
|
The ``fullname`` argument of both methods is the fully qualified module name,
|
|
|
|
|
for example "spam.eggs.ham". As explained above, when
|
|
|
|
|
``finder.find_module("spam.eggs.ham")`` is called, "spam.eggs" has already
|
|
|
|
|
been imported and added to ``sys.modules``. However, the ``find_module()``
|
|
|
|
|
method isn't necessarily always called during an actual import: meta tools
|
|
|
|
|
that analyze import dependencies (such as freeze, Installer or py2exe) don't
|
|
|
|
|
actually load modules, so a finder shouldn't *depend* on the parent package
|
|
|
|
|
being available in ``sys.modules``.
|
|
|
|
|
|
|
|
|
|
The ``load_module()`` method has a few responsibilities that it must fulfill
|
|
|
|
|
*before* it runs any code:
|
|
|
|
|
|
2016-05-03 04:18:02 -04:00
|
|
|
|
* If there is an existing module object named 'fullname' in ``sys.modules``,
|
|
|
|
|
the loader must use that existing module. (Otherwise, the ``reload()``
|
|
|
|
|
builtin will not work correctly.) If a module named 'fullname' does not
|
|
|
|
|
exist in ``sys.modules``, the loader must create a new module object and
|
|
|
|
|
add it to ``sys.modules``.
|
|
|
|
|
|
|
|
|
|
Note that the module object *must* be in ``sys.modules`` before the loader
|
|
|
|
|
executes the module code. This is crucial because the module code may
|
|
|
|
|
(directly or indirectly) import itself; adding it to ``sys.modules``
|
|
|
|
|
beforehand prevents unbounded recursion in the worst case and multiple
|
|
|
|
|
loading in the best.
|
|
|
|
|
|
|
|
|
|
If the load fails, the loader needs to remove any module it may have
|
|
|
|
|
inserted into ``sys.modules``. If the module was already in ``sys.modules``
|
|
|
|
|
then the loader should leave it alone.
|
|
|
|
|
|
|
|
|
|
* The ``__file__`` attribute must be set. This must be a string, but it may
|
|
|
|
|
be a dummy value, for example "<frozen>". The privilege of not having a
|
|
|
|
|
``__file__`` attribute at all is reserved for built-in modules.
|
|
|
|
|
|
|
|
|
|
* The ``__name__`` attribute must be set. If one uses ``imp.new_module()``
|
|
|
|
|
then the attribute is set automatically.
|
|
|
|
|
|
|
|
|
|
* If it's a package, the ``__path__`` variable must be set. This must be a
|
|
|
|
|
list, but may be empty if ``__path__`` has no further significance to the
|
|
|
|
|
importer (more on this later).
|
|
|
|
|
|
|
|
|
|
* The ``__loader__`` attribute must be set to the loader object. This is
|
|
|
|
|
mostly for introspection and reloading, but can be used for
|
|
|
|
|
importer-specific extras, for example getting data associated with an
|
|
|
|
|
importer.
|
|
|
|
|
|
2022-01-21 06:03:51 -05:00
|
|
|
|
* The ``__package__`` attribute must be set (:pep:`366`).
|
2016-05-03 04:18:02 -04:00
|
|
|
|
|
|
|
|
|
If the module is a Python module (as opposed to a built-in module or a
|
|
|
|
|
dynamically loaded extension), it should execute the module's code in the
|
|
|
|
|
module's global name space (``module.__dict__``).
|
|
|
|
|
|
|
|
|
|
Here is a minimal pattern for a ``load_module()`` method::
|
|
|
|
|
|
|
|
|
|
# Consider using importlib.util.module_for_loader() to handle
|
|
|
|
|
# most of these details for you.
|
|
|
|
|
def load_module(self, fullname):
|
|
|
|
|
code = self.get_code(fullname)
|
|
|
|
|
ispkg = self.is_package(fullname)
|
|
|
|
|
mod = sys.modules.setdefault(fullname, imp.new_module(fullname))
|
|
|
|
|
mod.__file__ = "<%s>" % self.__class__.__name__
|
|
|
|
|
mod.__loader__ = self
|
|
|
|
|
if ispkg:
|
|
|
|
|
mod.__path__ = []
|
|
|
|
|
mod.__package__ = fullname
|
|
|
|
|
else:
|
|
|
|
|
mod.__package__ = fullname.rpartition('.')[0]
|
|
|
|
|
exec(code, mod.__dict__)
|
|
|
|
|
return mod
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Specification part 2: Registering Hooks
|
2012-05-03 11:47:57 -04:00
|
|
|
|
=======================================
|
|
|
|
|
|
|
|
|
|
There are two types of import hooks: *Meta hooks* and *Path hooks*. Meta
|
|
|
|
|
hooks are called at the start of import processing, before any other import
|
|
|
|
|
processing (so that meta hooks can override ``sys.path`` processing, frozen
|
|
|
|
|
modules, or even built-in modules). To register a meta hook, simply add the
|
|
|
|
|
finder object to ``sys.meta_path`` (the list of registered meta hooks).
|
|
|
|
|
|
|
|
|
|
Path hooks are called as part of ``sys.path`` (or ``package.__path__``)
|
|
|
|
|
processing, at the point where their associated path item is encountered. A
|
|
|
|
|
path hook is registered by adding an importer factory to ``sys.path_hooks``.
|
|
|
|
|
|
|
|
|
|
``sys.path_hooks`` is a list of callables, which will be checked in sequence
|
|
|
|
|
to determine if they can handle a given path item. The callable is called
|
|
|
|
|
with one argument, the path item. The callable must raise ``ImportError`` if
|
|
|
|
|
it is unable to handle the path item, and return an importer object if it can
|
|
|
|
|
handle the path item. Note that if the callable returns an importer object
|
|
|
|
|
for a specific ``sys.path`` entry, the builtin import machinery will not be
|
|
|
|
|
invoked to handle that entry any longer, even if the importer object later
|
|
|
|
|
fails to find a specific module. The callable is typically the class of the
|
|
|
|
|
import hook, and hence the class ``__init__()`` method is called. (This is
|
|
|
|
|
also the reason why it should raise ``ImportError``: an ``__init__()`` method
|
|
|
|
|
can't return anything. This would be possible with a ``__new__()`` method in
|
|
|
|
|
a new style class, but we don't want to require anything about how a hook is
|
|
|
|
|
implemented.)
|
|
|
|
|
|
|
|
|
|
The results of path hook checks are cached in ``sys.path_importer_cache``,
|
|
|
|
|
which is a dictionary mapping path entries to importer objects. The cache is
|
|
|
|
|
checked before ``sys.path_hooks`` is scanned. If it is necessary to force a
|
|
|
|
|
rescan of ``sys.path_hooks``, it is possible to manually clear all or part of
|
|
|
|
|
``sys.path_importer_cache``.
|
|
|
|
|
|
|
|
|
|
Just like ``sys.path`` itself, the new ``sys`` variables must have specific
|
|
|
|
|
types:
|
|
|
|
|
|
2016-05-03 04:18:02 -04:00
|
|
|
|
* ``sys.meta_path`` and ``sys.path_hooks`` must be Python lists.
|
|
|
|
|
* ``sys.path_importer_cache`` must be a Python dict.
|
2012-05-03 11:47:57 -04:00
|
|
|
|
|
|
|
|
|
Modifying these variables in place is allowed, as is replacing them with new
|
|
|
|
|
objects.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Packages and the role of ``__path__``
|
|
|
|
|
=====================================
|
|
|
|
|
|
|
|
|
|
If a module has a ``__path__`` attribute, the import mechanism will treat it
|
|
|
|
|
as a package. The ``__path__`` variable is used instead of ``sys.path`` when
|
|
|
|
|
importing submodules of the package. The rules for ``sys.path`` therefore
|
|
|
|
|
also apply to ``pkg.__path__``. So ``sys.path_hooks`` is also consulted when
|
|
|
|
|
``pkg.__path__`` is traversed. Meta importers don't necessarily use
|
|
|
|
|
``sys.path`` at all to do their work and may therefore ignore the value of
|
|
|
|
|
``pkg.__path__``. In this case it is still advised to set it to list, which
|
|
|
|
|
can be empty.
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
|
|
|
|
|
2002-12-23 17:13:48 -05:00
|
|
|
|
Optional Extensions to the Importer Protocol
|
2012-05-03 11:47:57 -04:00
|
|
|
|
============================================
|
2002-12-23 17:13:48 -05:00
|
|
|
|
|
2012-05-03 11:47:57 -04:00
|
|
|
|
The Importer Protocol defines three optional extensions. One is to retrieve
|
|
|
|
|
data files, the second is to support module packaging tools and/or tools that
|
|
|
|
|
analyze module dependencies (for example Freeze), while the last is to support
|
|
|
|
|
execution of modules as scripts. The latter two categories of tools usually
|
|
|
|
|
don't actually *load* modules, they only need to know if and where they are
|
|
|
|
|
available. All three extensions are highly recommended for general purpose
|
|
|
|
|
importers, but may safely be left out if those features aren't needed.
|
2009-02-07 21:41:22 -05:00
|
|
|
|
|
2012-05-03 11:47:57 -04:00
|
|
|
|
To retrieve the data for arbitrary "files" from the underlying storage
|
|
|
|
|
backend, loader objects may supply a method named ``get_data()``::
|
2002-12-23 17:13:48 -05:00
|
|
|
|
|
2012-05-03 11:47:57 -04:00
|
|
|
|
loader.get_data(path)
|
|
|
|
|
|
|
|
|
|
This method returns the data as a string, or raise ``IOError`` if the "file"
|
|
|
|
|
wasn't found. The data is always returned as if "binary" mode was used -
|
|
|
|
|
there is no CRLF translation of text files, for example. It is meant for
|
|
|
|
|
importers that have some file-system-like properties. The 'path' argument is
|
|
|
|
|
a path that can be constructed by munging ``module.__file__`` (or
|
|
|
|
|
``pkg.__path__`` items) with the ``os.path.*`` functions, for example::
|
|
|
|
|
|
|
|
|
|
d = os.path.dirname(__file__)
|
|
|
|
|
data = __loader__.get_data(os.path.join(d, "logo.gif"))
|
|
|
|
|
|
|
|
|
|
The following set of methods may be implemented if support for (for example)
|
|
|
|
|
Freeze-like tools is desirable. It consists of three additional methods
|
|
|
|
|
which, to make it easier for the caller, each of which should be implemented,
|
|
|
|
|
or none at all::
|
|
|
|
|
|
|
|
|
|
loader.is_package(fullname)
|
|
|
|
|
loader.get_code(fullname)
|
|
|
|
|
loader.get_source(fullname)
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
2012-05-03 11:47:57 -04:00
|
|
|
|
All three methods should raise ``ImportError`` if the module wasn't found.
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
2012-05-03 11:47:57 -04:00
|
|
|
|
The ``loader.is_package(fullname)`` method should return ``True`` if the
|
|
|
|
|
module specified by 'fullname' is a package and ``False`` if it isn't.
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
2012-05-03 11:47:57 -04:00
|
|
|
|
The ``loader.get_code(fullname)`` method should return the code object
|
|
|
|
|
associated with the module, or ``None`` if it's a built-in or extension
|
|
|
|
|
module. If the loader doesn't have the code object but it *does* have the
|
|
|
|
|
source code, it should return the compiled source code. (This is so that our
|
|
|
|
|
caller doesn't also need to check ``get_source()`` if all it needs is the code
|
|
|
|
|
object.)
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
2012-05-03 11:47:57 -04:00
|
|
|
|
The ``loader.get_source(fullname)`` method should return the source code for
|
|
|
|
|
the module as a string (using newline characters for line endings) or ``None``
|
|
|
|
|
if the source is not available (yet it should still raise ``ImportError`` if
|
|
|
|
|
the module can't be found by the importer at all).
|
|
|
|
|
|
2022-01-21 06:03:51 -05:00
|
|
|
|
To support execution of modules as scripts (:pep:`338`),
|
|
|
|
|
the above three methods for
|
2012-05-03 11:47:57 -04:00
|
|
|
|
finding the code associated with a module must be implemented. In addition to
|
|
|
|
|
those methods, the following method may be provided in order to allow the
|
|
|
|
|
``runpy`` module to correctly set the ``__file__`` attribute::
|
|
|
|
|
|
|
|
|
|
loader.get_filename(fullname)
|
|
|
|
|
|
|
|
|
|
This method should return the value that ``__file__`` would be set to if the
|
|
|
|
|
named module was loaded. If the module is not found, then ``ImportError``
|
|
|
|
|
should be raised.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Integration with the 'imp' module
|
|
|
|
|
=================================
|
|
|
|
|
|
|
|
|
|
The new import hooks are not easily integrated in the existing
|
|
|
|
|
``imp.find_module()`` and ``imp.load_module()`` calls. It's questionable
|
|
|
|
|
whether it's possible at all without breaking code; it is better to simply add
|
|
|
|
|
a new function to the ``imp`` module. The meaning of the existing
|
|
|
|
|
``imp.find_module()`` and ``imp.load_module()`` calls changes from: "they
|
|
|
|
|
expose the built-in import mechanism" to "they expose the basic *unhooked*
|
|
|
|
|
built-in import mechanism". They simply won't invoke any import hooks. A new
|
|
|
|
|
``imp`` module function is proposed (but not yet implemented) under the name
|
|
|
|
|
``get_loader()``, which is used as in the following pattern::
|
|
|
|
|
|
|
|
|
|
loader = imp.get_loader(fullname, path)
|
|
|
|
|
if loader is not None:
|
|
|
|
|
loader.load_module(fullname)
|
|
|
|
|
|
|
|
|
|
In the case of a "basic" import, one the `imp.find_module()` function would
|
|
|
|
|
handle, the loader object would be a wrapper for the current output of
|
|
|
|
|
``imp.find_module()``, and ``loader.load_module()`` would call
|
|
|
|
|
``imp.load_module()`` with that output.
|
|
|
|
|
|
|
|
|
|
Note that this wrapper is currently not yet implemented, although a Python
|
|
|
|
|
prototype exists in the ``test_importhooks.py`` script (the ``ImpWrapper``
|
|
|
|
|
class) included with the patch.
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
|
|
|
|
|
2002-12-24 15:36:32 -05:00
|
|
|
|
Forward Compatibility
|
2012-05-03 11:47:57 -04:00
|
|
|
|
=====================
|
2002-12-24 15:36:32 -05:00
|
|
|
|
|
2012-05-03 11:47:57 -04:00
|
|
|
|
Existing ``__import__`` hooks will not invoke new-style hooks by magic, unless
|
|
|
|
|
they call the original ``__import__`` function as a fallback. For example,
|
|
|
|
|
``ihooks.py``, ``iu.py`` and ``imputil.py`` are in this sense not forward
|
|
|
|
|
compatible with this PEP.
|
2002-12-24 15:36:32 -05:00
|
|
|
|
|
|
|
|
|
|
2002-12-20 08:07:24 -05:00
|
|
|
|
Open Issues
|
2012-05-03 11:47:57 -04:00
|
|
|
|
===========
|
|
|
|
|
|
|
|
|
|
Modules often need supporting data files to do their job, particularly in the
|
|
|
|
|
case of complex packages or full applications. Current practice is generally
|
|
|
|
|
to locate such files via ``sys.path`` (or a ``package.__path__`` attribute).
|
|
|
|
|
This approach will not work, in general, for modules loaded via an import
|
|
|
|
|
hook.
|
|
|
|
|
|
|
|
|
|
There are a number of possible ways to address this problem:
|
|
|
|
|
|
2016-05-03 04:18:02 -04:00
|
|
|
|
* "Don't do that". If a package needs to locate data files via its
|
|
|
|
|
``__path__``, it is not suitable for loading via an import hook. The
|
|
|
|
|
package can still be located on a directory in ``sys.path``, as at present,
|
|
|
|
|
so this should not be seen as a major issue.
|
|
|
|
|
|
|
|
|
|
* Locate data files from a standard location, rather than relative to the
|
|
|
|
|
module file. A relatively simple approach (which is supported by
|
|
|
|
|
distutils) would be to locate data files based on ``sys.prefix`` (or
|
|
|
|
|
``sys.exec_prefix``). For example, looking in
|
|
|
|
|
``os.path.join(sys.prefix, "data", package_name)``.
|
|
|
|
|
|
|
|
|
|
* Import hooks could offer a standard way of getting at data files relative
|
|
|
|
|
to the module file. The standard ``zipimport`` object provides a method
|
|
|
|
|
``get_data(name)`` which returns the content of the "file" called ``name``,
|
|
|
|
|
as a string. To allow modules to get at the importer object, ``zipimport``
|
|
|
|
|
also adds an attribute ``__loader__`` to the module, containing the
|
|
|
|
|
``zipimport`` object used to load the module. If such an approach is used,
|
|
|
|
|
it is important that client code takes care not to break if the
|
|
|
|
|
``get_data()`` method is not available, so it is not clear that this
|
|
|
|
|
approach offers a general answer to the problem.
|
2012-05-03 11:47:57 -04:00
|
|
|
|
|
|
|
|
|
It was suggested on python-dev that it would be useful to be able to receive a
|
|
|
|
|
list of available modules from an importer and/or a list of available data
|
|
|
|
|
files for use with the ``get_data()`` method. The protocol could grow two
|
|
|
|
|
additional extensions, say ``list_modules()`` and ``list_files()``. The
|
|
|
|
|
latter makes sense on loader objects with a ``get_data()`` method. However,
|
|
|
|
|
it's a bit unclear which object should implement ``list_modules()``: the
|
|
|
|
|
importer or the loader or both?
|
|
|
|
|
|
|
|
|
|
This PEP is biased towards loading modules from alternative places: it
|
|
|
|
|
currently doesn't offer dedicated solutions for loading modules from
|
|
|
|
|
alternative file formats or with alternative compilers. In contrast, the
|
|
|
|
|
``ihooks`` module from the standard library does have a fairly straightforward
|
|
|
|
|
way to do this. The Quixote project [7]_ uses this technique to import PTL
|
|
|
|
|
files as if they are ordinary Python modules. To do the same with the new
|
|
|
|
|
hooks would either mean to add a new module implementing a subset of
|
|
|
|
|
``ihooks`` as a new-style importer, or add a hookable built-in path importer
|
|
|
|
|
object.
|
|
|
|
|
|
|
|
|
|
There is no specific support within this PEP for "stacking" hooks. For
|
|
|
|
|
example, it is not obvious how to write a hook to load modules from ``tar.gz``
|
|
|
|
|
files by combining separate hooks to load modules from ``.tar`` and ``.gz``
|
|
|
|
|
files. However, there is no support for such stacking in the existing hook
|
|
|
|
|
mechanisms (either the basic "replace ``__import__``" method, or any of the
|
|
|
|
|
existing import hook modules) and so this functionality is not an obvious
|
|
|
|
|
requirement of the new mechanism. It may be worth considering as a future
|
|
|
|
|
enhancement, however.
|
|
|
|
|
|
|
|
|
|
It is possible (via ``sys.meta_path``) to add hooks which run before
|
|
|
|
|
``sys.path`` is processed. However, there is no equivalent way of adding
|
|
|
|
|
hooks to run after ``sys.path`` is processed. For now, if a hook is required
|
|
|
|
|
after ``sys.path`` has been processed, it can be simulated by adding an
|
|
|
|
|
arbitrary "cookie" string at the end of ``sys.path``, and having the required
|
|
|
|
|
hook associated with this cookie, via the normal ``sys.path_hooks``
|
|
|
|
|
processing. In the longer term, the path handling code will become a "real"
|
|
|
|
|
hook on ``sys.meta_path``, and at that stage it will be possible to insert
|
|
|
|
|
user-defined hooks either before or after it.
|
2003-01-02 13:47:04 -05:00
|
|
|
|
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
|
|
|
|
Implementation
|
2012-05-03 11:47:57 -04:00
|
|
|
|
==============
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
2022-01-21 06:03:51 -05:00
|
|
|
|
The :pep:`302` implementation has been integrated with Python as of 2.3a1. An
|
2012-05-03 11:47:57 -04:00
|
|
|
|
earlier version is available as patch #652586 [9]_, but more interestingly,
|
|
|
|
|
the issue contains a fairly detailed history of the development and design.
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
2022-01-21 06:03:51 -05:00
|
|
|
|
:pep:`273` has been implemented using :pep:`302`'s import hooks.
|
2004-09-27 21:11:15 -04:00
|
|
|
|
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
2002-12-23 17:13:48 -05:00
|
|
|
|
References and Footnotes
|
2012-05-03 11:47:57 -04:00
|
|
|
|
========================
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
2012-05-03 11:47:57 -04:00
|
|
|
|
.. [1] imputil module
|
|
|
|
|
http://docs.python.org/library/imputil.html
|
2002-12-23 17:13:48 -05:00
|
|
|
|
|
2012-05-03 11:47:57 -04:00
|
|
|
|
.. [2] The Freeze tool.
|
|
|
|
|
See also the ``Tools/freeze/`` directory in a Python source distribution
|
2002-12-23 17:13:48 -05:00
|
|
|
|
|
2012-05-03 11:47:57 -04:00
|
|
|
|
.. [3] py2exe by Thomas Heller
|
|
|
|
|
http://www.py2exe.org/
|
2002-12-23 17:13:48 -05:00
|
|
|
|
|
2012-05-03 11:47:57 -04:00
|
|
|
|
.. [4] imp.set_frozenmodules() patch
|
|
|
|
|
http://bugs.python.org/issue642578
|
2002-12-23 17:13:48 -05:00
|
|
|
|
|
2012-05-03 11:47:57 -04:00
|
|
|
|
.. [5] The path argument to ``finder.find_module()`` is there because the
|
|
|
|
|
``pkg.__path__`` variable may be needed at this point. It may either come
|
|
|
|
|
from the actual parent module or be supplied by ``imp.find_module()`` or
|
|
|
|
|
the proposed ``imp.get_loader()`` function.
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
2012-05-03 11:47:57 -04:00
|
|
|
|
.. [7] Quixote, a framework for developing Web applications
|
|
|
|
|
http://www.mems-exchange.org/software/quixote/
|
2002-12-24 15:36:32 -05:00
|
|
|
|
|
2012-05-03 11:47:57 -04:00
|
|
|
|
.. [9] New import hooks + Import from Zip files
|
|
|
|
|
http://bugs.python.org/issue652586
|
2012-04-27 17:41:29 -04:00
|
|
|
|
|
2013-05-04 14:04:46 -04:00
|
|
|
|
.. [10] Language reference for imports
|
|
|
|
|
http://docs.python.org/3/reference/import.html
|
|
|
|
|
|
|
|
|
|
.. [11] importlib documentation
|
|
|
|
|
http://docs.python.org/3/library/importlib.html#module-importlib
|
|
|
|
|
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
|
|
|
|
Copyright
|
2012-05-03 11:47:57 -04:00
|
|
|
|
=========
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
2012-05-03 11:47:57 -04:00
|
|
|
|
This document has been placed in the public domain.
|
2002-12-20 08:07:24 -05:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2012-05-03 11:47:57 -04:00
|
|
|
|
..
|
|
|
|
|
Local Variables:
|
|
|
|
|
mode: indented-text
|
|
|
|
|
indent-tabs-mode: nil
|
|
|
|
|
sentence-end-double-space: t
|
|
|
|
|
fill-column: 70
|
|
|
|
|
coding: utf-8
|
|
|
|
|
End:
|