Finally publish the import engine PEP on python.org

This commit is contained in:
Nick Coghlan 2011-11-13 15:04:07 +10:00
parent 13fec55fba
commit 8a31bb792d
1 changed files with 247 additions and 0 deletions

247
pep-0406.txt Normal file
View File

@ -0,0 +1,247 @@
PEP: 406
Title: Improved Encapsulation of Import State
Version: $Revision$
Last-Modified: $Date$
Author: Nick Coghlan <ncoghlan@gmail.com>, Greg Slodkowicz <jergosh@gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 4-Jul-2011
Post-History: 31-Jul-2011, 13-Nov-2011
Abstract
========
This PEP proposes the introduction of a new 'ImportEngine' class as part of
``importlib`` which would encapsulate all state related to importing modules
into a single object. Creating new instances of this object would then provide
an alternative to completely replacing the built-in implementation of the
import statement, by overriding the ``__import__()`` function. To work with
the builtin import functionality and importing via import engine objects,
module importers and loaders must accept an optional ``engine`` parameter. In
that sense, this PEP constitutes a revision of finder and loader interfaces
described in PEP 302 [1]_. However, the standard import process will not
supply the additional argument, so this proposal remains fully backwards
compatible.
The PEP also proposes inclusion of a ``GlobalImportEngine`` subclass and a
globally accessible instance of that class, which "writes through" to the
process global state and invokes importers and loaders without the additional
``engine`` argument. This provides a backwards compatible bridge between the
proposed encapsulated API and the legacy process global state.
Rationale
=========
Currently, most state related to the import system is stored as module level
attributes in the ``sys`` module. The one exception is the import lock, which
is not accessible directly, but only via the related functions in the ``imp``
module. The current process global import state comprises:
* sys.modules
* sys.path
* sys.path_hooks
* sys.meta_path
* sys.path_importer_cache
* the import lock (imp.lock_held()/acquire_lock()/release_lock())
Isolating this state would allow multiple import states to be
conveniently stored within a process. Placing the import functionality
in a self-contained object would also allow subclassing to add additional
features (e.g. module import notifications or fine-grained control
over which modules can be imported). The engine would also be
subclassed to make it possible to use the import engine API to
interact with the existing process-global state.
The namespace PEPs (especially PEP 402) raise a potential need for
*additional* process global state, in order to correctly update package paths
as ``sys.path`` is modified.
Proposal
========
We propose introducing an ImportEngine class to encapsulate import
functionality. This includes an ``__import__()`` method which can
be used as an alternative to the built-in ``__import__()`` when
desired and also an ``import_module()`` method, equivalent to
``importlib.import_module()`` [3]_.
Since the new style finders and loaders should also have the option to
modify the global import state, we introduce a ``GlobalImportState``
class with an interface identical to ``ImportEngine`` but taking
advantage of the global state. This can be easily implemented using
class properties.
Specification
=============
ImportEngine API
~~~~~~~~~~~~~~~~
The proposed extension consists of the following objects:
``importlib.engine.ImportEngine``
``from_engine(self, other)``
Create a new import object from another ImportEngine instance. The
new object is initialised with a copy of the state in ``other``. When
called on ``importlib engine.sysengine``, ``from_engine()`` can be
used to create an ``ImportEngine`` object with a **copy** of the
global import state.
``__import__(self, name, globals={}, locals={}, fromlist=[], level=0)``
Reimplementation of the builtin ``__import__()`` function. The
import of a module will proceed using the state stored in the
ImportEngine instance rather than the global import state. For full
documentation of ``__import__`` funtionality, see [2]_ .
``__import__()`` from ``ImportEngine`` and its subclasses can be used
to customise the behaviour of the ``import`` statement by replacing
``__builtin__.__import__`` with ``ImportEngine().__import__``.
``import_module(name, package=None)``
A reimplementation of ``importlib.import_module()`` which uses the
import state stored in the ImportEngine instance. See [3]_ for a full
reference.
``modules, path, path_hooks, meta_path, path_importer_cache``
Instance-specific versions of their process global ``sys`` equivalents
``importlib.engine.GlobalImportEngine(ImportEngine)``
Convenience class to provide engine-like access to the global state.
Provides ``__import__()``, ``import_module()`` and ``from_engine()``
methods like ``ImportEngine`` but writes through to the global state
in ``sys``.
Global variables
~~~~~~~~~~~~~~~~
``importlib.engine.sysengine``
A precreated instance of ``GlobalImportEngine``. Intended for use by
importers and loaders that have been updated to accept optional ``engine``
parameters and with ``ImportEngine.from_engine(sysengine)`` to start with
a copy of the process global import state.
Necessary changes to finder/loader interfaces:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``find_module (cls, fullname, path=None, engine=None)``
``load_module (cls, fullname, path=None, engine=None)``
The only difference between engine compatible and PEP 302 compatible
finders/loaders is the presence of an additional ``engine`` parameter.
This is intended to specify an ImportEngine instance or subclass thereof.
This parameter is optional so that engine compatible finders and
loaders can be made backwards compatible with PEP 302 calling conventions by
falling back on ``engine.sysengine`` with the following simple pattern::
def find_module(cls, fullname, path=None, engine=None):
if not engine:
engine = importlib.engine.sysengine
...
Open Issues
===========
API design for falling back to global import state
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The current proposal relies on the ``from_engine()`` API to fall back to the
global import state. It may be desirable to offer a variant that instead falls
back to the global import state dynamically.
However, one big advantage of starting with an "as isolated as possible"
design is that it becomes possible to experiment with subclasses that blur
the boundaries between the engine instance state and the process global state
in various ways.
Builtin and extension modules must be process global
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Due to platform limitations, only one copy of each builtin and extension
module can readily exist in each process. Accordingly, it is impossible for
each ``ImportEngine`` instance to load such modules independently.
The simplest solution is for ``ImportEngine`` to refuse to load such modules,
raising ``ImportError``. ``GlobalImportEngine`` would be able to load them
normally.
``ImportEngine`` will still return such modules from a prepopulated module
cache - it's only loading them directly which causes problems.
Nested imports
~~~~~~~~~~~~~~
The reference implementation currently applies only to the outermost import.
Any imports by the module being imported will be handled using the standard
import machinery.
One way to handle this is to place the burden on the implementation of module
loaders to set ``module.__dict__["__import__"] = engine.__import__`` before
running the module's code. The ``importlib`` design facilities this by
allowing the change to be made in one place (``_LoaderBasics._load_module``).
Scope of API updates
~~~~~~~~~~~~~~~~~~~~
The reference implementation focuses on finding and loading modules. There
may be other PEP 302 APIs that should also be updated to accept an optional
``engine`` parameter.
Reference Implementation
========================
A reference implementation [4]_ based on Brett Cannon's importlib has been
developed by Greg Slodkowicz as part of the 2011 Google Summer of Code. Note
that the current implementation avoids modifying existing code, and hence
duplicates a lot of things unnecessarily. An actual implementation would just
modify any such affected code in place.
References
==========
.. [1] PEP 302, New Import Hooks, J van Rossum, Moore
(http://www.python.org/dev/peps/pep-0302)
.. [2] __import__() builtin function, The Python Standard Library documentation
(http://docs.python.org/library/functions.html#__import__)
.. [3] Importlib documentation, Cannon
(http://docs.python.org/dev/library/importlib)
.. [4] Reference implentation
(https://bitbucket.org/jergosh/gsoc_import_engine/src/default/Lib/importlib/engine.py)
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: