836 lines
32 KiB
Plaintext
836 lines
32 KiB
Plaintext
PEP: 451
|
||
Title: A ModuleSpec Type for the Import System
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Eric Snow <ericsnowcurrently@gmail.com>
|
||
Discussions-To: import-sig@python.org
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 8-Aug-2013
|
||
Python-Version: 3.4
|
||
Post-History: 8-Aug-2013, 28-Aug-2013, 18-Sep-2013, 24-Sep-2013, 4-Oct-2013
|
||
Resolution:
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
This PEP proposes to add a new class to importlib.machinery called
|
||
"ModuleSpec". It will provide all the import-related information used
|
||
to load a module and will be available without needing to load the
|
||
module first. Finders will directly provide a module's spec instead of
|
||
a loader (which they will continue to provide indirectly). The import
|
||
machinery will be adjusted to take advantage of module specs, including
|
||
using them to load modules.
|
||
|
||
|
||
Terms and Concepts
|
||
==================
|
||
|
||
The changes in this proposal are an opportunity to make several
|
||
existing terms and concepts more clear, whereas currently they are
|
||
(unfortunately) ambiguous. New concepts are also introduced in this
|
||
proposal. Finally, it's worth explaining a few other existing terms
|
||
with which people may not be so familiar. For the sake of context, here
|
||
is a brief summary of all three groups of terms and concepts. A more
|
||
detailed explanation of the import system is found at
|
||
[import_system_docs]_.
|
||
|
||
name
|
||
----
|
||
|
||
In this proposal, a module's "name" refers to its fully-qualified name,
|
||
meaning the fully-qualified name of the module's parent (if any) joined
|
||
to the simple name of the module by a period.
|
||
|
||
finder
|
||
------
|
||
|
||
A "finder" is an object that identifies the loader that the import
|
||
system should use to load a module. Currently this is accomplished by
|
||
calling the finder's find_module() method, which returns the loader.
|
||
|
||
Finders are strictly responsible for providing the loader, which they do
|
||
through their find_module() method. The import system then uses that
|
||
loader to load the module.
|
||
|
||
loader
|
||
------
|
||
|
||
A "loader" is an object that is used to load a module during import.
|
||
Currently this is done by calling the loader's load_module() method. A
|
||
loader may also provide APIs for getting information about the modules
|
||
it can load, as well as about data from sources associated with such a
|
||
module.
|
||
|
||
Right now loaders (via load_module()) are responsible for certain
|
||
boilerplate, import-related operations. These are:
|
||
|
||
1. perform some (module-related) validation;
|
||
2. create the module object;
|
||
3. set import-related attributes on the module;
|
||
4. "register" the module to sys.modules;
|
||
5. exec the module;
|
||
6. clean up in the event of failure while loading the module.
|
||
|
||
This all takes place during the import system's call to
|
||
Loader.load_module().
|
||
|
||
origin
|
||
------
|
||
|
||
This is a new term and concept. The idea of it exists subtly in the
|
||
import system already, but this proposal makes the concept explicit.
|
||
|
||
"origin" in an import context means the system (or resource within a
|
||
system) from which a module originates. For the purposes of this
|
||
proposal, "origin" is also a string which identifies such a resource or
|
||
system. "origin" is applicable to all modules.
|
||
|
||
For example, the origin for built-in and frozen modules is the
|
||
interpreter itself. The import system already identifies this origin as
|
||
"built-in" and "frozen", respectively. This is demonstrated in the
|
||
following module repr: "<module 'sys' (built-in)>".
|
||
|
||
In fact, the module repr is already a relatively reliable, though
|
||
implicit, indicator of a module's origin. Other modules also indicate
|
||
their origin through other means, as described in the entry for
|
||
"location".
|
||
|
||
It is up to the loader to decide on how to interpret and use a module's
|
||
origin, if at all.
|
||
|
||
location
|
||
--------
|
||
|
||
This is a new term. However the concept already exists clearly in the
|
||
import system, as associated with the ``__file__`` and ``__path__``
|
||
attributes of modules, as well as the name/term "path" elsewhere.
|
||
|
||
A "location" is a resource or "place", rather than a system at large,
|
||
from which a module is loaded. It qualifies as an "origin". Examples
|
||
of locations include filesystem paths and URLs. A location is
|
||
identified by the name of the resource, but may not necessarily identify
|
||
the system to which the resource pertains. In such cases the loader
|
||
would have to identify the system itself.
|
||
|
||
In contrast to other kinds of module origin, a location cannot be
|
||
inferred by the loader just by the module name. Instead, the loader
|
||
must be provided with a string to identify the location, usually by the
|
||
finder that generates the loader. The loader then uses this information
|
||
to locate the resource from which it will load the module. In theory
|
||
you could load the module at a given location under various names.
|
||
|
||
The most common example of locations in the import system are the
|
||
files from which source and extension modules are loaded. For these
|
||
modules the location is identified by the string in the ``__file__``
|
||
attribute. Although ``__file__`` isn't particularly accurate for some
|
||
modules (e.g. zipped), it is currently the only way that the import
|
||
system indicates that a module has a location.
|
||
|
||
A module that has a location may be called "locatable".
|
||
|
||
cache
|
||
-----
|
||
|
||
The import system stores compiled modules in the __pycache__ directory
|
||
as an optimization. This module cache that we use today was provided by
|
||
PEP 3147. For this proposal, the relevant API for module caching is the
|
||
``__cache__`` attribute of modules and the cache_from_source() function
|
||
in importlib.util. Loaders are responsible for putting modules into the
|
||
cache (and loading out of the cache). Currently the cache is only used
|
||
for compiled source modules. However, loaders may take advantage of
|
||
the module cache for other kinds of modules.
|
||
|
||
package
|
||
-------
|
||
|
||
The concept does not change, nor does the term. However, the
|
||
distinction between modules and packages is mostly superficial.
|
||
Packages *are* modules. They simply have a ``__path__`` attribute and
|
||
import may add attributes bound to submodules. The typically perceived
|
||
difference is a source of confusion. This proposal explicitly
|
||
de-emphasizes the distinction between packages and modules where it
|
||
makes sense to do so.
|
||
|
||
|
||
Motivation
|
||
==========
|
||
|
||
The import system has evolved over the lifetime of Python. In late 2002
|
||
PEP 302 introduced standardized import hooks via finders and
|
||
loaders and sys.meta_path. The importlib module, introduced
|
||
with Python 3.1, now exposes a pure Python implementation of the APIs
|
||
described by PEP 302, as well as of the full import system. It is now
|
||
much easier to understand and extend the import system. While a benefit
|
||
to the Python community, this greater accessibilty also presents a
|
||
challenge.
|
||
|
||
As more developers come to understand and customize the import system,
|
||
any weaknesses in the finder and loader APIs will be more impactful. So
|
||
the sooner we can address any such weaknesses the import system, the
|
||
better...and there are a couple we can take care of with this proposal.
|
||
|
||
Firstly, any time the import system needs to save information about a
|
||
module we end up with more attributes on module objects that are
|
||
generally only meaningful to the import system. It would be nice to
|
||
have a per-module namespace in which to put future import-related
|
||
information and to pass around within the import system. Secondly,
|
||
there's an API void between finders and loaders that causes undue
|
||
complexity when encountered. The PEP 420 (namespace packages)
|
||
implementation had to work around this. The complexity surfaced again
|
||
during recent efforts on a separate proposal. [ref_files_pep]_
|
||
|
||
The `finder`_ and `loader`_ sections above detail current responsibility
|
||
of both. Notably, loaders are not required to provide any of the
|
||
functionality of their load_module() through other methods. Thus,
|
||
though the import-related information about a module is likely available
|
||
without loading the module, it is not otherwise exposed.
|
||
|
||
Furthermore, the requirements assocated with load_module() are
|
||
common to all loaders and mostly are implemented in exactly the same
|
||
way. This means every loader has to duplicate the same boilerplate
|
||
code. importlib.util provides some tools that help with this, but
|
||
it would be more helpful if the import system simply took charge of
|
||
these responsibilities. The trouble is that this would limit the degree
|
||
of customization that load_module() could easily continue to facilitate.
|
||
|
||
More importantly, While a finder *could* provide the information that
|
||
the loader's load_module() would need, it currently has no consistent
|
||
way to get it to the loader. This is a gap between finders and loaders
|
||
which this proposal aims to fill.
|
||
|
||
Finally, when the import system calls a finder's find_module(), the
|
||
finder makes use of a variety of information about the module that is
|
||
useful outside the context of the method. Currently the options are
|
||
limited for persisting that per-module information past the method call,
|
||
since it only returns the loader. Popular options for this limitation
|
||
are to store the information in a module-to-info mapping somewhere on
|
||
the finder itself, or store it on the loader.
|
||
|
||
Unfortunately, loaders are not required to be module-specific. On top
|
||
of that, some of the useful information finders could provide is
|
||
common to all finders, so ideally the import system could take care of
|
||
those details. This is the same gap as before between finders and
|
||
loaders.
|
||
|
||
As an example of complexity attributable to this flaw, the
|
||
implementation of namespace packages in Python 3.3 (see PEP 420) added
|
||
FileFinder.find_loader() because there was no good way for
|
||
find_module() to provide the namespace search locations.
|
||
|
||
The answer to this gap is a ModuleSpec object that contains the
|
||
per-module information and takes care of the boilerplate functionality
|
||
involved with loading the module.
|
||
|
||
|
||
Specification
|
||
=============
|
||
|
||
The goal is to address the gap between finders and loaders while
|
||
changing as little of their semantics as possible. Though some
|
||
functionality and information is moved to the new ModuleSpec type,
|
||
their behavior should remain the same. However, for the sake of clarity
|
||
the finder and loader semantics will be explicitly identified.
|
||
|
||
Here is a high-level summary of the changes described by this PEP. More
|
||
detail is available in later sections.
|
||
|
||
importlib.machinery.ModuleSpec (new)
|
||
------------------------------------
|
||
|
||
A specification for a module's import-system-related state. See the
|
||
`ModuleSpec`_ section below for a more detailed description.
|
||
|
||
* ModuleSpec(name, loader, \*, origin=None, loader_state=None, is_package=None)
|
||
|
||
Attributes:
|
||
|
||
* name - a string for the fully-qualified name of the module.
|
||
* loader - the loader to use for loading.
|
||
* origin - the name of the place from which the module is loaded,
|
||
e.g. "builtin" for built-in modules and the filename for modules
|
||
loaded from source.
|
||
* submodule_search_locations - list of strings for where to find
|
||
submodules, if a package (None otherwise).
|
||
* loader_state - a container of extra module-specific data for use
|
||
during loading.
|
||
* cached (property) - a string for where the compiled module should be
|
||
stored.
|
||
* parent (RO-property) - the fully-qualified name of the package to
|
||
which the module belongs as a submodule (or None).
|
||
* has_location (RO-property) - a flag indicating whether or not the
|
||
module's "origin" attribute refers to a location.
|
||
|
||
Instance Methods:
|
||
|
||
* module_repr() - provide a repr string for the spec'ed module;
|
||
non-locatable modules will use their origin (e.g. "built-in").
|
||
* init_module_attrs(module) - set any of a module's import-related
|
||
attributes that aren't already set.
|
||
|
||
importlib.util Additions
|
||
------------------------
|
||
|
||
These are ModuleSpec factory functions, meant as a convenience for
|
||
finders. See the `Factory Functions`_ section below for more detail.
|
||
|
||
* spec_from_file_location(name, location, \*, loader=None, submodule_search_locations=None)
|
||
- build a spec from file-oriented information and loader APIs.
|
||
* from_loader(name, loader, \*, origin=None, is_package=None) - build
|
||
a spec with missing information filled in by using loader APIs.
|
||
|
||
Other API Additions
|
||
-------------------
|
||
|
||
* importlib.find_spec(name, path=None) will work exactly the same as
|
||
importlib.find_loader() (which it replaces), but return a spec instead
|
||
of a loader.
|
||
|
||
For loaders:
|
||
|
||
* importlib.abc.Loader.exec_module(module) will execute a module in its
|
||
own namespace. It replaces importlib.abc.Loader.load_module(), taking
|
||
over its module execution functionality.
|
||
* importlib.abc.Loader.create_module(spec) (optional) will return the
|
||
module to use for loading.
|
||
|
||
For modules:
|
||
|
||
* Module objects will have a new attribute: ``__spec__``.
|
||
|
||
API Changes
|
||
-----------
|
||
|
||
* InspectLoader.is_package() will become optional.
|
||
|
||
Deprecations
|
||
------------
|
||
|
||
* importlib.abc.MetaPathFinder.find_module()
|
||
* importlib.abc.PathEntryFinder.find_module()
|
||
* importlib.abc.PathEntryFinder.find_loader()
|
||
* importlib.abc.Loader.load_module()
|
||
* importlib.abc.Loader.module_repr()
|
||
* The parameters and attributes of the various loaders in
|
||
importlib.machinery
|
||
* importlib.util.set_package()
|
||
* importlib.util.set_loader()
|
||
* importlib.find_loader()
|
||
|
||
Removals
|
||
--------
|
||
|
||
These were introduced prior to Python 3.4's release, so they can simply
|
||
be removed.
|
||
|
||
* importlib.abc.Loader.init_module_attrs()
|
||
* importlib.util.module_to_load()
|
||
|
||
Other Changes
|
||
-------------
|
||
|
||
* The import system implementation in importlib will be changed to make
|
||
use of ModuleSpec.
|
||
* importlib.reload() will make use of ModuleSpec.
|
||
* A module's import-related attributes (other than ``__spec__``) will no
|
||
longer be used directly by the import system during that module's
|
||
import. However, this does not impact use of those attributes
|
||
(e.g. ``__path__``) when loading other modules (e.g. submodules).
|
||
* Import-related attributes should no longer be added to modules
|
||
directly, except by the import system.
|
||
* The module type's ``__repr__()`` will be a thin wrapper around a pure
|
||
Python implementation which will leverage ModuleSpec.
|
||
* The spec for the ``__main__`` module will reflect the appropriate
|
||
name and origin.
|
||
|
||
Backward-Compatibility
|
||
----------------------
|
||
|
||
* If a finder does not define find_spec(), a spec is derived from
|
||
the loader returned by find_module().
|
||
* PathEntryFinder.find_loader() still takes priority over
|
||
find_module().
|
||
* Loader.load_module() is used if exec_module() is not defined.
|
||
|
||
What Will not Change?
|
||
---------------------
|
||
|
||
* The syntax and semantics of the import statement.
|
||
* Existing finders and loaders will continue to work normally.
|
||
* The import-related module attributes will still be initialized with
|
||
the same information.
|
||
* Finders will still create loaders (now storing them in specs).
|
||
* Loader.load_module(), if a module defines it, will have all the
|
||
same requirements and may still be called directly.
|
||
* Loaders will still be responsible for module data APIs.
|
||
* importlib.reload() will still overwrite the import-related attributes.
|
||
|
||
Responsibilities
|
||
----------------
|
||
|
||
Here's a quick breakdown of where responsibilities lie after this PEP.
|
||
|
||
finders:
|
||
|
||
* create loader
|
||
* create spec
|
||
|
||
loaders:
|
||
|
||
* create module (optional)
|
||
* execute module
|
||
|
||
ModuleSpec:
|
||
|
||
* orchestrate module loading
|
||
* boilerplate for module loading, including managing sys.modules and
|
||
setting import-related attributes
|
||
* create module if loader doesn't
|
||
* call loader.exec_module(), passing in the module in which to exec
|
||
* contain all the information the loader needs to exec the module
|
||
* provide the repr for modules
|
||
|
||
|
||
What Will Existing Finders and Loaders Have to Do Differently?
|
||
==============================================================
|
||
|
||
Immediately? Nothing. The status quo will be deprecated, but will
|
||
continue working. However, here are the things that the authors of
|
||
finders and loaders should change relative to this PEP:
|
||
|
||
* Implement find_spec() on finders.
|
||
* Implement exec_module() on loaders, if possible.
|
||
|
||
The ModuleSpec factory functions in importlib.util are intended to be
|
||
helpful for converting existing finders. from_loader() and
|
||
from_file_location() are both straight-forward utilities in this
|
||
regard. In the case where loaders already expose methods for creating
|
||
and preparing modules, ModuleSpec.from_module() may be useful to
|
||
the corresponding finder.
|
||
|
||
For existing loaders, exec_module() should be a relatively direct
|
||
conversion from the non-boilerplate portion of load_module(). In some
|
||
uncommon cases the loader should also implement create_module().
|
||
|
||
|
||
ModuleSpec Users
|
||
================
|
||
|
||
ModuleSpec objects have 3 distinct target audiences: Python itself,
|
||
import hooks, and normal Python users.
|
||
|
||
Python will use specs in the import machinery, in interpreter startup,
|
||
and in various standard library modules. Some modules are
|
||
import-oriented, like pkgutil, and others are not, like pickle and
|
||
pydoc. In all cases, the full ModuleSpec API will get used.
|
||
|
||
Import hooks (finders and loaders) will make use of the spec in specific
|
||
ways. First of all, finders may use the spec factory functions in
|
||
importlib.util to create spec objects. They may also directly adjust
|
||
the spec attributes after the spec is created. Secondly, the finder may
|
||
bind additional information to the spec (in finder_extras) for the
|
||
loader to consume during module creation/execution. Finally, loaders
|
||
will make use of the attributes on a spec when creating and/or executing
|
||
a module.
|
||
|
||
Python users will be able to inspect a module's ``__spec__`` to get
|
||
import-related information about the object. Generally, Python
|
||
applications and interactive users will not be using the ``ModuleSpec``
|
||
factory functions nor any the instance methods.
|
||
|
||
|
||
How Loading Will Work
|
||
=====================
|
||
|
||
This is an outline of what happens in ModuleSpec's loading
|
||
functionality::
|
||
|
||
def load(spec):
|
||
if not hasattr(spec.loader, 'exec_module'):
|
||
module = spec.loader.load_module(spec.name)
|
||
spec.init_module_attrs(module)
|
||
return sys.modules[spec.name]
|
||
|
||
module = None
|
||
if hasattr(spec.loader, 'create_module'):
|
||
module = spec.loader.create_module(spec)
|
||
if module is None:
|
||
module = ModuleType(spec.name)
|
||
spec.init_module_attrs(module)
|
||
|
||
sys.modues[spec.name] = module
|
||
try:
|
||
spec.loader.exec_module(module)
|
||
except BaseException:
|
||
try:
|
||
del sys.modules[spec.name]
|
||
except KeyError:
|
||
pass
|
||
raise
|
||
return sys.modules[spec.name]
|
||
|
||
These steps are exactly what Loader.load_module() is already
|
||
expected to do. Loaders will thus be simplified since they will only
|
||
need to implement exec_module().
|
||
|
||
Note that we must return the module from sys.modules. During loading
|
||
the module may have replaced itself in sys.modules. Since we don't have
|
||
a post-import hook API to accommodate the use case, we have to deal with
|
||
it. However, in the replacement case we do not worry about setting the
|
||
import-related module attributes on the object. The module writer is on
|
||
their own if they are doing this.
|
||
|
||
|
||
ModuleSpec
|
||
==========
|
||
|
||
Attributes
|
||
----------
|
||
|
||
Each of the following names is an attribute on ModuleSpec objects. A
|
||
value of None indicates "not set". This contrasts with module
|
||
objects where the attribute simply doesn't exist. Most of the
|
||
attributes correspond to the import-related attributes of modules. Here
|
||
is the mapping. The reverse of this mapping is used by
|
||
ModuleSpec.init_module_attrs().
|
||
|
||
========================== ==============
|
||
On ModuleSpec On Modules
|
||
========================== ==============
|
||
name __name__
|
||
loader __loader__
|
||
package __package__
|
||
origin __file__*
|
||
cached __cached__*,**
|
||
submodule_search_locations __path__**
|
||
loader_state \-
|
||
has_location \-
|
||
========================== ==============
|
||
|
||
| \* Set on the module only if spec.has_location is true.
|
||
| \*\* Set on the module only if the spec attribute is not None.
|
||
|
||
While package and has_location are read-only properties, the remaining
|
||
attributes can be replaced after the module spec is created and even
|
||
after import is complete. This allows for unusual cases where directly
|
||
modifying the spec is the best option. However, typical use should not
|
||
involve changing the state of a module's spec.
|
||
|
||
**origin**
|
||
|
||
"origin" is a string for the name of the place from which the module
|
||
originates. See `origin`_ above. Aside from the informational value,
|
||
it is also used in module_repr(). In the case of a spec where
|
||
"has_location" is true, ``__file__`` is set to the value of "origin".
|
||
For built-in modules "origin" would be set to "built-in".
|
||
|
||
**has_location**
|
||
|
||
As explained in the `location`_ section above, many modules are
|
||
"locatable", meaning there is a corresponding resource from which the
|
||
module will be loaded and that resource can be described by a string.
|
||
In contrast, non-locatable modules can't be loaded in this fashion, e.g.
|
||
builtin modules and modules dynamically created in code. For these, the
|
||
name is the only way to access them, so they have an "origin" but not a
|
||
"location".
|
||
|
||
"has_location" is true if the module is locatable. In that case the
|
||
spec's origin is used as the location and ``__file__`` is set to
|
||
spec.origin. If additional location information is required (e.g.
|
||
zipimport), that information may be stored in spec.loader_state.
|
||
|
||
"has_location" may be implied from the existence of a load_data() method
|
||
on the loader.
|
||
|
||
Incidently, not all locatable modules will be cachable, but most will.
|
||
|
||
**submodule_search_locations**
|
||
|
||
The list of location strings, typically directory paths, in which to
|
||
search for submodules. If the module is a package this will be set to
|
||
a list (even an empty one). Otherwise it is None.
|
||
|
||
The name of the corresponding module attribute, ``__path__``, is
|
||
relatively ambiguous. Instead of mirroring it, we use a more explicit
|
||
attribute name that makes the purpose clear.
|
||
|
||
**loader_state**
|
||
|
||
A finder may set loader_state to any value to provide additional
|
||
data for the loader to use during loading. A value of None is the
|
||
default and indicates that there is no additional data. Otherwise it
|
||
can be set to any object, such as a dict, list, or
|
||
types.SimpleNamespace, containing the relevant extra information.
|
||
|
||
For example, zipimporter could use it to pass the zip archive name
|
||
to the loader directly, rather than needing to derive it from origin
|
||
or create a custom loader for each find operation.
|
||
|
||
loader_state is meant for use by the finder and corresponding loader.
|
||
It is not guaranteed to be a stable resource for any other use.
|
||
|
||
Factory Functions
|
||
-----------------
|
||
|
||
**spec_from_file_location(name, location, \*, loader=None, submodule_search_locations=None)**
|
||
|
||
Build a spec from file-oriented information and loader APIs.
|
||
|
||
* "origin" will be set to the location.
|
||
* "has_location" will be set to True.
|
||
* "cached" will be set to the result of calling cache_from_source().
|
||
|
||
* "origin" can be deduced from loader.get_filename() (if "location" is
|
||
not passed in.
|
||
* "loader" can be deduced from suffix if the location is a filename.
|
||
* "submodule_search_locations" can be deduced from loader.is_package()
|
||
and from os.path.dirname(location) if locatin is a filename.
|
||
|
||
**from_loader(name, loader, \*, origin=None, is_package=None)**
|
||
|
||
Build a spec with missing information filled in by using loader APIs.
|
||
|
||
* "has_location" can be deduced from loader.get_data.
|
||
* "origin" can be deduced from loader.get_filename().
|
||
* "submodule_search_locations" can be deduced from loader.is_package()
|
||
and from os.path.dirname(location) if locatin is a filename.
|
||
|
||
**spec_from_module(module, loader=None)**
|
||
|
||
Build a spec based on the import-related attributes of an existing
|
||
module. The spec attributes are set to the corresponding import-
|
||
related module attributes. See the table in `Attributes`_.
|
||
|
||
Omitted Attributes and Methods
|
||
------------------------------
|
||
|
||
There is no "PathModuleSpec" subclass of ModuleSpec that separates out
|
||
has_location, cached, and submodule_search_locations. While that might
|
||
make the separation cleaner, module objects don't have that distinction.
|
||
ModuleSpec will support both cases equally well.
|
||
|
||
While "is_package" would be a simple additional attribute (aliasing
|
||
self.submodule_search_locations is not None), it perpetuates the
|
||
artificial (and mostly erroneous) distinction between modules and
|
||
packages.
|
||
|
||
Conceivably, a ModuleSpec.load() method could optionally take a list of
|
||
modules with which to interact instead of sys.modules. That
|
||
capability is left out of this PEP, but may be pursued separately at
|
||
some other time, including relative to PEP 406 (import engine).
|
||
|
||
Likewise load() could be leveraged to implement multi-version
|
||
imports. While interesting, doing so is outside the scope of this
|
||
proposal.
|
||
|
||
Others:
|
||
|
||
* Add ModuleSpec.submodules (RO-property) - returns possible submodules
|
||
relative to the spec.
|
||
* Add ModuleSpec.loaded (RO-property) - the module in sys.module, if
|
||
any.
|
||
* Add ModuleSpec.data - a descriptor that wraps the data API of the
|
||
spec's loader.
|
||
* Also see [cleaner_reload_support]_.
|
||
|
||
|
||
Backward Compatibility
|
||
----------------------
|
||
|
||
ModuleSpec doesn't have any. This would be a different story if
|
||
Finder.find_module() were to return a module spec instead of loader.
|
||
In that case, specs would have to act like the loader that would have
|
||
been returned instead. Doing so would be relatively simple, but is an
|
||
unnecessary complication. It was part of earlier versions of this PEP.
|
||
|
||
Subclassing
|
||
-----------
|
||
|
||
Subclasses of ModuleSpec are allowed, but should not be necessary.
|
||
Simply setting loader_state or adding functionality to a custom
|
||
finder or loader will likely be a better fit and should be tried first.
|
||
However, as long as a subclass still fulfills the requirements of the
|
||
import system, objects of that type are completely fine as the return
|
||
value of Finder.find_spec().
|
||
|
||
|
||
Existing Types
|
||
==============
|
||
|
||
Module Objects
|
||
--------------
|
||
|
||
Other than adding ``__spec__``, none of the import-related module
|
||
attributes will be changed or deprecated, though some of them could be;
|
||
any such deprecation can wait until Python 4.
|
||
|
||
A module's spec will not be kept in sync with the corresponding import-
|
||
related attributes. Though they may differ, in practice they will
|
||
typically be the same.
|
||
|
||
One notable exception is that case where a module is run as a script by
|
||
using the ``-m`` flag. In that case ``module.__spec__.name`` will
|
||
reflect the actual module name while ``module.__name__`` will be
|
||
``__main__``.
|
||
|
||
A module's spec is not guaranteed to be identical between two modules
|
||
with the same name. Likewise there is no guarantee that successive
|
||
calls to importlib.find_spec() will return the same object or even an
|
||
equivalent object, though at least the latter is likely.
|
||
|
||
Finders
|
||
-------
|
||
|
||
Finders are still responsible for creating the loader. That loader will
|
||
now be stored in the module spec returned by find_spec() rather
|
||
than returned directly. As is currently the case without the PEP, if a
|
||
loader would be costly to create, that loader can be designed to defer
|
||
the cost until later.
|
||
|
||
**MetaPathFinder.find_spec(name, path=None)**
|
||
|
||
**PathEntryFinder.find_spec(name)**
|
||
|
||
Finders will return ModuleSpec objects when find_spec() is
|
||
called. This new method replaces find_module() and
|
||
find_loader() (in the PathEntryFinder case). If a loader does
|
||
not have find_spec(), find_module() and find_loader() are
|
||
used instead, for backward-compatibility.
|
||
|
||
Adding yet another similar method to loaders is a case of practicality.
|
||
find_module() could be changed to return specs instead of loaders.
|
||
This is tempting because the import APIs have suffered enough,
|
||
especially considering PathEntryFinder.find_loader() was just
|
||
added in Python 3.3. However, the extra complexity and a less-than-
|
||
explicit method name aren't worth it.
|
||
|
||
Loaders
|
||
-------
|
||
|
||
**Loader.exec_module(module)**
|
||
|
||
Loaders will have a new method, exec_module(). Its only job
|
||
is to "exec" the module and consequently populate the module's
|
||
namespace. It is not responsible for creating or preparing the module
|
||
object, nor for any cleanup afterward. It has no return value.
|
||
exec_module() will be used during both loading and reloading.
|
||
|
||
exec_module() should properly handle the case where it is called more
|
||
than once. For some kinds of modules this may mean raising ImportError
|
||
every time after the first time the method is called. This is
|
||
particularly relevant for reloading, where some kinds of modules do not
|
||
support in-place reloading.
|
||
|
||
**Loader.create_module(spec)**
|
||
|
||
Loaders may also implement create_module() that will return a
|
||
new module to exec. It may return None to indicate that the default
|
||
module creation code should be used. One use case, though atypical, for
|
||
create_module() is to provide a module that is a subclass of the builtin
|
||
module type. Most loaders will not need to implement create_module(),
|
||
|
||
create_module() should properly handle the case where it is called more
|
||
than once for the same spec/module. This may include returning None or
|
||
raising ImportError.
|
||
|
||
.. note::
|
||
|
||
exec_module() and create_module() should not set any import-related
|
||
module attributes. The fact that load_module() does is a design flaw
|
||
that this proposal aims to correct.
|
||
|
||
Other changes:
|
||
|
||
PEP 420 introduced the optional module_repr() loader method to limit
|
||
the amount of special-casing in the module type's ``__repr__()``. Since
|
||
this method is part of ModuleSpec, it will be deprecated on loaders.
|
||
However, if it exists on a loader it will be used exclusively.
|
||
|
||
Loader.init_module_attr() method, added prior to Python 3.4's
|
||
release , will be removed in favor of the same method on ModuleSpec.
|
||
|
||
However, InspectLoader.is_package() will not be deprecated even
|
||
though the same information is found on ModuleSpec. ModuleSpec
|
||
can use it to populate its own is_package if that information is
|
||
not otherwise available. Still, it will be made optional.
|
||
|
||
One consequence of ModuleSpec is that loader ``__init__`` methods will
|
||
no longer need to accommodate per-module state. The path-based loaders
|
||
in importlib take arguments in their ``__init__()`` and have
|
||
corresponding attributes. However, the need for those values is
|
||
eliminated by module specs.
|
||
|
||
In addition to executing a module during loading, loaders will still be
|
||
directly responsible for providing APIs concerning module-related data.
|
||
|
||
|
||
Other Changes
|
||
=============
|
||
|
||
* The various finders and loaders provided by importlib will be
|
||
updated to comply with this proposal.
|
||
* The spec for the ``__main__`` module will reflect how the interpreter
|
||
was started. For instance, with ``-m`` the spec's name will be that
|
||
of the run module, while ``__main__.__name__`` will still be
|
||
"__main__".
|
||
* We add importlib.find_spec() to mirror
|
||
importlib.find_loader() (which becomes deprecated).
|
||
* importlib.reload() is changed to use ModuleSpec.load().
|
||
* importlib.reload() will now make use of the per-module import
|
||
lock.
|
||
|
||
|
||
Reference Implementation
|
||
========================
|
||
|
||
A reference implementation will be available at
|
||
http://bugs.python.org/issue18864.
|
||
|
||
|
||
Open Issues
|
||
==============
|
||
|
||
\* The impact of this change on pkgutil (and setuptools) needs looking
|
||
into. It has some generic function-based extensions to PEP 302. These
|
||
may break if importlib starts wrapping loaders without the tools'
|
||
knowledge.
|
||
|
||
\* Other modules to look at: runpy (and pythonrun.c), pickle, pydoc,
|
||
inspect.
|
||
|
||
For instance, pickle should be updated in the ``__main__`` case to look
|
||
at ``module.__spec__.name``.
|
||
|
||
\* Impact on some kinds of lazy loading modules. [lazy_import_concerns]_
|
||
|
||
|
||
References
|
||
==========
|
||
|
||
.. [ref_files_pep] http://mail.python.org/pipermail/import-sig/2013-August/000658.html
|
||
|
||
.. [import_system_docs] http://docs.python.org/3/reference/import.html
|
||
|
||
.. [cleaner_reload_support] https://mail.python.org/pipermail/import-sig/2013-September/000735.html
|
||
|
||
.. [lazy_import_concerns] https://mail.python.org/pipermail/python-dev/2013-August/128129.html
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|
||
|