From ed9161cd0d44f94e121b58e3b5e963e646a4e53e Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Wed, 18 Sep 2013 03:46:21 -0600 Subject: [PATCH] PEP 451 cleanup in response to comments. --- pep-0451.txt | 830 +++++++++++++++++++-------------------------------- 1 file changed, 307 insertions(+), 523 deletions(-) diff --git a/pep-0451.txt b/pep-0451.txt index 804b4e66e..fb3a05c88 100644 --- a/pep-0451.txt +++ b/pep-0451.txt @@ -9,7 +9,7 @@ Type: Standards Track Content-Type: text/x-rst Created: 8-Aug-2013 Python-Version: 3.4 -Post-History: 8-Aug-2013, 28-Aug-2013 +Post-History: 8-Aug-2013, 28-Aug-2013, 18-Sep-2013 Resolution: @@ -19,9 +19,10 @@ Abstract This PEP proposes to add a new class to ``importlib.machinery`` called ``ModuleSpec``. It will be authoritative for all the import-related information about a module, and will be available without needing to -load the module first. Finders will provide a module's spec instead of -a loader. The import machinery will be adjusted to take advantage of -module specs, including using them to load modules. +load the module first. Finders will directly provide a module's spec +instead of a loader (which they will continue to provide indirectly). +The import machinery will be adjusted to take advantage of module specs, +including using them to load modules. Motivation @@ -43,18 +44,20 @@ better...and there are a couple we can take care of with this proposal. Firstly, any time the import system needs to save information about a module we end up with more attributes on module objects that are -generally only meaningful to the import system and occasionally to some -people. It would be nice to have a per-module namespace to put future -import-related information. Secondly, there's an API void between -finders and loaders that causes undue complexity when encountered. +generally only meaningful to the import system. It would be nice to +have a per-module namespace in which to put future import-related +information and to pass around within the import system. Secondly, +there's an API void between finders and loaders that causes undue +complexity when encountered. -Currently finders are strictly responsible for providing the loader -which the import system will use to load the module. The loader is then -responsible for doing some checks, creating the module object, setting -import-related attributes, "installing" the module to ``sys.modules``, -and loading the module, along with some cleanup. This all takes place -during the import system's call to ``Loader.load_module()``. Loaders -also provide some APIs for accessing data associated with a module. +Currently finders are strictly responsible for providing the loader, +through their find_module() method, which the import system will use to +load the module. The loader is then responsible for doing some checks, +creating the module object, setting import-related attributes, +"installing" the module to ``sys.modules``, and loading the module, +along with some cleanup. This all takes place during the import +system's call to ``Loader.load_module()``. Loaders also provide some +APIs for accessing data associated with a module. Loaders are not required to provide any of the functionality of ``load_module()`` through other methods. Thus, though the import- @@ -81,7 +84,8 @@ the finder itself, or store it on the loader. Unfortunately, loaders are not required to be module-specific. On top of that, some of the useful information finders could provide is common to all finders, so ideally the import system could take care of -that. This is the same gap as before between finders and loaders. +those details. This is the same gap as before between finders and +loaders. As an example of complexity attributable to this flaw, the implementation of namespace packages in Python 3.3 (see PEP 420) added @@ -90,7 +94,7 @@ implementation of namespace packages in Python 3.3 (see PEP 420) added The answer to this gap is a ``ModuleSpec`` object that contains the per-module information and takes care of the boilerplate functionality -of loading the module. +involved with loading the module. (The idea gained momentum during discussions related to another PEP.[1]) @@ -110,100 +114,106 @@ detail is available in later sections. importlib.machinery.ModuleSpec (new) ------------------------------------ +A specification for a module's import-system-related state. + +* ModuleSpec(name, loader, \*, origin=None, loading_info=None, is_package=None) + Attributes: * name - a string for the name of the module. * loader - the loader to use for loading and for module data. -* origin - a string for the location from which the module is loaded. +* origin - a string for the location from which the module is loaded, + e.g. "builtin" for built-in modules and the filename for modules + loaded from source. * submodule_search_locations - strings for where to find submodules, if a package. -* loading_info - a container of data for use during loading (or None). +* loading_info - a container of extra data for use during loading. * cached (property) - a string for where the compiled module will be - stored. -* is_location (RO-property) - the module's origin refers to a location. - -.. XXX Find a better name than loading_info? -.. XXX Add ``submodules`` (RO-property) - returns possible submodules - relative to spec (or None)? -.. XXX Add ``loaded`` (RO-property) - the module in sys.modules, if any? - -Factory Methods: - -* from_file_location() - factory for file-based module specs. -* from_module() - factory based on import-related module attributes. -* from_loader() - factory based on information provided by loaders. - -.. XXX Move the factories to importlib.util or make class-only? + stored (see PEP 3147). +* package (RO-property) - the name of the module's parent (or None). +* has_location (RO-property) - the module's origin refers to a location. Instance Methods: -* init_module_attrs() - populate a module's import-related attributes. -* module_repr() - provide a repr string for a module. -* create() - provide a new module to use for loading. -* exec() - execute the spec into a module namespace. -* load() - prepare a module and execute it in a protected way. -* reload() - re-execute a module in a protected way. +* module_repr() - provide a repr string for the spec'ed module. +* init_module_attrs(module) - set any of a module's import-related + attributes that aren't already set. -.. XXX Make module_repr() match the spec (BC problem?)? +importlib.util Additions +------------------------ -API Additions -------------- +* spec_from_file_location(name, location, \*, loader=None, submodule_search_locations=None) + - factory for file-based module specs. +* from_loader(name, loader, \*, origin=None, is_package=None) - factory + based on information provided by loaders. +* spec_from_module(module, loader=None) - factory based on existing + import-related module attributes. This function is expected to be + used only in some backward-compatibility situations. -* ``importlib.abc.Loader.exec_module()`` will execute a module in its - own namespace, replacing ``importlib.abc.Loader.load_module()``. -* ``importlib.abc.Loader.create_module()`` (optional) will return a new +Other API Additions +------------------- + +* importlib.abc.Loader.exec_module(module) will execute a module in its + own namespace. It replaces ``importlib.abc.Loader.load_module()``. +* importlib.abc.Loader.create_module(spec) (optional) will return a new module to use for loading. * Module objects will have a new attribute: ``__spec__``. -* ``importlib.find_spec()`` will return the spec for a module. -* ``__subclasshook__()`` will be implemented on the importlib ABCs. +* importlib.find_spec(name, path=None) will return the spec for a + module. -.. XXX Do __subclasshook__() separately from the PEP (issue18862). +exec_module() and create_module() should not set any import-related +module attributes. The fact that load_module() does is a design flaw +that this proposal aims to correct. API Changes ----------- -* Import-related module attributes will no longer be authoritative nor - used by the import system. * ``InspectLoader.is_package()`` will become optional. -.. XXX module __repr__() will prefer spec attributes? - Deprecations ------------ -* ``importlib.abc.MetaPathFinder.find_module()`` -* ``importlib.abc.PathEntryFinder.find_module()`` -* ``importlib.abc.PathEntryFinder.find_loader()`` -* ``importlib.abc.Loader.load_module()`` -* ``importlib.abc.Loader.module_repr()`` +* importlib.abc.MetaPathFinder.find_module() +* importlib.abc.PathEntryFinder.find_module() +* importlib.abc.PathEntryFinder.find_loader() +* importlib.abc.Loader.load_module() +* importlib.abc.Loader.module_repr() * The parameters and attributes of the various loaders in - ``importlib.machinery`` -* ``importlib.util.set_package()`` -* ``importlib.util.set_loader()`` -* ``importlib.find_loader()`` + importlib.machinery +* importlib.util.set_package() +* importlib.util.set_loader() +* importlib.find_loader() Removals -------- -* ``importlib.abc.Loader.init_module_attrs()`` -* ``importlib.util.module_to_load()`` +These were introduced prior to Python 3.4's release. + +* importlib.abc.Loader.init_module_attrs() +* importlib.util.module_to_load() Other Changes ------------- +* The import system implementation in importlib will be changed to make + use of ModuleSpec. +* Import-related module attributes (other than ``__spec__``) will no + longer be used directly by the import system. +* Import-related attributes should no longer be added to modules + directly. +* The module type's ``__repr__()`` will be thin wrapper around a pure + Python implementation which will leverage ModuleSpec. * The spec for the ``__main__`` module will reflect the appropriate name and origin. -* The module type's ``__repr__`` will defer to ModuleSpec exclusively. Backward-Compatibility ---------------------- -* If a finder does not define ``find_spec()``, a spec is derived from - the loader returned by ``find_module()``. -* ``PathEntryFinder.find_loader()`` will be used, if defined. -* ``Loader.load_module()`` is used if ``exec_module()`` is not defined. -* ``Loader.module_repr()`` is used by ``ModuleSpec.module_repr()`` if it - exists. +* If a finder does not define find_spec(), a spec is derived from + the loader returned by find_module(). +* PathEntryFinder.find_loader() still takes priority over + find_module(). +* Loader.load_module() is used if exec_module() is not defined. What Will not Change? --------------------- @@ -212,10 +222,33 @@ What Will not Change? * Existing finders and loaders will continue to work normally. * The import-related module attributes will still be initialized with the same information. -* Finders will still create loaders, storing them in the specs. -* ``Loader.load_module()``, if a module defines it, will have all the +* Finders will still create loaders (now storing them in specs). +* Loader.load_module(), if a module defines it, will have all the same requirements and may still be called directly. * Loaders will still be responsible for module data APIs. +* importlib.reload() will still overwrite the import-related attributes. + + +What Will Existing Finders and Loaders Have to Do Differently? +============================================================== + +Immediately? Nothing. The status quo will be deprecated, but will +continue working. However, here are the things that the authors of +finders and loaders should change relative to this PEP: + +* Implement ``find_spec()`` on finders. +* Implement ``exec_module()`` on loaders, if possible. + +The ModuleSpec factory functions in importlib.util are intended to be +helpful for converting existing finders. ``from_loader()`` and +``from_file_location()`` are both straight-forward utilities in this +regard. In the case where loaders already expose methods for creating +and preparing modules, ``ModuleSpec.from_module()`` may be useful to +the corresponding finder. + +For existing loaders, exec_module() should be a relatively direct +conversion from the non-boilerplate portion of load_module(). In some +uncommon cases the loader should also implement create_module(). ModuleSpec Users @@ -230,456 +263,216 @@ import-oriented, like pkgutil, and others are not, like pickle and pydoc. In all cases, the full ``ModuleSpec`` API will get used. Import hooks (finders and loaders) will make use of the spec in specific -ways, mostly without using the ``ModuleSpec`` instance methods. First -of all, finders will use the factory methods to create spec objects. -They may also directly adjust the spec attributes after the spec is -created. Secondly, the finder may bind additional information to the -spec for the loader to consume during module creation/execution. -Finally, loaders will make use of the attributes on a spec when creating -and/or executing a module. +ways. First of all, finders may use the spec factory functions in +importlib.util to create spec objects. They may also directly adjust +the spec attributes after the spec is created. Secondly, the finder may +bind additional information to the spec (in finder_extras) for the +loader to consume during module creation/execution. Finally, loaders +will make use of the attributes on a spec when creating and/or executing +a module. Python users will be able to inspect a module's ``__spec__`` to get -import-related information about the object. Generally, they will not -be using the ``ModuleSpec`` factory methods nor the instance methods. -However, each spec has methods named ``create``, ``exec``, ``load``, and -``reload``. Since they are so easy to access (and misunderstand/abuse), -their function and availability require explicit consideration in this -proposal. - - -What Will Existing Finders and Loaders Have to Do Differently? -============================================================== - -Immediately? Nothing. The status quo will be deprecated, but will -continue working. However, here are the things that the authors of -finders and loaders should change relative to this PEP: - -* Implement ``find_spec()`` on finders. -* Implement ``exec_module()`` on loaders, if possible. - -The factory methods of ``ModuleSpec`` are intended to be helpful for -converting existing finders. ``from_loader()`` and -``from_file_location()`` are both straight-forward utilities in this -regard. In the case where loaders already expose methods for creating -and preparing modules, a finder may use ``ModuleSpec.from_module()`` on -a throw-away module to create the appropriate spec. - -As for loaders, ``exec_module()`` should be a relatively direct -conversion from a portion of the existing ``load_module()``. However, -``Loader.create_module()`` will also be necessary in some uncommon -cases. Furthermore, ``load_module()`` will still work as a final option -when ``exec_module()`` is not appropriate. +import-related information about the object. Generally, Python +applications and interactive users will not be using the ``ModuleSpec`` +factory functions nor any the instance methods. How Loading Will Work ===================== -This is an outline of what happens in ``ModuleSpec.load()``. +This is an outline of what happens in ModuleSpec's loading +functionality:: -1. A new module is created by calling ``spec.create()``. + def load(spec): + if not hasattr(spec.loader, 'exec_module'): + module = spec.loader.load_module(spec.name) + spec.init_module_attrs(module) + return sys.modules[spec.name] - a. If the loader has a ``create_module()`` method, it gets called. - Otherwise a new module gets created. - b. The import-related module attributes are set. + module = None + if hasattr(spec.loader, 'create_module'): + module = spec.loader.create_module(spec) + if module is None: + module = ModuleType(spec.name) + spec.init_module_attrs(module) -2. The module is added to sys.modules. -3. ``spec.exec(module)`` gets called. - - a. If the loader has an ``exec_module()`` method, it gets called. - Otherwise ``load_module()`` gets called for backward-compatibility - and the resulting module is updated to match the spec. - -4. If there were any errors the module is removed from sys.modules. -5. If the module was replaced in sys.modules during ``exec()``, the one - in sys.modules is updated to match the spec. -6. The module in sys.modules is returned. + spec._initializing = True + sys.modues[spec.name] = module + try: + spec.loader.exec_module(module) + except Exception: + del sys.modules[spec.name] + finally: + spec._initializing = False + return sys.modules[spec.name] These steps are exactly what ``Loader.load_module()`` is already expected to do. Loaders will thus be simplified since they will only -need to implement the portion in step 3a. +need to implement exec_module(). + +Note that we must return the module from sys.modules. During loading +the module may have replaced itself in sys.modules. Since we don't have +a post-import hook API to accommodate the use case, we have to deal with +it. However, in the replacement case we do not worry about setting the +import-related module attributes on the object. The module writer is on +their own if they are doing this. ModuleSpec ========== -This is a new class which defines the import-related values to use when -loading the module. It closely corresponds to the import-related -attributes of module objects. ``ModuleSpec`` objects may also be used -by finders and loaders and other import-related APIs to hold extra -import-related state concerning the module. This greatly reduces the -need to add any new new import-related attributes to module objects, and -loader ``__init__`` methods will no longer need to accommodate such -per-module state. - -General Notes -------------- - -* The spec for each module instance will be unique to that instance even - if the information is identical to that of another spec. -* A module's spec is not intended to be modified by anything but - finders. - -Creating a ModuleSpec ---------------------- - -**ModuleSpec(name, loader, *, origin=None, is_package=None)** - -.. container:: - - ``name``, ``loader``, and ``origin`` are set on the new instance - without any modification. If ``is_package`` is not passed in, the - loader's ``is_package()`` gets called (if available), or it defaults - to `False`. If ``is_package`` is true, - ``submodule_search_locations`` is set to a new empty list. Otherwise - it is set to None. - - Other attributes not listed as parameters (such as ``package``) are - either read-only dynamic properties or default to None. - -**from_filename(name, loader, *, filename=None, submodule_search_locations=None)** - -.. container:: - - This factory classmethod allows a suitable ModuleSpec instance to be - easily created with extra file-related information. This includes - the values that would be set on a module as ``__file__`` or - ``__cached__``. - - ``is_location`` is set to True for specs created using - ``from_filename()``. - -**from_module(module, loader=None)** - -.. container:: - - This factory is used to create a spec based on the import-related - attributes of an existing module. Since modules should already have - ``__spec__`` set, this method has limited utility. - -**from_loader(name, loader, *, origin=None, is_package=None)** - -.. container:: - - A factory classmethod that returns a new ``ModuleSpec`` derived from - the arguments. ``is_package`` is used inside the method to indicate - that the module is a package. If not explicitly passed in, it falls - back to using the result of the loader's ``is_package()``, if - available. If not available, if defaults to False. - - In contrast to ``ModuleSpec.__init__()``, which takes the arguments - as-is, ``from_loader()`` calculates missing values from the ones - passed in, as much as possible. This replaces the behavior that is - currently provided by several ``importlib.util`` functions as well as - the optional ``init_module_attrs()`` method of loaders. Just to be - clear, here is a more detailed description of those calculations:: - - If not passed in, ``filename`` is to the result of calling the - loader's ``get_filename()``, if available. Otherwise it stays - unset (``None``). - - If not passed in, ``submodule_search_locations`` is set to an empty - list if ``is_package`` is true. Then the directory from ``filename`` - is appended to it, if possible. If ``is_package`` is false, - ``submodule_search_locations`` stays unset. - - If ``cached`` is not passed in and ``filename`` is passed in, - ``cached`` is derived from it. For filenames with a source suffix, - it set to the result of calling - ``importlib.util.cache_from_source()``. For bytecode suffixes (e.g. - ``.pyc``), ``cached`` is set to the value of ``filename``. If - ``filename`` is not passed in or ``cache_from_source()`` raises - ``NotImplementedError``, ``cached`` stays unset. - - If not passed in, ``origin`` is set to ``filename``. Thus if - ``filename`` is unset, ``origin`` stays unset. - - Attributes ---------- -Each of the following names is an attribute on ``ModuleSpec`` objects. -A value of ``None`` indicates "not set". This contrasts with module -objects where the attribute simply doesn't exist. +Each of the following names is an attribute on ModuleSpec objects. A +value of ``None`` indicates "not set". This contrasts with module +objects where the attribute simply doesn't exist. Most of the +attributes correspond to the import-related attributes of modules. Here +is the mapping. The reverse of this mapping is used by +ModuleSpec.init_module_attrs(). -While ``package`` is a read-only property, the remaining attributes can -be replaced after the module spec is created and even after import is -complete. This allows for unusual cases where directly modifying the -spec is the best option. However, typical use should not involve -changing the state of a module's spec. - -Most of the attributes correspond to the import-related attributes of -modules. Here is the mapping, followed by a description of the -attributes. The reverse of this mapping is used by -``ModuleSpec.init_module_attrs()``. - -========================== =========== +========================== ============== On ModuleSpec On Modules -========================== =========== +========================== ============== name __name__ loader __loader__ package __package__ origin __file__* -cached __cached__* +cached __cached__*,** submodule_search_locations __path__** loading_info \- -has_location (RO-property) \- -========================== =========== +has_location \- +========================== ============== -\* Only if ``is_location`` is true. -\*\* Only if not None. +\* Set only if has_location is true. +\*\* Set only if the spec attribute is not None. -**name** - -.. container:: - - The module's fully resolved and absolute name. It must be set. - -**loader** - -.. container:: - - The loader to use during loading and for module data. These specific - functionalities do not change for loaders. Finders are still - responsible for creating the loader and this attribute is where it is - stored. The loader must be set. +While package and has_location are read-only properties, the remaining +attributes can be replaced after the module spec is created and even +after import is complete. This allows for unusual cases where directly +modifying the spec is the best option. However, typical use should not +involve changing the state of a module's spec. **origin** -.. container:: +origin is a string for the place from which the module originates. +Aside from the informational value, it is also used in module_repr(). - A string for the location from which the module originates. Aside from - the informational value, it is also used in ``module_repr()``. - - The module attribute ``__file__`` has a similar but more restricted - meaning. Not all modules have it set (e.g. built-in modules). However, - ``origin`` is applicable to essentially all modules. For built-in - modules it would be set to "built-in". - -Secondary Attributes --------------------- - -Some of the ``ModuleSpec`` attributes are not set via arguments when -creating a new spec. Either they are strictly dynamically calculated -properties or they are simply set to None (aka "not set"). For the -latter case, those attributes may still be set directly. - -**package** - -.. container:: - - A dynamic property that gives the name of the module's parent. The - value is derived from ``name`` and ``is_package``. For packages it is - the value of ``name``. Otherwise it is equivalent to - ``name.rpartition('.')[0]``. Consequently, a top-level module will have - the empty string for ``package``. +The module attribute ``__file__`` has a similar but more restricted +meaning. Not all modules have it set (e.g. built-in modules). However, +``origin`` is applicable to all modules. For built-in modules it would +be set to "built-in". **has_location** -.. container:: +Some modules can be loaded by reference to a location, e.g. a filesystem +path or a URL or something of the sort. Having the location lets you +load the module, but in theory you could load that module under various +names. - Some modules can be loaded by reference to a location, e.g. a filesystem - path or a URL or something of the sort. Having the location lets you - load the module, but in theory you could load that module under various - names. +In contrast, non-located modules can't be loaded in this fashion, e.g. +builtin modules and modules dynamically created in code. For these, the +name is the only way to access them, so they have an "origin" but not a +"location". - In contrast, non-located modules can't be loaded in this fashion, e.g. - builtin modules and modules dynamically created in code. For these, the - name is the only way to access them, so they have an "origin" but not a - "location". +This attribute reflects whether or not the module is locatable. If it +is, origin must be set to the module's location and ``__file__`` will be +set on the module. Not all locatable modules will be cachable, but most +will. - This attribute reflects whether or not the module is locatable. If it - is, ``origin`` must be set to the module's location and ``__file__`` - will be set on the module. Furthermore, a locatable module is also - cacheable and so ``__cached__`` is tied to ``has_location``. - - The corresponding module attribute name, ``__file__``, is somewhat - inaccurate and potentially confusion, so we will use a more explicit - combination of ``origin`` and ``has_location`` to represent the same - information. Having a separate ``filename`` is unncessary since we have - ``origin``. - -**cached** - -.. container:: - - A string for the location where the compiled code for a module should be - stored. PEP 3147 details the caching mechanism of the import system. - - If ``has_location`` is true, this location string is set on the module - as ``__cached__``. When ``from_filename()`` is used to create a spec, - ``cached`` is set to the result of calling - ``importlib.util.source_to_cache()``. - - ``cached`` is not necessarily a file location. A finder or loader may - store an alternate location string in ``cached``. However, in practice - this will be the file location dicated by PEP 3147. +The corresponding module attribute name, ``__file__``, is somewhat +inaccurate and potentially confusion, so we will use a more explicit +combination of origin and has_location to represent the same +information. Having a separate filename is unncessary since we have +origin. **submodule_search_locations** -.. container:: +The list of location strings, typically directory paths, in which to +search for submodules. If the module is a package this will be set to +a list (even an empty one). Otherwise it is ``None``. - The list of location strings, typically directory paths, in which to - search for submodules. If the module is a package this will be set to - a list (even an empty one). Otherwise it is ``None``. - - The corresponding module attribute's name, ``__path__``, is relatively - ambiguous. Instead of mirroring it, we use a more explicit name that - makes the purpose clear. +The corresponding module attribute's name, ``__path__``, is relatively +ambiguous. Instead of mirroring it, we use a more explicit name that +makes the purpose clear. **loading_info** -.. container:: +A finder may set loading_info to any value to provide additional +data for the loader to use during loading. A value of None is the +default and indicates that there is no additional data. Otherwise it +can be set to any object, such as a dict, list, or +types.SimpleNamespace, containing the relevant extra information. - A finder may set ``loading_info`` to any value to provide additional - data for the loader to use during loading. A value of ``None`` is the - default and indicates that there is no additional data. Otherwise it is - likely set to some containers, such as a ``dict``, ``list``, or - ``types.SimpleNamespace`` containing the relevant extra information. +For example, zipimporter could use it to pass the zip archive name +to the loader directly, rather than needing to derive it from origin +or create a custom loader for each find operation. - For example, ``zipimporter`` could use it to pass the zip archive name - to the loader directly, rather than needing to derive it from ``origin`` - or create a custom loader for each find operation. - -Methods -------- - -**module_repr()** - -.. container:: - - Returns a repr string for the module, based on the module's import- - related attributes and falling back to the spec's attributes. The - string will reflect the current output of the module type's - ``__repr__()``. - - The module type's ``__repr__()`` will use the module's ``__spec__`` - exclusively. If the module does not have ``__spec__`` set, a spec is - generated using ``ModuleSpec.from_module()``. - - Since the module attributes may be out of sync with the spec and to - preserve backward-compatibility in that case, we defer to the module - attributes and only when they are missing do we fall back to the spec - attributes. - -**init_module_attrs(module)** - -.. container:: - - Sets the module's import-related attributes to the corresponding values - in the module spec. If ``has_location`` is false on the spec, - ``__file__`` and ``__cached__`` are not set on the module. ``__path__`` - is only set on the module if ``submodule_search_locations`` is None. - For the rest of the import-related module attributes, a ``None`` value - on the spec (aka "not set") means ``None`` will be set on the module. - If any of the attributes are already set on the module, the existing - values are replaced. The module's own ``__spec__`` is not consulted but - does get replaced with the spec on which ``init_module_attrs()`` was - called. The earlier mapping of ``ModuleSpec`` attributes to module - attributes indicates which attributes are involved on both sides. - -**create()** - -.. container:: - - A new module is created relative to the spec and its import-related - attributes are set accordingly. If the spec's loader has a - ``create_module()`` method, that gets called to create the module. This - give the loader a chance to do any pre-loading initialization that can't - otherwise be accomplished elsewhere. Otherwise a bare module object is - created. In both cases ``init_module_attrs()`` is called on the module - before it gets returned. - -**exec(module)** - -.. container:: - - The spec's loader is used to execute the module. If the loader has - ``exec_module()`` defined, the namespace of ``module`` is the target of - execution. Otherwise the loader's ``load_module()`` is called, which - ignores ``module`` and returns the module that was the actual - execution target. In that case the import-related attributes of that - module are updated to reflect the spec. In both cases the targeted - module is the one that gets returned. - -**load()** - -.. container:: - - This method captures the current functionality of and requirements on - ``Loader.load_module()`` without any semantic changes. It is - essentially a wrapper around ``create()`` and ``exec()`` with some - extra functionality regarding ``sys.modules``. - - itself in ``sys.modules`` while executing. Consequently, the module in - ``sys.modules`` is the one that gets returned by ``load()``. - - Right before ``exec()`` is called, the module is added to - ``sys.modules``. In the case of error during loading the module is - removed from ``sys.modules``. The module in ``sys.modules`` when - ``load()`` finishes is the one that gets returned. Returning the module - from ``sys.modules`` accommodates the ability of the module to replace - itself there while it is executing (during load). - - As already noted, this is what already happens in the import system. - ``load()`` is not meant to change any of this behavior. - - If ``loader`` is not set (``None``), ``load()`` raises a ValueError. - -**reload(module)** - -.. container:: - - As with ``load()`` this method faithfully fulfills the semantics of - ``Loader.load_module()`` in the reload case, with one exception: - reloading a module when ``exec_module()`` is available actually uses - ``module`` rather than ignoring it in favor of the one in - ``sys.modules``, as ``Loader.load_module()`` does. The functionality - here mirrors that of ``load()``, minus the ``create()`` call and the - ``sys.modules`` handling. - -.. XXX add more of importlib.reload()'s boilerplate to reload()? +loading_info is meant for use by the finder and corresponding loader. +It is not guaranteed to be a stable resource for any other use. Omitted Attributes and Methods ------------------------------ -There is no ``PathModuleSpec`` subclass of ``ModuleSpec`` that provides -the ``has_location``, ``cached``, and ``submodule_search_locations`` -functionality. While that might make the separation cleaner, module -objects don't have that distinction. ``ModuleSpec`` will support both -cases equally well. +The following ModuleSpec methods are not part of the public API since +it is easy to use them incorrectly and only the import system really +needs them (i.e. they would be an attractive nuisance). -While ``is_package`` would be a simple additional attribute (aliasing +* create() - provide a new module to use for loading. +* exec(module) - execute the spec into a module namespace. +* load() - prepare a module and execute it in a protected way. +* reload(module) - re-execute a module in a protected way. + +Here are other omissions: + +There is no PathModuleSpec subclass of ModuleSpec that separates out +has_location, cached, and submodule_search_locations. While that might +make the separation cleaner, module objects don't have that distinction. +ModuleSpec will support both cases equally well. + +While is_package would be a simple additional attribute (aliasing ``self.submodule_search_locations is not None``), it perpetuates the artificial (and mostly erroneous) distinction between modules and packages. -Conceivably, ``ModuleSpec.load()`` could optionally take a list of -modules with which to interact instead of ``sys.modules``. That +Conceivably, a ModuleSpec.load() method could optionally take a list of +modules with which to interact instead of sys.modules. That capability is left out of this PEP, but may be pursued separately at some other time, including relative to PEP 406 (import engine). -Likewise ``load()`` could be leveraged to implement multi-version +Likewise load() could be leveraged to implement multi-version imports. While interesting, doing so is outside the scope of this proposal. +Others: + +* Add ModuleSpec.submodules (RO-property) - returns possible submodules + relative to the spec. +* Add ModuleSpec.loaded (RO-property) - the module in sys.module, if + any. +* Add ModuleSpec.data - a descriptor that wraps the data API of the + spec's loader. +* Also see [3]. + + Backward Compatibility ---------------------- -``ModuleSpec`` doesn't have any. This would be a different story if -``Finder.find_module()`` were to return a module spec instead of loader. +ModuleSpec doesn't have any. This would be a different story if +Finder.find_module() were to return a module spec instead of loader. In that case, specs would have to act like the loader that would have been returned instead. Doing so would be relatively simple, but is an -unnecessary complication. +unnecessary complication. It was part of earlier versions of this PEP. Subclassing ----------- Subclasses of ModuleSpec are allowed, but should not be necessary. -Simply setting ``loading_info`` or adding functionality to a custom +Simply setting loading_info or adding functionality to a custom finder or loader will likely be a better fit and should be tried first. However, as long as a subclass still fulfills the requirements of the import system, objects of that type are completely fine as the return -value of ``Finder.find_spec()``. +value of Finder.find_spec(). Existing Types @@ -688,95 +481,78 @@ Existing Types Module Objects -------------- -**__spec__** +Other than adding ``__spec__``, none of the import-related module +attributes will be changed or deprecated, though some of them could be; +any such deprecation can wait until Python 4. -.. container:: - - Module objects will now have a ``__spec__`` attribute to which the - module's spec will be bound. - -None of the other import-related module attributes will be changed or -deprecated, though some of them could be; any such deprecation can wait -until Python 4. - -``ModuleSpec`` objects will not be kept in sync with the corresponding -module object's import-related attributes. Though they may differ, in -practice they will typically be the same. +A module's spec will not be kept in sync with the corresponding import- +related attributes. Though they may differ, in practice they will +typically be the same. One notable exception is that case where a module is run as a script by using the ``-m`` flag. In that case ``module.__spec__.name`` will reflect the actual module name while ``module.__name__`` will be ``__main__``. +Notably, the spec for each module instance will be unique to that +instance even if the information is identical to that of another spec. +This won't happen in general. + Finders ------- -**MetaPathFinder.find_spec(name, path=None)** - -**PathEntryFinder.find_spec(name)** - -.. container:: - - Finders will return ModuleSpec objects when ``find_spec()`` is - called. This new method replaces ``find_module()`` and - ``find_loader()`` (in the ``PathEntryFinder`` case). If a loader does - not have ``find_spec()``, ``find_module()`` and ``find_loader()`` are - used instead, for backward-compatibility. - - Adding yet another similar method to loaders is a case of practicality. - ``find_module()`` could be changed to return specs instead of loaders. - This is tempting because the import APIs have suffered enough, - especially considering ``PathEntryFinder.find_loader()`` was just - added in Python 3.3. However, the extra complexity and a less-than- - explicit method name aren't worth it. - Finders are still responsible for creating the loader. That loader will now be stored in the module spec returned by ``find_spec()`` rather than returned directly. As is currently the case without the PEP, if a loader would be costly to create, that loader can be designed to defer the cost until later. +**MetaPathFinder.find_spec(name, path=None)** + +**PathEntryFinder.find_spec(name)** + +Finders will return ModuleSpec objects when ``find_spec()`` is +called. This new method replaces ``find_module()`` and +``find_loader()`` (in the ``PathEntryFinder`` case). If a loader does +not have ``find_spec()``, ``find_module()`` and ``find_loader()`` are +used instead, for backward-compatibility. + +Adding yet another similar method to loaders is a case of practicality. +``find_module()`` could be changed to return specs instead of loaders. +This is tempting because the import APIs have suffered enough, +especially considering ``PathEntryFinder.find_loader()`` was just +added in Python 3.3. However, the extra complexity and a less-than- +explicit method name aren't worth it. + Loaders ------- **Loader.exec_module(module)** -.. container:: +Loaders will have a new method, exec_module(). Its only job +is to "exec" the module and consequently populate the module's +namespace. It is not responsible for creating or preparing the module +object, nor for any cleanup afterward. It has no return value. - Loaders will have a new method, ``exec_module()``. Its only job - is to "exec" the module and consequently populate the module's - namespace. It is not responsible for creating or preparing the module - object, nor for any cleanup afterward. It has no return value. - -**Loader.load_module(fullname)** - -.. container:: - - The ``load_module()`` of loaders will still work and be an active part - of the loader API. It is still useful for cases where the default - module creation/prepartion/cleanup is not appropriate for the loader. - If implemented, ``load_module()`` will still be responsible for its - current requirements (prep/exec/etc.) since the method may be called - directly. - - For example, the C API for extension modules only supports the full - control of ``load_module()``. As such, ``ExtensionFileLoader`` will not - implement ``exec_module()``. In the future it may be appropriate to - produce a second C API that would support an ``exec_module()`` - implementation for ``ExtensionFileLoader``. Such a change is outside - the scope of this PEP. - -A loader must define either ``exec_module()`` or ``load_module()``. If -both exist on the loader, ``ModuleSpec.load()`` uses ``exec_module()`` -and ignores ``load_module()``. +exec_module() should properly handle the case where it is called more +than once. For some kinds of modules this may mean raising ImportError +every time after the first time the method is called. This is +particularly relevant for reloading, where some kinds of modules do not +support in-place reloading. **Loader.create_module(spec)** -.. container:: +Loaders may also implement create_module() that will return a +new module to exec. It may return None to indicate that the default +module creation code should be used. One use case for create_module() +is to provide a module that is a subclass of the builtin module type. +Most loaders will not need to implement create_module(), - Loaders may also implement ``create_module()`` that will return a - new module to exec. However, most loaders will not need to implement - the method. +create_module() should properly handle the case where it is called more +than once for the same spec/module. This may include returning None or +raising ImportError. + +Other changes: PEP 420 introduced the optional ``module_repr()`` loader method to limit the amount of special-casing in the module type's ``__repr__()``. Since @@ -791,11 +567,11 @@ though the same information is found on ``ModuleSpec``. ``ModuleSpec`` can use it to populate its own ``is_package`` if that information is not otherwise available. Still, it will be made optional. -The path-based loaders in ``importlib`` take arguments in their -``__init__()`` and have corresponding attributes. However, the need for -those values is eliminated by module specs. The only exception is -``FileLoader.get_filename()``, which uses ``self.path``. The signatures -for these loaders and the accompanying attributes will be deprecated. +One consequence of ModuleSpec is that loader ``__init__`` methods will +no longer need to accommodate per-module state. The path-based loaders +in ``importlib`` take arguments in their ``__init__()`` and have +corresponding attributes. However, the need for those values is +eliminated by module specs. In addition to executing a module during loading, loaders will still be directly responsible for providing APIs concerning module-related data. @@ -804,7 +580,7 @@ directly responsible for providing APIs concerning module-related data. Other Changes ============= -* The various finders and loaders provided by ``importlib`` will be +* The various finders and loaders provided by importlib will be updated to comply with this proposal. * The spec for the ``__main__`` module will reflect how the interpreter was started. For instance, with ``-m`` the spec's name will be that @@ -812,12 +588,9 @@ Other Changes "__main__". * We add ``importlib.find_spec()`` to mirror ``importlib.find_loader()`` (which becomes deprecated). -* Deprecations in ``importlib.util``: ``set_package()``, - ``set_loader()``, and ``module_for_loader()``. ``module_to_load()`` - (introduced prior to Python 3.4's release) can be removed. * ``importlib.reload()`` is changed to use ``ModuleSpec.load()``. -* ``ModuleSpec.load()`` and ``importlib.reload()`` will now make use of - the per-module import lock, whereas ``Loader.load_module()`` did not. +* ``importlib.reload()`` will now make use of the per-module import + lock. Reference Implementation @@ -838,11 +611,18 @@ knowledge. \* Other modules to look at: runpy (and pythonrun.c), pickle, pydoc, inspect. -\* Add ``ModuleSpec.data`` as a descriptor that wraps the data API of the -spec's loader? +For instance, pickle should be updated in the __main__ case to look at +``module.__spec__.name``. -\* How to limit possible end-user confusion/abuses relative to spec -attributes (since __spec__ will make them really accessible)? +\* Impact on some kinds of lazy loading modules. See [3]. + +\* Find a better name than loading_info? Perhaps loading_data, +loader_state, or loader_info. + +\* Change loader.create_module() to prepare_module()? + +\* Add more explicit reloading support to exec_module() (and +prepare_module())? References @@ -850,6 +630,10 @@ References [1] http://mail.python.org/pipermail/import-sig/2013-August/000658.html +[2] https://mail.python.org/pipermail/import-sig/2013-September/000735.html + +[3] https://mail.python.org/pipermail/python-dev/2013-August/128129.html + Copyright =========