From dffee71a33655aad8438b66363897b361b970500 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Wed, 28 Aug 2013 02:43:45 -0600 Subject: [PATCH] Big update to PEP 451 in response to feedback. --- pep-0451.txt | 897 +++++++++++++++++++++++++++++++++------------------ 1 file changed, 577 insertions(+), 320 deletions(-) diff --git a/pep-0451.txt b/pep-0451.txt index 4d88c2314..ce1bf7b0c 100644 --- a/pep-0451.txt +++ b/pep-0451.txt @@ -17,10 +17,11 @@ Abstract ======== This PEP proposes to add a new class to ``importlib.machinery`` called -``ModuleSpec``. It will contain all the import-related information -about a module without needing to load the module first. Finders will -now return a module's spec rather than a loader. The import system will -use the spec to load the module. +``ModuleSpec``. It will be authoritative for all the import-related +information about a module, and will be available without needing to +load the module first. Finders will provide a module's spec instead of +a loader. The import machinery will be adjusted to take advantage of +module specs, including using them to load modules. Motivation @@ -85,7 +86,7 @@ that. This is the same gap as before between finders and loaders. As an example of complexity attributable to this flaw, the implementation of namespace packages in Python 3.3 (see PEP 420) added ``FileFinder.find_loader()`` because there was no good way for -``find_module()`` to provide the namespace path. +``find_module()`` to provide the namespace search locations. The answer to this gap is a ``ModuleSpec`` object that contains the per-module information and takes care of the boilerplate functionality @@ -100,334 +101,603 @@ Specification The goal is to address the gap between finders and loaders while changing as little of their semantics as possible. Though some functionality and information is moved to the new ``ModuleSpec`` type, -their semantics should remain the same. However, for the sake of -clarity, those semantics will be explicitly identified. +their behavior should remain the same. However, for the sake of clarity +the finder and loader semantics will be explicitly identified. + +This is a high-level summary of the changes described by this PEP. More +detail is available in later sections. + +importlib.machinery.ModuleSpec (new) +------------------------------------ + +Attributes: + +* name - a string for the name of the module. +* loader - the loader to use for loading and for module data. +* origin - a string for the location from which the module is loaded. +* submodule_search_locations - strings for where to find submodules, + if a package. +* loading_info - a container of data for use during loading (or None). +* cached (property) - a string for where the compiled module will be + stored. +* is_location (RO-property) - the module's origin refers to a location. + +.. XXX Find a better name than loading_info? +.. XXX Add ``submodules`` (RO-property) - returns possible submodules + relative to spec (or None)? +.. XXX Add ``loaded`` (RO-property) - the module in sys.modules, if any? + +Factory Methods: + +* from_file_location() - factory for file-based module specs. +* from_module() - factory based on import-related module attributes. +* from_loader() - factory based on information provided by loaders. + +.. XXX Move the factories to importlib.util or make class-only? + +Instance Methods: + +* init_module_attrs() - populate a module's import-related attributes. +* module_repr() - provide a repr string for a module. +* create() - provide a new module to use for loading. +* exec() - execute the spec into a module namespace. +* load() - prepare a module and execute it in a protected way. +* reload() - re-execute a module in a protected way. + +.. XXX Make module_repr() match the spec (BC problem?)? + +API Additions +------------- + +* ``importlib.abc.Loader.exec_module()`` will execute a module in its + own namespace, replacing ``importlib.abc.Loader.load_module()``. +* ``importlib.abc.Loader.create_module()`` (optional) will return a new + module to use for loading. +* Module objects will have a new attribute: ``__spec__``. +* ``importlib.find_spec()`` will return the spec for a module. +* ``__subclasshook__()`` will be implemented on the importlib ABCs. + +.. XXX Do __subclasshook__() separately from the PEP (issue18862). + +API Changes +----------- + +* Import-related module attributes will no longer be authoritative nor + used by the import system. +* ``InspectLoader.is_package()`` will become optional. + +.. XXX module __repr__() will prefer spec attributes? + +Deprecations +------------ + +* ``importlib.abc.MetaPathFinder.find_module()`` +* ``importlib.abc.PathEntryFinder.find_module()`` +* ``importlib.abc.PathEntryFinder.find_loader()`` +* ``importlib.abc.Loader.load_module()`` +* ``importlib.abc.Loader.module_repr()`` +* The parameters and attributes of the various loaders in + ``importlib.machinery`` +* ``importlib.util.set_package()`` +* ``importlib.util.set_loader()`` +* ``importlib.find_loader()`` + +Removals +-------- + +* ``importlib.abc.Loader.init_module_attrs()`` +* ``importlib.util.module_to_load()`` + +Other Changes +------------- + +* The spec for the ``__main__`` module will reflect the appropriate + name and origin. +* The module type's ``__repr__`` will defer to ModuleSpec exclusively. + +Backward-Compatibility +---------------------- + +* If a finder does not define ``find_spec()``, a spec is derived from + the loader returned by ``find_module()``. +* ``PathEntryFinder.find_loader()`` will be used, if defined. +* ``Loader.load_module()`` is used if ``exec_module()`` is not defined. +* ``Loader.module_repr()`` is used by ``ModuleSpec.module_repr()`` if it + exists. + +What Will not Change? +--------------------- + +* The syntax and semantics of the import statement. +* Existing finders and loaders will continue to work normally. +* The import-related module attributes will still be initialized with + the same information. +* Finders will still create loaders, storing them in the specs. +* ``Loader.load_module()``, if a module defines it, will have all the + same requirements and may still be called directly. +* Loaders will still be responsible for module data APIs. + + +ModuleSpec Users +================ + +``ModuleSpec`` objects has 3 distinct target audiences: Python itself, +import hooks, and normal Python users. + +Python will use specs in the import machinery, in interpreter startup, +and in various standard library modules. Some modules are +import-oriented, like pkgutil, and others are not, like pickle and +pydoc. In all cases, the full ``ModuleSpec`` API will get used. + +Import hooks (finders and loaders) will make use of the spec in specific +ways, mostly without using the ``ModuleSpec`` instance methods. First +of all, finders will use the factory methods to create spec objects. +They may also directly adjust the spec attributes after the spec is +created. Secondly, the finder may bind additional information to the +spec for the loader to consume during module creation/execution. +Finally, loaders will make use of the attributes on a spec when creating +and/or executing a module. + +Python users will be able to inspect a module's ``__spec__`` to get +import-related information about the object. Generally, they will not +be using the ``ModuleSpec`` factory methods nor the instance methods. +However, each spec has methods named ``create``, ``exec``, ``load``, and +``reload``. Since they are so easy to access (and misunderstand/abuse), +their function and availability require explicit consideration in this +proposal. + + +What Will Existing Finders and Loaders Have to Do Differently? +============================================================== + +Immediately? Nothing. The status quo will be deprecated, but will +continue working. However, here are the things that the authors of +finders and loaders should change relative to this PEP: + +* Implement ``find_spec()`` on finders. +* Implement ``exec_module()`` on loaders, if possible. + +The factory methods of ``ModuleSpec`` are intended to be helpful for +converting existing finders. ``from_loader()`` and +``from_file_location()`` are both straight-forward utilities in this +regard. In the case where loaders already expose methods for creating +and preparing modules, a finder may use ``ModuleSpec.from_module()`` on +a throw-away module to create the appropriate spec. + +As for loaders, ``exec_module()`` should be a relatively direct +conversion from a portion of the existing ``load_module()``. However, +``Loader.create_module()`` will also be necessary in some uncommon +cases. Furthermore, ``load_module()`` will still work as a final option +when ``exec_module()`` is not appropriate. + + +How Loading Will Work +===================== + +This is an outline of what happens in ``ModuleSpec.load()``. + +1. A new module is created by calling ``spec.create()``. + + a. If the loader has a ``create_module()`` method, it gets called. + Otherwise a new module gets created. + b. The import-related module attributes are set. + +2. The module is added to sys.modules. +3. ``spec.exec(module)`` gets called. + + a. If the loader has an ``exec_module()`` method, it gets called. + Otherwise ``load_module()`` gets called for backward-compatibility + and the resulting module is updated to match the spec. + +4. If there were any errors the module is removed from sys.modules. +5. If the module was replaced in sys.modules during ``exec()``, the one + in sys.modules is updated to match the spec. +6. The module in sys.modules is returned. + +These steps are exactly what ``Loader.load_module()`` is already +expected to do. Loaders will thus be simplified since they will only +need to implement the portion in step 3a. + ModuleSpec ----------- +========== -A new class which defines the import-related values to use when loading -the module. It closely corresponds to the import-related attributes of -module objects. ``ModuleSpec`` objects may also be used by finders and -loaders and other import-related APIs to hold extra import-related -state about the module. This greatly reduces the need to add any new -new import-related attributes to module objects, and loader ``__init__`` -methods won't need to accommodate such per-module state. +This is a new class which defines the import-related values to use when +loading the module. It closely corresponds to the import-related +attributes of module objects. ``ModuleSpec`` objects may also be used +by finders and loaders and other import-related APIs to hold extra +import-related state concerning the module. This greatly reduces the +need to add any new new import-related attributes to module objects, and +loader ``__init__`` methods will no longer need to accommodate such +per-module state. -Creating a ModuleSpec: +General Notes +------------- -``ModuleSpec(name, loader, *, origin=None, filename=None, cached=None, -path=None)`` +* The spec for each module instance will be unique to that instance even + if the information is identical to that of another spec. +* A module's spec is not intended to be modified by anything but + finders. -Passed in parameter values are assigned directly to the corresponding -attributes below. Other attributes not listed as parameters (such as -``package``) are read-only properties that are automatically derived -from these values. - -The ``ModuleSpec.from_loader()`` class method allows a suitable -ModuleSpec instance to be easily created from a PEP 302 loader object. - -ModuleSpec Attributes +Creating a ModuleSpec --------------------- +**ModuleSpec(name, loader, *, origin=None, is_package=None)** + +.. container:: + + ``name``, ``loader``, and ``origin`` are set on the new instance + without any modification. If ``is_package`` is not passed in, the + loader's ``is_package()`` gets called (if available), or it defaults + to `False`. If ``is_package`` is true, + ``submodule_search_locations`` is set to a new empty list. Otherwise + it is set to None. + + Other attributes not listed as parameters (such as ``package``) are + either read-only dynamic properties or default to None. + +**from_filename(name, loader, *, filename=None, submodule_search_locations=None)** + +.. container:: + + This factory classmethod allows a suitable ModuleSpec instance to be + easily created with extra file-related information. This includes + the values that would be set on a module as ``__file__`` or + ``__cached__``. + + ``is_location`` is set to True for specs created using + ``from_filename()``. + +**from_module(module, loader=None)** + +.. container:: + + This factory is used to create a spec based on the import-related + attributes of an existing module. Since modules should already have + ``__spec__`` set, this method has limited utility. + +**from_loader(name, loader, *, origin=None, is_package=None)** + +.. container:: + + A factory classmethod that returns a new ``ModuleSpec`` derived from + the arguments. ``is_package`` is used inside the method to indicate + that the module is a package. If not explicitly passed in, it falls + back to using the result of the loader's ``is_package()``, if + available. If not available, if defaults to False. + + In contrast to ``ModuleSpec.__init__()``, which takes the arguments + as-is, ``from_loader()`` calculates missing values from the ones + passed in, as much as possible. This replaces the behavior that is + currently provided by several ``importlib.util`` functions as well as + the optional ``init_module_attrs()`` method of loaders. Just to be + clear, here is a more detailed description of those calculations:: + + If not passed in, ``filename`` is to the result of calling the + loader's ``get_filename()``, if available. Otherwise it stays + unset (``None``). + + If not passed in, ``submodule_search_locations`` is set to an empty + list if ``is_package`` is true. Then the directory from ``filename`` + is appended to it, if possible. If ``is_package`` is false, + ``submodule_search_locations`` stays unset. + + If ``cached`` is not passed in and ``filename`` is passed in, + ``cached`` is derived from it. For filenames with a source suffix, + it set to the result of calling + ``importlib.util.cache_from_source()``. For bytecode suffixes (e.g. + ``.pyc``), ``cached`` is set to the value of ``filename``. If + ``filename`` is not passed in or ``cache_from_source()`` raises + ``NotImplementedError``, ``cached`` stays unset. + + If not passed in, ``origin`` is set to ``filename``. Thus if + ``filename`` is unset, ``origin`` stays unset. + + +Attributes +---------- + Each of the following names is an attribute on ``ModuleSpec`` objects. A value of ``None`` indicates "not set". This contrasts with module objects where the attribute simply doesn't exist. -While ``package`` and ``is_package`` are read-only properties, the -remaining attributes can be replaced after the module spec is created -and after import is complete. This allows for unusual cases where -modifying the spec is the best option. However, typical use should not -involve changing the state of a module's spec. +While ``package`` is a read-only property, the remaining attributes can +be replaced after the module spec is created and even after import is +complete. This allows for unusual cases where directly modifying the +spec is the best option. However, typical use should not involve +changing the state of a module's spec. Most of the attributes correspond to the import-related attributes of modules. Here is the mapping, followed by a description of the attributes. The reverse of this mapping is used by -``init_module_attrs()``. +``ModuleSpec.init_module_attrs()``. -============= =========== -On ModuleSpec On Modules -============= =========== -name __name__ -loader __loader__ -package __package__ -is_package - -origin - -filename __file__ -cached __cached__ -path __path__ -============= =========== +========================== =========== +On ModuleSpec On Modules +========================== =========== +name __name__ +loader __loader__ +package __package__ +origin __file__* +cached __cached__* +submodule_search_locations __path__** +loading_info \- +has_location (RO-property) \- +========================== =========== -``name`` +\* Only if ``is_location`` is true. +\*\* Only if not None. -The module's fully resolved and absolute name. It must be set. +**name** -``loader`` +.. container:: -The loader to use during loading and for module data. These specific -functionalities do not change for loaders. Finders are still -responsible for creating the loader and this attribute is where it is -stored. The loader must be set. + The module's fully resolved and absolute name. It must be set. -``package`` +**loader** -The name of the module's parent. This is a dynamic attribute with a -value derived from ``name`` and ``is_package``. For packages it is the -value of ``name``. Otherwise it is equivalent to -``name.rpartition('.')[0]``. Consequently, a top-level module will have -the empty string for ``package``. +.. container:: + The loader to use during loading and for module data. These specific + functionalities do not change for loaders. Finders are still + responsible for creating the loader and this attribute is where it is + stored. The loader must be set. -``is_package`` +**origin** -Whether or not the module is a package. This dynamic attribute is True -if ``path`` is not None (e.g. the empty list is a "true" value), else it -is false. +.. container:: -``origin`` + A string for the location from which the module originates. Aside from + the informational value, it is also used in ``module_repr()``. -A string for the location from which the module originates. If -``filename`` is set, ``origin`` should be set to the same value unless -some other value is more appropriate. ``origin`` is used in -``module_repr()`` if it does not match the value of ``filename``. + The module attribute ``__file__`` has a similar but more restricted + meaning. Not all modules have it set (e.g. built-in modules). However, + ``origin`` is applicable to essentially all modules. For built-in + modules it would be set to "built-in". -Using ``filename`` for this meaning would be inaccurate, since not all -modules have path-based locations. For instance, built-in modules do -not have ``__file__`` set. Yet it is useful to have a descriptive -string indicating that it originated from the interpreter as a built-in -module. So built-in modules will have ``origin`` set to ``"built-in"``. +Secondary Attributes +-------------------- -Path-based attributes: +Some of the ``ModuleSpec`` attributes are not set via arguments when +creating a new spec. Either they are strictly dynamically calculated +properties or they are simply set to None (aka "not set"). For the +latter case, those attributes may still be set directly. -If any of these is set, it indicates that the module is path-based. For -reference, a path entry is a string for a location where the import -system will look for modules, e.g. the path entries in ``sys.path`` or a -package's ``__path__``). +**package** -``filename`` +.. container:: -Like ``origin``, but limited to a path-based location. If ``filename`` -is set, ``origin`` should be set to the same string, unless origin is -explicitly set to something else. ``filename`` is not necessarily an -actual file name, but could be any location string based on a path -entry. Regarding the attribute name, while it is potentially -inaccurate, it is both consistent with the equivalent module attribute -and generally accurate. + A dynamic property that gives the name of the module's parent. The + value is derived from ``name`` and ``is_package``. For packages it is + the value of ``name``. Otherwise it is equivalent to + ``name.rpartition('.')[0]``. Consequently, a top-level module will have + the empty string for ``package``. -.. XXX Would a different name be better? ``path_location``? +**has_location** -``cached`` +.. container:: -The path-based location where the compiled code for a module should be -stored. If ``filename`` is set to a source file, this should be set to -corresponding path that PEP 3147 specifies. The -``importlib.util.source_to_cache()`` function facilitates getting the -correct value. + Some modules can be loaded by reference to a location, e.g. a filesystem + path or a URL or something of the sort. Having the location lets you + load the module, but in theory you could load that module under various + names. -``path`` + In contrast, non-located modules can't be loaded in this fashion, e.g. + builtin modules and modules dynamically created in code. For these, the + name is the only way to access them, so they have an "origin" but not a + "location". -The list of path entries in which to search for submodules if this -module is a package. Otherwise it is ``None``. + This attribute reflects whether or not the module is locatable. If it + is, ``origin`` must be set to the module's location and ``__file__`` + will be set on the module. Furthermore, a locatable module is also + cacheable and so ``__cached__`` is tied to ``has_location``. -.. XXX add a path-based subclass? + The corresponding module attribute name, ``__file__``, is somewhat + inaccurate and potentially confusion, so we will use a more explicit + combination of ``origin`` and ``has_location`` to represent the same + information. Having a separate ``filename`` is unncessary since we have + ``origin``. -ModuleSpec Methods ------------------- +**cached** -``from_loader(name, loader, *, is_package=None, origin=None, filename=None, cached=None, path=None)`` +.. container:: -.. XXX use a different name? + A string for the location where the compiled code for a module should be + stored. PEP 3147 details the caching mechanism of the import system. -A factory classmethod that returns a new ``ModuleSpec`` derived from the -arguments. ``is_package`` is used inside the method to indicate that -the module is a package. If not explicitly passed in, it is set to -``True`` if ``path`` is passed in. It falls back to using the result of -the loader's ``is_package()``, if available. Finally it defaults to -False. The remaining parameters have the same meaning as the -corresponding ``ModuleSpec`` attributes. + If ``has_location`` is true, this location string is set on the module + as ``__cached__``. When ``from_filename()`` is used to create a spec, + ``cached`` is set to the result of calling + ``importlib.util.source_to_cache()``. -In contrast to ``ModuleSpec.__init__()``, which takes the arguments -as-is, ``from_loader()`` calculates missing values from the ones passed -in, as much as possible. This replaces the behavior that is currently -provided by several ``importlib.util`` functions as well as the optional -``init_module_attrs()`` method of loaders. Just to be clear, here is a -more detailed description of those calculations:: + ``cached`` is not necessarily a file location. A finder or loader may + store an alternate location string in ``cached``. However, in practice + this will be the file location dicated by PEP 3147. - If not passed in, ``filename`` is to the result of calling the - loader's ``get_filename()``, if available. Otherwise it stays - unset (``None``). +**submodule_search_locations** - If not passed in, ``path`` is set to an empty list if - ``is_package`` is true. Then the directory from ``filename`` is - appended to it, if possible. If ``is_package`` is false, ``path`` - stays unset. +.. container:: - If ``cached`` is not passed in and ``filename`` is passed in, - ``cached`` is derived from it. For filenames with a source suffix, - it set to the result of calling - ``importlib.util.cache_from_source()``. For bytecode suffixes (e.g. - ``.pyc``), ``cached`` is set to the value of ``filename``. If - ``filename`` is not passed in or ``cache_from_source()`` raises - ``NotImplementedError``, ``cached`` stays unset. + The list of location strings, typically directory paths, in which to + search for submodules. If the module is a package this will be set to + a list (even an empty one). Otherwise it is ``None``. - If not passed in, ``origin`` is set to ``filename``. Thus if - ``filename`` is unset, ``origin`` stays unset. + The corresponding module attribute's name, ``__path__``, is relatively + ambiguous. Instead of mirroring it, we use a more explicit name that + makes the purpose clear. -``module_repr()`` +**loading_info** -Returns a repr string for the module if ``origin`` is set and -``filename`` is not set. The string refers to the value of ``origin``. -Otherwise ``module_repr()`` returns None. This indicates to the module -type's ``__repr__()`` that it should fall back to the default repr. +.. container:: -We could also have ``module_repr()`` produce the repr for the case where -``filename`` is set or where ``origin`` is not set, mirroring the repr -that the module type produces directly. However, the repr string is -derived from the import-related module attributes, which might be out of -sync with the spec. + A finder may set ``loading_info`` to any value to provide additional + data for the loader to use during loading. A value of ``None`` is the + default and indicates that there is no additional data. Otherwise it is + likely set to some containers, such as a ``dict``, ``list``, or + ``types.SimpleNamespace`` containing the relevant extra information. -.. XXX Is using the spec close enough? Probably not. + For example, ``zipimporter`` could use it to pass the zip archive name + to the loader directly, rather than needing to derive it from ``origin`` + or create a custom loader for each find operation. -The implementation of the module type's ``__repr__()`` will change to -accommodate this PEP. However, the current functionality will remain to -handle the case where a module does not have a ``__spec__`` attribute. +Methods +------- -.. XXX Clarify the above justification. +**module_repr()** -``init_module_attrs(module)`` +.. container:: -Sets the module's import-related attributes to the corresponding values -in the module spec. If a path-based attribute is not set on the spec, -it is not set on the module. For the rest, a ``None`` value on the spec -(aka "not set") means ``None`` will be set on the module. If any of the -attributes are already set on the module, the existing values are -replaced. The module's own ``__spec__`` is not consulted but does get -replaced with the spec on which ``init_module_attrs()`` was called. -The earlier mapping of ``ModuleSpec`` attributes to module attributes -indicates which attributes are involved on both sides. + Returns a repr string for the module, based on the module's import- + related attributes and falling back to the spec's attributes. The + string will reflect the current output of the module type's + ``__repr__()``. -``load(module=None, *, is_reload=False)`` + The module type's ``__repr__()`` will use the module's ``__spec__`` + exclusively. If the module does not have ``__spec__`` set, a spec is + generated using ``ModuleSpec.from_module()``. -This method captures the current functionality of and requirements on -``Loader.load_module()`` without any semantic changes, except one. -Reloading a module when ``exec_module()`` is available actually uses -``module`` rather than ignoring it in favor of the one in -``sys.modules``, as ``Loader.load_module()`` does. + Since the module attributes may be out of sync with the spec and to + preserve backward-compatibility in that case, we defer to the module + attributes and only when they are missing do we fall back to the spec + attributes. -``module`` is only allowed when ``is_reload`` is true. This means that -``is_reload`` could be dropped as a parameter. However, doing so would -mean we could not use ``None`` to indicate that the module should be -pulled from ``sys.modules``. Furthermore, ``is_reload`` makes the -intent of the call clear. +**init_module_attrs(module)** -There are two parts to what happens in ``load()``. First, the module is -prepared, loaded, updated appropriately, and left available for the -second part. This is described in more detail shortly. +.. container:: -Second, in the case of error during a normal load (not reload) the -module is removed from ``sys.modules``. If no error happened, the -module is pulled from ``sys.modules``. This the module returned by -``load()``. Before it is returned, if it is a different object than the -one produced by the first part, attributes of the module from -``sys.modules`` are updated to reflect the spec. + Sets the module's import-related attributes to the corresponding values + in the module spec. If ``has_location`` is false on the spec, + ``__file__`` and ``__cached__`` are not set on the module. ``__path__`` + is only set on the module if ``submodule_search_locations`` is None. + For the rest of the import-related module attributes, a ``None`` value + on the spec (aka "not set") means ``None`` will be set on the module. + If any of the attributes are already set on the module, the existing + values are replaced. The module's own ``__spec__`` is not consulted but + does get replaced with the spec on which ``init_module_attrs()`` was + called. The earlier mapping of ``ModuleSpec`` attributes to module + attributes indicates which attributes are involved on both sides. -Returning the module from ``sys.modules`` accommodates the ability of -the module to replace itself there while it is executing (during load). +**create()** -As already noted, this is what already happens in the import system. -``load()`` is not meant to change any of this behavior. +.. container:: -Regarding the first part of ``load()``, the following describes what -happens. It depends on if ``is_reload`` is true and if the loader has -``exec_module()``. + A new module is created relative to the spec and its import-related + attributes are set accordingly. If the spec's loader has a + ``create_module()`` method, that gets called to create the module. This + give the loader a chance to do any pre-loading initialization that can't + otherwise be accomplished elsewhere. Otherwise a bare module object is + created. In both cases ``init_module_attrs()`` is called on the module + before it gets returned. -For normal load with ``exec_module()`` available:: +**exec(module)** - A new module is created, ``init_module_attrs()`` is called to set - its attributes, and it is set on sys.modules. At that point - the loader's ``exec_module()`` is called, after which the module - is ready for the second part of loading. +.. container:: -.. XXX What if the module already exists in sys.modules? + The spec's loader is used to execute the module. If the loader has + ``exec_module()`` defined, the namespace of ``module`` is the target of + execution. Otherwise the loader's ``load_module()`` is called, which + ignores ``module`` and returns the module that was the actual + execution target. In that case the import-related attributes of that + module are updated to reflect the spec. In both cases the targeted + module is the one that gets returned. -For normal load without ``exec_module()`` available:: +**load()** - The loader's ``load_module()`` is called and the attributes of the - module it returns are updated to match the spec. +.. container:: -For reload with ``exec_module()`` available:: + This method captures the current functionality of and requirements on + ``Loader.load_module()`` without any semantic changes. It is + essentially a wrapper around ``create()`` and ``exec()`` with some + extra functionality regarding ``sys.modules``. - If ``module`` is ``None``, it is pulled from ``sys.modules``. If - still ``None``, ImportError is raised. Otherwise ``exec_module()`` - is called, passing in the module-to-be-reloaded. + itself in ``sys.modules`` while executing. Consequently, the module in + ``sys.modules`` is the one that gets returned by ``load()``. -For reload without ``exec_module()`` available:: + Right before ``exec()`` is called, the module is added to + ``sys.modules``. In the case of error during loading the module is + removed from ``sys.modules``. The module in ``sys.modules`` when + ``load()`` finishes is the one that gets returned. Returning the module + from ``sys.modules`` accommodates the ability of the module to replace + itself there while it is executing (during load). - The loader's ``load_module()`` is called and the attributes of the - module it returns are updated to match the spec. + As already noted, this is what already happens in the import system. + ``load()`` is not meant to change any of this behavior. -There is some boilerplate involved when ``exec_module()`` is available, -but only the boilerplate that the import system uses currently. + If ``loader`` is not set (``None``), ``load()`` raises a ValueError. -If ``loader`` is not set (``None``), ``load()`` raises a ValueError. If -``module`` is passed in but ``is_reload`` is false, a ValueError is also -raises to indicate that ``load()`` was called incorrectly. There may be -use cases for calling ``load()`` in that way, but they are outside the -scope of this PEP +**reload(module)** -.. XXX add reload(module=None) and drop load()'s parameters entirely? -.. XXX add more of importlib.reload()'s boilerplate to load()/reload()? +.. container:: + + As with ``load()`` this method faithfully fulfills the semantics of + ``Loader.load_module()`` in the reload case, with one exception: + reloading a module when ``exec_module()`` is available actually uses + ``module`` rather than ignoring it in favor of the one in + ``sys.modules``, as ``Loader.load_module()`` does. The functionality + here mirrors that of ``load()``, minus the ``create()`` call and the + ``sys.modules`` handling. + +.. XXX add more of importlib.reload()'s boilerplate to reload()? Omitted Attributes and Methods ------------------------------ -``ModuleSpec`` does not have a ``from_module()`` factory method since -all modules should already have a spec. +There is no ``PathModuleSpec`` subclass of ``ModuleSpec`` that provides +the ``has_location``, ``cached``, and ``submodule_search_locations`` +functionality. While that might make the separation cleaner, module +objects don't have that distinction. ``ModuleSpec`` will support both +cases equally well. -Additionally, there is no ``PathModuleSpec`` subclass of ``ModuleSpec`` -that provides the ``filename``, ``cached``, and ``path`` functionality. -While that might make the separation cleaner, module objects don't have -that distinction. ``ModuleSpec`` will support both cases equally well. +While ``is_package`` would be a simple additional attribute (aliasing +``self.submodule_search_locations is not None``), it perpetuates the +artificial (and mostly erroneous) distinction between modules and +packages. + +Conceivably, ``ModuleSpec.load()`` could optionally take a list of +modules with which to interact instead of ``sys.modules``. That +capability is left out of this PEP, but may be pursued separately at +some other time, including relative to PEP 406 (import engine). + +Likewise ``load()`` could be leveraged to implement multi-version +imports. While interesting, doing so is outside the scope of this +proposal. Backward Compatibility ---------------------- -Since ``Finder.find_module()`` methods would now return a module spec -instead of loader, specs must act like the loader that would have been -returned instead. This is relatively simple to solve since the loader -is available as an attribute of the spec. We will use ``__getattr__()`` -to do it. - -However, ``ModuleSpec.is_package`` (an attribute) conflicts with -``InspectLoader.is_package()`` (a method). Working around this requires -a more complicated solution but is not a large obstacle. Simply making -``ModuleSpec.is_package`` a method does not reflect that is a relatively -static piece of data. ``module_repr()`` also conflicts with the same -method on loaders, but that workaround is not complicated since both are -methods. - -Unfortunately, the ability to proxy does not extend to ``id()`` -comparisons and ``isinstance()`` tests. In the case of the return value -of ``find_module()``, we accept that break in backward compatibility. -However, we will mitigate the problem with ``isinstance()`` somewhat by -registering ``ModuleSpec`` on the loaders in ``importlib.abc``. +``ModuleSpec`` doesn't have any. This would be a different story if +``Finder.find_module()`` were to return a module spec instead of loader. +In that case, specs would have to act like the loader that would have +been returned instead. Doing so would be relatively simple, but is an +unnecessary complication. Subclassing ----------- Subclasses of ModuleSpec are allowed, but should not be necessary. -Adding functionality to a custom finder or loader will likely be a -better fit and should be tried first. However, as long as a subclass -still fulfills the requirements of the import system, objects of that -type are completely fine as the return value of ``find_module()``. +Simply setting ``loading_info`` or adding functionality to a custom +finder or loader will likely be a better fit and should be tried first. +However, as long as a subclass still fulfills the requirements of the +import system, objects of that type are completely fine as the return +value of ``Finder.find_spec()``. + + +Existing Types +============== Module Objects -------------- -Module objects will now have a ``__spec__`` attribute to which the -module's spec will be bound. None of the other import-related module -attributes will be changed or deprecated, though some of them could be; -any such deprecation can wait until Python 4. +**__spec__** + +.. container:: + + Module objects will now have a ``__spec__`` attribute to which the + module's spec will be bound. + +None of the other import-related module attributes will be changed or +deprecated, though some of them could be; any such deprecation can wait +until Python 4. ``ModuleSpec`` objects will not be kept in sync with the corresponding module object's import-related attributes. Though they may differ, in @@ -438,32 +708,30 @@ using the ``-m`` flag. In that case ``module.__spec__.name`` will reflect the actual module name while ``module.__name__`` will be ``__main__``. -The ``__file__`` attribute will be set where applicable in the same way -it is now. For instance, zip imports will still have it set for -backward-compatibility reasons. However, the recommendation will be to -have ``__file__`` set only for actual filenames from now on. - Finders ------- -Finders will now return ModuleSpec objects when ``find_module()`` is -called rather than loaders. For backward compatility, ``Modulespec`` -objects proxy the attributes of their ``loader`` attribute. +**MetaPathFinder.find_spec(name, path=None)** -Adding another similar method to avoid backward-compatibility issues -is undersireable if avoidable. The import APIs have suffered enough, -especially considering ``PathEntryFinder.find_loader()`` was just -added in Python 3.3. The approach taken by this PEP should be -sufficient to address backward-compatibility issues for -``find_module()``. +**PathEntryFinder.find_spec(name)** -The change to ``find_module()`` applies to both ``MetaPathFinder`` and -``PathEntryFinder``. ``PathEntryFinder.find_loader()`` will be -deprecated and, for backward compatibility, implicitly special-cased if -the method exists on a finder. +.. container:: + + Finders will return ModuleSpec objects when ``find_spec()`` is + called. This new method replaces ``find_module()`` and + ``find_loader()`` (in the ``PathEntryFinder`` case). If a loader does + not have ``find_spec()``, ``find_module()`` and ``find_loader()`` are + used instead, for backward-compatibility. + + Adding yet another similar method to loaders is a case of practicality. + ``find_module()`` could be changed to return specs instead of loaders. + This is tempting because the import APIs have suffered enough, + especially considering ``PathEntryFinder.find_loader()`` was just + added in Python 3.3. However, the extra complexity and a less-than- + explicit method name aren't worth it. Finders are still responsible for creating the loader. That loader will -now be stored in the module spec returned by ``find_module()`` rather +now be stored in the module spec returned by ``find_spec()`` rather than returned directly. As is currently the case without the PEP, if a loader would be costly to create, that loader can be designed to defer the cost until later. @@ -471,26 +739,45 @@ the cost until later. Loaders ------- -Loaders will have a new method, ``exec_module(module)``. Its only job -is to "exec" the module and consequently populate the module's -namespace. It is not responsible for creating or preparing the module -object, nor for any cleanup afterward. It has no return value. +**Loader.exec_module(module)** -The ``load_module()`` of loaders will still work and be an active part -of the loader API. It is still useful for cases where the default -module creation/prepartion/cleanup is not appropriate for the loader. +.. container:: -For example, the C API for extension modules only supports the full -control of ``load_module()``. As such, ``ExtensionFileLoader`` will not -implement ``exec_module()``. In the future it may be appropriate to -produce a second C API that would support an ``exec_module()`` -implementation for ``ExtensionFileLoader``. Such a change is outside -the scope of this PEP. + Loaders will have a new method, ``exec_module()``. Its only job + is to "exec" the module and consequently populate the module's + namespace. It is not responsible for creating or preparing the module + object, nor for any cleanup afterward. It has no return value. + +**Loader.load_module(fullname)** + +.. container:: + + The ``load_module()`` of loaders will still work and be an active part + of the loader API. It is still useful for cases where the default + module creation/prepartion/cleanup is not appropriate for the loader. + If implemented, ``load_module()`` will still be responsible for its + current requirements (prep/exec/etc.) since the method may be called + directly. + + For example, the C API for extension modules only supports the full + control of ``load_module()``. As such, ``ExtensionFileLoader`` will not + implement ``exec_module()``. In the future it may be appropriate to + produce a second C API that would support an ``exec_module()`` + implementation for ``ExtensionFileLoader``. Such a change is outside + the scope of this PEP. A loader must define either ``exec_module()`` or ``load_module()``. If both exist on the loader, ``ModuleSpec.load()`` uses ``exec_module()`` and ignores ``load_module()``. +**Loader.create_module(spec)** + +.. container:: + + Loaders may also implement ``create_module()`` that will return a + new module to exec. However, most loaders will not need to implement + the method. + PEP 420 introduced the optional ``module_repr()`` loader method to limit the amount of special-casing in the module type's ``__repr__()``. Since this method is part of ``ModuleSpec``, it will be deprecated on loaders. @@ -506,86 +793,56 @@ not otherwise available. Still, it will be made optional. The path-based loaders in ``importlib`` take arguments in their ``__init__()`` and have corresponding attributes. However, the need for -those values is eliminated. The only exception is +those values is eliminated by module specs. The only exception is ``FileLoader.get_filename()``, which uses ``self.path``. The signatures for these loaders and the accompanying attributes will be deprecated. In addition to executing a module during loading, loaders will still be directly responsible for providing APIs concerning module-related data. + Other Changes -------------- +============= * The various finders and loaders provided by ``importlib`` will be updated to comply with this proposal. - * The spec for the ``__main__`` module will reflect how the interpreter was started. For instance, with ``-m`` the spec's name will be that of the run module, while ``__main__.__name__`` will still be "__main__". - -* We add ``importlib.find_module()`` to mirror +* We add ``importlib.find_spec()`` to mirror ``importlib.find_loader()`` (which becomes deprecated). - * Deprecations in ``importlib.util``: ``set_package()``, ``set_loader()``, and ``module_for_loader()``. ``module_to_load()`` (introduced prior to Python 3.4's release) can be removed. - * ``importlib.reload()`` is changed to use ``ModuleSpec.load()``. - * ``ModuleSpec.load()`` and ``importlib.reload()`` will now make use of the per-module import lock, whereas ``Loader.load_module()`` did not. + Reference Implementation ------------------------- +======================== -A reference implementation is available at . +A reference implementation will be available at +http://bugs.python.org/issue18864. -Open Questions +Open Issues ============== -* How to avoid having custom ModuleSpec attributes conflict with future - normal attributes? +\* The impact of this change on pkgutil (and setuptools) needs looking +into. It has some generic function-based extensions to PEP 302. These +may break if importlib starts wrapping loaders without the tools' +knowledge. -This could be done with a sub-namespace bound to a single ModuleSpec -attribute. It could also be done by reserving names with a single -leading underscore for custom attributes. Or we could just not worry -about it. +\* Other modules to look at: runpy (and pythonrun.c), pickle, pydoc, +inspect. -* Get rid of the ``is_package`` property? +\* Add ``ModuleSpec.data`` as a descriptor that wraps the data API of the +spec's loader? -It duplicates information -both in the ``ModuleSpec()`` signature and in attributes. It is -technically unncessary in light of the path attribute and it conflicts -with ``InspectLoader.is_package()``, which makes the implementation more -complicated. However, it also provides an explicit indicator of -package-ness, which helps those less familiar with the import system. - -* Deprecate the use of ``__file__`` for anything except actual files? - -* Introduce a new extension module API that takes advantage of - ``ModuleSpec``? I'd rather that be part of a separate proposal. - -* Add ``create_module()`` to loaders? - -It would take a ``ModuleSpec`` -and return the module that should be passed to ``spec.exec()``. This -method would be helpful for new extension module import APIs. - -* Have ``ModuleSpec.module_repr()`` replace more of the module type's - ``__repr__()`` implementation? - -A compliant module is required to have -``__spec__`` set so that should work. However, currently the repr uses -the module attributes. Using the spec attributes would give precedence -to the spec in the case that they differ, which would be -backward-incompatible. - -* Factor the path-based attributes/functionality into a subclass-- - something like ``PathModuleSpec``? - -It looks like there just isn't enough benefit to doing so. +\* How to limit possible end-user confusion/abuses relative to spec +attributes (since __spec__ will make them really accessible)? References