Fix backward-compatibility hole described by Jeff Hardy in:
http://mail.python.org/pipermail/python-dev/2011-July/112370.html Using the approach described here: http://mail.python.org/pipermail/python-dev/2011-July/112374.html This should now restrict backward-compatibility concerns to tool-support questions, unless somebody comes up with another way to break it. ;-)
This commit is contained in:
parent
62a9a418f9
commit
d65528918f
191
pep-0402.txt
191
pep-0402.txt
|
@ -339,17 +339,57 @@ it. If this is done *only* by importing a top-level module (i.e., not
|
|||
checking for a ``__version__`` or some other attribute), *and* there
|
||||
is a directory of the same name as the sought-for package on
|
||||
``sys.path`` somewhere, *and* the package is not actually installed,
|
||||
then such code could *perhaps* be fooled into thinking a package is
|
||||
installed that really isn't.
|
||||
then such code could be fooled into thinking a package is installed
|
||||
that really isn't.
|
||||
|
||||
However, even in the rare case where all these conditions line up to
|
||||
happen at once, the failure is more likely to be annoying than
|
||||
damaging. In most cases, after all, the code will simply fail a
|
||||
little later on, when it actually tries to DO something with the
|
||||
imported (but empty) module. (And code that checks ``__version__``
|
||||
attributes or for the presence of some desired function, class, or
|
||||
module in the package will not see a false positive result in the
|
||||
first place.)
|
||||
For example, suppose someone writes a script (``datagen.py``)
|
||||
containing the following code::
|
||||
|
||||
try:
|
||||
import json
|
||||
except ImportError:
|
||||
import simplejson as json
|
||||
|
||||
And runs it in a directory laid out like this::
|
||||
|
||||
datagen.py
|
||||
json/
|
||||
foo.js
|
||||
bar.js
|
||||
|
||||
If ``import json`` succeeded due to the mere presence of the ``json/``
|
||||
subdirectory, the code would incorrectly believe that the ``json``
|
||||
module was available, and proceed to fail with an error.
|
||||
|
||||
However, we can prevent corner cases like these from arising, simply
|
||||
by making one small change to the algorithm presented so far. Instead
|
||||
of allowing you to import a "pure virtual" package (like ``zc``),
|
||||
we allow only importing of the *contents* of virtual packages.
|
||||
|
||||
That is, a statement like ``import zc`` should raise ``ImportError``
|
||||
if there is no ``zc.py`` or ``zc/__init__.py`` on ``sys.path``. But,
|
||||
doing ``import zc.buildout`` should still succeed, as long as there's
|
||||
a ``zc/buildout.py`` or ``zc/buildout/__init__.py`` on ``sys.path``.
|
||||
|
||||
In other words, we don't allow pure virtual packages to be imported
|
||||
directly, only modules and self-contained packages. (This is an
|
||||
acceptable limitation, because there is no *functional* value to
|
||||
importing such a package by itself. After all, the module object
|
||||
will have no *contents* until you import at least one of its
|
||||
subpackages or submodules!)
|
||||
|
||||
Once ``zc.buildout`` has been successfully imported, though, there
|
||||
*will* be a ``zc`` module in ``sys.modules``, and trying to import it
|
||||
will of course succeed. We are only preventing an *initial* import
|
||||
from succeeding, in order to prevent false-positive import successes
|
||||
when clashing subdirectories are present on ``sys.path``.
|
||||
|
||||
So, with this slight change, the ``datagen.py`` example above will
|
||||
work correctly. When it does ``import json``, the mere presence of a
|
||||
``json/`` directory will simply not affect the import process at all,
|
||||
even if it contains ``.py`` files. The ``json/`` directory will still
|
||||
only be searched in the case where an import like ``import
|
||||
json.converter`` is attempted.
|
||||
|
||||
Meanwhile, tools that expect to locate packages and modules by
|
||||
walking a directory tree can be updated to use the existing
|
||||
|
@ -361,41 +401,54 @@ packages in memory should use the other APIs described in the
|
|||
Specification
|
||||
=============
|
||||
|
||||
Two changes are made to the existing import process.
|
||||
A change is made to the existing import process, when importing
|
||||
names containing at least one ``.`` -- that is, imports of modules
|
||||
that have a parent package.
|
||||
|
||||
First, the built-in ``__import__`` function must not raise an
|
||||
``ImportError`` when importing a submodule of a module with no
|
||||
``__path__``. Instead, it must attempt to *create* a ``__path__``
|
||||
attribute for the parent module first, as described in `__path__
|
||||
creation`_, below.
|
||||
Specifically, if the parent package does not exist, or exists but
|
||||
lacks a ``__path__`` attribute, an attempt is first made to create a
|
||||
"virtual path" for the parent package (following the algorithm
|
||||
described in the section on `virtual paths`_, below).
|
||||
|
||||
Second, if searching ``sys.meta_path`` and ``sys.path`` (or a parent
|
||||
package ``__path__``) fails to find a module being imported, the
|
||||
import process must attempt to create a ``__path__`` attribute for
|
||||
the missing module. If the attempt succeeds, an empty module is
|
||||
created and its ``__path__`` is set. Otherwise, importing fails.
|
||||
If the computed "virtual path" is empty, an ``ImportError`` results,
|
||||
just as it would today. However, if a non-empty virtual path is
|
||||
obtained, the normal import of the submodule or subpackage proceeds,
|
||||
using that virtual path to find the submodule or subpackage. (Just
|
||||
as it would have with the parent's ``__path__``, if the parent package
|
||||
had existed and had a ``__path__``.)
|
||||
|
||||
In both of the above cases, if a non-empty ``__path__`` is created,
|
||||
the name of the module whose ``__path__`` was created is added to
|
||||
``sys.virtual_packages`` -- an initially-empty ``set()`` of package
|
||||
names.
|
||||
When a submodule or subpackage is found (but not yet loaded),
|
||||
the parent package is created and added to ``sys.modules`` (if it
|
||||
didn't exist before), and its ``__path__`` is set to the computed
|
||||
virtual path (if it wasn't already set).
|
||||
|
||||
(This way, code that extends ``sys.path`` at runtime can find out
|
||||
what virtual packages are currently imported, and thereby add any
|
||||
new subdirectories to those packages' ``__path__`` attributes. See
|
||||
`Standard Library Changes/Additions`_ below for more details.)
|
||||
In this way, when the actual loading of the submodule or subpackage
|
||||
occurs, it will see a parent package existing, and any relative
|
||||
imports will work correctly. However, if no submodule or subpackage
|
||||
exists, then the parent package will *not* be created, nor will a
|
||||
standalone module be converted into a package (by the addition of a
|
||||
spurious ``__path__`` attribute).
|
||||
|
||||
Conversely, if an empty ``__path__`` results, an ``ImportError``
|
||||
is immediately raised, and the module is not created or changed, nor
|
||||
is its name added to ``sys.virtual_packages``.
|
||||
Note, by the way, that this change must be applied *recursively*: that
|
||||
is, if ``foo`` and ``foo.bar`` are pure virtual packages, then
|
||||
``import foo.bar.baz`` must wait until ``foo.bar.baz`` is found before
|
||||
creating module objects for *both* ``foo`` and ``foo.bar``, and then
|
||||
create both of them together, properly setting the ``foo`` module's
|
||||
``.bar`` attrbute to point to the ``foo.bar``module.
|
||||
|
||||
In this way, pure virtual packages are never directly importable:
|
||||
an ``import foo`` or ``import foo.bar`` by itself will fail, and the
|
||||
corresponding modules will not appear in ``sys.modules`` until they
|
||||
are needed to point to a *successfully* imported submodule or
|
||||
self-contained subpackage.
|
||||
|
||||
|
||||
``__path__`` Creation
|
||||
---------------------
|
||||
Virtual Paths
|
||||
-------------
|
||||
|
||||
A virtual ``__path__`` is created by obtaining a PEP 302 "importer"
|
||||
object for each of the path entries found in ``sys.path`` (for a
|
||||
top-level module) or the parent ``__path__`` (for a submodule).
|
||||
A virtual path is created by obtaining a PEP 302 "importer" object for
|
||||
each of the path entries found in ``sys.path`` (for a top-level
|
||||
module) or the parent ``__path__`` (for a submodule).
|
||||
|
||||
(Note: because ``sys.meta_path`` importers are not associated with
|
||||
``sys.path`` or ``__path__`` entry strings, such importers do *not*
|
||||
|
@ -403,18 +456,34 @@ participate in this process.)
|
|||
|
||||
Each importer is checked for a ``get_subpath()`` method, and if
|
||||
present, the method is called with the full name of the module/package
|
||||
the ``__path__`` is being constructed for. The return value is either
|
||||
a string representing a subdirectory for the requested package, or
|
||||
the path is being constructed for. The return value is either a
|
||||
string representing a subdirectory for the requested package, or
|
||||
``None`` if no such subdirectory exists.
|
||||
|
||||
The strings returned by the importers are added to the ``__path__``
|
||||
The strings returned by the importers are added to the path list
|
||||
being built, in the same order as they are found. (``None`` values
|
||||
and missing ``get_subpath()`` methods are simply skipped.)
|
||||
|
||||
In Python code, the algorithm would look something like this::
|
||||
The resulting list (whether empty or not) is then stored in a
|
||||
``sys.virtual_package_paths`` dictionary, keyed by module name.
|
||||
|
||||
This dictionary has two purposes. First, it serves as a cache, in
|
||||
the event that more than one attempt is made to import a submodule
|
||||
of a virtual package.
|
||||
|
||||
Second, and more importantly, the dictionary can be used by code that
|
||||
extends ``sys.path`` at runtime to *update* imported packages'
|
||||
``__path__`` attributes accordingly. (See `Standard Library
|
||||
Changes/Additions`_ below for more details.)
|
||||
|
||||
In Python code, the virtual path construction algorithm would look
|
||||
something like this::
|
||||
|
||||
def get_virtual_path(modulename, parent_path=None):
|
||||
|
||||
if modulename in sys.virtual_package_paths:
|
||||
return sys.virtual_package_paths[modulename]
|
||||
|
||||
if parent_path is None:
|
||||
parent_path = sys.path
|
||||
|
||||
|
@ -429,6 +498,7 @@ In Python code, the algorithm would look something like this::
|
|||
if subpath is not None:
|
||||
path.append(subpath)
|
||||
|
||||
sys.virtual_package_paths[modulename] = path
|
||||
return path
|
||||
|
||||
And a function like this one should be exposed in the standard
|
||||
|
@ -453,19 +523,25 @@ Specifically the proposed changes and additions to ``pkgutil`` are:
|
|||
path.
|
||||
|
||||
The implementation of this function does a simple top-down traversal
|
||||
of ``sys.virtual_packages``, and performs any necessary
|
||||
``get_subpath()`` calls to identify what path entries need to
|
||||
be added to each package's ``__path__``, given that `path_entry`
|
||||
of ``sys.virtual_package_paths``, and performs any necessary
|
||||
``get_subpath()`` calls to identify what path entries need to be
|
||||
added to the virtual path for that package, given that `path_entry`
|
||||
has been added to ``sys.path``. (Or, in the case of sub-packages,
|
||||
adding a derived subpath entry, based on their parent namespace's
|
||||
``__path__``.)
|
||||
adding a derived subpath entry, based on their parent package's
|
||||
virtual path.)
|
||||
|
||||
(Note: this function must update both the path values in
|
||||
``sys.virtual_package_paths`` as well as the ``__path__`` attributes
|
||||
of any corresponding modules in ``sys.modules``, even though in the
|
||||
common case they will both be the same ``list`` object.)
|
||||
|
||||
* A new ``iter_virtual_packages(parent='')`` function to allow
|
||||
top-down traversal of virtual packages in ``sys.virtual_packages``,
|
||||
by yielding the child virtual packages of `parent`. For example,
|
||||
calling ``iter_virtual_packages("zope")`` might yield ``zope.app``
|
||||
and ``zope.products`` (if they are imported virtual packages listed
|
||||
in ``sys.virtual_packages``), but **not** ``zope.foo.bar``.
|
||||
top-down traversal of virtual packages from
|
||||
``sys.virtual_package_paths``, by yielding the child virtual
|
||||
packages of `parent`. For example, calling
|
||||
``iter_virtual_packages("zope")`` might yield ``zope.app``
|
||||
and ``zope.products`` (if they are virtual packages listed in
|
||||
``sys.virtual_package_paths``), but **not** ``zope.foo.bar``.
|
||||
(This function is needed to implement ``extend_virtual_paths()``,
|
||||
but is also potentially useful for other code that needs to inspect
|
||||
imported virtual packages.)
|
||||
|
@ -500,10 +576,11 @@ For users, developers, and distributors of virtual packages:
|
|||
and do other things that make more sense for a self-contained
|
||||
project than for a mere "namespace" package.
|
||||
|
||||
* ``sys.virtual_packages`` is allowed to contain non-existent or
|
||||
not-yet-imported package names; code that uses its contents should
|
||||
not assume that every name in this set is also present in
|
||||
``sys.modules`` or that importing the name will necessarily succeed.
|
||||
* ``sys.virtual_package_paths`` is allowed to contain entries for
|
||||
non-existent or not-yet-imported package names; code that uses its
|
||||
contents should not assume that every key in this dictionary is also
|
||||
present in ``sys.modules`` or that importing the name will
|
||||
necessarily succeed.
|
||||
|
||||
* If you are changing a currently self-contained package into a
|
||||
virtual one, it's important to note that you can no longer use its
|
||||
|
@ -539,7 +616,9 @@ For those implementing PEP \302 importer objects:
|
|||
XXX This might list a lot of not-really-packages. Should we
|
||||
require importable contents to exist? If so, how deep do we
|
||||
search, and how do we prevent e.g. link loops, or traversing onto
|
||||
different filesystems, etc.? Ick.
|
||||
different filesystems, etc.? Ick. Also, if virtual packages are
|
||||
listed, they still can't be *imported*, which is a problem for the
|
||||
way that ``pkgutil.walk_modules()`` is currently implemented.
|
||||
|
||||
* "Meta" importers (i.e., importers placed on ``sys.meta_path``) do
|
||||
not need to implement ``get_subpath()``, because the method
|
||||
|
|
Loading…
Reference in New Issue