Update for latest import-sig discussion: fix proposed -m semantics, better explanation of why we can change this now when we couldn't in 2.x

This commit is contained in:
Nick Coghlan 2011-11-24 22:37:18 +10:00
parent 28707a70f1
commit efd34dcbef
1 changed files with 116 additions and 60 deletions

View File

@ -38,8 +38,9 @@ accommodate PEP 382, but has some critical incompatibilities with respect to
the implicit namespace package mechanism proposed in PEP 402.
Finally, PEP 328 eliminated implicit relative imports from imported modules.
This PEP proposes that implicit relative imports from main modules also be
eliminated.
This PEP proposes that the de facto implicit relative imports from main
modules that are provided by the current initialisation behaviour for
``sys.path[0]`` also be eliminated.
What's in a ``__name__``?
@ -62,11 +63,11 @@ The key use cases identified for this module attribute are:
Traps for the Unwary
====================
The overloading of the semantics of ``__name__`` have resulted in several
traps for the unwary. These traps can be quite annoying in practice, as
they are highly unobvious and can cause quite confusing behaviour. A lot of
the time, you won't even notice them, which just makes them all the more
surprising when they do come up.
The overloading of the semantics of ``__name__``, along with some historically
associated behaviour in the initialisation of ``sys.path[0]``, has resulted in
several traps for the unwary. These traps can be quite annoying in practice,
as they are highly unobvious (especially to beginners) and can cause quite
confusing behaviour.
Why are my imports broken?
@ -96,47 +97,51 @@ the exact same error when determining the value for ``sys.path[0]``.
The impact of this can be seen relatively frequently if you follow the
"python" and "import" tags on Stack Overflow. When I had the time to follow
it myself, I regularly encountered people struggling to understand the
behaviour of straightforward package layouts like the following::
behaviour of straightforward package layouts like the following (I actually
use package layouts along these lines in my own projects)::
project/
setup.py
package/
example/
__init__.py
foo.py
tests/
__init__.py
test_foo.py
I would actually often see it without the ``__init__.py`` files first, but
that's a trivial fix to explain. What's hard to explain is that all of the
following ways to invoke ``test_foo.py`` *probably won't work* due to broken
imports (either failing to find ``package`` for absolute imports, complaining
about relative imports in a non-package for explicit relative imports, or
issuing even more obscure errors if some other submodule happens to shadow
the name of a top-level module, such as a ``package.json`` module that
handled serialisation or a ``package.tests.unittest`` test runner)::
While I would often see it without the ``__init__.py`` files first, that's a
trivial fix to explain. What's hard to explain is that all of the following
ways to invoke ``test_foo.py`` *probably won't work* due to broken imports
(either failing to find ``example`` for absolute imports, complaining
about relative imports in a non-package or beyond the toplevel package for
explicit relative imports, or issuing even more obscure errors if some other
submodule happens to shadow the name of a top-level module, such as an
``example.json`` module that handled serialisation or an
``example.tests.unittest`` test runner)::
# working directory: project/package/tests
# These commands will most likely *FAIL*, even if the code is correct
# working directory: project/example/tests
./test_foo.py
python test_foo.py
python -m test_foo
python -c "from test_foo import main; main()"
python -m package.tests.test_foo
python -c "from package.tests.test_foo import main; main()"
# working directory: project/package
tests/test_foo.py
python tests/test_foo.py
python -m tests.test_foo
python -c "from tests.test_foo import main; main()"
python -m package.tests.test_foo
python -c "from package.tests.test_foo import main; main()"
# working directory: project
package/tests/test_foo.py
python package/tests/test_foo.py
example/tests/test_foo.py
python example/tests/test_foo.py
# working directory: project/..
project/package/tests/test_foo.py
python project/package/tests/test_foo.py
project/example/tests/test_foo.py
python project/example/tests/test_foo.py
# The -m and -c approaches don't work from here either, but the failure
# to find 'package' correctly is pretty easy to explain in this case
# to find 'package' correctly is easier to explain in this case
That's right, that long list is of all the methods of invocation that will
almost certainly *break* if you try them, and the error messages won't make
@ -162,16 +167,24 @@ via the ``-m`` switch), the following also works properly::
The fact that most methods of invoking Python code from the command line
break when that code is inside a package, and the two that do work are highly
sensitive to the current working directory is all thoroughly confusing for a
beginner, and I personally believe it is one of the key factors leading
beginner. I personally believe it is one of the key factors leading
to the perception that Python packages are complicated and hard to get right.
This problem isn't even limited to the command line - if ``test_foo.py`` is
open in Idle and you attempt to run it by pressing F5, then it will fail in
just the same way it would if run directly from the command line.
open in Idle and you attempt to run it by pressing F5, or if you try to run
it by clicking on it in a graphical filebrowser, then it will fail in just
the same way it would if run directly from the command line.
There's a reason the general ``sys.path`` guideline mentioned above exists,
and the fact that the interpreter itself doesn't follow it when determining
``sys.path[0]`` is the root cause of all sorts of grief.
There's a reason the general "no package directories on ``sys.path``"
guideline exists, and the fact that the interpreter itself doesn't follow
it when determining ``sys.path[0]`` is the root cause of all sorts of grief.
In the past, this couldn't be fixed due to backwards compatibility concerns.
However, scripts potentially affected by this problem will *already* require
fixes when porting to the Python 3.x (due to the elimination of implicit
relative imports when importing modules normally). This provides a convenient
opportunity to implement a corresponding change in the initialisation
semantics for ``sys.path[0]``.
Importing the main module twice
@ -282,8 +295,8 @@ Two alternative names were also considered for the new attribute: "full name"
(``__fullname__``) and "implementation name" (``__implname__``).
Either of those would actually be valid for the use case in this PEP.
However, as a meta-issue, it seemed needlessly inconsistent to add *two*
terms that were essentially "like ``__name__``, but different in some cases
However, as a meta-issue, PEP 3155 is *also* adding a new attribute (for
functions and classes) that is "like ``__name__``, but different in some cases
where ``__name__`` is missing necessary information" and those terms aren't
accurate for the PEP 3155 function and class use case.
@ -294,6 +307,11 @@ case for PEP 3155 (in that PEP, ``__name__`` and ``__qualname__`` always
refer to the same function or class, it's just that ``__name__`` is
insufficient to accurately identify nested functions and classes).
Since it seems needlessly inconsistent to add *two* new terms for attributes
that only exist because backwards compatibility concerns keep us from
changing the behaviour of ``__name__`` itself, this PEP instead chose to
adopt the PEP 3155 terminology.
If the relative inscrutability of "qualified name" and ``__qualname__``
encourages interested developers to look them up at least once rather than
assuming they know what they mean just from the name and guessing wrong,
@ -314,9 +332,9 @@ for the unwary noted above, or else provide straightforward mechanisms for
dealing with them.
A rough draft of some of the concepts presented here was first posted on the
python-ideas list [1]_, but they have evolved considerably since first being
python-ideas list ([1]_), but they have evolved considerably since first being
discussed in that thread. Further discussion has subsequently taken place on
the import-sig mailing list [2]_.
the import-sig mailing list ([2]_. [3]_).
Fixing main module imports inside packages
@ -346,25 +364,19 @@ as follows::
It is proposed that this initialisation process be modified to take
package details stored on the filesystem into account::
# Interactive prompt, -c switch
in_package, path_entry, modname = split_path_module(os.getcwd(), '')
# Interactive prompt, -m switch, -c switch
in_package, path_entry, _ignored = split_path_module(os.getcwd(), '')
if in_package:
sys.path.insert(0, path_entry)
else:
sys.path.insert(0, '')
# Start interactive prompt or run -c command as usual
# __main__.__qualname__ is set to "__main__"
::
# -m switch
modname = <<argument to -m switch>>
in_package, path_entry, modname = split_path_module(os.getcwd(), modname)
if in_package:
sys.path.insert(0, path_entry)
else:
sys.path.insert(0, '')
# modname (possibly adjusted) is passed to ``runpy._run_module_as_main()``
# The -m switches uses the same sys.path[0] calculation, but:
# modname is the argument to the -m switch
# modname is passed to ``runpy._run_module_as_main()`` as usual
# __main__.__qualname__ is set to modname
::
@ -373,6 +385,7 @@ package details stored on the filesystem into account::
modname = "__main__"
path_entry, modname = split_path_module(sys.argv[0], modname)
sys.path.insert(0, path_entry)
# modname (possibly adjusted) is passed to ``runpy._run_module_as_main()``
# __main__.__qualname__ is set to modname
@ -423,11 +436,9 @@ exposed directly to Python users via the ``runpy`` module.
With this fix in place, and the same simple package layout described earlier,
*all* of the following commands would invoke the test suite correctly::
# working directory: project/package/tests
# working directory: project/example/tests
./test_foo.py
python test_foo.py
python -m test_foo
python -m tests.test_foo
python -m package.tests.test_foo
python -c "from .test_foo import main; main()"
python -c "from ..tests.test_foo import main; main()"
@ -436,27 +447,64 @@ With this fix in place, and the same simple package layout described earlier,
# working directory: project/package
tests/test_foo.py
python tests/test_foo.py
python -m tests.test_foo
python -m package.tests.test_foo
python -c "from .tests.test_foo import main; main()"
python -c "from package.tests.test_foo import main; main()"
# working directory: project
package/tests/test_foo.py
python package/tests/test_foo.py
example/tests/test_foo.py
python example/tests/test_foo.py
python -m package.tests.test_foo
python -c "from package.tests.test_foo import main; main()"
# working directory: project/..
project/package/tests/test_foo.py
python project/package/tests/test_foo.py
project/example/tests/test_foo.py
python project/example/tests/test_foo.py
# The -m and -c approaches still don't work from here, but the failure
# to find 'package' correctly is pretty easy to explain in this case
With these changes, clicking Python modules in a graphical file browser
should always execute them correctly, even if they live inside a package.
Depending on the details of how it invokes the script, Idle would likely also
be able to run ``test_foo.py`` correctly with F5, without needing any Idle
specific fixes.
Optional addition: command line relative imports
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
With the above changes in place, it would be a fairly minor addition to allow
explicit relative imports as arguments to the ``-m`` switch::
# working directory: project/example/tests
python -m .test_foo
python -m ..tests.test_foo
# working directory: project/example/
python -m .tests.test_foo
With this addition, system initialisation for the ``-m`` switch would change
as follows::
# -m switch (permitting explicit relative imports)
in_package, path_entry, pkg_name = split_path_module(os.getcwd(), '')
qualname= <<arguments to -m switch>>
if qualname.startswith('.'):
modname = qualname
while modname.startswith('.'):
modname = modname[1:]
pkg_name, sep, _ignored = pkg_name.rpartition('.')
if not sep:
raise ImportError("Attempted relative import beyond top level package")
qualname = pkg_name + '.' modname
if in_package:
sys.path.insert(0, path_entry)
else:
sys.path.insert(0, '')
# qualname is passed to ``runpy._run_module_as_main()``
# _main__.__qualname__ is set to qualname
Compatibility with PEP 382
~~~~~~~~~~~~~~~~~~~~~~~~~~
@ -486,7 +534,7 @@ adopted, the core concept of making import semantics from main and other
modules more consistent would no longer be feasible.
This incompatibility is discussed in more detail in the relevant import-sig
thread [2]_.
threads ([2]_, [3]_).
Potential incompatibilities with scripts stored in packages
@ -637,6 +685,10 @@ backwards incompatibilities with existing code that deliberately manipulates
relative imports by adjusting ``__name__`` rather than setting ``__package__``
directly.
This PEP does *not* propose that ``__package__`` be deprecated. While it is
technically redundant following the introduction of ``__qualname__``, it just
isn't worth the hassle of deprecating it within the lifetime of Python 3.x.
Reference Implementation
========================
@ -653,7 +705,11 @@ References
.. [2] PEP 395 (Module aliasing) and the namespace PEPs
(http://mail.python.org/pipermail/import-sig/2011-November/000382.html)
.. [3] Updated PEP 395 (aka "Implicit Relative Imports Must Die!")
(http://mail.python.org/pipermail/import-sig/2011-November/000397.html)
.. [4] Elaboration of compatibility problems between this PEP and PEP 402
(http://mail.python.org/pipermail/import-sig/2011-November/000403.html)
Copyright
=========