Minor updates after additional review of the published version of PEP 395

This commit is contained in:
Nick Coghlan 2011-11-19 22:49:17 +10:00
parent d0145b5271
commit ec5d9254c7
1 changed files with 49 additions and 33 deletions

View File

@ -28,10 +28,14 @@ This PEP builds on the "qualified name" concept introduced by PEP 3155, and
also shares in that PEP's aim of fixing some ugly corner cases when dealing
with serialisation of arbitrary functions and classes.
It is also affected by the two competing "namespace package" PEPs (PEP 382
and PEP 402). This PEP would require some minor adjustments to accommodate
PEP 382, but has some critical incompatibilities with respect to the namespace
package mechanism proposed in PEP 402.
It also builds on PEP 366, which took initial tentative steps towards making
explicit relative imports from the main module work correctly in at least
*some* circumstances.
This PEP is also affected by the two competing "namespace package" PEPs
(PEP 382 and PEP 402). This PEP would require some minor adjustments to
accommodate PEP 382, but has some critical incompatibilities with respect to
the implicit namespace package mechanism proposed in PEP 402.
Finally, PEP 328 eliminated implicit relative imports from imported modules.
This PEP proposes that implicit relative imports from main modules also be
@ -136,8 +140,8 @@ handled serialisation or a ``package.tests.unittest`` test runner)::
That's right, that long list is of all the methods of invocation that will
almost certainly *break* if you try them, and the error messages won't make
any sense if you're not already intimately not only with the way Python's
import system works, but also with how it gets initialised.
any sense if you're not already intimately familiar not only with the way
Python's import system works, but also with how it gets initialised.
For a long time, the only way to get ``sys.path`` right with that kind of
setup was to either set it manually in ``test_foo.py`` itself (hardly
@ -173,31 +177,31 @@ and the fact that the interpreter itself doesn't follow it when determining
Importing the main module twice
-------------------------------
Another venerable trap is the issue of (effectively) importing ``__main__``
twice. This occurs when the main module is also imported under its real
name, effectively creating two instances of the same module under
different names.
Another venerable trap is the issue of importing ``__main__`` twice. This
occurs when the main module is also imported under its real name, effectively
creating two instances of the same module under different names.
If the state stored in ``__main__`` is significant to the correct operation
of the program, then this duplication can cause obscure and surprising
errors.
of the program, or if there is top-level code in the main module that has
non-idempotent side effects, then this duplication can cause obscure and
surprising errors.
In a bit of a pickle
--------------------
Something many users may not realise is that the ``pickle`` module serialises
objects based on the ``__name__`` of the containing module. So objects
defined in ``__main__`` are pickled that way, and won't be unpickled
correctly by another python instance that only imported that module instead
of running it directly. This behaviour is the underlying reason for the
advice from many Python veterans to do as little as possible in the
``__main__`` module in any application that involves any form of object
serialisation and persistence.
Something many users may not realise is that the ``pickle`` module sometimes
relies on the ``__module__`` attribute when serialising instances of arbitrary
classes. So instances of classes defined in ``__main__`` are pickled that way,
and won't be unpickled correctly by another python instance that only imported
that module instead of running it directly. This behaviour is the underlying
reason for the advice from many Python veterans to do as little as possible
in the ``__main__`` module in any application that involves any form of
object serialisation and persistence.
Similarly, when creating a pseudo-module, pickles rely on the name of the
module where a class is actually defined, rather than the officially
documented location for that class in the module hierarchy.
Similarly, when creating a pseudo-module (see next paragraph), pickles rely
on the name of the module where a class is actually defined, rather than the
officially documented location for that class in the module hierarchy.
For the purposes of this PEP, a "pseudo-module" is a package designed like
the Python 3.2 ``unittest`` and ``concurrent.futures`` packages. These
@ -211,7 +215,8 @@ API.
While this PEP focuses specifically on ``pickle`` as the principal
serialisation scheme in the standard library, this issue may also affect
other mechanisms that support serialisation of arbitrary class instances
and rely on ``__name__`` to determine how to handle deserialisation.
and rely on ``__module__`` attributes to determine how to handle
deserialisation.
Where's the source?
@ -240,7 +245,8 @@ that simply aren't valid whenever the main module isn't an ordinary directly
executed script or top-level module. Packages and non-top-level modules
executed via the ``-m`` switch, as well as directly executed zipfiles or
directories, are likely to make multiprocessing on Windows do the wrong thing
(either quietly or noisily) when spawning a new process.
(either quietly or noisily, depending on application details) when spawning a
new process.
While this issue currently only affects Windows directly, it also impacts
any proposals to provide Windows-style "clean process" invocation via the
@ -256,10 +262,6 @@ to add a new module level attribute: ``__qualname__``. This abbreviation of
path to a nested class or function definition relative to the top level
module.
If a module loader does not initialise ``__qualname__`` itself, then the
import system will add it automatically (setting it to the same value as
``__name__``).
For modules, ``__qualname__`` will normally be the same as ``__name__``, just
as it is for top-level functions and classes in PEP 3155. However, it will
differ in some situations so that the above problems can be addressed.
@ -268,6 +270,10 @@ Specifically, whenever ``__name__`` is modified for some other purpose (such
as to denote the main module), then ``__qualname__`` will remain unchanged,
allowing code that needs it to access the original unmodified value.
If a module loader does not initialise ``__qualname__`` itself, then the
import system will add it automatically (setting it to the same value as
``__name__``).
Eliminating the Traps
=====================
@ -280,7 +286,7 @@ dealing with them.
A rough draft of some of the concepts presented here was first posted on the
python-ideas list [1]_, but they have evolved considerably since first being
discussed in that thread. Further discussion has subsequently taken place on
import-sig [2]_.
the import-sig mailing list [2]_.
Fixing main module imports inside packages
@ -292,14 +298,18 @@ will look for Python's explicit package directory markers and use them to find
the appropriate directory to add to ``sys.path``.
The current algorithm for setting ``sys.path[0]`` in relevant cases is roughly
as follows:
as follows::
# Interactive prompt, -m switch, -c switch
sys.path.insert(0, '')
::
# Valid sys.path entry execution (i.e. directory and zip execution)
sys.path.insert(0, sys.argv[0])
::
# Direct script execution
sys.path.insert(0, os.path.dirname(sys.argv[0]))
@ -315,6 +325,8 @@ package details stored on the filesystem into account::
# Start interactive prompt or run -c command as usual
# __main__.__qualname__ is set to "__main__"
::
# -m switch
modname = <<argument to -m switch>>
in_package, path_entry, modname = split_path_module(os.getcwd(), modname)
@ -325,6 +337,8 @@ package details stored on the filesystem into account::
# modname (possibly adjusted) is passed to ``runpy._run_module_as_main()``
# __main__.__qualname__ is set to modname
::
# Valid sys.path entry execution (i.e. directory and zip execution)
modname = "__main__"
path_entry, modname = split_path_module(sys.argv[0], modname)
@ -332,6 +346,8 @@ package details stored on the filesystem into account::
# modname (possibly adjusted) is passed to ``runpy._run_module_as_main()``
# __main__.__qualname__ is set to modname
::
# Direct script execution
in_package, path_entry, modname = split_path_module(sys.argv[0])
sys.path.insert(0, path_entry)
@ -497,7 +513,7 @@ functions and classes will be given a new ``__qualmodule__`` attribute
that refers to the ``__qualname__`` of their module.
This isn't strictly necessary for functions (you could find out their
module's qualified name by looking in their globals dictionary), it is
module's qualified name by looking in their globals dictionary), but it is
needed for classes, since they don't hold a reference to the globals of
their defining module. Once a new attribute is added to classes, it is
more convenient to keep the API consistent and add a new attribute to
@ -550,7 +566,7 @@ proposes that the behaviour of explicit relative imports be left alone.
In particular, if ``__package__`` is not set in a module when an explicit
relative import occurs, the automatically cached value will continue to be
derived from ``__name__`` rather than ``__qualname__``. This minimises any
backwards incompatibilities with code that deliberately manipulates
backwards incompatibilities with existing code that deliberately manipulates
relative imports by adjusting ``__name__`` rather than setting ``__package__``
directly.