269 lines
10 KiB
ReStructuredText
269 lines
10 KiB
ReStructuredText
PEP: 3122
|
|
Title: Delineation of the main module
|
|
Version: $Revision$
|
|
Last-Modified: $Date$
|
|
Author: Brett Cannon
|
|
Status: Rejected
|
|
Type: Standards Track
|
|
Content-Type: text/x-rst
|
|
Created: 27-Apr-2007
|
|
Post-History:
|
|
|
|
.. attention::
|
|
This PEP has been rejected. Guido views running scripts within a
|
|
package as an anti-pattern [#guido-rejection]_.
|
|
|
|
Abstract
|
|
========
|
|
|
|
Because of how name resolution works for relative imports in a world
|
|
where :pep:`328` is implemented, the ability to execute modules within a
|
|
package ceases being possible. This failing stems from the fact that
|
|
the module being executed as the "main" module replaces its
|
|
``__name__`` attribute with ``"__main__"`` instead of leaving it as
|
|
the absolute name of the module. This breaks import's ability
|
|
to resolve relative imports from the main module into absolute names.
|
|
|
|
In order to resolve this issue, this PEP proposes to change how the
|
|
main module is delineated. By leaving the ``__name__`` attribute in
|
|
a module alone and setting ``sys.main`` to the name of the main
|
|
module this will allow at least some instances of executing a module
|
|
within a package that uses relative imports.
|
|
|
|
This PEP does not address the idea of introducing a module-level
|
|
function that is automatically executed like :pep:`299` proposes.
|
|
|
|
|
|
The Problem
|
|
===========
|
|
|
|
With the introduction of :pep:`328`, relative imports became dependent on
|
|
the ``__name__`` attribute of the module performing the import. This
|
|
is because the use of dots in a relative import are used to strip away
|
|
parts of the calling module's name to calculate where in the package
|
|
hierarchy an import should fall (prior to :pep:`328` relative
|
|
imports could fail and would fall back on absolute imports which had a
|
|
chance of succeeding).
|
|
|
|
For instance, consider the import ``from .. import spam`` made from the
|
|
``bacon.ham.beans`` module (``bacon.ham.beans`` is not a package
|
|
itself, i.e., does not define ``__path__``). Name resolution of the
|
|
relative import takes the caller's name (``bacon.ham.beans``), splits
|
|
on dots, and then slices off the last n parts based on the level
|
|
(which is 2). In this example both ``ham`` and ``beans`` are dropped
|
|
and ``spam`` is joined with what is left (``bacon``). This leads to
|
|
the proper import of the module ``bacon.spam``.
|
|
|
|
This reliance on the ``__name__`` attribute of a module when handling
|
|
relative imports becomes an issue when executing a script within a
|
|
package. Because the executing script has its name set to
|
|
``'__main__'``, import cannot resolve any relative imports, leading to
|
|
an ``ImportError``.
|
|
|
|
For example, assume we have a package named ``bacon`` with an
|
|
``__init__.py`` file containing::
|
|
|
|
from . import spam
|
|
|
|
Also create a module named ``spam`` within the ``bacon`` package (it
|
|
can be an empty file). Now if you try to execute the ``bacon``
|
|
package (either through ``python bacon/__init__.py`` or
|
|
``python -m bacon``) you will get an ``ImportError`` about trying to
|
|
do a relative import from within a non-package. Obviously the import
|
|
is valid, but because of the setting of ``__name__`` to ``'__main__'``
|
|
import thinks that ``bacon/__init__.py`` is not in a package since no
|
|
dots exist in ``__name__``. To see how the algorithm works in more
|
|
detail, see ``importlib.Import._resolve_name()`` in the sandbox
|
|
[#importlib]_.
|
|
|
|
Currently a work-around is to remove all relative imports in the
|
|
module being executed and make them absolute. This is unfortunate,
|
|
though, as one should not be required to use a specific type of
|
|
resource in order to make a module in a package be able to be
|
|
executed.
|
|
|
|
|
|
The Solution
|
|
============
|
|
|
|
The solution to the problem is to not change the value of ``__name__``
|
|
in modules. But there still needs to be a way to let executing code
|
|
know it is being executed as a script. This is handled with a new
|
|
attribute in the ``sys`` module named ``main``.
|
|
|
|
When a module is being executed as a script, ``sys.main`` will be set
|
|
to the name of the module. This changes the current idiom of::
|
|
|
|
if __name__ == '__main__':
|
|
...
|
|
|
|
to::
|
|
|
|
import sys
|
|
if __name__ == sys.main:
|
|
...
|
|
|
|
The newly proposed solution does introduce an added line of
|
|
boilerplate which is a module import. But as the solution does not
|
|
introduce a new built-in or module attribute (as discussed in
|
|
`Rejected Ideas`_) it has been deemed worth the extra line.
|
|
|
|
Another issue with the proposed solution (which also applies to all
|
|
rejected ideas as well) is that it does not directly solve the problem
|
|
of discovering the name of a file. Consider ``python bacon/spam.py``.
|
|
By the file name alone it is not obvious whether ``bacon`` is a
|
|
package. In order to properly find this out both the current
|
|
direction must exist on ``sys.path`` as well as ``bacon/__init__.py``
|
|
existing.
|
|
|
|
But this is the simple example. Consider ``python ../spam.py``. From
|
|
the file name alone it is not at all clear if ``spam.py`` is in a
|
|
package or not. One possible solution is to find out what the
|
|
absolute name of ``..``, check if a file named ``__init__.py`` exists,
|
|
and then look if the directory is on ``sys.path``. If it is not, then
|
|
continue to walk up the directory until no more ``__init__.py`` files
|
|
are found or the directory is found on ``sys.path``.
|
|
|
|
This could potentially be an expensive process. If the package depth
|
|
happens to be deep then it could require a large amount of disk access
|
|
to discover where the package is anchored on ``sys.path``, if at all.
|
|
The stat calls alone can be expensive if the file system the executed
|
|
script is on is something like NFS.
|
|
|
|
Because of these issues, only when the ``-m`` command-line argument
|
|
(introduced by :pep:`338`) is used will ``__name__`` be set. Otherwise
|
|
the fallback semantics of setting ``__name__`` to ``"__main__"`` will
|
|
occur. ``sys.main`` will still be set to the proper value,
|
|
regardless of what ``__name__`` is set to.
|
|
|
|
|
|
Implementation
|
|
==============
|
|
|
|
When the ``-m`` option is used, ``sys.main`` will be set to the
|
|
argument passed in. ``sys.argv`` will be adjusted as it is currently.
|
|
Then the equivalent of ``__import__(self.main)`` will occur. This
|
|
differs from current semantics as the ``runpy`` module fetches the
|
|
code object for the file specified by the module name in order to
|
|
explicitly set ``__name__`` and other attributes. This is no longer
|
|
needed as import can perform its normal operation in this situation.
|
|
|
|
If a file name is specified, then ``sys.main`` will be set to
|
|
``"__main__"``. The specified file will then be read and have a code
|
|
object created and then be executed with ``__name__`` set to
|
|
``"__main__"``. This mirrors current semantics.
|
|
|
|
|
|
Transition Plan
|
|
===============
|
|
|
|
In order for Python 2.6 to be able to support both the current
|
|
semantics and the proposed semantics, ``sys.main`` will always be set
|
|
to ``"__main__"``. Otherwise no change will occur for Python 2.6.
|
|
This unfortunately means that no benefit from this change will occur
|
|
in Python 2.6, but it maximizes compatibility for code that is to
|
|
work as much as possible with 2.6 and 3.0.
|
|
|
|
To help transition to the new idiom, 2to3 [#2to3]_ will gain a rule to
|
|
transform the current ``if __name__ == '__main__': ...`` idiom to the
|
|
new one. This will not help with code that checks ``__name__``
|
|
outside of the idiom, though.
|
|
|
|
|
|
Rejected Ideas
|
|
==============
|
|
|
|
``__main__`` built-in
|
|
---------------------
|
|
|
|
A counter-proposal to introduce a built-in named ``__main__``.
|
|
The value of the built-in would be the name of the module being
|
|
executed (just like the proposed ``sys.main``). This would lead to a
|
|
new idiom of::
|
|
|
|
if __name__ == __main__:
|
|
...
|
|
|
|
A drawback is that the syntactic difference is subtle; the dropping
|
|
of quotes around "__main__". Some believe that for existing Python
|
|
programmers bugs will be introduced where the quotation marks will be
|
|
put on by accident. But one could argue that the bug would be
|
|
discovered quickly through testing as it is a very shallow bug.
|
|
|
|
While the name of built-in could obviously be different (e.g.,
|
|
``main``) the other drawback is that it introduces a new built-in.
|
|
With a simple solution such as ``sys.main`` being possible without
|
|
adding another built-in to Python, this proposal was rejected.
|
|
|
|
``__main__`` module attribute
|
|
-----------------------------
|
|
|
|
Another proposal was to add a ``__main__`` attribute to every module.
|
|
For the one that was executing as the main module, the attribute would
|
|
have a true value while all other modules had a false value. This has
|
|
a nice consequence of simplify the main module idiom to::
|
|
|
|
if __main__:
|
|
...
|
|
|
|
The drawback was the introduction of a new module attribute. It also
|
|
required more integration with the import machinery than the proposed
|
|
solution.
|
|
|
|
|
|
Use ``__file__`` instead of ``__name__``
|
|
----------------------------------------
|
|
|
|
Any of the proposals could be changed to use the ``__file__``
|
|
attribute on modules instead of ``__name__``, including the current
|
|
semantics. The problem with this is that with the proposed solutions
|
|
there is the issue of modules having no ``__file__`` attribute defined
|
|
or having the same value as other modules.
|
|
|
|
The problem that comes up with the current semantics is you still have
|
|
to try to resolve the file path to a module name for the import to
|
|
work.
|
|
|
|
|
|
Special string subclass for ``__name__`` that overrides ``__eq__``
|
|
------------------------------------------------------------------
|
|
|
|
One proposal was to define a subclass of ``str`` that overrode the
|
|
``__eq__`` method so that it would compare equal to ``"__main__"`` as
|
|
well as the actual name of the module. In all other respects the
|
|
subclass would be the same as ``str``.
|
|
|
|
This was rejected as it seemed like too much of a hack.
|
|
|
|
|
|
References
|
|
==========
|
|
|
|
.. [#2to3] 2to3 tool
|
|
(http://svn.python.org/view/sandbox/trunk/2to3/) [ViewVC]
|
|
|
|
.. [#importlib] importlib
|
|
(http://svn.python.org/view/sandbox/trunk/import_in_py/importlib.py?view=markup)
|
|
[ViewVC]
|
|
|
|
.. [#guido-rejection] Python-Dev email: "PEP to change how the main module is delineated"
|
|
(https://mail.python.org/pipermail/python-3000/2007-April/006793.html)
|
|
|
|
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
|
|
|
|
..
|
|
Local Variables:
|
|
mode: indented-text
|
|
indent-tabs-mode: nil
|
|
sentence-end-double-space: t
|
|
fill-column: 70
|
|
coding: utf-8
|
|
End:
|