diff --git a/pep-0000.txt b/pep-0000.txt index 71d6a93fd..77d23e826 100644 --- a/pep-0000.txt +++ b/pep-0000.txt @@ -259,6 +259,7 @@ Index by Category SR 363 Syntax For Dynamic Attribute Access North SR 666 Reject Foolish Indentation Creighton SR 3103 A Switch/Case Statement GvR + SR 3122 Delineation of the main module Cannon Numerical Index @@ -478,6 +479,7 @@ Numerical Index S 3119 Introducing Abstract Base Classes GvR, Talin S 3120 Using UTF-8 as the default source encoding von Löwis S 3121 Module Initialization and finalization von Löwis + SR 3122 Delineation of the main module Cannon S 3141 A Type Hierarchy for Numbers Yasskin diff --git a/pep-3122.txt b/pep-3122.txt new file mode 100644 index 000000000..30f74b27f --- /dev/null +++ b/pep-3122.txt @@ -0,0 +1,267 @@ +PEP: 3122 +Title: Delineation of the main module +Version: $Revision$ +Last-Modified: $Date$ +Author: Brett Cannon +Status: Rejected +Type: Standards Track +Content-Type: text/x-rst +Created: 27-Apr-2007 + +.. attention:: + This PEP has been rejected. Guido views running scripts within a + package as an anti-pattern [#guido-rejection]_. + +Abstract +======== + +Because of how name resolution works for relative imports in a world +where PEP 328 is implemented, the ability to execute modules within a +package ceases being possible. This failing stems from the fact that +the module being executed as the "main" module replaces its +``__name__`` attribute with ``"__main__"`` instead of leaving it as +the absolute name of the module. This breaks import's ability +to resolve relative imports from the main module into absolute names. + +In order to resolve this issue, this PEP proposes to change how the +main module is delineated. By leaving the ``__name__`` attribute in +a module alone and setting ``sys.main`` to the name of the main +module this will allow at least some instances of executing a module +within a package that uses relative imports. + +This PEP does not address the idea of introducing a module-level +function that is automatically executed like PEP 299 proposes. + + +The Problem +=========== + +With the introduction of PEP 328, relative imports became dependent on +the ``__name__`` attribute of the module performing the import. This +is because the use of dots in a relative import are used to strip away +parts of the calling module's name to calculate where in the package +hierarchy an import should fall (prior to PEP 328 relative +imports could fail and would fall back on absolute imports which had a +chance of succeeding). + +For instance, consider the import ``from .. import spam`` made from the +``bacon.ham.beans`` module (``bacon.ham.beans`` is not a package +itself, i.e., does not define ``__path__``). Name resolution of the +relative import takes the caller's name (``bacon.ham.beans``), splits +on dots, and then slices off the last n parts based on the level +(which is 2). In this example both ``ham`` and ``beans`` are dropped +and ``spam`` is joined with what is left (``bacon``). This leads to +the proper import of the module ``bacon.spam``. + +This reliance on the ``__name__`` attribute of a module when handling +relative imports becomes an issue when executing a script within a +package. Because the executing script has its name set to +``'__main__'``, import cannot resolve any relative imports, leading to +an ``ImportError``. + +For example, assume we have a package named ``bacon`` with an +``__init__.py`` file containing:: + + from . import spam + +Also create a module named ``spam`` within the ``bacon`` package (it +can be an empty file). Now if you try to execute the ``bacon`` +package (either through ``python bacon/__init__.py`` or +``python -m bacon``) you will get an ``ImportError`` about trying to +do a relative import from within a non-package. Obviously the import +is valid, but because of the setting of ``__name__`` to ``'__main__'`` +import thinks that ``bacon/__init__.py`` is not in a package since no +dots exist in ``__name__``. To see how the algorithm works in more +detail, see ``importlib.Import._resolve_name()`` in the sandbox +[#importlib]_. + +Currently a work-around is to remove all relative imports in the +module being executed and make them absolute. This is unfortunate, +though, as one should not be required to use a specific type of +resource in order to make a module in a package be able to be +executed. + + +The Solution +============ + +The solution to the problem is to not change the value of ``__name__`` +in modules. But there still needs to be a way to let executing code +know it is being executed as a script. This is handled with a new +attribute in the ``sys`` module named ``main``. + +When a module is being executed as a script, ``sys.main`` will be set +to the name of the module. This changes the current idiom of:: + + if __name__ == '__main__': + ... + +to:: + + import sys + if __name__ == sys.main: + ... + +The newly proposed solution does introduce an added line of +boilerplate which is a module import. But as the solution does not +introduce a new built-in or module attribute (as discussed in +`Rejected Ideas`_) it has been deemed worth the extra line. + +Another issue with the proposed solution (which also applies to all +rejected ideas as well) is that it does not directly solve the problem +of discovering the name of a file. Consider ``python bacon/spam.py``. +By the file name alone it is not obvious whether ``bacon`` is a +package. In order to properly find this out both the current +direction must exist on ``sys.path`` as well as ``bacon/__init__.py`` +existing. + +But this is the simple example. Consider ``python ../spam.py``. From +the file name alone it is not at all clear if ``spam.py`` is in a +package or not. One possible solution is to find out what the +absolute name of ``..``, check if a file named ``__init__.py`` exists, +and then look if the directory is on ``sys.path``. If it is not, then +continue to walk up the directory until no more ``__init__.py`` files +are found or the directory is found on ``sys.path``. + +This could potentially be an expensive process. If the package depth +happens to be deep then it could require a large amount of disk access +to discover where the package is anchored on ``sys.path``, if at all. +The stat calls alone can be expensive if the file system the executed +script is on is something like NFS. + +Because of these issues, only when the ``-m`` command-line argument +(introduced by PEP 338) is used will ``__name__`` be set. Otherwise +the fallback semantics of setting ``__name__`` to ``"__main__"`` will +occur. ``sys.main`` will still be set to the proper value, +regardless of what ``__name__`` is set to. + + +Implementation +============== + +When the ``-m`` option is used, ``sys.main`` will be set to the +argument passed in. ``sys.argv`` will be adjusted as it is currently. +Then the equivalent of ``__import__(self.main)`` will occur. This +differs from current semantics as the ``runpy`` module fetches the +code object for the file specified by the module name in order to +explicitly set ``__name__`` and other attributes. This is no longer +needed as import can perform its normal operation in this situation. + +If a file name is specified, then ``sys.main`` will be set to +``"__main__"``. The specified file will then be read and have a code +object created and then be executed with ``__name__`` set to +``"__main__"``. This mirrors current semantics. + + +Transition Plan +=============== + +In order for Python 2.6 to be able to support both the current +semantics and the proposed semantics, ``sys.main`` will always be set +to ``"__main__"``. Otherwise no change will occur for Python 2.6. +This unfortunately means that no benefit from this change will occur +in Python 2.6, but it maximizes compatibility for code that is to +work as much as possible with 2.6 and 3.0. + +To help transition to the new idiom, 2to3 [#2to3]_ will gain a rule to +transform the current ``if __name__ == '__main__': ...`` idiom to the +new one. This will not help with code that checks ``__name__`` +outside of the idiom, though. + + +Rejected Ideas +============== + +``__main__`` built-in +--------------------- + +A counter-proposal to introduce a built-in named ``__main__``. +The value of the built-in would be the name of the module being +executed (just like the proposed ``sys.main``). This would lead to a +new idiom of:: + + if __name__ == __main__: + ... + +A drawback is that the syntactic difference is subtle; the dropping +of quotes around "__main__". Some believe that for existing Python +programmers bugs will be introduced where the quotation marks will be +put on by accident. But one could argue that the bug would be +discovered quickly through testing as it is a very shallow bug. + +While the name of built-in could obviously be different (e.g., +``main``) the other drawback is that it introduces a new built-in. +With a simple solution such as ``sys.main`` being possible without +adding another built-in to Python, this proposal was rejected. + +``__main__`` module attribute +----------------------------- + +Another proposal was to add a ``__main__`` attribute to every module. +For the one that was executing as the main module, the attribute would +have a true value while all other modules had a false value. This has +a nice consequence of simplify the main module idiom to:: + + if __main__: + ... + +The drawback was the introduction of a new module attribute. It also +required more integration with the import machinery than the proposed +solution. + + +Use ``__file__`` instead of ``__name__`` +---------------------------------------- + +Any of the proposals could be changed to use the ``__file__`` +attribute on modules instead of ``__name__``, including the current +semantics. The problem with this is that with the proposed solutions +there is the issue of modules having no ``__file__`` attribute defined +or having the same value as other modules. + +The problem that comes up with the current semantics is you still have +to try to resolve the file path to a module name for the import to +work. + + +Special string subclass for ``__name__`` that overrides ``__eq__`` +------------------------------------------------------------------ + +One proposal was to define a subclass of ``str`` that overrode the +``__eq__`` method so that it would compare equal to ``"__main__"`` as +well as the actual name of the module. In all other respects the +subclass would be the same as ``str``. + +This was rejected as it seemed like too much of a hack. + + +References +========== + +.. [#2to3] 2to3 tool + (http://svn.python.org/view/sandbox/trunk/2to3/) [ViewVC] + +.. [#importlib] importlib + (http://svn.python.org/view/sandbox/trunk/import_in_py/importlib.py?view=markup) + [ViewVC] + +.. [#guido-rejection] Python-Dev email: "PEP to change how the main module is delineated" + (http://mail.python.org/pipermail/python-3000/2007-April/006793.html) + + + +Copyright +========= + +This document has been placed in the public domain. + + + +.. + Local Variables: + mode: indented-text + indent-tabs-mode: nil + sentence-end-double-space: t + fill-column: 70 + coding: utf-8 + End: