python-peps/pep-0338.txt

266 lines
10 KiB
Plaintext
Raw Normal View History

2004-12-11 15:24:11 -05:00
PEP: 338
Title: Executing modules as scripts
2004-12-11 15:24:11 -05:00
Version: $Revision$
Last-Modified: $Date$
Author: Nick Coghlan <ncoghlan@gmail.com>
2004-12-11 15:24:11 -05:00
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 16-Oct-2004
Python-Version: 2.5
Post-History: 8-Nov-2004, 11-Feb-2006, 12-Feb-2006, 18-Feb-2006
2004-12-11 15:24:11 -05:00
Abstract
========
This PEP defines semantics for executing any Python module as a
script, either with the ``-m`` command line switch, or by invoking
it via ``runpy.run_module(modulename)``.
2004-12-11 15:24:11 -05:00
The ``-m`` switch implemented in Python 2.4 is quite limited. This
PEP proposes making use of the PEP 302 [4]_ import hooks to allow any
module which provides access to its code object to be executed.
2004-12-11 15:24:11 -05:00
Rationale
=========
Python 2.4 adds the command line switch ``-m`` to allow modules to be
located using the Python module namespace for execution as scripts.
The motivating examples were standard library modules such as ``pdb``
and ``profile``, and the Python 2.4 implementation is fine for this
limited purpose.
2004-12-11 15:24:11 -05:00
A number of users and developers have requested extension of the
feature to also support running modules located inside packages. One
example provided is pychecker's ``pychecker.checker`` module. This
capability was left out of the Python 2.4 implementation because the
implementation of this was significantly more complicated, and the most
appropriate strategy was not at all clear.
2004-12-11 15:24:11 -05:00
The opinion on python-dev was that it was better to postpone the
extension to Python 2.5, and go through the PEP process to help make
sure we got it right.
Since that time, it has also been pointed out that the current version
of ``-m`` does not support ``zipimport`` or any other kind of
alternative import behaviour (such as frozen modules).
Providing this functionality as a Python module is significantly easier
than writing it in C, and makes the functionality readily available to
all Python programs, rather than being specific to the CPython
interpreter. CPython's command line switch can then be rewritten to
make use of the new module.
Scripts which execute other scripts (e.g. ``profile``, ``pdb``) also
have the option to use the new module to provide ``-m`` style support
for identifying the script to be executed.
2004-12-11 15:24:11 -05:00
Scope of this proposal
==========================
In Python 2.4, a module located using ``-m`` is executed just as if
its filename had been provided on the command line. The goal of this
PEP is to get as close as possible to making that statement also hold
true for modules inside packages, or accessed via alternative import
mechanisms (such as ``zipimport``).
2004-12-11 15:24:11 -05:00
Prior discussions suggest it should be noted that this PEP is **not**
about changing the idiom for making Python modules also useful as
scripts (see PEP 299 [1]_). That issue is considered orthogonal to the
specific feature addressed by this PEP.
2004-12-11 15:24:11 -05:00
Current Behaviour
=================
Before describing the new semantics, it's worth covering the existing
semantics for Python 2.4 (as they are currently defined only by the
source code and the command line help).
2004-12-11 15:24:11 -05:00
When ``-m`` is used on the command line, it immediately terminates the
option list (like ``-c``). The argument is interpreted as the name of
a top-level Python module (i.e. one which can be found on
``sys.path``).
If the module is found, and is of type ``PY_SOURCE`` or
``PY_COMPILED``, then the command line is effectively reinterpreted
from ``python <options> -m <module> <args>`` to ``python <options>
<filename> <args>``. This includes setting ``sys.argv[0]`` correctly
(some scripts rely on this - Python's own ``regrtest.py`` is one
example).
If the module is not found, or is not of the correct type, an error
is printed.
Proposed Semantics
==================
The semantics proposed are fairly simple: if ``-m`` is used to execute
a module the PEP 302 import mechanisms are used to locate the module and
retrieve its compiled code, before executing the module in accordance
with the semantics for a top-level module. The interpreter does this by
invoking a new standard library function ``runpy.run_module``.
2004-12-11 15:24:11 -05:00
This is necessary due to the way Python's import machinery locates
modules inside packages. A package may modify its own __path__
variable during initialisation. In addition, paths may be affected by
``*.pth`` files, and some packages will install custom loaders on
``sys.metapath``. Accordingly, the only way for Python to reliably
2004-12-11 15:24:11 -05:00
locate the module is by importing the containing package and
using the PEP 302 import hooks to gain access to the Python code.
2004-12-11 15:24:11 -05:00
Note that the process of locating the module to be executed may require
importing the containing package. The effects of such a package import
that will be visible to the executed module are:
2004-12-11 15:24:11 -05:00
- the containing package will be in sys.modules
- any external effects of the package initialisation (e.g. installed
import hooks, loggers, atexit handlers, etc.)
Reference Implementation
========================
A reference implementation is available on SourceForge ([2]_), along
with documentation for the library reference ([5]_). There are
two parts to this implementation. The first is a proposed standard
library module ``runpy``. The second is a modification to the code
implementing the ``-m`` switch to always delegate to
``runpy.run_module`` instead of trying to run the module directly.
The delegation has the form::
runpy.run_module(sys.argv[0], run_name="__main__", alter_sys=True)
``run_module`` is the only function ``runpy`` exposes in its public API.
``run_module(mod_name[, init_globals][, run_name][, alter_sys])``
Execute the code of the specified module and return the resulting
module globals dictionary. The module's code is first located using
the standard import mechanism (refer to PEP 302 for details) and
then executed in a fresh module namespace.
The optional dictionary argument ``init_globals`` may be used to
pre-populate the globals dictionary before the code is executed.
The supplied dictionary will not be modified. If any of the special
global variables below are defined in the supplied dictionary, those
definitions are overridden by the run_module function.
The special global variables ``__name__``, ``__file__``,
``__loader__`` and ``__builtins__`` are set in the globals dictionary
before the module code is executed.
``__name__`` is set to ``run_name`` if this optional argument is
supplied, and the original ``mod_name`` argument otherwise.
``__loader__`` is set to the PEP 302 module loader used to retrieve
the code for the module (This loader may be a wrapper around the
standard import mechanism).
``__file__`` is set to the name provided by the module loader. If
the loader does not make filename information available, this
argument is set to ``None``.
``__builtins__`` is automatically initialised with a reference to
the top level namespace of the ``__builtin__`` module.
2006-02-11 09:18:13 -05:00
If the argument ``alter_sys`` is supplied and evaluates to ``True``,
then ``sys.argv[0]`` is updated with the value of ``__file__``
and ``sys.modules[__name__]`` is updated with a temporary module
object for the module being executed. The import lock is used to
prevent other threads from seeing the partially initialised module
object. Both ``sys.argv[0]`` and ``sys.modules[__name__]`` are
restored to their original values before this function returns.
When invoked as a script, the ``runpy`` module finds and executes the
module supplied as the first argument. It adjusts ``sys.argv`` by
deleting ``sys.argv[0]`` (which refers to the ``runpy`` module itself)
and then invokes ``run_module(sys.argv[0], run_name="__main__",
alter_sys=True)``.
2004-12-11 15:24:11 -05:00
Resolved Issues
================
There were some key design decisions that influenced the development of
the ``runpy`` module. These are listed below.
- The special variables ``__name__``, ``__file__`` and ``__loader__``
are set in a module's global namespace before the module is executed.
As ``run_module`` alters these values, it does **not** mutate the
supplied dictionary. If it did, then passing ``globals()`` to this
function could have nasty side effects.
- Sometimes, the information needed to populate the special variables
simply isn't available. Rather than trying to be too clever, these
variables are simply set to ``None`` when the relevant information
cannot be determined.
- There is no special protection on the alter_sys argument.
2006-02-11 09:18:13 -05:00
This may result in ``sys.argv[0]`` being set to ``None`` if file
name information is not available.
2004-12-11 15:24:11 -05:00
- The import lock is used to avoid potential threading issues that arise
when alter_sys is set to True.
2004-12-11 15:24:11 -05:00
Alternatives
============
The first alternative implementation considered ignored packages'
2004-12-11 15:24:11 -05:00
__path__ variables, and looked only in the main package directory. A
Python script with this behaviour can be found in the discussion of
the ``execmodule`` cookbook recipe [3]_.
The ``execmodule`` cookbook recipe itself was the proposed mechanism in
an earlier version of this PEP (before the PEP's author read PEP 302).
Both approaches were rejected as they do not meet the main goal of the
2004-12-11 15:24:11 -05:00
``-m`` switch -- to allow the full Python namespace to be used to
locate modules for execution from the command line.
2004-12-11 15:24:11 -05:00
An earlier version of this PEP included some mistaken assumptions
2006-02-11 09:18:13 -05:00
about the way ``exec`` handled locals dictionaries and code from
function objects. These mistaken assumptions led to some unneeded
design complexity which has now been removed - ``run_code`` shares all
of the quirks of ``exec``.
Earlier versions of the PEP also exposed a broader API that just the
single ``run_module()`` function needed to implement the updates to
the ``-m`` switch. In the interests of simplicity, those extra functions
have been dropped from the proposed API.
2004-12-11 15:24:11 -05:00
References
==========
.. [1] Special __main__() function in modules
(http://www.python.org/peps/pep-0299.html)
.. [2] PEP 338 implementation (runpy module and ``-m`` update)
(http://www.python.org/sf/1429601)
2004-12-11 15:24:11 -05:00
.. [3] execmodule Python Cookbook Recipe
(http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/307772)
.. [4] New import hooks
(http://www.python.org/peps/pep-0302.html)
.. [5] PEP 338 documentation (for runpy module)
(http://www.python.org/sf/1429605)
2004-12-11 15:24:11 -05:00
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End: