Update PEP to fully support PEP 302 import semantics

2006-02-11 10:00:00 +00:00 · 2006-02-11 10:00:00 +00:00 · d8ac0a56d0
parent fd26d618a0
commit d8ac0a56d0
1 changed files with 194 additions and 68 deletions
--- a/pep-0338.txt
+++ b/pep-0338.txt
@ -1,24 +1,30 @@
 PEP: 338
-Title: Executing modules inside packages with '-m'
+Title: Executing modules as scripts
 Version: $Revision$
 Last-Modified: $Date$
-Author: Nick Coghlan <ncoghlan@email.com>
+Author: Nick Coghlan <ncoghlan@gmail.com>
 Status: Draft
 Type: Standards Track
 Content-Type: text/x-rst
 Created: 16-Oct-2004
 Python-Version: 2.5
-Post-History: 8-Nov-2004
+Post-History: 8-Nov-2004, 11-Feb-2006


 Abstract
 ========

-This PEP defines semantics for executing modules inside packages as
-scripts with the ``-m`` command line switch.
+This PEP defines semantics for executing any Python module as a
+scripts, either with the ``-m`` command line switch, or by invoking
+it via ``runpy.run_module(modulename)``.

-The proposed semantics are that the containing package be imported
-prior to execution of the script.
+The ``-m`` switch implemented in Python 2.4 is quite limited. This
+PEP proposes making use of the PEP 302 [4]_ import hooks to allow any
+module which provides access to its code object to be executed.
+
+Additional functions are proposed to make the same convenience available
+for other references to executable Python code (strings, code objects,
+Python source files, Python compiled files).


 Rationale
@ -27,18 +33,34 @@ Rationale
 Python 2.4 adds the command line switch ``-m`` to allow modules to be
 located using the Python module namespace for execution as scripts.
 The motivating examples were standard library modules such as ``pdb``
-and ``profile``.
+and ``profile``, and the Python 2.4 implementation is fine for this
+limited purpose.

 A number of users and developers have requested extension of the
 feature to also support running modules located inside packages.  One
 example provided is pychecker's ``pychecker.checker`` module.  This
 capability was left out of the Python 2.4 implementation because the
-appropriate semantics were not entirely clear.
+implementation of this was significantly more complicated, and the most
+appropriate strategy was not at all clear.

 The opinion on python-dev was that it was better to postpone the
 extension to Python 2.5, and go through the PEP process to help make
 sure we got it right.

+Since that time, it has also been pointed out that the current version
+of ``-m`` does not support ``zipimport`` or any other kind of
+alternative import behaviour (such as frozen modules).
+
+Providing this functionality as a Python module is significantly easier
+than writing it in C, and makes the functionality readily available to
+all Python programs, rather than being specific to the CPython
+interpreter. CPython's command line switch can then be rewritten to
+make use of the new module.
+
+Scripts which execute other scripts (e.g. ``profile``, ``pdb``) also
+have the option to use the new module to provide ``-m`` style support
+for identifying the script to be executed.
+

 Scope of this proposal
 ==========================
@ -46,30 +68,20 @@ Scope of this proposal
 In Python 2.4, a module located using ``-m`` is executed just as if
 its filename had been provided on the command line.  The goal of this
 PEP is to get as close as possible to making that statement also hold
-true for modules inside packages.
+true for modules inside packages, or accessed via alternative import
+mechanisms (such as ``zipimport``).

 Prior discussions suggest it should be noted that this PEP is **not**
-about any of the following:
-
- changing the idiom for making Python modules also useful as scripts
-  (see PEP 299 [1]_).
-
- lifting the restriction of ``-m`` to modules of type PY_SOURCE or
-  PY_COMPILED (i.e. ``.py``, ``.pyc``, ``.pyo``, ``.pyw``).
-
- addressing the problem of ``-m`` not understanding zip imports or
-  Python's sys.metapath.
-
-The issues listed above are considered orthogonal to the specific
-feature addressed by this PEP.
-
+about changing the idiom for making Python modules also useful as
+scripts (see PEP 299 [1]_). That issue is considered orthogonal to the
+specific feature addressed by this PEP.

 Current Behaviour
 =================

 Before describing the new semantics, it's worth covering the existing
 semantics for Python 2.4 (as they are currently defined only by the
-source code).
+source code and the command line help).

 When ``-m`` is used on the command line, it immediately terminates the
 option list (like ``-c``).  The argument is interpreted as the name of
@ -91,20 +103,22 @@ Proposed Semantics
 ==================

 The semantics proposed are fairly simple: if ``-m`` is used to execute
-a module inside a package as a script, then the containing package is
-imported before executing the module in accordance with the semantics
-for a top-level module.
+a module the PEP 302 import mechanisms are used to locate the module and
+retrieve its compiled code, before executing the module in accordance
+with the semantics for a top-level module. The interpreter does this by
+invoking a new standard library function ``runpy.run_module``.

 This is necessary due to the way Python's import machinery locates
 modules inside packages.  A package may modify its own __path__
-variable during initialisation.  In addition, paths may affected by
-``*.pth`` files.  Accordingly, the only way for Python to reliably
+variable during initialisation.  In addition, paths may be affected by
+``*.pth`` files, and some packages will install custom loaders on
+``sys.metapath``.  Accordingly, the only way for Python to reliably
 locate the module is by importing the containing package and
-inspecting its __path__ variable.
+using the PEP 302 import hooks to gain access to the Python code.

-Note that the package is *not* imported into the ``__main__`` module's
-namespace.  The effects of these semantics that will be visible to the
-executed module are:
+Note that the process of locating the module to be executed may require
+importing the containing package.  The effects of such a package import
+that will be visible to the executed module are:

 - the containing package will be in sys.modules

@ -115,57 +129,164 @@ executed module are:
 Reference Implementation
 ========================

-A reference implementation is available on SourceForge [2]_.  In this
-implementation, if the ``-m`` switch fails to locate the requested
-module at the top level, it effectively reinterprets the command from
-``python -m <script>`` to ``python -m execmodule <script>``.  (There
-is one caveat: when reinterpreted in this way, ``sys.argv[0]`` may not
-actually contain the filename of ``execmodule``.  This only affects
-``execmodule`` itself, not the requested module).
+A reference implementation is available on SourceForge ([2]_), along
+with documentation for the library reference ([5]_).  There are
+two parts to this implementation. The first is a proposed standard
+library module ``runpy``. The second is a modification to the code
+implementing the ``-m`` switch to always delegate to
+``runpy.run_module`` instead of trying to run the module directly.
+The delegation has the form::

-``execmodule`` is a proposed standard library module that contains a
-single function (also called ``execmodule``).  When invoked as a
-script, this module finds and executes the module supplied as the
-first argument.  It adjusts ``sys.argv`` by deleting ``sys.argv[0]``
-and replacing the new ``sys.argv[0]`` with the module's filename
-instead of its Python name.
+  runpy.run_module(sys.argv[0], run_name="__main__", as_script=True)

-The function ``execmodule`` is like ``execfile``, but uses the Python
-module namespace to locate the script instead of the filesystem.  It
-has an additional optional argument ``set_argv0`` which causes the
-filename of the located module to be written to ``sys.argv[0]`` before
-the module is executed.
+``run_module`` is only one of a number of functions ``runpy`` exposes to
+make it easier to run Python code dynamically. The proposed functions
+are listed below (the descriptions are taken from the proposed
+documenation).

-A hybrid C/Python implementation is used as the Python module is much
-more flexible and extensible than the equivalent C code would be.  It
-also allows the ``execmodule`` function to be made available.  Scripts
-which execute other scripts (e.g. ``profile``, ``pdb``) have the
-option to use this function to provide ``-m`` style support for
-identifying the script to be executed.
+``run_code(code[, globals])``

-The Python code for ``execmodule`` has also been posted as a
-cookbook recipe for Python 2.4 [3]_.
+    Execute the supplied Python code object or string of source code and
+    return the resulting module globals dictionary. If supplied, the
+    optional globals dictionary is used as the module globals.
+    Otherwise, a new dictionary is used.
+    The special variable ``__builtins__`` in the globals dictionary is
+    automatically initialised with a reference to the top level
+    namespace of the ``__builtin__`` module. 
+
+``run_function_code(code[, globals][, locals])``
+
+    Execute the supplied function code object and return a tuple
+    containing the resulting globals and locals dictionaries. If
+    supplied, the optional globals dictionary is used as the module
+    globals. Otherwise, a new dictionary is used. Similarly, if
+    supplied, the optional locals dictionary is used as the function
+    locals. Otherwise, a new dictionary is used.
+
+    As for ``run_code()`` the special variable ``__builtins__`` in the
+    globals dictionary is automatically initialised with a reference to
+    the top level namespace of the ``__builtin__`` module.
+
+    A function code object is required, as module level Python code
+    cannot resolve names correctly when the locals and globals
+    dictionaries are not the same (specifically, new names are bound in
+    the locals dictionary, but this dictionary is not used when looking
+    up references to names at module level). 
+
+
+``run_module_code(code[, init_globals][, mod_name][, mod_file]\
+[, mod_loader][, as_script])``
+
+    Execute the supplied Python code object or string of source code and
+    return the resulting module globals dictionary.
+
+    The optional argument ``init_globals`` may be used to pre-populate
+    the globals dictionary before the code is executed. The supplied
+    dictionary will not be modified. If any of the special global
+    variables below are defined in the supplied dictionary, those
+    definitions are overridden.
+
+    The special global variables ``__name__``, ``__file__``,
+    ``__loader__`` and ``__builtins__`` are set in the globals
+    dictionary before the module code is executed. ``__name__``,
+    ``__file__``, ``__loader__`` are set based on the optional arguments
+    ``mod_name``, ``mod_file`` and ``mod_loader``. If the arguments are
+    omitted, the corresponding special variable is set to ``None``.
+
+    If the argument ``as_script`` is supplied and evaluates to ``True``,
+    then ``sys.argv[0]`` is updated with the value of ``mod_file``
+    before the code is executed.
+
+    The supplied code is then executed in the globals dictionary using
+    ``run_code()``. 
+
+``run_module(mod_name[, init_globals][, run_name][, as_script])``
+
+    Execute the code of the specified module and return the resulting
+    module globals dictionary. The module's code is first located using
+    the standard import mechanism (refer to PEP 302 for details) and
+    then executed using ``run_module_code()``.
+
+    The ``init_globals`` and ``as_script`` arguments are passed directly
+    down to the lower level function. The ``mod_name`` argument to the
+    lower level function is ``run_name`` if this optional argument is
+    supplied, and the original ``mod_name`` argument otherwise.
+
+    The ``mod_loader`` argument to the lower level function is set to
+    the PEP 302 module loader used to retrieve the code for the module
+    (This loader may be a wrapper around the standard import mechanism).
+    The ``mod_file`` argument is set to the name provided by the module
+    loader. If the loader does not make filename information available,
+    this argument is set to ``None``. 
+
+``run_source_file(filename[, init_globals] [, run_name][, as_script])``
+    Execute the specified Python source file and return the resulting
+    module globals dictionary. The file's code is read and then executed
+    using ``run_module_code()``.
+
+    The ``init_globals`` and ``as_script`` arguments are passed directly
+    down to the lower level function. The mod_name argument to the lower
+    level function is ``run_name`` if this optional argument is supplied
+    and ``None`` otherwise. 
+
+``run_compiled_file(filename[, init_globals][, run_name]\
+[, as_script])``
+
+    Execute the specified compiled Python file and return the resulting
+    module globals dictionary. The file's code is read and then executed
+    using ``run_module_code()``.
+
+    The ``init_globals`` and ``as_script`` arguments are passed directly
+    down to the lower level function. The mod_name argument to the lower
+    level function is ``run_name`` if this optional argument is supplied
+    and ``None`` otherwise. 
+
+``run_file(filename[, init_globals][, run_name][, as_script])``
+
+    Execute the specified Python file and return the resulting module
+    globals dictionary.
+
+    This function first attempts to retrieve a code object from the file
+    by interpreting it as a compiled Python file. If this fails, then
+    the file's contents are retrieved directly, interpreting it as a
+    Python source file. The retrieved code is then executed using
+    ``run_module_code()``.
+
+    The ``init_globals`` and ``as_script`` arguments are passed directly
+    down to the lower level function. The mod_name argument to the lower
+    level function is ``run_name`` if this optional argument is supplied
+    and ``None`` otherwise. 
+
+
+When invoked as a script, the ``runpy`` module finds and executes the
+module supplied as the first argument.  It adjusts ``sys.argv`` by
+deleting ``sys.argv[0]`` (which refers to the ``runpy`` module itself``)
+and then invokes ``run_module(sys.argv[0], run_name="__main__",
+as_script=True)``.


 Open Issues
 ===========

- choosing a name for the standard library module containing
-  ``execmodule``.  The reference implementation uses ``execmodule``.
-  An alternative name proposed on python-dev is ``runpy``.
+ - the ``-m`` switch really only needs the ``run_module`` function. The
+   other six functions are merely about giving the module API coverage
+   of the other sources of executable Python code.


 Alternatives
 ============

-The main alternative implementation considered ignored packages'
+The first alternative implementation considered ignored packages'
 __path__ variables, and looked only in the main package directory.  A
 Python script with this behaviour can be found in the discussion of
 the ``execmodule`` cookbook recipe [3]_.

-This approach was not used as it does not meet the main goal of the
+The ``execmodule`` cookbook recipe itself was the proposed mechanism in
+an earlier version of this PEP (before the PEP's author read PEP 302).
+
+Both approaches were rejected as they do not meet the main goal of the
 ``-m`` switch -- to allow the full Python namespace to be used to
-locate modules for execution.
+locate modules for execution from the command line.


 References
@ -174,12 +295,17 @@ References
 .. [1] Special __main__() function in modules
   (http://www.python.org/peps/pep-0299.html)

-.. [2] Native ``-m`` execmodule support
-   (http://sourceforge.net/tracker/?func=detail&aid=1043356&group_id=5470&atid=305470 )
+.. [2] PEP 338 implementation (runpy module and ``-m`` update)
+   (http://www.python.org/sf/1429601)

 .. [3] execmodule Python Cookbook Recipe
   (http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/307772)

+.. [4] New import hooks
+   (http://www.python.org/peps/pep-0302.html)
+
+.. [5] PEP 338 documentation (for runpy module)
+   (http://www.python.org/sf/1429605)

 Copyright
 =========