2015-08-08 00:43:09 -04:00
|
|
|
PEP: 499
|
2023-09-04 22:40:47 -04:00
|
|
|
Title: ``python -m foo`` should also bind ``'foo'`` in ``sys.modules``
|
2019-03-19 19:30:05 -04:00
|
|
|
Author: Cameron Simpson <cs@cskk.id.au>, Chris Angelico <rosuav@gmail.com>, Joseph Jevnik <joejev@gmail.com>
|
2023-10-11 08:05:51 -04:00
|
|
|
BDFL-Delegate: Alyssa Coghlan
|
2020-03-14 00:36:57 -04:00
|
|
|
Status: Deferred
|
2015-08-08 00:43:09 -04:00
|
|
|
Type: Standards Track
|
|
|
|
Content-Type: text/x-rst
|
|
|
|
Created: 07-Aug-2015
|
2020-03-14 00:36:57 -04:00
|
|
|
Python-Version: 3.10
|
|
|
|
|
|
|
|
|
|
|
|
PEP Deferral
|
|
|
|
============
|
|
|
|
|
|
|
|
The implementation of this PEP isn't currently expected to be ready for the
|
|
|
|
Python 3.9 feature freeze in April 2020, so it has been deferred 12 months to
|
|
|
|
Python 3.10.
|
|
|
|
|
2015-08-08 00:43:09 -04:00
|
|
|
|
|
|
|
Abstract
|
|
|
|
========
|
|
|
|
|
|
|
|
When a module is used as a main program on the Python command line,
|
|
|
|
such as by:
|
|
|
|
|
|
|
|
python -m module.name ...
|
|
|
|
|
|
|
|
it is easy to accidentally end up with two independent instances
|
|
|
|
of the module if that module is again imported within the program.
|
|
|
|
This PEP proposes a way to fix this problem.
|
|
|
|
|
|
|
|
When a module is invoked via Python's -m option the module is bound
|
|
|
|
to ``sys.modules['__main__']`` and its ``.__name__`` attribute is set to
|
|
|
|
``'__main__'``.
|
|
|
|
This enables the standard "main program" boilerplate code at the
|
|
|
|
bottom of many modules, such as::
|
|
|
|
|
|
|
|
if __name__ == '__main__':
|
|
|
|
sys.exit(main(sys.argv))
|
|
|
|
|
|
|
|
However, when the above command line invocation is used it is a
|
|
|
|
natural inference to presume that the module is actually imported
|
|
|
|
under its official name ``module.name``,
|
|
|
|
and therefore that if the program again imports that name
|
|
|
|
then it will obtain the same module instance.
|
|
|
|
|
|
|
|
That actuality is that the module was imported only as ``'__main__'``.
|
|
|
|
Another import will obtain a distinct module instance, which can
|
2019-03-19 15:10:14 -04:00
|
|
|
lead to confusing bugs,
|
|
|
|
all stemming from having two instances of module global objects:
|
|
|
|
one in each module.
|
|
|
|
|
|
|
|
Examples include:
|
|
|
|
|
|
|
|
module level data structures
|
|
|
|
Some modules provide features such as caches or registries
|
|
|
|
as module level global variables,
|
|
|
|
typically private.
|
|
|
|
A second instance of a module creates a second data structure.
|
|
|
|
If that structure is a cache
|
|
|
|
such as in the ``re`` module
|
|
|
|
then two caches exist leading to wasteful memory use.
|
|
|
|
If that structure is a shared registry
|
|
|
|
such as a mapping of values to handlers
|
|
|
|
then it is possible to register a handler to one registry
|
|
|
|
and to try to use it via the other registry, where it is unknown.
|
|
|
|
|
|
|
|
sentinels
|
|
|
|
The standard test for a sentinel value provided by a module
|
|
|
|
is the identity comparison using ``is``,
|
|
|
|
as this avoids unreliable "looks like" comparisons
|
|
|
|
such as equality which can both mismatch two values as "equal"
|
|
|
|
(for example being zeroish)
|
|
|
|
or raise a ``TypeError`` when the objects are incompatible.
|
|
|
|
When there are two instances of a module
|
|
|
|
there are two sentinel instances
|
|
|
|
and only one will be recognised via ``is``.
|
|
|
|
|
|
|
|
classes
|
|
|
|
With two modules
|
|
|
|
there are duplicate class definitions of any classes provided.
|
|
|
|
All operations which depend on recognising these classes
|
|
|
|
and subclasses of these are prone to failure
|
|
|
|
depending where the reference class
|
|
|
|
(from one of the modules) is obtained
|
|
|
|
and where the comparison class or instance is obtained.
|
|
|
|
This impacts ``isinstance``, ``issubclass``
|
|
|
|
and also ``try``/``except`` constructs.
|
2015-08-08 00:43:09 -04:00
|
|
|
|
|
|
|
Proposal
|
|
|
|
========
|
|
|
|
|
|
|
|
It is suggested that to fix this situation all that is needed is a
|
|
|
|
simple change to the way the ``-m`` option is implemented: in addition
|
|
|
|
to binding the module object to ``sys.modules['__main__']``, it is also
|
|
|
|
bound to ``sys.modules['module.name']``.
|
|
|
|
|
2023-10-11 08:05:51 -04:00
|
|
|
Alyssa (Nick) Coghlan has suggested that this is as simple as modifying the
|
2015-08-08 00:43:09 -04:00
|
|
|
``runpy`` module's ``_run_module_as_main`` function as follows::
|
|
|
|
|
|
|
|
main_globals = sys.modules["__main__"].__dict__
|
|
|
|
|
|
|
|
to instead be::
|
|
|
|
|
|
|
|
main_module = sys.modules["__main__"]
|
|
|
|
sys.modules[mod_spec.name] = main_module
|
|
|
|
main_globals = main_module.__dict__
|
|
|
|
|
2019-03-19 15:10:14 -04:00
|
|
|
Joseph Jevnik has pointed out that modules which are packages already
|
|
|
|
do something very similar to this proposal:
|
|
|
|
the __init__.py file is bound to the module's canonical name
|
|
|
|
and the __main__.py file is bound to "__main__".
|
|
|
|
As such, the double import issue does not occur.
|
2021-02-03 09:06:23 -05:00
|
|
|
Therefore, this PEP proposes to affect only simple non-package modules.
|
2019-03-19 15:10:14 -04:00
|
|
|
|
2015-08-08 00:43:09 -04:00
|
|
|
|
|
|
|
Considerations and Prerequisites
|
|
|
|
================================
|
|
|
|
|
|
|
|
Pickling Modules
|
|
|
|
----------------
|
|
|
|
|
2023-10-11 08:05:51 -04:00
|
|
|
Alyssa has mentioned `issue 19702`_ which proposes (quoted from the issue):
|
2015-08-08 00:43:09 -04:00
|
|
|
|
|
|
|
- runpy will ensure that when __main__ is executed via the import
|
|
|
|
system, it will also be aliased in sys.modules as __spec__.name
|
|
|
|
- if __main__.__spec__ is set, pickle will use __spec__.name rather
|
|
|
|
than __name__ to pickle classes, functions and methods defined in
|
|
|
|
__main__
|
|
|
|
- multiprocessing is updated appropriately to skip creating __mp_main__
|
|
|
|
in child processes when __main__.__spec__ is set in the parent
|
|
|
|
process
|
|
|
|
|
|
|
|
The first point above covers this PEP's specific proposal.
|
|
|
|
|
|
|
|
|
2019-03-19 15:10:14 -04:00
|
|
|
A Normal Module's ``__name__`` Is No Longer Canonical
|
|
|
|
-----------------------------------------------------
|
|
|
|
|
|
|
|
Chris Angelico points out that it becomes possible to import a
|
|
|
|
module whose ``__name__`` is not what you gave to "import", since
|
|
|
|
"__main__" is now present at "module.name", so a subsequent
|
|
|
|
``import module.name`` finds it already present.
|
2021-02-03 09:06:23 -05:00
|
|
|
Therefore, ``__name__`` is no longer the canonical name for some normal imports.
|
2019-03-19 15:10:14 -04:00
|
|
|
|
|
|
|
Some counter arguments follow:
|
|
|
|
|
2022-01-21 06:03:51 -05:00
|
|
|
- As of :pep:`451` a module's canonical name is stored at ``__spec__.name``.
|
2019-03-19 15:10:14 -04:00
|
|
|
- Very little code should actually care about ``__name__`` being the canonical name
|
|
|
|
and any that does should arguably be updated to consult ``__spec__.name``
|
|
|
|
with fallback to ``__name__`` for older Pythons, should that be relevant.
|
|
|
|
This is true even if this PEP is not approved.
|
|
|
|
- Should this PEP be approved,
|
|
|
|
it becomes possible to introspect a module by its canonical name
|
|
|
|
and ask "was this the main program?" by inferring from ``__name__``.
|
|
|
|
This was not previously possible.
|
|
|
|
|
|
|
|
The glaring counter example is the standard "am I the main program?" boilerplate,
|
|
|
|
where ``__name__`` is expected to be "__main__".
|
|
|
|
This PEP explicitly preserves that semantic.
|
|
|
|
|
|
|
|
|
2019-03-24 08:03:37 -04:00
|
|
|
Reference Implementation
|
|
|
|
========================
|
|
|
|
|
|
|
|
`BPO 36375 <https://bugs.python.org/issue36375>`_ is the issue tracker entry
|
|
|
|
for the PEP's reference implementation, with the current draft PR being
|
|
|
|
available `on GitHub <https://github.com/python/cpython/pull/12490>`_.
|
|
|
|
|
|
|
|
|
|
|
|
Open Questions
|
|
|
|
==============
|
|
|
|
|
|
|
|
This proposal does raise some backwards compatibility concerns, and these will
|
|
|
|
need to be well understood, and either a deprecation process designed, or clear
|
|
|
|
porting guidelines provided.
|
|
|
|
|
|
|
|
Pickle compatibility
|
|
|
|
--------------------
|
|
|
|
|
|
|
|
If no changes are made to the pickle module, then pickles that were previously
|
|
|
|
being written with the correct module name (due to a dual import) may start
|
2023-09-01 15:27:29 -04:00
|
|
|
being written with ``__main__`` as their module name instead, and hence fail
|
|
|
|
to be loaded correctly by other projects.
|
2019-03-24 08:03:37 -04:00
|
|
|
|
|
|
|
Scenarios to be checked:
|
|
|
|
|
2023-09-01 15:27:29 -04:00
|
|
|
* ``python script.py`` writing, ``python -m script`` reading
|
|
|
|
* ``python -m script`` writing, ``python script.py`` reading
|
|
|
|
* ``python -m script`` writing, ``python some_other_app.py`` reading
|
|
|
|
* ``old_python -m script`` writing, ``new_python -m script`` reading
|
|
|
|
* ``new_python -m script`` writing, ``old_python -m script`` reading
|
2019-03-24 08:03:37 -04:00
|
|
|
|
|
|
|
|
2023-09-01 15:27:29 -04:00
|
|
|
Projects that special-case ``__main__``
|
|
|
|
---------------------------------------
|
2019-03-24 08:03:37 -04:00
|
|
|
|
|
|
|
In order to get the regression test suite to pass, the current reference
|
2023-09-01 15:27:29 -04:00
|
|
|
implementation had to patch ``pdb`` to avoid destroying its own global
|
|
|
|
namespace.
|
2019-03-24 08:03:37 -04:00
|
|
|
|
|
|
|
This suggests there may be a broader compatibility issue where some scripts are
|
|
|
|
relying on direct execution and import giving different namespaces (just as
|
2023-09-01 15:27:29 -04:00
|
|
|
package execution keeps the two separate by executing the ``__main__``
|
|
|
|
submodule in the ``__main__`` namespace, while the package name references
|
|
|
|
the ``__init__`` file as usual.
|
2019-03-24 08:03:37 -04:00
|
|
|
|
|
|
|
|
2015-08-08 00:43:09 -04:00
|
|
|
Background
|
|
|
|
==========
|
|
|
|
|
|
|
|
`I tripped over this issue`_ while debugging a main program via a
|
|
|
|
module which tried to monkey patch a named module, that being the
|
|
|
|
main program module. Naturally, the monkey patching was ineffective
|
|
|
|
as it imported the main module by name and thus patched the second
|
|
|
|
module instance, not the running module instance.
|
|
|
|
|
|
|
|
However, the problem has been around as long as the ``-m`` command
|
|
|
|
line option and is encountered regularly, if infrequently, by others.
|
|
|
|
|
2023-09-01 15:27:29 -04:00
|
|
|
In addition to `issue 19702`_, the discrepancy around ``__main__``
|
2022-01-21 06:03:51 -05:00
|
|
|
is alluded to in :pep:`451` and a similar proposal (predating :pep:`451`)
|
|
|
|
is described in :pep:`395` under
|
|
|
|
:pep:`Fixing dual imports of the main module <395#fixing-dual-imports-of-the-main-module>`.
|
2015-08-08 00:43:09 -04:00
|
|
|
|
|
|
|
|
|
|
|
References
|
|
|
|
==========
|
|
|
|
|
|
|
|
.. _issue 19702: http://bugs.python.org/issue19702
|
|
|
|
|
|
|
|
.. _I tripped over this issue: https://mail.python.org/pipermail/python-list/2015-August/694905.html
|
|
|
|
|
|
|
|
|
|
|
|
Copyright
|
|
|
|
=========
|
|
|
|
|
|
|
|
This document has been placed in the public domain.
|