From ad9c99dd8da9ddd810fb4bb1c2035b758a118d6f Mon Sep 17 00:00:00 2001 From: Nick Coghlan Date: Sun, 7 Apr 2019 17:45:54 +1000 Subject: [PATCH] PEP 534: Improve title and clarify proposal (#949) The previous title gave the impression that this PEP was about splitting the standard library into separately versioned pieces, when it's actually a far more modest proposal to improve the error messages that users receive when pieces are missing, as well as to provide a standard library API to retrieve the full list of expected standard library module names. It also includes a number of other improvements that came up during the PR review: - explicitly limit metadata to top level modules - be clear that private stdlib modules need to be listed - provide more details on which modules would be optional - use tkinter and idlelib as additional examples - provide rationale for packaging modules being mandatory - provide example error messages for missing submodules - add subheadings to the design discussion section - rename Open Questions to Deferred Ideas to avoid scope creep in the initial proposal --- pep-0534.txt | 299 +++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 264 insertions(+), 35 deletions(-) diff --git a/pep-0534.txt b/pep-0534.txt index 0bac62400..32638e18f 100644 --- a/pep-0534.txt +++ b/pep-0534.txt @@ -1,5 +1,5 @@ PEP: 534 -Title: Distributing a Subset of the Standard Library +Title: Improved Errors for Missing Standard Library Modules Version: $Revision$ Last-Modified: $Date$ Author: Tomáš Orsava , @@ -9,7 +9,7 @@ Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 5-Sep-2016 -Python-Version: 3.7 +Python-Version: 3.8 Post-History: @@ -17,26 +17,48 @@ Abstract ======== Python is often being built or distributed without its full standard library. -However, there is as of yet no standard, user friendly way of properly informing the user about the failure to import such missing `stdlib` modules. This PEP proposes a mechanism for identifying standard library modules and informing the user appropriately. +However, there is as of yet no standard, user friendly way of properly +informing the user about the failure to import such missing standard library +modules. + +This PEP proposes a mechanism for identifying expected standard library modules +and providing more informative error messages to users when attempts to import +standard library modules fail. Motivation ========== There are several use cases for including only a subset of Python's standard -library. However, there is so far no user-friendly mechanism for informing the user *why* a stdlib module is missing and how to remedy the situation appropriately. +library. However, there is so far no user-friendly mechanism for informing +the user *why* a stdlib module is missing and how to remedy the situation +appropriately. + CPython ------- -When one of Python standard library modules (such as ``_sqlite3``) cannot be -compiled during a Python build because of missing dependencies (e.g. SQLite -header files), the module is simply skipped. If you then install this compiled Python and use it to try to import one of the -missing modules, Python will fail with a ModuleNotFoundError_. +When one of Python's standard library modules (such as ``_sqlite3``) cannot be +compiled during a CPython build because of missing dependencies (e.g. SQLite +header files), the module is simply skipped. If you then install this compiled +Python and use it to try to import one of the missing modules, Python will fail +with a ModuleNotFoundError_. .. _ModuleNotFoundError: https://docs.python.org/3.7/library/exceptions.html#ModuleNotFoundError +For example, after deliberately removing ``sqlite-devel`` from the local +system:: + + $ ./python -c "import sqlite3" + Traceback (most recent call last): + File "", line 1, in + File "/home/ncoghlan/devel/cpython/Lib/sqlite3/__init__.py", line 23, in + from sqlite3.dbapi2 import * + File "/home/ncoghlan/devel/cpython/Lib/sqlite3/dbapi2.py", line 27, in + from _sqlite3 import * + ModuleNotFoundError: No module named '_sqlite3' + This can confuse users who may not understand why a cleanly built Python is missing standard library modules. @@ -47,9 +69,10 @@ Linux and other distributions Many Linux and other distributions are already separating out parts of the standard library to standalone packages. Among the most commonly excluded modules are the ``tkinter`` module, since it draws in a dependency on the -graphical environment, and the ``test`` package, as it only serves to test -Python internally and is about as big as the rest of the standard library put -together. +graphical environment, ``idlelib``, since it depends on ``tkinter`` (and most +Linux desktop environments provide their own default code editor), and the +``test`` package, as it only serves to test Python internally and is about as +big as the rest of the standard library put together. The methods of omission of these modules differ. For example, Debian patches the file ``Lib/tkinter/__init__.py`` to envelop the line ``import _tkinter`` in @@ -58,46 +81,247 @@ the following to the error message: ``please install the python3-tk package`` [#debian-patch]_. Fedora and other distributions simply don't include the omitted modules, potentially leaving users baffled as to where to find them. +An example from Fedora 29:: + + $ python3 -c "import tkinter" + Traceback (most recent call last): + File "", line 1, in + ModuleNotFoundError: No module named 'tkinter' + + Specification ============= -The `sysconfig`_ module will be extended by two functions: `sysconfig.get_stdlib_modules()`, which will provide a list of the names of all Python standard library modules, and a function `sysconfig.get_optional_modules()`, that will list optional `stdlib` module names. The results of the latter function—`sysconfig.get_optional_modules()`—as well as of the existing `sys.builtin_module_names` will both be subsets of the full list provided by `sysconfig.get_stdlib_modules()`. These added lists will be generated during the Python build process and saved in `_sysconfigdata-*.py` file along with other `sysconfig`_ values. +APIs to list expected standard library modules +---------------------------------------------- -.. _`sysconfig`: https://docs.python.org/3.7/library/sysconfig.html +To allow for easier identification of which module names are *expected* to be +resolved in the standard library, the `sysconfig`_ module will be extended +with two additional functions: -The default implementation of the `sys.excepthook`_ function will then be modified to dispense an appropriate message when it detects a failure to import a module identified by one of the two new `sysconfig`_ functions as belonging to the Python standard library. +* ``sysconfig.get_stdlib_modules()``, which will provide a list of the names of + all top level Python standard library modules (including private modules) +* ``sysconfig.get_optional_modules()``, which will list optional public top level + standard library module names -.. _`sys.excepthook`: https://docs.python.org/3.7/library/sys.html#sys.excepthook +The results of ``sysconfig.get_optional_modules()``and the existing +``sys.builtin_module_names`` will both be subsets of the full list provided by +the new ``sysconfig.get_stdlib_modules()`` function. + +These added lists will be generated during the Python build process and saved in +the ``_sysconfigdata-*.py`` file along with other `sysconfig`_ values. + +Possible reasons for modules being in the "optional" list will be: + +* the module relies on an optional build dependency (e.g. ``_sqlite3``, + ``tkinter``, ``idlelib``) +* the module is private for other reasons and hence may not be present on all + implementations (e.g. ``_freeze_importlib``, ``_collections_abc``) +* the module is platform specific and hence may not be present in all + installations (e.g. ``winreg``) +* the ``test`` package may also be freely omitted from Python runtime + installations, as it is intended for use in testing Python implementations, + not as a runtime library for Python projects to use (the public API offering + testing utilities is ``unittest``) + +(Note: the ``ensurepip``, ``venv``, and ``distutils`` modules are all considered +mandatory modules in this PEP, even though not all redistributors currently +adhere to that practice) + +.. _`sysconfig`: https://docs.python.org/3/library/sysconfig.html -Rationale -========= +Changes to the default ``sys.excepthook`` implementation +-------------------------------------------------------- -The `sys.excepthook`_ function gets called when a raised exception is uncaught and the program is about to exit or—in an interactive session—the control is being returned to the prompt. This makes it a perfect place for customized error messages, as it will not influence caught errors and thus not slow down normal execution of Python scripts. +The default implementation of the `sys.excepthook`_ function will then be +modified to dispense an appropriate message when it detects a failure to +import a module identified by one of the two new `sysconfig`_ functions as +belonging to the Python standard library. -The inclusion of the functions `sysconfig.get_stdlib_modules()` and `sysconfig.get_optional_modules()` will also provide a long sought-after way of easily listing the names of Python standard library modules [#stackoverflow-stdlib]_, which will—among other benefits—make it easier for code analysis, profiling, and error reporting tools to offer runtime `--ignore-stdlib` flags. +.. _`sys.excepthook`: https://docs.python.org/3/library/sys.html#sys.excepthook -Ideas leading up to this PEP were discussed on the `python-dev mailing list`_ and subsequently on `python-ideas`_. +Revised error message for a module that relies on an optional build dependency +or is otherwise considered optional when Python is installed:: -.. _`python-dev mailing list`: - https://mail.python.org/pipermail/python-dev/2016-July/145534.html -.. _`python-ideas`: - https://mail.python.org/pipermail/python-ideas/2016-December/043907.html + $ ./python -c "import sqlite3" + Traceback (most recent call last): + File "", line 1, in + File "/home/ncoghlan/devel/cpython/Lib/sqlite3/__init__.py", line 23, in + from sqlite3.dbapi2 import * + File "/home/ncoghlan/devel/cpython/Lib/sqlite3/dbapi2.py", line 27, in + from _sqlite3 import * + ModuleNotFoundError: Optional standard library module '_sqlite3' was not found + +Revised error message for a submodule of an optional top level package when the +entire top level package is missing:: + + $ ./python -c "import test.regrtest" + Traceback (most recent call last): + File "", line 1, in + ModuleNotFoundError: Optional standard library module 'test' was not found + +Revised error message for a submodule of an optional top level package when the +top level package is present:: + + $ ./python -c "import test.regrtest" + Traceback (most recent call last): + File "", line 1, in + ModuleNotFoundError: No submodule named 'test.regrtest' in optional standard library module 'test' + +Revised error message for a module that is always expected to be available:: + + $ ./python -c "import ensurepip" + Traceback (most recent call last): + File "", line 1, in + ModuleNotFoundError: Standard library module 'ensurepip' was not found + +Revised error message for a missing submodule of a standard library package when +the top level package is present:: + + $ ./python -c "import encodings.mbcs" + Traceback (most recent call last): + File "", line 1, in + ModuleNotFoundError: No submodule named 'encodings.mbcs' in standard library module 'encodings' + +These revised error messages make it clear that the missing modules are expected +to be available from the standard library, but are not available for some reason, +rather than being an indicator of a missing third party dependency in the current +environment. + + +Design Discussion +================= + +Modifying ``sys.excepthook`` +---------------------------- + +The `sys.excepthook`_ function gets called when a raised exception is uncaught +and the program is about to exit or (in an interactive session) the control is +being returned to the prompt. This makes it a perfect place for customized +error messages, as it will not influence caught errors and thus not slow down +normal execution of Python scripts. + + +Public API to query expected standard library module names +---------------------------------------------------------- + +The inclusion of the functions ``sysconfig.get_stdlib_modules()`` and +``sysconfig.get_optional_modules()`` will provide a long sought-after +way of easily listing the names of Python standard library modules +[#stackoverflow-stdlib]_, which will (among other benefits) make it easier for +code analysis, profiling, and error reporting tools to offer runtime +``--ignore-stdlib`` flags. + + +Only including top level module names +------------------------------------- + +This PEP proposes that only top level module and package names be reported by +the new query APIs. This is sufficient information to generate the proposed +error messages, reduces the number of required entries by an order of magnitude, +and simplifies the process of generating the related metadata during the build +process. + +If this is eventually found to be overly limiting, a new ``include_submodules`` +flag could be added to the query APIs. However, this is *not* part of the initial +proposal, as the benefits of doing so aren't currently seen as justifying the +extra complexity. + +There is one known consequence of this restriction, which is that the new +default ``excepthook`` implementation will report incorrect submodules names the +same way that it reports genuinely missing standard library submodules:: + + $ ./python -c "import unittest.muck" + Traceback (most recent call last): + File "", line 1, in + ModuleNotFoundError: No submodule named 'unittest.muck' in standard library module 'unittest' + + +Listing private top level module names as optional standard library modules +--------------------------------------------------------------------------- + +Many of the modules that have an optional external build dependency are written +as hybrid modules, where there is a shared Python wrapper around an +implementation dependent interface to the underlying external library. In other +cases, a private top level module may simply be a CPython implementation detail, +and other implementations may not provide that module at all. + +To report import errors involving these modules appropriately, the new default +``excepthook`` implementation needs them to be reported by the new query APIs. + + +Deeming packaging related modules to be mandatory +------------------------------------------------- + +Some redistributors aren't entirely keen on installing the Python specific +packaging related modules (``distutils``, ``ensurepip``, ``venv``) by default, +preferring that developers use their platform specific tooling instead. + +This approach causes interoperability problems for developers working on +cross-platform projects and educators attempting to write platform independent +setup instructions, so this PEP takes the view that these modules should be +considered mandatory, and left out of the list of optional modules. + + +Deferred Ideas +============== + +The ideas in this section are concepts that this PEP would potentially help +enable, but they're considered out of scope for the initial proposal. + +Platform dependent modules +-------------------------- + +Some standard library modules may be missing because they're only provided on +particular platforms. For example, the ``winreg`` module is only available on +Windows:: + + $ python3 -c "import winreg" + Traceback (most recent call last): + File "", line 1, in + ModuleNotFoundError: No module named 'winreg' + +In the current proposal, these platform dependent modules will simply be +included with all the other optional modules rather than attempting to expose +the platform dependency information in a more structured way. + +However, the platform dependence is at least tracked at the level of "Windows", +"Unix", "Linux", and "FreeBSD" for the benefit of `the documentation`_, so it +seems plausible that it could potentially be exposed programmatically as well. + +.. _the documentation: https://docs.python.org/3/py-modindex.html + + +Emitting a warning when ``__main__`` shadows a standard library module +---------------------------------------------------------------------- + +Given the new query APIs, the new default ``excepthook`` implementation could +potentially detect when ``__main__.__file__`` or ``__main__.__spec__.name`` +match a standard library module, and emit a suitable warning. + +However, actually doing anything along this lines should review more cases where +uses actually encounter this problem, and the various options for potentially +offering more information to assist in debugging the situation, rather than +needing to be incorporated right now. Recommendation for Downstream Distributors ========================================== -By patching `site.py`_ [*]_ to provide their own implementation of the `sys.excepthook`_ function, Python distributors can display tailor-made error messages for any uncaught exceptions, including informing the user of a proper, distro-specific way to install missing standard library modules upon encountering a `ModuleNotFoundError`_. Some downstream distributors are already using this method of patching `sys.excepthook` to integrate with platform crash reporting mechanisms. +By patching `site.py`_ [*]_ to provide their own implementation of the +`sys.excepthook`_ function, Python distributors can display tailor-made +error messages for any uncaught exceptions, including informing the user of +a proper, distro-specific way to install missing standard library modules upon +encountering a `ModuleNotFoundError`_. + +Some downstream distributors are already using this method of patching +`sys.excepthook` to integrate with platform crash reporting mechanisms. .. _`site.py`: https://docs.python.org/3.7/library/site.html .. _`sitecustomize.py`: `site.py`_ -An `example implementation`_ is attached to this PEP. - -.. _`example implementation`: `Reference and Example Implementation`_ - Backwards Compatibility ======================= @@ -111,12 +335,8 @@ Reference and Example Implementation ==================================== TBD. The finer details will depend on what's practical given the capabilities -of the build system. - -.. Reference implementation can be found on `GitHub`_ and is also accessible in the form of a `patch`_. - -.. _`GitHub`: https://github.com/torsava/cpython/pull/1 -.. _`patch`: https://github.com/torsava/cpython/pull/1.patch +of the CPython build system (other implementations should then be able to use +the generated CPython data, rather than having to regenerate it themselves). Notes and References @@ -130,6 +350,15 @@ Notes and References http://stackoverflow.com/questions/6463918/how-can-i-get-a-list-of-all-the-python-standard-library-modules +Ideas leading up to this PEP were discussed on the `python-dev mailing list`_ +and subsequently on `python-ideas`_. + +.. _`python-dev mailing list`: + https://mail.python.org/pipermail/python-dev/2016-July/145534.html +.. _`python-ideas`: + https://mail.python.org/pipermail/python-ideas/2016-December/043907.html + + Copyright =========