PEP 534: Extensive rewrite after a mailing list discussion (#195)

This commit is contained in:
torsava 2017-02-01 15:06:41 +01:00 committed by Nick Coghlan
parent ccae12b0ba
commit 35ddc0c081
1 changed files with 42 additions and 62 deletions

View File

@ -2,8 +2,9 @@ PEP: 534
Title: Distributing a Subset of the Standard Library
Version: $Revision$
Last-Modified: $Date$
Author: Tomáš Orsava <torsava@redhat.com>,
Petr Viktorin <encukou@gmail.com>
Author: Tomáš Orsava <tomas.n@orsava.cz>,
Petr Viktorin <encukou@gmail.com>,
Nick Coghlan <ncoghlan@gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
@ -15,34 +16,23 @@ Post-History:
Abstract
========
Python is sometimes being distributed without its full standard library.
However, there is as of yet no standardized way of dealing with importing a
missing standard library module. This PEP proposes a mechanism for identifying
which standard library modules are missing and puts forth a method of how
attempts to import a missing standard library module should be handled.
Python is often being built or distributed without its full standard library.
However, there is as of yet no standard, user friendly way of properly informing the user about the failure to import such missing `stdlib` modules. This PEP proposes a mechanism for identifying standard library modules and informing the user appropriately.
Motivation
==========
There are several use cases for including only a subset of Python's standard
library. However, there is so far no formal specification of how to properly
implement distribution of a subset of the standard library. Namely, how to
safely handle attempts to import a missing *stdlib* module, and display an
informative error message.
library. However, there is so far no user-friendly mechanism for informing the user *why* a stdlib module is missing and how to remedy the situation appropriately.
CPython
-------
When one of Python standard library modules (such as ``_sqlite3``) cannot be
compiled during a Python build because of missing dependencies (e.g. SQLite
header files), the module is simply skipped.
If you then install this compiled Python and use it to try to import one of the
missing modules, Python will go through the ``sys.path`` entries looking for
it. It won't find it among the *stdlib* modules and thus it will continue onto
``site-packages`` and fail with a ModuleNotFoundError_ if it doesn't find it.
header files), the module is simply skipped. If you then install this compiled Python and use it to try to import one of the
missing modules, Python will fail with a ModuleNotFoundError_.
.. _ModuleNotFoundError:
https://docs.python.org/3.7/library/exceptions.html#ModuleNotFoundError
@ -72,57 +62,41 @@ omitted modules, potentially leaving users baffled as to where to find them.
Specification
=============
When, for any reason, a standard library module is not to be included with the
rest, a file with its name and the extension ``.missing.py`` shall be created
and placed in the directory the module itself would have occupied. This file
can contain any Python code, however, it *should* raise a ModuleNotFoundError_
with a helpful error message.
The `sysconfig`_ module will be extended by two functions: `sysconfig.get_stdlib_modules()`, which will provide a list of the names of all Python standard library modules, and a function `sysconfig.get_optional_modules()`, that will list optional `stdlib` module names. The results of the latter function—`sysconfig.get_optional_modules()`—as well as of the existing `sys.builtin_module_names` will both be subsets of the full list provided by `sysconfig.get_stdlib_modules()`. These added lists will be generated during the Python build process and saved in `_sysconfigdata-*.py` file along with other `sysconfig`_ values.
Currently, when Python tries to import a module ``XYZ``, the ``FileFinder``
path hook goes through the entries in ``sys.path``, and in each location looks
for a file whose name is ``XYZ`` with one of the valid suffixes (e.g. ``.so``,
..., ``.py``, ..., ``.pyc``). The suffixes are tried in order. If none of
them are found, Python goes on to try the next directory in ``sys.path``.
.. _`sysconfig`: https://docs.python.org/3.7/library/sysconfig.html
The ``.missing.py`` extension will be added to the end of the list, and
configured to be handled by ``SourceFileLoader``. Thus, if a module is not
found in its proper location, the ``XYZ.missing.py`` file is found and
executed, and further locations are not searched.
The default implementation of the `sys.excepthook`_ function will then be modified to dispense an appropriate message when it detects a failure to import a module identified by one of the two new `sysconfig`_ functions as belonging to the Python standard library.
The CPython build system will be modified to generate ``.missing.py`` files for
optional modules that were not built.
.. _`sys.excepthook`: https://docs.python.org/3.7/library/sys.html#sys.excepthook
Rationale
=========
The mechanism of handling missing standard library modules through the use of
the ``.missing.py`` files was chosen due to its advantages both for CPython
itself and for Linux and other distributions that are packaging it.
The `sys.excepthook`_ function gets called when a raised exception is uncaught and the program is about to exit or—in an interactive session—the control is being returned to the prompt. This makes it a perfect place for customized error messages, as it will not influence caught errors and thus not slow down normal execution of Python scripts.
The missing pieces of the standard library can be subsequently installed simply
by putting the module files in their appropriate location. They will then take
precedence over the corresponding ``.missing.py`` files. This makes
installation simple for Linux package managers.
The inclusion of the functions `sysconfig.get_stdlib_modules()` and `sysconfig.get_optional_modules()` will also provide a long sought-after way of easily listing the names of Python standard library modules [#stackoverflow-stdlib]_, which will—among other benefits—make it easier for code analysis, profiling, and error reporting tools to offer runtime `--ignore-stdlib` flags.
This mechanism also solves the minor issue of importing a module from
``site-packages`` with the same name as a missing standard library module.
Now, Python will import the ``.missing.py`` file and won't ever look for a
*stdlib* module in ``site-packages``.
In addition, this method of handling missing *stdlib* modules can be
implemented in a succinct, non-intrusive way in CPython, and thus won't add to
the complexity of the existing code base.
The ``.missing.py`` file can be customized by the packager to provide any
desirable behaviour. While we strongly recommend that these files only raise a
ModuleNotFoundError_ with an appropriate message, there is no reason to limit
customization options.
Ideas leading up to this PEP were discussed on the `python-dev mailing list`_.
Ideas leading up to this PEP were discussed on the `python-dev mailing list`_ and subsequently on `python-ideas`_.
.. _`python-dev mailing list`:
https://mail.python.org/pipermail/python-dev/2016-July/145534.html
.. _`python-ideas`:
https://mail.python.org/pipermail/python-ideas/2016-December/043907.html
Recommendation for Downstream Distributors
==========================================
By patching `site.py`_ [*]_ to provide their own implementation of the `sys.excepthook`_ function, Python distributors can display tailor-made error messages for any uncaught exceptions, including informing the user of a proper, distro-specific way to install missing standard library modules upon encountering a `ModuleNotFoundError`_. Some downstream distributors are already using this method of patching `sys.excepthook` to integrate with platform crash reporting mechanisms.
.. _`site.py`: https://docs.python.org/3.7/library/site.html
.. _`sitecustomize.py`: `site.py`_
An `example implementation`_ is attached to this PEP.
.. _`example implementation`: `Reference and Example Implementation`_
Backwards Compatibility
@ -133,21 +107,27 @@ already patching Python modules to provide custom handling of missing
dependencies can continue to do so unhindered.
Reference Implementation
========================
Reference and Example Implementation
====================================
Reference implementation can be found on `GitHub`_ and is also accessible in
the form of a `patch`_.
TBD. The finer details will depend on what's practical given the capabilities
of the build system.
.. Reference implementation can be found on `GitHub`_ and is also accessible in the form of a `patch`_.
.. _`GitHub`: https://github.com/torsava/cpython/pull/1
.. _`patch`: https://github.com/torsava/cpython/pull/1.patch
References
==========
Notes and References
====================
.. [*] Or `sitecustomize.py`_ for organizations with their own custom
Python variant.
.. [#debian-patch]
http://bazaar.launchpad.net/~doko/python/pkg3.5-debian/view/head:/patches/tkinter-import.diff
.. [#stackoverflow-stdlib]
http://stackoverflow.com/questions/6463918/how-can-i-get-a-list-of-all-the-python-standard-library-modules
Copyright