python-peps/pep-0648.rst

327 lines
14 KiB
ReStructuredText

PEP: 648
Title: Extensible customizations of the interpreter at startup
Author: Mario Corchero <mariocj89@gmail.com>
Sponsor: Pablo Galindo
BDFL-Delegate: XXXX
Discussions-To: https://discuss.python.org/t/pep-648-extensible-customizations-of-the-interpreter-at-startup/6403
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 30-Dec-2020
Python-Version: 3.10
Post-History: python-ideas: 16th Dec. python-dev: 18th Dec.
Abstract
========
This PEP proposes supporting extensible customization of the interpreter by
allowing users to install scripts that will be executed at startup.
Motivation
==========
System administrators, tools that repackage the interpreter and some
libraries need to customize aspects of the interpreter at startup time.
This is usually achieved via ``sitecustomize.py`` for system administrators
whilst libraries rely on exploiting ``pth`` files. This PEP proposes a way of
achieving the same in a more user-friendly and structured way.
Limitations of ``pth`` files
----------------------------
If a library needs to perform any customization before an import or that
relates to the general working of the interpreter, they often rely on the
fact that ``pth`` files, which are loaded at startup and implemented via the
site module [#site]_, can include Python code that will be executed when the
``pth`` file is evaluated.
Note that ``pth`` files were originally developed to just add additional
directories to ``sys.path``, but they may also contain lines which start
with "import", which will be passed to ``exec()``. Users have exploited this
feature to allow the customizations that they needed. See setuptools
[#setuptools]_ or betterexceptions [#betterexceptions]_ as examples.
Using ``pth`` files for this purpose is far from ideal for library developers,
as they need to inject code into a single line preceded by an import, making
it rather unreadable. Library developers following that practice will usually
create a module that performs all actions on import, as done by
betterexceptions [#betterexceptions]_, but the approach is still not really
user friendly.
Additionally, it is also non-ideal for users of the interpreter if they want
to inspect what is being executed at Python startup as they need to review
all the ``pth`` files for potential code execution which can be spread across
all site paths. Most of those ``pth`` files will be "legitimate" ``pth``
files that just modify the path, answering the question of "what is changing
my interpreter at startup" a rather complex one.
Lastly, there have been multiple suggestions for removing code execution from
``pth`` files, see [#bpo-24534]_ and [#bpo-33944]_.
Limitations of ``sitecustomize.py``
-----------------------------------
Whilst sitecustomize is an acceptable solution, it assumes a single person is
in charge of the system and the interpreter. If both the system administrator
and the responsibility of provisioning the interpreter want to add
customizations at the interpreter startup they need to agree on the contents
of the file and combine all the changes. This is not a major limitation
though, and it is not the main driver of this change. Should the change
happen, it will also improve the situation for these users, as rather than
having a ``sitecustomize.py`` which performs all those actions, they can have
custom isolated files named after the features they want to enhance. As an
example, Ubuntu could change their current ``sitecustomize.py`` to just be
``ubuntu_apport_python_hook``. This not only better represents its intent but
also gives users of the interpreter a better understanding of the
modifications happening on their interpreter.
Benefits of ``__sitecustomize__``
---------------------------------
Having a structured way of injecting custom startup scripts will allow
supporting the cases presented above in a better way. It will result in both
maintainers and users having a better experience as detailed, and allow
CPython to deprecate and eventually remove code execution from ``pth`` files
as desired in the previously mentioned bpos. Additionally, these solutions
provide a unique way to support all use-cases that before have been fulfilled
via the misuse of ``pth`` files, ``sitecustomize.py`` and
``usercustomize.py``. The use of a ``__sitecustomize__`` will allow for
packages, tools and system admins to inject scripts that will be loaded at
startups through an easy-to-understand mechanism.
Rationale
=========
This PEP proposes supporting extensible customization of the interpreter at
startup by allowing users to install scripts into a folder named
``__sitecustomize__`` located in any site path. Those scripts will be executed
at startup time. The implementation will take advantage of the fact that all
site paths are already being walked to look for ``pth`` files to include also a
check for ``__sitecustomize__`` folders and execute all scripts within them.
The ``site`` module will expose an option on its main function that allows
listing all scripts that will be executed, which will allow users to quickly
see all customizations that affect an interpreter.
We will also work with packaging build backends to facilitate the installation
of these files.
Why ``__sitecustomize__``
-------------------------
The name aims to follow the already existing concept of ``sitecustomize.py``.
As the folder will be within ``sys.path``, given that it is located in site
paths, we choose to use double underscore around its name, to prevent
colliding with the already existing ``sitecustomize.py``.
Disabling start scripts
-----------------------
In some scenarios, like when the startup time is key, it might be desired to
disable this option altogether. Whilst we could add a new flag to do so, we
think that the already existing flag ``-S`` [#s-flag]_ is already good
enough, as it disables all ``site``-related manipulation. If the flag is
passed in, ``__sitecustomize__`` will not be used.
Order of execution
------------------
The scripts in ``__sitecustomize__`` will be executed in file name sorted order
after the evaluation of ``pth`` files. We considered executing them in random
order, but that could result in different results depending on how the
interpreter chooses to pick up those files. So even if it won't be a good
practice to rely on other files being executed, we think that is better than
having randomly different results on interpreter startup.
We chose to run the scripts after the ``pth`` files in case a user needs to
add items to the path before running a script.
Note the execution happens after the handling of ``pth`` files for each of the
site paths and that the ``__sitecustomize__`` folder need to be in site paths
and not in just any importable path.
Impact on startup time
----------------------
If an interpreter is not using this mechanism, the impact on performance is
expected to be minimal as this PEP just adds a check for
``__sitecustomize__`` when ``site.py`` is walking the site paths looking for
``pth`` files. This impact will be reduced in the future as we will remove
two other imports: "sitecustomize.py" and "usercustomize.py".
If the user has custom scripts, we think that the impact on the performance
of walking each of the folders is acceptable, as the user wants to use this
feature. If they need to run a time-sensitive application, they can always
use ``-S`` to disable this entirely.
Running "./python -c pass" with perf on 50 iterations, repeating 50 times the
command on each and getting the geometric mean on a commodity laptop did not
reveal any substantial raise on CPU time.
Failure handling
----------------
Any error on any of the scripts will not be logged unless the interpreter is
run in verbose mode and it should not stop the evaluation of other scripts.
The user will just receive a message in stderr saying that the script failed to
be executed and that verbose mode can be used to get more information. This
behaviour follows the one already existing for ``sitecustomize.py``.
Scripts naming convention
-------------------------
Packages will be encouraged to include the name of the package within the name
of the script to avoid collisions between packages. The only requirement on the
filename is that it ends in ``.py`` for the interpreter to execute them.
Relationship with sitecustomize and usercustomize
-------------------------------------------------
The existing logic for ``sitecustomize.py`` and ``usercustomize.py`` will be left
as is, later deprecated and scheduled for removal. Once ``__sitecustomize__`` is
supported, it will provide better integration for all existing users, and even
if it will indeed require a migration for system administrators, we expect the
effort required to be minimal, it will just require moving and renaming the
current ``sitecustomize.py`` into the new provided folder.
Identifying all installed scripts
---------------------------------
To facilitate debugging of the Python startup, a new option will be added to
the main of the site module to list all scripts that will be executed as part
of the ``__sitecustomize__`` initialization.
How to teach this
=================
This can be documented and taught as simple as saying that the interpreter
will try to look for the ``__sitecustomize__`` folder at startup in its site
paths and if it finds any scripts with ``.py`` extension, it will then
execute it one by one.
For system administrators and tools that package the interpreter, we can now
recommend placing files in ``__sitecustomize__`` as they used to place
``sitecustomize.py``. Being more comfortable on that their content won't be
overridden by the next person, as they can provide with specific files to
handle the logic they want to customize.
Library developers should be able to specify a new argument on tools like
setuptools that will inject those new files. Something like
``sitecustomize_scripts=["scripts/betterexceptions.py"]``, which allows them to
add those. Should the build backend not support that, they can manually
install them as they used to do with ``pth`` files. We will recommend them to
include the name of the package as part of the script's name.
Backward compatibility
======================
We propose to add support for ``__sitecustomize__`` in the next release of
Python, add a warning on the three next releases on the deprecation and
future removal of ``sitecustomize.py``, ``usercustomize.py`` and code execution
in ``pth`` files, and remove it after maintainers have had 4 releases to
migrate. Ignoring those lines in pth files.
Whilst the existing ``sitecutzomize.py`` mechanism was created targeting
System Administrators that placed it in a site path, the file could be
actually placed anywhere in the path at the time that the interpreter was
starting up. The new mechanism does not allow for users to place
``__sitecustomize__`` folders anywhere in the path, but only in site paths.
System administrators can recover a similar behavior to ``sitecustomize.py``
if they need it by adding a custom script in ``__sitecustomize__`` which just
imports ``sitecustomize`` as a migration path.
Reference Implementation
========================
An initial implementation that passes the CPython test suite is available for
evaluation [#reference-implementation]_.
This implementation is just for the reviewer to play with and check potential
issues that this PEP could generate.
Rejected Ideas
==============
Do nothing
----------
Whilst the current status "works" it presents the issues listed in the
motivation. After analysing the impact of this change, we believe it is worth
given the enhanced experience it brings.
Formalize using ``pth`` files
-----------------------------
Another option would be to just glorify and document the usage of ``pth`` files
to inject code at startup code, but that is a suboptimal experience for users
as listed in the motivation.
Making ``__sitecustomize__`` a namespace package
------------------------------------------------
We considered making the folder a namespace package and just import all the
modules within it, which allowed searching across all paths in ``sys.path``
at initialization time and provided a way to declare dependencies between
scripts by importing each other. This was rejected for multiple reasons:
1. This was unnecessarily broadening the list of paths where arbitrary scripts
are executed.
2. The logic brought additional complexity, like what to do if a package were
to install an ``__init__.py`` file in one of the locations.
3. It's cheaper to search for ``__sitecustomize__`` as we are looking for
``pth`` files already in the site paths compared to performing an actual
import of a namespace package.
Support for shutdown custom scripts
-----------------------------------
``init.d`` users might be tempted to implement this feature in a way that users
could also add code at shutdown, but extra support for that is not needed, as
Python users can already do that via ``atexit``.
Using entry_points
------------------
We considered extending the use of entry points to allow specifying scripts
that should be executed at startup but we discarded that solution due to two
main reasons. The first one being impact on startup time. This approach will
require scanning all packages distribution information to just execute a
handful of files. This has an impact on performance even if the user is not
using the feature and such impact growths linearly with the number of packages
installed in the environment. The second reason was that the proposed
implementation in this PEP offers a single solution for startup customization
for packages and system administrators. Additionally, if the main objective of
entry points is to make it easy for libraries to install scripts at startup,
that can still be added and make the build backends just install the files
within the ``__sitecustomize__`` folder.
Copyright
=========
This document is placed in the public domain or under the CC0-1.0-Universal
license, whichever is more permissive.
References
==========
.. [#bpo-24534]
https://bugs.python.org/issue24534
.. [#bpo-33944]
https://bugs.python.org/issue33944
.. [#s-flag]
https://docs.python.org/3/using/cmdline.html#id3
.. [#setuptools]
https://github.com/pypa/setuptools/blob/b6bbe236ed0689f50b5148f1172510b975687e62/setup.py#L100
.. [#betterexceptions]
https://github.com/Qix-/better-exceptions/blob/7b417527757d555faedc354c86d3b6fe449200c2/better_exceptions_hook.pth#L1
.. [#reference-implementation]
https://github.com/mariocj89/cpython/tree/pu/__sitecustomize__
.. [#site]
https://docs.python.org/3/library/site.html