497 lines
22 KiB
ReStructuredText
497 lines
22 KiB
ReStructuredText
PEP: 648
|
|
Title: Extensible customizations of the interpreter at startup
|
|
Author: Mario Corchero <mariocj89@gmail.com>
|
|
Sponsor: Pablo Galindo
|
|
Discussions-To: https://discuss.python.org/t/pep-648-extensible-customizations-of-the-interpreter-at-startup/6403
|
|
Status: Rejected
|
|
Type: Standards Track
|
|
Content-Type: text/x-rst
|
|
Created: 30-Dec-2020
|
|
Python-Version: 3.11
|
|
Post-History: 16-Dec-2020, 18-Dec-2020
|
|
|
|
Abstract
|
|
========
|
|
|
|
This PEP proposes supporting extensible customization of the interpreter by
|
|
allowing users to install files that will be executed at startup.
|
|
|
|
PEP Rejection
|
|
=============
|
|
|
|
PEP 648 was rejected `by the steering council
|
|
<https://mail.python.org/archives/list/python-dev@python.org/message/UHODNOGISLYUKX2K2JCCBYMZFEWZDSPO/>`__
|
|
as it has a limited number of use cases and further complicates the startup sequence.
|
|
|
|
Motivation
|
|
==========
|
|
|
|
System administrators, tools that repackage the interpreter and some
|
|
libraries need to customize aspects of the interpreter at startup time.
|
|
|
|
This is usually achieved via ``sitecustomize.py`` for system administrators
|
|
whilst libraries rely on exploiting ``pth`` files. This PEP proposes a way of
|
|
achieving the same functionality in a more user-friendly and structured way.
|
|
|
|
Limitations of ``pth`` files
|
|
----------------------------
|
|
|
|
If a library needs to perform any customization before an import or that
|
|
relates to the general working of the interpreter, they often rely on the
|
|
fact that ``pth`` files, which are loaded at startup and implemented via the
|
|
site module [#site]_, can include Python code that will be executed when the
|
|
``pth`` file is evaluated.
|
|
|
|
Note that ``pth`` files were originally developed to just add additional
|
|
directories to ``sys.path``, but they may also contain lines which start
|
|
with "import", which will be passed to ``exec()``. Users have exploited this
|
|
feature to allow the customizations that they needed. See setuptools
|
|
[#setuptools]_ or betterexceptions [#betterexceptions]_ as examples.
|
|
|
|
Using ``pth`` files for this purpose is far from ideal for library developers,
|
|
as they need to inject code into a single line preceded by an import, making
|
|
it rather unreadable. Library developers following that practice will usually
|
|
create a module that performs all actions on import, as done by
|
|
betterexceptions [#betterexceptions]_, but the approach is still not really
|
|
user friendly.
|
|
|
|
Additionally, it is also non-ideal for users of the interpreter if they want
|
|
to inspect what is being executed at Python startup as they need to review
|
|
all the ``pth`` files for potential code execution which can be spread across
|
|
all site paths. Most of those ``pth`` files will be "legitimate" ``pth``
|
|
files that just modify the path, answering the question of "what is changing
|
|
my interpreter at startup" a rather complex one.
|
|
|
|
Lastly, there have been multiple suggestions for removing code execution from
|
|
``pth`` files, see [#bpo-24534]_ and [#bpo-33944]_.
|
|
|
|
Limitations of ``sitecustomize.py``
|
|
-----------------------------------
|
|
|
|
Whilst sitecustomize is an acceptable solution, it assumes a single person is
|
|
in charge of the system and the interpreter. If both the system administrator
|
|
and the responsibility of provisioning the interpreter want to add
|
|
customizations at the interpreter startup they need to agree on the contents
|
|
of the file and combine all the changes. This is not a major limitation
|
|
though, and it is not the main driver of this change. Should the change
|
|
happen, it will also improve the situation for these users, as rather than
|
|
having a ``sitecustomize.py`` which performs all those actions, they can have
|
|
custom isolated files named after the features they want to enhance. As an
|
|
example, Ubuntu could change their current ``sitecustomize.py`` to just be
|
|
``ubuntu_apport_python_hook``. This not only better represents its intent but
|
|
also gives users of the interpreter a better understanding of the
|
|
modifications happening on their interpreter.
|
|
|
|
Rationale
|
|
=========
|
|
|
|
This PEP proposes supporting extensible customization of the interpreter at
|
|
startup by executing all files discovered in directories named
|
|
``__sitecustomize__`` in sitepackages [#sitepackages-api]_ or
|
|
usersitepackages [#usersitepackages-api]_ at startup time.
|
|
|
|
Why ``__sitecustomize__``
|
|
-------------------------
|
|
|
|
The name aims to follow the already existing concept of ``sitecustomize.py``.
|
|
As the directory will be within ``sys.path``, given that it is located in
|
|
site paths, we choose to use double underscore around its name, to prevent
|
|
colliding with the already existing ``sitecustomize.py``.
|
|
|
|
Discovering the new ``__sitecustomize__`` directories
|
|
-----------------------------------------------------
|
|
|
|
The Python interpreter will look at startup for directory named
|
|
``__sitecustomize__`` within any of the standard site-packages path.
|
|
|
|
These are commonly the Python system location and the user location, but are
|
|
ultimately defined by the site module logic.
|
|
|
|
Users can use ``site.sitepackages`` [#sitepackages-api]_ and
|
|
``site.usersitepackages`` [#usersitepackages-api]_ to know the paths where
|
|
the interpreter can discover ``__sitecustomize__`` directories.
|
|
|
|
Time of ``__sitecustomize__`` discovery
|
|
---------------------------------------
|
|
|
|
The ``__sitecustomize__`` directories will be discovered exactly after ``pth``
|
|
files are discovered in a site-packages path as part of ``site.addsitedir``
|
|
[#siteaddsitedir]_.
|
|
|
|
These is repeated for each of the site-packages path in the exact same order
|
|
that is being followed today for ``pth`` files.
|
|
|
|
Order of execution within ``__sitecustomize__``
|
|
-----------------------------------------------
|
|
|
|
The implementation will execute the files within ``__sitecustomize__`` by
|
|
sorting them by name when discovering each of the ``__sitecustomize__``
|
|
directories. We discourage users to rely on the order of execution though.
|
|
|
|
We considered executing them in random order, but that could result in
|
|
different results depending on how the interpreter chooses to pick up those
|
|
files. So even if it won't be a good practice to rely on other files being
|
|
executed, we think that is better than having randomly different results on
|
|
interpreter startup. We chose to run the files after the ``pth`` files in
|
|
case a user needs to add items to the path before running a files.
|
|
|
|
Interaction with ``pth`` files
|
|
------------------------------
|
|
|
|
``pth`` files can be used to add paths into ``sys.path``, but this should not
|
|
affect the ``__sitecustomize__`` discovery process, as those directories are
|
|
looked up exclusively in site-packages paths.
|
|
|
|
Execution of files within ``__sitecustomize__``
|
|
-----------------------------------------------
|
|
|
|
When a ``__sitecustomize__`` directory is discovered, all of the files that
|
|
have a ``.py`` extension within it will be read with ``io.open_code`` and
|
|
executed by using ``exec`` [#exec]_.
|
|
|
|
An empty dictionary will be passed as ``globals`` to the ``exec`` function
|
|
to prevent unexpected interactions between different files.
|
|
|
|
Failure handling
|
|
----------------
|
|
|
|
Any error on the execution of any of the files will not be logged unless the
|
|
interpreter is run in verbose mode and it should not stop the evaluation of
|
|
other files. The user will receive a message in stderr saying that the file
|
|
failed to be executed and that verbose mode can be used to get more
|
|
information. This behaviour mimics the one existing for ``sitecustomize.py``.
|
|
|
|
Interaction with virtual environments
|
|
-------------------------------------
|
|
|
|
The customizations applied to an interpreter via the new
|
|
``__sitecustomize__`` solutions will continue to work when a user creates a
|
|
virtual environment the same way that ``sitecustomize.py``
|
|
interact with virtual environments.
|
|
|
|
This is a difference when compared to ``pth`` files, which are not propagated
|
|
into virtual environments unless ``include-system-site-packages`` is enabled.
|
|
|
|
If library maintainers have features installed via ``__sitecustomize__`` that
|
|
they do not want to propagate into virtual environments, they should detect
|
|
if they are running within a virtual environment by checking ``sys.prefix ==
|
|
sys.base_prefix``. This behavior is similar to packages that modify the global
|
|
``sitecustomize.py``.
|
|
|
|
Interaction with ``sitecustomize.py`` and ``usercustomize.py``
|
|
--------------------------------------------------------------
|
|
|
|
Until removed, ``sitecustomize`` and ``usercustomize`` will be executed after
|
|
``__sitecustomize__`` similar to pth files. See the Backward compatibility
|
|
section for information on removal plans for ``sitecustomize`` and
|
|
``usercustomize``.
|
|
|
|
Identifying all installed files
|
|
-------------------------------
|
|
|
|
To facilitate debugging of the Python startup, if the site module is invoked
|
|
it will print the ``__sitecustomize__`` directories that will be discovered
|
|
on startup.
|
|
|
|
Files naming convention
|
|
-----------------------
|
|
|
|
Packages will be encouraged to include the name of the package within the
|
|
name of the file to avoid collisions between packages. But the only
|
|
requirement on the filename is that it ends in ``.py`` for the interpreter to
|
|
execute them.
|
|
|
|
Disabling start files
|
|
---------------------
|
|
|
|
In some scenarios, like when the startup time is key, it might be desired to
|
|
disable this option altogether. The already existing flag ``-S`` [#s-flag]_
|
|
will disable all ``site``-related manipulation, including this new feature.
|
|
If the flag is passed in, ``__sitecustomize__`` directories will not be
|
|
discovered.
|
|
|
|
Additionally, to allow for starting the interpreter disabling only this new
|
|
feature a new option will be added under ``-X``: ``disablesitecustomize``,
|
|
which will disable the discovery of ``__sitecustomize__`` exclusively.
|
|
|
|
Lastly, the user can disable the discovery of ``__sitecustomize__``
|
|
directories only in the user site by disabling the user site via any of the
|
|
multiple options in the ``site.py`` module.
|
|
|
|
Support in build backends
|
|
-------------------------
|
|
|
|
Whilst build backends can choose to provide an option to facilitate the
|
|
installation of these files into a ``__sitecustomize__`` directory, this
|
|
PEP does not address that directly. Similar to ``pth`` files, build backends
|
|
can choose to not provide an easy-to-configure mechanism for
|
|
``__sitecustomize__`` files and let users hook into the installation
|
|
process to include such files. We do not think build backends enhanced
|
|
support as a requirement for this PEP.
|
|
|
|
Impact on startup time
|
|
----------------------
|
|
|
|
A concern in this implementation is how Python interpreter startup time can
|
|
be affected by this addition. We expect the performance impact to be highly
|
|
coupled to the logic in the files that a user or sysadmin installs in the
|
|
Python environment being tested.
|
|
|
|
If the interpreter has any files in their ``__sitecustomize__`` directory,
|
|
the file execution time plus a call reading the code will be added to the
|
|
startup time. This is similar to how code execution is impacting startup time
|
|
through ``sitecustomize.py``, ``usercustomize.py`` and code in ``pth`` files.
|
|
We will therefore focus here on comparing this solution against those three,
|
|
as otherwise the actual time added to startup is highly dependent on the code
|
|
that is being executed in those files.
|
|
|
|
Results were gathered by running "./python.exe -c pass" with perf on 50
|
|
iterations, repeating 50 times the command on each iteration and getting the
|
|
geometric mean of all the results. The file used to run those benchmarks is
|
|
checked in in the reference implementation [#reference-implementation]_.
|
|
|
|
The benchmark was run with 3.10 alpha 7 compiled with PGO and LTO with the
|
|
following parameters and system state:
|
|
|
|
- Perf event: Max sample rate set to 1 per second
|
|
- CPU Frequency: Minimum frequency of CPU 17,35 set to the maximum frequency
|
|
- Turbo Boost (MSR): Turbo Boost disabled on CPU 17: MSR 0x1a0 set to 0x4000850089
|
|
- IRQ affinity: Set default affinity to CPU 0-16,18-34
|
|
- IRQ affinity: Set affinity of IRQ 1,3-16,21,25-31,56-59,68-85,87,89-90,92-93,95-104 to CPU 0-16,18-34
|
|
- CPU: use 2 logical CPUs: 17,35
|
|
- Perf event: Maximum sample rate: 1 per second
|
|
- ASLR: Full randomization
|
|
- Linux scheduler: Isolated CPUs (2/36): 17,35
|
|
- Linux scheduler: RCU disabled on CPUs (2/36): 17,35
|
|
- CPU Frequency: 0-16,18-34=min=1200 MHz, max=3600 MHz; 17,35=min=max=3600 MHz
|
|
- Turbo Boost (MSR): CPU 17,35: disabled
|
|
|
|
The code placed to be executed in ``pth`` files, ``sitecustomize.py``,
|
|
``usercustomize.py`` and files within ``__sitecustomize__`` is the following:
|
|
|
|
import time; x = time.time() ** 5
|
|
|
|
The file is aimed at execution a simple operation but still expected to be
|
|
negligible. This is to put the experiment in a situation where we make
|
|
visible any hit on performance due to the mechanism whilst still making it
|
|
relatively realistic. Additionally, it starts with an import and is a single
|
|
line to be able to be used in ``pth`` files.
|
|
|
|
==== ==================== ==================== ======= ===================== ====== =====
|
|
Test # of files Time (us)
|
|
---- -------------------------------------------------------------------------- -------------
|
|
# ``sitecustomize.py`` ``usercustomize.py`` ``pth`` ``__sitecustomize__`` Run 1 Run 2
|
|
==== ==================== ==================== ======= ===================== ====== =====
|
|
1 0 0 0 Dir not created 13884 13897
|
|
2 0 0 0 0 13871 13818
|
|
3 0 0 1 0 13964 13924
|
|
4 0 0 0 1 13940 13939
|
|
5 1 1 0 0 13990 13993
|
|
6 0 0 0 2 (system + user) 14063 14040
|
|
7 0 0 50 0 16011 16014
|
|
8 0 0 0 50 15456 15448
|
|
==== ==================== ==================== ======= ===================== ====== =====
|
|
|
|
Results can be reproduced with ``run-benchmark.py`` script provided in the
|
|
reference implementation [#reference-implementation]_.
|
|
|
|
We interpret the following from these results:
|
|
|
|
- Using two ``__sitecustomize__`` scripts compared to ``sitecustomize.py``
|
|
and ``usercustomize.py`` slows down the interpreter by 0.3%. We expect this
|
|
slowdown until ``sitecustomize.py`` and ``usercustomize.py`` are removed in
|
|
a future release as even if the user does not create the files, the
|
|
interpreter will still attempt to import them.
|
|
- With the arbitrary 50 pth files with code tested, moving those to
|
|
``__sitecustomize__`` produces a speedup of ~3.5% in startup. Which is likely
|
|
related to the simpler logic to evaluate ``__sitecustomize__`` files compared
|
|
to ``pth`` file execution.
|
|
- In general all measurements show that there is a low impact on startup time
|
|
with this addition.
|
|
|
|
Audit Event
|
|
-----------
|
|
|
|
A new audit event will be added and triggered on ``__sitecustomize__``
|
|
execution to facilitate security inspection by calling ``sys.audit``
|
|
[#sysaudit]_ with "sitecustimze.exec_file" as name and the filename as
|
|
argument.
|
|
|
|
|
|
Security implications
|
|
---------------------
|
|
|
|
This PEP aims to move all code execution from ``pth`` files to files within a
|
|
``__sitecustomize__`` directory. We think this is an improvement to system admins
|
|
for the following reasons:
|
|
|
|
* Allows to quickly identify the code being executed at startup time by the
|
|
interpreter by looking into a single directory rather than having to scan
|
|
all ``pth`` files.
|
|
|
|
* Allows to track usage of this feature through the new proposed audit event.
|
|
|
|
* Gives finer grain control by allowing to tune permissions on the
|
|
``__sitecustomize__`` directory, potentially allowing users to install only
|
|
packages that does not change the interpreter startup.
|
|
|
|
In short, whilst this allows for a malicious users to drop a file that will
|
|
be executed at startup, it's an improvement compared to the existing ``pth``
|
|
files.
|
|
|
|
How to teach this
|
|
=================
|
|
|
|
This can be documented and taught as simple as saying that the interpreter
|
|
will try to look for the ``__sitecustomize__`` directory at startup in its
|
|
site paths and if it finds any files with ``.py`` extension, it will then
|
|
execute it one by one.
|
|
|
|
For system administrators and tools that package the interpreter, we can now
|
|
recommend placing files in ``__sitecustomize__`` as they used to place
|
|
``sitecustomize.py``. Being more comfortable on that their content won't be
|
|
overridden by the next person, as they can provide with specific files to
|
|
handle the logic they want to customize.
|
|
|
|
Library developers should be able to specify a new argument on tools like
|
|
setuptools that will inject those new files. Something like
|
|
``sitecustomize_files=["scripts/betterexceptions.py"]``, which allows them to
|
|
add those. Should the build backend not support that, they can manually
|
|
install them as they used to do with ``pth`` files. We will recommend them to
|
|
include the name of the package as part of the file's name.
|
|
|
|
Backward compatibility
|
|
======================
|
|
|
|
This PEP adds a deprecation warning on ``sitecustomize.py``,
|
|
``usercustomize.py`` and ``pth`` code execution in 3.11, 3.12 and 3.13. With
|
|
plans on removing those features by 3.14. The migration from those solutions
|
|
to ``__sitecustomize__`` should ideally be just moving the logic into a
|
|
different file.
|
|
|
|
Whilst the existing ``sitecustomize.py`` mechanism was created targeting
|
|
System Administrators that placed it in a site path, the file could be
|
|
actually placed anywhere in the path at the time that the interpreter was
|
|
starting up. The new mechanism does not allow for users to place
|
|
``__sitecustomize__`` directories anywhere in the path, but only in site
|
|
paths. System administrators can recover a similar behavior to
|
|
``sitecustomize.py`` by adding a custom file in ``__sitecustomize__`` which
|
|
just imports ``sitecustomize`` as a migration path.
|
|
|
|
Reference Implementation
|
|
========================
|
|
|
|
An initial implementation that passes the CPython test suite is available for
|
|
evaluation [#reference-implementation]_.
|
|
|
|
This implementation is just for the reviewer to play with and check potential
|
|
issues that this PEP could generate.
|
|
|
|
Rejected Ideas
|
|
==============
|
|
|
|
Do nothing
|
|
----------
|
|
|
|
Whilst the current status "works" it presents the issues listed in the
|
|
motivation. After analyzing the impact of this change, we believe it is worth
|
|
it, given the enhanced experience it brings.
|
|
|
|
Formalize using ``pth`` files
|
|
-----------------------------
|
|
|
|
Another option would be to just glorify and document the usage of ``pth`` files
|
|
to inject code at startup code, but that is a suboptimal experience for users
|
|
as listed in the motivation.
|
|
|
|
Making ``__sitecustomize__`` a namespace package
|
|
------------------------------------------------
|
|
|
|
We considered making the directory a namespace package and just import all
|
|
the modules within it, which allowed searching across all paths in
|
|
``sys.path`` at initialization time and provided a way to declare
|
|
dependencies between files by importing each other. This was rejected for
|
|
multiple reasons:
|
|
|
|
1. This was unnecessarily broadening the list of paths where arbitrary files
|
|
are executed.
|
|
2. The logic brought additional complexity, like what to do if a package were
|
|
to install an ``__init__.py`` file in one of the locations.
|
|
3. It's cheaper to search for ``__sitecustomize__`` as we are looking for
|
|
``pth`` files already in the site paths compared to performing an actual
|
|
import of a namespace package.
|
|
|
|
Support for shutdown customization
|
|
----------------------------------
|
|
|
|
``init.d`` users might be tempted to implement this feature in a way that users
|
|
could also add code at shutdown, but extra support for that is not needed, as
|
|
Python users can already do that via ``atexit``.
|
|
|
|
Using entry_points
|
|
------------------
|
|
|
|
We considered extending the use of entry points to allow specifying files
|
|
that should be executed at startup but we discarded that solution due to two
|
|
main reasons. The first one being impact on startup time. This approach will
|
|
require scanning all packages distribution information to just execute a
|
|
handful of files. This has an impact on performance even if the user is not
|
|
using the feature and such impact growths linearly with the number of packages
|
|
installed in the environment. The second reason was that the proposed
|
|
implementation in this PEP offers a single solution for startup customization
|
|
for packages and system administrators. Additionally, if the main objective of
|
|
entry points is to make it easy for libraries to install files at startup,
|
|
that can still be added and make the build backends just install the files
|
|
within the ``__sitecustomize__`` directory.
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document is placed in the public domain or under the CC0-1.0-Universal
|
|
license, whichever is more permissive.
|
|
|
|
Acknowledgements
|
|
================
|
|
|
|
Thanks Pablo Galindo for contributing to this PEP and offering his PC to run
|
|
the benchmark.
|
|
|
|
References
|
|
==========
|
|
|
|
.. [#bpo-24534]
|
|
https://bugs.python.org/issue24534
|
|
|
|
.. [#bpo-33944]
|
|
https://bugs.python.org/issue33944
|
|
|
|
.. [#s-flag]
|
|
https://docs.python.org/3/using/cmdline.html#id3
|
|
|
|
.. [#setuptools]
|
|
https://github.com/pypa/setuptools/blob/b6bbe236ed0689f50b5148f1172510b975687e62/setup.py#L100
|
|
|
|
.. [#betterexceptions]
|
|
https://github.com/Qix-/better-exceptions/blob/7b417527757d555faedc354c86d3b6fe449200c2/better_exceptions_hook.pth#L1
|
|
|
|
.. [#reference-implementation]
|
|
https://github.com/mariocj89/cpython/tree/pu/__sitecustomize__
|
|
|
|
.. [#site]
|
|
https://docs.python.org/3/library/site.html
|
|
|
|
.. [#sitepackages-api]
|
|
https://docs.python.org/3/library/site.html?highlight=site#site.getsitepackages
|
|
|
|
.. [#usersitepackages-api]
|
|
https://docs.python.org/3/library/site.html?highlight=site#site.getusersitepackages
|
|
|
|
.. [#siteaddsitedir]
|
|
https://github.com/python/cpython/blob/5787ba4a45492e232f5470c7d2e93763198e4b22/Lib/site.py#L207
|
|
|
|
.. [#exec]
|
|
https://docs.python.org/3/library/functions.html#exec
|
|
|
|
.. [#sysaudit]
|
|
https://docs.python.org/3/library/sys.html#sys.audit
|