PEP 582: Update the PEP with all feedback (#2999)
Updates the PEP with all feedback This also adds more proper explanation for the interpreter and tool developers.
This commit is contained in:
parent
6adb907aaf
commit
c5efbaca06
275
pep-0582.rst
275
pep-0582.rst
|
@ -16,94 +16,104 @@ Python-Version: 3.12
|
|||
Abstract
|
||||
========
|
||||
|
||||
This PEP proposes to add to Python a mechanism to automatically recognize a
|
||||
``__pypackages__`` directory and prefer importing packages installed in this
|
||||
location over user or global site-packages. This will avoid the steps to create,
|
||||
activate or deactivate "virtual environments". Python will use the
|
||||
``__pypackages__`` from the base directory of the script when present.
|
||||
This PEP proposes extending the existing mechanism for setting up ``sys.path``
|
||||
to include a new ``__pypackages__`` directory, in addition to the existing
|
||||
locations. The new directory will be added at the start of ``sys.path``, after
|
||||
the current working directory and just before the system site-packages, to give
|
||||
packages installed there priority over other locations.
|
||||
|
||||
This is similar to the existing mechanism of adding the current directory (or
|
||||
the directory the script is located in), but by using a subdirectory,
|
||||
additional libraries are kept separate from the user's work.
|
||||
|
||||
|
||||
Motivation
|
||||
==========
|
||||
|
||||
Python virtual environments have become an essential part of development and
|
||||
teaching workflow in the community, but at the same time, they create a barrier
|
||||
to entry for many. The following are a few of the issues people run into while
|
||||
being introduced to Python (or programming for the first time).
|
||||
New Python programmers can benefit from being taught the value of isolating an
|
||||
individual project's dependencies from their system environment. However, the
|
||||
existing mechanism for doing this, virtual environments, is known to be complex
|
||||
and error-prone for beginners to understand. Explaining virtual environments is
|
||||
often a distraction when trying to get a group of beginners set up - differences
|
||||
in platform and shell environments require individual assistance, and the need
|
||||
for activation in every new shell session makes it easy for students to make
|
||||
mistakes when coming back to work after a break. This proposal offers a lightweight
|
||||
solution that gives isolation without the user needing to understand more
|
||||
advanced concepts.
|
||||
|
||||
- How virtual environments work is a lot of information for anyone new. It takes
|
||||
a lot of extra time and effort to explain them.
|
||||
|
||||
- Different platforms and shell environments require different sets of commands
|
||||
to activate the virtual environments. Any workshop or teaching environment with
|
||||
people coming with different operating systems installed on their laptops create a
|
||||
lot of confusion among the participants.
|
||||
|
||||
- Virtual environments need to be activated on each opened terminal. If someone
|
||||
creates/opens a new terminal, that by default does not get the same environment
|
||||
as in a previous terminal with virtual environment activated.
|
||||
Furthermore, standalone Python applications usually need 3rd party libraries to
|
||||
function. Typically, they are either designed to be run from a virtual environment,
|
||||
where the dependencies are installed into the environment alongside the application,
|
||||
or they bundle their depenencies in a subdirectory, and modify ``sys.path`` at
|
||||
application startup. Virtual environments, while a common and effective solution
|
||||
(used, for example, by the ``pipx`` tool), are somewhat awkward to set up and manage,
|
||||
and are not relocatable. On the other hand, manual manipulation of ``sys.path`` is
|
||||
boilerplate that developers need to get right, and (being a runtime behaviour)
|
||||
it is not understood by tools like linters and type checkers. The ``__pypackages__``
|
||||
proposal formalises the idea of a "bundled dependencies" location, avoiding the
|
||||
boilerplate and providing a standard location that development tools can be taught
|
||||
to recognise.
|
||||
|
||||
It should be noted that in general, Python libraries cannot be simply copied
|
||||
between machines, platforms, or even necessarily between Python versions. This
|
||||
proposal does nothing to change that fact, and while it is tempting to assume
|
||||
that bundling a script and its ``__pypackages__`` is a mechanism for
|
||||
distributing applications, this is explicitly *not* a goal of this proposal.
|
||||
Developers remain responsible for the portability of their code.
|
||||
|
||||
Rationale
|
||||
=========
|
||||
|
||||
Python is a beginner friendly programming language. But, so far virtual environment(s)
|
||||
is the major time taking part of the learning process of a new person. This PEP is not
|
||||
trying to solve every packaging problem, but focused on the 90% of the new folks who
|
||||
struggle with virtual environments in their learning path. Creating a new directory
|
||||
is still far easier than learning the details of virtual environments in various
|
||||
platforms.
|
||||
While ``sys.path`` can be manipulated at runtime, the default value is important, as
|
||||
it establishes a common baseline that users and tools can agree on. The current default
|
||||
does not include a location that could be viewed as "private to the current project",
|
||||
and yet that is a useful concept.
|
||||
|
||||
This is similar to the npm ``node_modules`` directory, which is popular in the
|
||||
Javascript community, and something that developers familiar with that
|
||||
ecosystem often ask for from Python.
|
||||
|
||||
A major point for this PEP is that it is not trying to replace virtual environments.
|
||||
If one needs all the features of virtual environments, they should use proper virtual
|
||||
environments (for example, created using the :mod:`venv` module).
|
||||
|
||||
Specification
|
||||
=============
|
||||
|
||||
When the Python binary is executed, it attempts to determine its prefix (as
|
||||
stored in ``sys.prefix``), which is then used to find the standard library and
|
||||
other key files, and by the ``site`` module to determine the location of the
|
||||
``site-package`` directories. Currently the prefix is found -- assuming
|
||||
``PYTHONHOME`` is not set -- by first walking up the filesystem tree looking for
|
||||
a marker file (``os.py``) that signifies the presence of the standard library,
|
||||
and if none is found, falling back to the build-time prefix hard coded in the
|
||||
binary. The result of this process is the contents of ``sys.path`` - a list of
|
||||
locations that the Python import system will search for modules.
|
||||
|
||||
This PEP proposes to add a new step in this process. If a ``__pypackages__``
|
||||
directory is found in the current working directory, then it will be included in
|
||||
``sys.path`` after the current working directory and just before the system
|
||||
site-packages. This way, if the Python executable starts in the given project
|
||||
directory, it will automatically find all the dependencies inside of
|
||||
``__pypackages__``.
|
||||
This PEP proposes to add a new step in the process of calculating ``sys.path`` at
|
||||
startup.
|
||||
|
||||
In case of Python scripts, Python will try to find ``__pypackages__`` in the
|
||||
same directory as the script. If found (along with the current Python version
|
||||
directory inside), then it will be used, otherwise Python will behave as it does
|
||||
currently.
|
||||
When the interactive interpreter starts, if a ``__pypackages__`` directory is
|
||||
found in the current working directory, then it will be included in
|
||||
``sys.path`` after the entry for current working directory and just before the
|
||||
system site-packages.
|
||||
|
||||
If any package management tool finds the same ``__pypackages__`` directory in
|
||||
the current working directory, it will install any packages there and also
|
||||
create it if required based on Python version.
|
||||
When the interpreter runs a script, Python will try to find ``__pypackages__``
|
||||
in the same directory as the script. If found (along with the current Python
|
||||
version directory inside), then it will be used, otherwise Python will behave
|
||||
as it does currently.
|
||||
|
||||
Projects that use a source management system can include a ``__pypackages__``
|
||||
directory (empty or with e.g. a file like ``.gitignore``). After doing a fresh
|
||||
check out the source code, a tool like ``pip`` can be used to install the
|
||||
required dependencies directly into this directory.
|
||||
The behaviour should work exactly the same as the way the existing mechanism
|
||||
for adding the current working directory or script directory to ``sys.path``
|
||||
works. For example, ``__pypackages__`` will be ignored if the ``-P`` option or
|
||||
the ``PYTHONSAFEPATH`` environment variable is set.
|
||||
|
||||
But, this does not enable all features of virtual environments in a similar
|
||||
fashion. For example, if the project has multiple scripts, or helper scripts
|
||||
in a different directory to build the project, a normal virtual environment
|
||||
should be preferred over ``__pypackages__``.
|
||||
In order to be recognised, the ``__pypackages__`` directory must be laid out
|
||||
according to the "prefix" scheme in the sysconfig module. Specifically, either
|
||||
or both of the ``purelib`` and ``platlib`` directories must be present, using
|
||||
the following code to determine the locations of those directories::
|
||||
|
||||
scheme = sysconfig.get_preferred_scheme("prefix")
|
||||
purelib = sysconfig.get_path("purelib", scheme, vars={"base": "__pypackages__", "platbase": "__pypackages__"})
|
||||
platlib = sysconfig.get_path("platlib", scheme, vars={"base": "__pypackages__", "platbase": "__pypackages__"})
|
||||
|
||||
These two locations will be added to ``sys.path``, other directories or files in the ``__pypackages__`` directory will be silently ignored.
|
||||
|
||||
|
||||
Example
|
||||
-------
|
||||
|
||||
The following shows an example project directory structure, and different ways
|
||||
the Python executable and any script will behave.
|
||||
the Python executable and any script will behave. The example is for Unix-like
|
||||
systems - on Windows the subdirectories will be different.
|
||||
|
||||
::
|
||||
|
||||
|
@ -138,9 +148,7 @@ We have a project directory called ``foo`` and it has a ``__pypackages__``
|
|||
inside of it. We have ``bottle`` installed in that
|
||||
``__pypackages__/lib/python3.10/stie-packages/``, and have a ``myscript.py``
|
||||
file inside of the project directory. We have used whatever tool we generally
|
||||
use to install ``bottle`` in that location. This actual internal path will
|
||||
depend on the Python implementation name, as mentioned in the
|
||||
``sysconfig._INSTALL_SCHEMES['posix_prefix']`` dictionary.
|
||||
use to install ``bottle`` in that location.
|
||||
|
||||
For invoking a script, Python will try to find a ``__pypackages__`` inside of
|
||||
the directory that the script resides [1]_, ``/usr/bin``. The same will happen
|
||||
|
@ -174,33 +182,141 @@ use ``python3`` without any activation step, etc.
|
|||
resides, not the symlink pointing to the script
|
||||
|
||||
|
||||
Relationship to virtual environments
|
||||
====================================
|
||||
|
||||
At its heart, this proposal is simply to modify the calculation of the default
|
||||
value of ``sys.path``, and does not relate at all to the virtual environment
|
||||
mechanism. However, ``__pypackages__`` can be viewed as providing an isolation
|
||||
capability, and in that sense, it "competes" with virtual environments.
|
||||
|
||||
However, there are significant differences:
|
||||
|
||||
* Virtual environments are isolated from the system environment, whereas
|
||||
``__pypackages__`` simply adds to the system environment.
|
||||
* Virtual environments include a full "installation scheme", with directories
|
||||
for binaries, C header files, etc., whereas ``__pypackages__`` is solely
|
||||
for Python library code.
|
||||
* Virtual environments work most smoothly when "activated". This proposal
|
||||
needs no activation.
|
||||
|
||||
This proposal should be seen as independent of virtual environments, not competing
|
||||
with them. At best, some use cases currently only served by virtual environments
|
||||
can also be served (possibly better) by ``__pypackages__``.
|
||||
|
||||
It should be noted that libraries installed in ``__pypackages__`` will be visible
|
||||
in a virtual environment. This arguably breaks the isolation of virtual environments,
|
||||
but it is no different in principle to the presence of the current directory on
|
||||
``sys.path`` (or mechanisms like the ``PYTHONPATH`` environment variable). The only
|
||||
difference is in degree, as the expectation is that people will more commonly install
|
||||
packages in ``__pypackages__``. The alternative would be to explicitly detect virtual
|
||||
environments and disable ``__pypackages__`` in that case - however that would break
|
||||
scripts with bundled dependencies. The PEP authors believe that developers using
|
||||
virtual environments should be experienced enough to understand the issue and
|
||||
anticipate and avoid any problems.
|
||||
|
||||
Security Considerations
|
||||
=======================
|
||||
|
||||
While executing a Python script, it will not consider the ``__pypackages__`` in
|
||||
the current directory, instead if there is a ``__pypackages__`` directory in the
|
||||
same path of the script, that will be used.
|
||||
In theory, it is possible to add a library to the ``__pypackages__`` directory
|
||||
that overrides a stdlib module or an installed 3rd party library. For the
|
||||
``__pypackages__`` associated with a script, this is assumed not to be a
|
||||
significant issue, as it is unlikely that anyone would be able to write to
|
||||
``__pypackages__`` unless they also had the ability to write to the script itself.
|
||||
|
||||
For example, if we execute ``python /usr/share/myproject/fancy.py`` from the
|
||||
``/tmp`` directory and if there is a ``__pypackages__`` directory inside of
|
||||
``/usr/share/myproject/`` directory, it will be used. Any potential
|
||||
``__pypackages__`` directory in ``/tmp`` will be ignored.
|
||||
For a ``__pypackages__`` directory in the current working directory, the
|
||||
interactive interpreter could be affected. However, this is not significantly
|
||||
different than the existing issue of someone having a ``math.py`` mdule in their
|
||||
current directory, and while (just like that case) it can cause user confusion,
|
||||
it does not introduce any new security implications.
|
||||
|
||||
When running a script, any ``__pypackages__`` directory in the current working
|
||||
directory is ignored. This is the same approach Python uses for adding the
|
||||
current working directory to ``sys.path`` and ensures that it is not possible
|
||||
to change the behaviour of a script by modifying files in the current
|
||||
directory.
|
||||
|
||||
Also, a ``__pypackages__`` directory is only recognised in the current (or
|
||||
script) directory. The interpreter will *not* scan for ``__pypackages__`` in
|
||||
parent directories. Doing so would open up the risk of security issues if
|
||||
directory permissions on parents differ. In particular, scripts in the ``bin``
|
||||
directory or ``__pypackages__`` (the ``scripts`` location in ``sysconfig``
|
||||
terms) have no special access to the libraries installed in ``__pypackages__``.
|
||||
Putting executable scripts in a ``bin`` directory is not supported by this
|
||||
proposal.
|
||||
|
||||
How to Teach This
|
||||
=================
|
||||
|
||||
The original motivation for this proposal was to make it easier to teach Python
|
||||
to beginners. To that end, it needs to be easy to explain, and simple to use.
|
||||
|
||||
At the most basic level, this is similar to the existing mechanism where the
|
||||
script directory is added to ``sys.path`` and can be taught in a similar manner.
|
||||
However, for its intended use of "lightweight isolation", it would likely be taught
|
||||
in terms of "things you put in a ``__pypackages__`` directory are private to your
|
||||
script". The experience of the PEP authors suggests that this would be significantly
|
||||
easier to teach than the current alternative of introducing virtual environments.
|
||||
|
||||
|
||||
Impact on Tools
|
||||
===============
|
||||
|
||||
As the intended use of the feature is to install 3rd party libraries in the new
|
||||
directory, it is important that tools, particularly installers, understand how to
|
||||
manage ``__pypackages__``.
|
||||
|
||||
To minimise transition costs, the PEP proposes a layout for the
|
||||
``__pypackages__`` directory that is compatible with pip's ``--prefix`` option,
|
||||
in the most common cases, so that in the absence of any dedicated mechanism,
|
||||
``pip install --prefix __pypackages__`` should work. However, this is
|
||||
considered a transitional measure only, and there is no guarantee that in
|
||||
exceptional cases where a distributor has customised things or pip has
|
||||
special-case handling, ``pip install --prefix`` might not work and installation
|
||||
will need to be handled manually.
|
||||
|
||||
In the longer term, it is hoped that tools will introduce a dedicated
|
||||
"pypackages" installation mode that *is* guaranteed to match the expected
|
||||
layout in all cases, removing the need for interim approaches like
|
||||
``--prefix``. However, the question of how best to support the
|
||||
``__pypackages__`` layout is ultimately left to individual tool maintainers to
|
||||
consider and decide on.
|
||||
|
||||
Tools that locate packages without actually running Python code (IDEs, linters,
|
||||
type checkers, etc.) would need updating to recognise ``__pypackages__``. In the
|
||||
absence of such updates, the ``__pypackages__`` directory would work similarly
|
||||
to directories currently added to ``sys.path`` at runtime (i.e., the tool would
|
||||
probably ignore it).
|
||||
|
||||
This also means we will not scan any parent directory while executing scripts.
|
||||
If we want to execute scripts inside of the ``~/bin/`` directory, then
|
||||
the ``__pypackages__`` directory must be inside of the ``~/bin/`` directory.
|
||||
|
||||
Backwards Compatibility
|
||||
=======================
|
||||
|
||||
This does not affect any older version of Python implementation.
|
||||
The directory name ``__pypackages__`` was chosen because it is unlikely to be in
|
||||
common use. It is true that users who have chosen to use that name for their own
|
||||
purposes will be impacted, but at the time this PEP was written, this was viewed
|
||||
as a relatively low risk.
|
||||
|
||||
Unfortunately, in the time this PEP has been under discussion, a number of tools
|
||||
have chosen to implement variations on what is being proposed here, which are not
|
||||
all compatible with the final form of the PEP. As a result, the risk of clashes is
|
||||
now higher than originally anticipated.
|
||||
|
||||
It would be possible to mitigate this by choosing a *different* name, hopefully as
|
||||
uncommon as ``__pypackages__`` originally was. But realistically, any compatibility
|
||||
issues can be viewed as simply the consequences of people trying to implement
|
||||
draft proposals, without making the effort to track changes in the proposal. As such,
|
||||
it seems reasonable to retain the ``__pypackages__`` name, and put the burden of
|
||||
addressing the compatibility issue on the tools that implemented the draft version.
|
||||
|
||||
|
||||
Impact on other Python implementations
|
||||
--------------------------------------
|
||||
|
||||
Other Python implementations will need to replicate the new behavior of the
|
||||
interpreter bootstrap, including locating the ``__pypackages__`` directory and
|
||||
adding it the ``sys.path`` just before site packages, if it is present.
|
||||
adding it the ``sys.path`` just before site packages, if it is present. This is
|
||||
no different to any other Python change.
|
||||
|
||||
|
||||
Reference Implementation
|
||||
|
@ -213,8 +329,13 @@ enable the implementation for ``Cpython`` & in ``PyPy``.
|
|||
Rejected Ideas
|
||||
==============
|
||||
|
||||
``__pylocal__`` and ``python_modules`` as directory name. We will also not
|
||||
reimplement all the features of virtual environments.
|
||||
* Alternative names, such as ``__pylocal__`` and ``python_modules``. Ultimately, the name is arbitrary and the chosen name is good enough.
|
||||
|
||||
* Additional features of virtual environments. This proposal is not a replacement for virtual environments, and such features are therefore out of scope.
|
||||
|
||||
* Raise an error if unexpected files or directories are present in ``__pypackages__``. This is considered too strict, particularly as transitional approaches like ``pip install --prefix`` can create additional files in ``__pypackages__``.
|
||||
|
||||
* Using a different ``sysconfig`` scheme, or a dedicated ``pypackages`` scheme. While this is attractive in theory, it makes transition harder, as there will be no readily-available way of installing to ``__pypackages__`` until tools implement explicit support. And while the PEP authors hope and assume that such support would be added, having the proposal dependent on such support in order to be usable seems like an unacceptable risk.
|
||||
|
||||
Copyright
|
||||
=========
|
||||
|
|
Loading…
Reference in New Issue