1022 lines
40 KiB
ReStructuredText
1022 lines
40 KiB
ReStructuredText
PEP: 665
|
|
Title: A file format to list Python dependencies for reproducibility of an application
|
|
Author: Brett Cannon <brett@python.org>,
|
|
Pradyun Gedam <pradyunsg@gmail.com>,
|
|
Tzu-ping Chung <uranusjr@gmail.com>
|
|
PEP-Delegate: Paul Moore <p.f.moore@gmail.com>
|
|
Discussions-To: https://discuss.python.org/t/9911
|
|
Status: Rejected
|
|
Type: Standards Track
|
|
Topic: Packaging
|
|
Content-Type: text/x-rst
|
|
Created: 29-Jul-2021
|
|
Post-History: 29-Jul-2021, 03-Nov-2021, 25-Nov-2021
|
|
Resolution: https://discuss.python.org/t/pep-665-take-2-a-file-format-to-list-python-dependencies-for-reproducibility-of-an-application/11736/140
|
|
|
|
.. note::
|
|
This PEP was rejected due to lukewarm reception from the community
|
|
from the lack of source distribution support.
|
|
|
|
========
|
|
Abstract
|
|
========
|
|
|
|
This PEP specifies a file format to specify the list of Python package
|
|
installation requirements for an application, and the relation between
|
|
the specified requirements. The list of requirements is considered
|
|
exhaustive for the installation target, and thus not requiring any
|
|
information beyond the platform being installed for, and the file
|
|
itself. The file format is flexible enough to allow installing the
|
|
requirements across different platforms, which allows for
|
|
reproducibility on multiple platforms from the same file.
|
|
|
|
|
|
===========
|
|
Terminology
|
|
===========
|
|
|
|
There are several terms whose definition must be agreed upon in order
|
|
to facilitate a discussion on the topic of this PEP.
|
|
|
|
A *package* is something you install as a dependency and use via the
|
|
import system. The packages on PyPI are an example of this.
|
|
|
|
An *application* or *app* is an end product that other external code
|
|
does not directly rely on via the import system (i.e. they are
|
|
standalone). Desktop applications, command-line tools, etc. are
|
|
examples of applications.
|
|
|
|
A *lock file* records the packages that are to be installed for an
|
|
app. Traditionally, the exact version of the package to be installed
|
|
is specified by a lock file, but specified packages are not always
|
|
installed on a given platform (according a filtering logic described
|
|
in a later section), which enables the lock file to describe
|
|
reproducibility across multiple platforms. Examples of this are
|
|
``package-lock.json`` from npm_, ``Poetry.lock`` from Poetry_, etc.
|
|
|
|
*Locking* is the act of taking the input of the packages an app
|
|
depends on and producing a lock file from that.
|
|
|
|
A *locker* is a tool which produces a lock file.
|
|
|
|
An *installer* consumes a lock file to install what the lock file
|
|
specifies.
|
|
|
|
|
|
==========
|
|
Motivation
|
|
==========
|
|
|
|
Applications want reproducible installs for a few reasons (we are not
|
|
worrying about package development, integration into larger systems
|
|
that would handle locking dependencies external to the Python
|
|
application, or other situations where *flexible* installation
|
|
requirements are desired over strict, reproducible installations).
|
|
|
|
One, reproducibility eases development. When you and your fellow
|
|
developers all end up with the same files on a specific platform, you
|
|
make sure you are all developing towards the same experience for the
|
|
application. You also want your users to install the same files as
|
|
you expect to guarantee the experience is the same as you developed
|
|
for them.
|
|
|
|
Two, you want to be able to reproduce what gets installed across
|
|
multiple platforms. Thanks to Python's portability across operating
|
|
systems, CPUs, etc., it is very easy and often desirable to create
|
|
applications that are not restricted to a single platform. Thus, you
|
|
want to be flexible enough to allow for differences in your package
|
|
dependencies between platforms, while still having consistency
|
|
and reproducibility on any one specific platform.
|
|
|
|
Three, reproducibility is more secure. When you control exactly what
|
|
files are installed, you can make sure no malicious actor is
|
|
attempting to slip nefarious code into your application (i.e. some
|
|
supply chain attacks). By using a lock file which always leads to
|
|
reproducible installs, we can avoid certain risks entirely.
|
|
|
|
Four, relying on the `wheel file`_ format provides reproducibility
|
|
without requiring build tools to support reproducibility themselves.
|
|
Thanks to wheels being static and not executing code as part of
|
|
installation, wheels always lead to a reproducible result. Compare
|
|
this to source distributions (aka sdists) or source trees which only
|
|
lead to a reproducible install if their build tool supports
|
|
reproducibility due to inherent code execution. Unfortunately the vast
|
|
majority of build tools do not support reproducible builds, so this
|
|
PEP helps alleviate that issue by only supporting wheels as a package
|
|
format.
|
|
|
|
This PEP proposes a standard for a lock file, as the current solutions
|
|
don't meet the outlined goals. Today, the closest we come to a lock
|
|
file standard is the `requirements file format`_ from pip.
|
|
Unfortunately, that format does not lead to inherently reproducible
|
|
installs (it requires optional features both in the requirements file
|
|
and the installer itself, to be discussed later).
|
|
|
|
The community itself has also shown a need for lock files based on the
|
|
fact that multiple tools have independently created their own lock
|
|
file formats:
|
|
|
|
#. PDM_
|
|
#. `pip-tools`_
|
|
#. Pipenv_
|
|
#. Poetry_
|
|
#. Pyflow_
|
|
|
|
Unfortunately, those tools all use differing lock file formats. This
|
|
means tooling around these tools must be unique. This impacts tooling
|
|
such as code editors and hosting providers, which want to be as
|
|
flexible as possible when it comes to accepting a user's application
|
|
code, but also have a limit as to how much development resources they
|
|
can spend to add support for yet another lock file format. A
|
|
standardized format would allow tools to focus their work on a single
|
|
target, and make sure that workflow decisions made by developers
|
|
outside of the lock file format are of no concern to e.g. hosting
|
|
providers.
|
|
|
|
Other programming language communities have also shown the usefulness
|
|
of lock files by developing their own solution to this problem. Some
|
|
of those communities include:
|
|
|
|
#. Dart_
|
|
#. npm_/Node
|
|
#. Go
|
|
#. Rust_
|
|
|
|
The trend in programming languages in the past decade seems to have
|
|
been toward providing a lock file solution.
|
|
|
|
|
|
=========
|
|
Rationale
|
|
=========
|
|
|
|
-----------
|
|
File Format
|
|
-----------
|
|
|
|
We wanted the file format to be easy to read as a diff when auditing
|
|
a change to the lock file. As such, and thanks to :pep:`518` and
|
|
``pyproject.toml``, we decided to go with the TOML_ file format.
|
|
|
|
|
|
-----------------
|
|
Secure by Design
|
|
-----------------
|
|
|
|
Viewing the `requirements file format`_ as the closest we have to
|
|
a lock file standard, there are a few issues with the file format when
|
|
it comes to security. First is that the file format simply does not
|
|
require you to specify the exact version of a package. This is why
|
|
tools like `pip-tools`_ exist to help manage that users of
|
|
requirements files.
|
|
|
|
Second, you must opt into specifying what files are acceptable to be
|
|
installed by using the ``--hash`` argument for a specific dependency.
|
|
This is also optional with pip-tools as it requires specifying the
|
|
``--generate-hashes`` CLI argument. This requires ``--require-hashes``
|
|
for pip to make sure no dependencies lack a hash to check.
|
|
|
|
|
|
Third, even when you control what files may be installed, it does not
|
|
prevent other packages from being installed. If a dependency is not
|
|
listed in the requirements file, pip will happily go searching for a
|
|
file to meet that need. You must specify ``--no-deps`` as an
|
|
argument to pip to prevent unintended dependency resolution outside
|
|
of the requirements file.
|
|
|
|
Fourth, the format allows for installing a
|
|
`source distribution file`_ (aka "sdist"). By its very nature,
|
|
installing an sdist requires executing arbitrary Python code, meaning
|
|
that there is no control over what files may be installed. Only by
|
|
specifying ``--only-binary :all:`` can you guarantee pip to only use a
|
|
`wheel file`_ for each package.
|
|
|
|
To recap, in order for a requirements file to be as secure as what is
|
|
being proposed, a user should always do the following steps:
|
|
|
|
#. Use pip-tools and its command ``pip-compile --generate-hashes``
|
|
#. Install the requirements file using
|
|
``pip install --require-hashes --no-deps --only-binary :all:``
|
|
|
|
Critically, all of those flags, and both the specificity and
|
|
exhaustion of what to install that pip-tools provides, are optional
|
|
for requirements files.
|
|
|
|
As such, the proposal raised in this PEP is secure by design which
|
|
combats some supply chain attacks. Hashes for files which would be
|
|
used to install from are **required**. You can **only** install from
|
|
wheels to unambiguously define what files will be placed in the file
|
|
system. Installers **must** lead to an deterministic installation
|
|
from a lock file for a given platform. All of this leads to a
|
|
reproducible installation which you can deem trustworthy (when you
|
|
have audited the lock file and what it lists).
|
|
|
|
|
|
--------------
|
|
Cross-Platform
|
|
--------------
|
|
|
|
Various projects which already have a lock file, like PDM_ and
|
|
Poetry_, provide a lock file which is *cross-platform*. This allows
|
|
for a single lock file to work on multiple platforms while still
|
|
leading to the exact same top-level requirements to be installed
|
|
everywhere with the installation being consistent/unambiguous on
|
|
each platform.
|
|
|
|
As to why this is useful, let's use an example involving PyWeek_
|
|
(a week-long game development competition). Assume you are developing
|
|
on Linux, while someone you choose to partner with is using macOS.
|
|
Now assume the judges are using Windows. How do you make sure everyone
|
|
is using the same top-level dependencies, while allowing for any
|
|
platform-specific requirements (e.g. a package requires a helper
|
|
package under Windows)?
|
|
|
|
With a cross-platform lock file, you can make sure that the key
|
|
requirements are met consistently across all platforms. You can then
|
|
also make sure that all users on the same platform get the same
|
|
reproducible installation.
|
|
|
|
|
|
----------------
|
|
Simple Installer
|
|
----------------
|
|
|
|
The separation of concerns between a locker and an installer allows
|
|
for an installer to have a much simpler operation to perform. As
|
|
such, it not only allows for installers to be easier to write, but
|
|
facilitates in making sure installers create unambiguous, reproducible
|
|
installations correctly.
|
|
|
|
The installer can also expend less computation/energy in creating the
|
|
installation. This is beneficial not only for faster installs, but
|
|
also from an energy consumption perspective, as installers are
|
|
expected to be run more often than lockers.
|
|
|
|
This has led to a design where the locker must do more work upfront
|
|
to the benefit installers. It also means the complexity of package
|
|
dependencies is simpler and easier to comprehend in a lock files to
|
|
avoid ambiguity.
|
|
|
|
|
|
=============
|
|
Specification
|
|
=============
|
|
|
|
-------
|
|
Details
|
|
-------
|
|
|
|
Lock files MUST use the TOML_ file format. This not only prevents the
|
|
need to have another file format in the Python packaging ecosystem
|
|
thanks to its adoption by :pep:`518` for ``pyproject.toml``, but also
|
|
assists in making lock files more human-readable.
|
|
|
|
Lock files MUST end their file names with ``.pylock.toml``. The
|
|
``.toml`` part unambiguously distinguishes the format of the file,
|
|
and helps tools like code editors support the file appropriately. The
|
|
``.pylock`` part distinguishes the file from other TOML files the user
|
|
has, to make the logic easier for tools to create functionality
|
|
specific to Python lock files, instead of TOML files in general.
|
|
|
|
The following sections are the top-level keys of the TOML file data
|
|
format. Any field not listed as **required** is considered optional.
|
|
|
|
|
|
``version``
|
|
===========
|
|
|
|
This field is **required**.
|
|
|
|
The version of the lock file being used. The key MUST be a string
|
|
consisting of a number that follows the same formatting as the
|
|
``Metadata-Version`` key in the `core metadata spec`_.
|
|
|
|
The value MUST be set to ``"1.0"`` until a future PEP allows for a
|
|
different value. The introduction of a new *optional* key to the file
|
|
format SHOULD increase the minor version. The introduction of a new
|
|
required key or changing the format MUST increase the major version.
|
|
How to handle other scenarios is left as a per-PEP decision.
|
|
|
|
Installers MUST warn the user if the lock file specifies a version
|
|
whose major version is supported but whose minor version is
|
|
unsupported/unrecognized (e.g. the installer supports ``"1.0"``, but
|
|
the lock file specifies ``"1.1"``).
|
|
|
|
Installers MUST raise an error if the lock file specifies a major
|
|
version which is unsupported (e.g. the installer supports ``"1.9"``
|
|
but the lock file specifies ``"2.0"``).
|
|
|
|
|
|
``created-at``
|
|
==============
|
|
|
|
This field is **required**.
|
|
|
|
The timestamp for when the lock file was generated (using TOML's
|
|
native timestamp type). It MUST be recorded using the UTC time zone to
|
|
avoid ambiguity.
|
|
|
|
If the SOURCE_DATE_EPOCH_ environment variable is set, it MUST be used
|
|
as the timestamp by the locker. This facilitates reproducibility of
|
|
the lock file itself.
|
|
|
|
|
|
``[tool]``
|
|
==========
|
|
|
|
Tools may create their own sub-tables under the ``tool`` table. The
|
|
rules for this table match those for ``pyproject.toml`` and its
|
|
``[tool]`` table from the `build system declaration spec`_.
|
|
|
|
|
|
``[metadata]``
|
|
==============
|
|
|
|
This table is **required**.
|
|
|
|
A table containing data applying to the overall lock file.
|
|
|
|
|
|
``metadata.marker``
|
|
-------------------
|
|
|
|
A key storing a string containing an environment marker as
|
|
specified in the `dependency specifier spec`_.
|
|
|
|
The locker MAY specify an environment marker which specifies any
|
|
restrictions the lock file was generated under.
|
|
|
|
If the installer is installing for an environment which does not
|
|
satisfy the specified environment marker, the installer MUST raise an
|
|
error as the lock file does not support the target installation
|
|
environment.
|
|
|
|
|
|
``metadata.tag``
|
|
----------------
|
|
|
|
A key storing a string specifying `platform compatibility tags`_
|
|
(i.e. wheel tags). The tag MAY be a compressed tag set.
|
|
|
|
If the installer is installing for an environment which does not
|
|
satisfy the specified tag (set), the installer MUST raise an error
|
|
as the lock file does not support the targeted installation
|
|
environment.
|
|
|
|
|
|
``metadata.requires``
|
|
---------------------
|
|
|
|
This field is **required**.
|
|
|
|
An array of strings following the `dependency specifier spec`_. This
|
|
array represents the top-level package dependencies of the lock file
|
|
and thus the root of the dependency graph.
|
|
|
|
|
|
``metadata.requires-python``
|
|
----------------------------
|
|
|
|
A string specifying the supported version(s) of Python for this lock
|
|
file. It follows the same format as that specified for the
|
|
``Requires-Python`` field in the `core metadata spec`_.
|
|
|
|
|
|
``[[package._name_._version_]]``
|
|
================================
|
|
|
|
This array is **required**.
|
|
|
|
An array per package and version containing entries for the potential
|
|
(wheel) files to install (as represented by ``_name_`` and
|
|
``_version_``, respectively).
|
|
|
|
Lockers MUST normalize a project's name according to the
|
|
`simple repository API`_. If extras are specified as part of the
|
|
project to install, the extras are to be included in the key name and
|
|
are to be sorted in lexicographic order.
|
|
|
|
Within the file, the tables for the projects SHOULD be sorted by:
|
|
|
|
#. Project/key name in lexicographic order
|
|
#. Package version, newest/highest to older/lowest according to the
|
|
`version specifiers spec`_
|
|
#. Optional dependencies (extras) via lexicographic order
|
|
#. File name based on the ``filename`` field (discussed
|
|
below)
|
|
|
|
These recommendations are to help minimize diff changes between tool
|
|
executions.
|
|
|
|
|
|
``package._name_._version_.filename``
|
|
-------------------------------------
|
|
|
|
This field is **required**.
|
|
|
|
A string representing the base name of the file as represented by an
|
|
entry in the array (i.e. what
|
|
``os.path.basename()``/``pathlib.PurePath.name`` represents). This
|
|
field is required to simplify installers as the file name is required
|
|
to resolve wheel tags derived from the file name. It also guarantees
|
|
that the association of the array entry to the file it is meant for is
|
|
always clear.
|
|
|
|
|
|
``[package._name_._version_.hashes]``
|
|
-------------------------------------
|
|
|
|
This table is **required**.
|
|
|
|
A table with keys specifying a hash algorithm and values as the hash
|
|
for the file represented by this entry in the
|
|
``package._name_._version_`` table.
|
|
|
|
Lockers SHOULD list hashes in lexicographic order. This is to help
|
|
minimize diff sizes and the potential to overlook hash value changes.
|
|
|
|
An installer MUST only install a file which matches one of the
|
|
specified hashes.
|
|
|
|
|
|
``package._name_._version_.url``
|
|
--------------------------------
|
|
|
|
A string representing a URL where to get the file.
|
|
|
|
The installer MAY support any schemes it wants for URLs. A URL with no
|
|
scheme MUST be assumed to be a local file path (both relative paths to
|
|
the lock file and absolute paths). Installers MUST support, at
|
|
minimum, HTTPS URLs as well as local file paths.
|
|
|
|
An installer MAY choose to not use the URL to retrieve a file
|
|
if a file matching the specified hash can be found using alternative
|
|
means (e.g. on the file system in a cache directory).
|
|
|
|
|
|
``package._name_._version_.direct``
|
|
-----------------------------------
|
|
|
|
A boolean representing whether an installer should consider the
|
|
project installed "directly" as specified by the
|
|
`direct URL origin of installed distributions spec`_.
|
|
|
|
If the key is true, then the installer MUST follow the
|
|
`direct URL origin of installed distributions spec`_ for recording
|
|
the installation as "direct".
|
|
|
|
|
|
``package._name_._version_.requires-python``
|
|
--------------------------------------------
|
|
|
|
A string specifying the support version(s) of Python for this file. It
|
|
follows the same format as that specified for the
|
|
``Requires-Python`` field in the `core metadata spec`_.
|
|
|
|
|
|
``package._name_._version_.requires``
|
|
-------------------------------------
|
|
|
|
An array of strings following the `dependency specifier spec`_ which
|
|
represent the dependencies of this file.
|
|
|
|
|
|
-------
|
|
Example
|
|
-------
|
|
|
|
::
|
|
|
|
version = "1.0"
|
|
created-at = 2021-10-19T22:33:45.520739+00:00
|
|
|
|
[tool]
|
|
# Tool-specific table.
|
|
|
|
[metadata]
|
|
requires = ["mousebender", "coveragepy[toml]"]
|
|
marker = "sys_platform == 'linux'" # As an example for coverage.
|
|
requires-python = ">=3.7"
|
|
|
|
[[package.attrs."21.2.0"]]
|
|
filename = "attrs-21.2.0-py2.py3-none-any.whl"
|
|
hashes.sha256 = "149e90d6d8ac20db7a955ad60cf0e6881a3f20d37096140088356da6c716b0b1"
|
|
url = "https://files.pythonhosted.org/packages/20/a9/ba6f1cd1a1517ff022b35acd6a7e4246371dfab08b8e42b829b6d07913cc/attrs-21.2.0-py2.py3-none-any.whl"
|
|
requires-python = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*"
|
|
|
|
[[package.attrs."21.2.0"]]
|
|
# If attrs had another wheel file (e.g. that was platform-specific),
|
|
# it could be listed here.
|
|
|
|
[[package."coveragepy[toml]"."6.2.0"]]
|
|
filename = "coverage-6.2-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl"
|
|
hashes.sha256 = "c7912d1526299cb04c88288e148c6c87c0df600eca76efd99d84396cfe00ef1d"
|
|
url = "https://files.pythonhosted.org/packages/da/64/468ca923e837285bd0b0a60bd9a287945d6b68e325705b66b368c07518b1/coverage-6.2-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl"
|
|
requires-python = ">=3.6"
|
|
requires = ["tomli"]
|
|
|
|
[[package."coveragepy[toml]"."6.2.0"]]
|
|
filename = "coverage-6.2-cp310-cp310-musllinux_1_1_x86_64.whl "
|
|
hashes.sha256 = "276651978c94a8c5672ea60a2656e95a3cce2a3f31e9fb2d5ebd4c215d095840"
|
|
url = "https://files.pythonhosted.org/packages/17/d6/a29f2cccacf2315150c31d8685b4842a6e7609279939a478725219794355/coverage-6.2-cp310-cp310-musllinux_1_1_x86_64.whl"
|
|
requires-python = ">=3.6"
|
|
requires = ["tomli"]
|
|
|
|
# More wheel files for `coverage` could be listed for more
|
|
# extensive support (i.e. all Linux-based wheels).
|
|
|
|
[[package.mousebender."2.0.0"]]
|
|
filename = "mousebender-2.0.0-py3-none-any.whl"
|
|
hashes.sha256 = "a6f9adfbd17bfb0e6bb5de9a27083e01dfb86ed9c3861e04143d9fd6db373f7c"
|
|
url = "https://files.pythonhosted.org/packages/f4/b3/f6fdbff6395e9b77b5619160180489410fb2f42f41272994353e7ecf5bdf/mousebender-2.0.0-py3-none-any.whl"
|
|
requires-python = ">=3.6"
|
|
requires = ["attrs", "packaging"]
|
|
|
|
[[package.packaging."20.9"]]
|
|
filename = "packaging-20.9-py2.py3-none-any.whl"
|
|
hashes.blake-256 = "3e897ea760b4daa42653ece2380531c90f64788d979110a2ab51049d92f408af"
|
|
hashes.sha256 = "67714da7f7bc052e064859c05c595155bd1ee9f69f76557e21f051443c20947a"
|
|
url = "https://files.pythonhosted.org/packages/3e/89/7ea760b4daa42653ece2380531c90f64788d979110a2ab51049d92f408af/packaging-20.9-py2.py3-none-any.whl"
|
|
requires-python = ">=3.6"
|
|
requires = ["pyparsing"]
|
|
|
|
[[package.pyparsing."2.4.7"]]
|
|
filename = "pyparsing-2.4.7-py2.py3-none-any.whl"
|
|
hashes.sha256 = "ef9d7589ef3c200abe66653d3f1ab1033c3c419ae9b9bdb1240a85b024efc88b"
|
|
url = "https://files.pythonhosted.org/packages/8a/bb/488841f56197b13700afd5658fc279a2025a39e22449b7cf29864669b15d/pyparsing-2.4.7-py2.py3-none-any.whl"
|
|
direct = true # For demonstration purposes.
|
|
requires-python = ">=2.6, !=3.0.*, !=3.1.*, !=3.2.*"
|
|
|
|
[[package.tomli."2.0.0"]]
|
|
filename = "tomli-2.0.0-py3-none-any.whl"
|
|
hashes.sha256 = "b5bde28da1fed24b9bd1d4d2b8cba62300bfb4ec9a6187a957e8ddb9434c5224"
|
|
url = "https://files.pythonhosted.org/packages/e2/9f/5e1557a57a7282f066351086e78f87289a3446c47b2cb5b8b2f614d8fe99/tomli-2.0.0-py3-none-any.whl"
|
|
requires-python = ">=3.7"
|
|
|
|
|
|
------------------------
|
|
Expectations for Lockers
|
|
------------------------
|
|
|
|
Lockers MUST create lock files for which a topological sort of the
|
|
packages which qualify for installation on the specified platform
|
|
results in a graph for which only a single version of any package
|
|
qualifies for installation and there is at least one compatible file
|
|
to install for each package. This leads to a lock file for any
|
|
supported platform where the only decision an installer can make
|
|
is what the "best-fitting" wheel is to install (which is discussed
|
|
below).
|
|
|
|
Lockers are expected to utilize ``metadata.marker``, ``metadata.tag``,
|
|
and ``metadata.requires-python`` as appropriate as well as environment
|
|
markers specified via ``requires`` and Python version requirements via
|
|
``requires-python`` to enforce this result for installers. Put another
|
|
way, the information used in the lock file is not expected to be
|
|
pristine/raw from the locker's input and instead is to be changed as
|
|
necessary to the benefit of the locker's goals.
|
|
|
|
|
|
---------------------------
|
|
Expectations for Installers
|
|
---------------------------
|
|
|
|
The expected algorithm for resolving what to install is:
|
|
|
|
#. Construct a dependency graph based on the data in the lock file
|
|
with ``metadata.requires`` as the starting/root point.
|
|
#. Eliminate all files that are unsupported by the specified platform.
|
|
#. Eliminate all irrelevant edges between packages based on marker
|
|
evaluation for ``requires``.
|
|
#. Raise an error if a package version is still reachable from the
|
|
root of the dependency graph but lacks any compatible file.
|
|
#. Verify that all packages left only have one version to install,
|
|
raising an error otherwise.
|
|
#. Install the best-fitting wheel file for each package which
|
|
remains.
|
|
|
|
Installers MUST follow a deterministic algorithm determine what the
|
|
"best-fitting wheel file" is. A simple solution for this is to
|
|
rely upon the `packaging project <https://pypi.org/p/packaging/>`__
|
|
and its ``packaging.tags`` module to determine wheel file precedence.
|
|
|
|
Installers MUST support installing into an empty environment.
|
|
Installers MAY support installing into an environment that already
|
|
contains installed packages (and whatever that would entail to be
|
|
supported).
|
|
|
|
|
|
========================
|
|
(Potential) Tool Support
|
|
========================
|
|
|
|
The pip_ team has `said <https://github.com/pypa/pip/issues/10636>`__
|
|
they are interested in supporting this PEP if accepted. The current
|
|
proposal for pip may even
|
|
`supplant the need <https://github.com/jazzband/pip-tools/issues/1526#issuecomment-961883367>`__
|
|
for `pip-tools`_.
|
|
|
|
PDM_ has also said they would
|
|
`support the PEP <https://github.com/pdm-project/pdm/issues/718>`__
|
|
if accepted.
|
|
|
|
Pyflow_ has said they
|
|
`"like the idea" <https://github.com/David-OConnor/pyflow/issues/153#issuecomment-962482058>`__
|
|
of the PEP.
|
|
|
|
Poetry_ has said they would **not** support the PEP as-is because
|
|
`"Poetry supports sdists files, directory and VCS dependencies which are not supported" <https://github.com/python-poetry/poetry/issues/4710#issuecomment-973946104>`__.
|
|
Recording requirements at the file level, which is on purpose to
|
|
better reflect what can occur when it comes to dependencies,
|
|
`"is contradictory to the design of Poetry" <https://github.com/python-poetry/poetry/issues/4710#issuecomment-973946104>`__.
|
|
This also excludes export support to a this PEP's lock file as
|
|
`"Poetry exports the information present in the poetry.lock file into another format" <https://github.com/python-poetry/poetry/issues/4710#issuecomment-974551351>`__
|
|
and sdists and source trees are included in ``Poetry.lock`` files.
|
|
Thus it is not a clean translation from Poetry's lock file to this
|
|
PEP's lock file format.
|
|
|
|
|
|
=======================
|
|
Backwards Compatibility
|
|
=======================
|
|
|
|
As there is no pre-existing specification regarding lock files, there
|
|
are no explicit backwards compatibility concerns.
|
|
|
|
As for pre-existing tools that have their own lock file, some updating
|
|
will be required. Most document the lock file name, but not its
|
|
contents. For projects which do not commit their lock file to
|
|
version control, they will need to update the equivalent of their
|
|
``.gitignore`` file. For projects that do commit their lock file to
|
|
version control, what file(s) get committed will need an update.
|
|
|
|
For projects which do document their lock file format like pipenv_,
|
|
they will very likely need a major version release which changes the
|
|
lock file format.
|
|
|
|
|
|
===============
|
|
Transition Plan
|
|
===============
|
|
|
|
In general, this PEP could be considered successful if:
|
|
|
|
#. Two pre-existing tools became lockers (e.g. `pip-tools`_, PDM_,
|
|
pip_ via ``pip freeze``).
|
|
#. Pip became an installer.
|
|
#. One major, non-Python-specific platform supported the file format
|
|
(e.g. a cloud provider).
|
|
|
|
This would show interoperability, usability, and programming
|
|
community/business acceptance.
|
|
|
|
In terms of a transition plan, there are potentially multiple steps
|
|
that could lead to this desired outcome. Below is a somewhat idealized
|
|
plan that would see this PEP being broadly used.
|
|
|
|
|
|
---------
|
|
Usability
|
|
---------
|
|
|
|
First, a ``pip freeze`` equivalent tool could be developed which
|
|
creates a lock file. While installed packages do not by themselves
|
|
provide enough information to statically create a lock file, a user
|
|
could provide local directories and index URLs to construct one. This
|
|
would then lead to lock files that are stricter than a requirements
|
|
file by limiting the lock file to the current platform. This would
|
|
also allow people to see whether their environment would be
|
|
reproducible.
|
|
|
|
Second, a stand-alone installer should be developed. As the
|
|
requirements on an installer are much simpler than what pip provides,
|
|
it should be reasonable to have an installer that is independently
|
|
developed.
|
|
|
|
Third, a tool to convert a pinned requirements file as emitted by
|
|
pip-tools could be developed. Much like the ``pip freeze`` equivalent
|
|
outlined above, some input from the user may be needed. But this tool
|
|
could act as a transitioning step for anyone who has an appropriate
|
|
requirements file. This could also act as a test before potentially
|
|
having pip-tools grow some ``--lockfile`` flag to use this PEP.
|
|
|
|
All of this could be required before the PEP transitions from
|
|
conditional acceptance to full acceptance (and give the community a
|
|
chance to test if this PEP is potentially useful).
|
|
|
|
|
|
----------------
|
|
Interoperability
|
|
----------------
|
|
|
|
At this point, the goal would be to increase interoperability between
|
|
tools.
|
|
|
|
First, pip would become an installer. By having the most widely used
|
|
installer support the format, people can innovate on the locker side
|
|
while knowing people will have the tools necessary to actually consume
|
|
a lock file.
|
|
|
|
Second, pip becomes a locker. Once again, pip's reach would make the
|
|
format accessible for the vast majority of Python users very quickly.
|
|
|
|
Third, a project with a pre-existing lock file format supports at
|
|
least exporting to the lock file format (e.g. PDM or Pyflow). This
|
|
would show that the format meets the needs of other projects.
|
|
|
|
|
|
----------
|
|
Acceptance
|
|
----------
|
|
|
|
With the tooling available throughout the community, acceptance would
|
|
be shown via those not exclusively tied to the Python community
|
|
supporting the file format based on what they believe their users
|
|
want.
|
|
|
|
First, tools that operate on requirements files like code editors
|
|
having equivalent support for lock files.
|
|
|
|
Second, consumers of requirements files like cloud providers would
|
|
also accept lock files.
|
|
|
|
At this point the PEP would have permeated out far enough to be on
|
|
par with requirements files in terms of general acceptance and
|
|
potentially more if projects had dropped their own lock files for this
|
|
PEP.
|
|
|
|
|
|
=====================
|
|
Security Implications
|
|
=====================
|
|
|
|
A lock file should not introduce security issues but instead help
|
|
solve them. By requiring the recording of hashes for files, a lock
|
|
file is able to help prevent tampering with code since the hash
|
|
details were recorded. Relying on only wheel files means what files
|
|
will be installed can be known ahead of time and is reproducible. A
|
|
lock file also helps prevent unexpected package updates being
|
|
installed which may in turn be malicious.
|
|
|
|
|
|
=================
|
|
How to Teach This
|
|
=================
|
|
|
|
Teaching of this PEP will very much be dependent on the lockers and
|
|
installers being used for day-to-day use. Conceptually, though, users
|
|
could be taught that a lock file specifies what should be installed
|
|
for a project to work. The benefits of consistency and security should
|
|
be emphasized to help users realize why they should care about lock
|
|
files.
|
|
|
|
|
|
========================
|
|
Reference Implementation
|
|
========================
|
|
|
|
A proof-of-concept locker can be found at
|
|
https://github.com/frostming/pep665_poc . No installer has been
|
|
implemented yet, but the design of this PEP suggests the locker is the
|
|
more difficult aspect to implement.
|
|
|
|
|
|
==============
|
|
Rejected Ideas
|
|
==============
|
|
|
|
----------------------------
|
|
File Formats Other Than TOML
|
|
----------------------------
|
|
|
|
JSON_ was briefly considered, but due to:
|
|
|
|
#. TOML already being used for ``pyproject.toml``
|
|
#. TOML being more human-readable
|
|
#. TOML leading to better diffs
|
|
|
|
the decision was made to go with TOML. There was some concern over
|
|
Python's standard library lacking a TOML parser, but most packaging
|
|
tools already use a TOML parser thanks to ``pyproject.toml`` so this
|
|
issue did not seem to be a showstopper. Some have also argued against
|
|
this concern in the past by the fact that if packaging tools abhor
|
|
installing dependencies and feel they can't vendor a package then the
|
|
packaging ecosystem has much bigger issues to rectify than the need to
|
|
depend on a third-party TOML parser.
|
|
|
|
|
|
--------------------------
|
|
Alternative Naming Schemes
|
|
--------------------------
|
|
|
|
Specifying a directory to install file to was considered, but
|
|
ultimately rejected due to people's distaste for the idea.
|
|
|
|
It was also suggested to not have a special file name suffix, but it
|
|
was decided that hurt discoverability by tools too much.
|
|
|
|
|
|
-----------------------------
|
|
Supporting a Single Lock File
|
|
-----------------------------
|
|
|
|
At one point the idea of only supporting single lock file which
|
|
contained all possible lock information was considered. But it quickly
|
|
became apparent that trying to devise a data format which could
|
|
encompass both a lock file format which could support multiple
|
|
environments as well as strict lock outcomes for
|
|
reproducible builds would become quite complex and cumbersome.
|
|
|
|
The idea of supporting a directory of lock files as well as a single
|
|
lock file named ``pyproject-lock.toml`` was also considered. But any
|
|
possible simplicity from skipping the directory in the case of a
|
|
single lock file seemed unnecessary. Trying to define appropriate
|
|
logic for what should be the ``pyproject-lock.toml`` file and what
|
|
should go into ``pyproject-lock.d`` seemed unnecessarily complicated.
|
|
|
|
|
|
-----------------------------------------------
|
|
Using a Flat List Instead of a Dependency Graph
|
|
-----------------------------------------------
|
|
|
|
The first version of this PEP proposed that the lock file have no
|
|
concept of a dependency graph. Instead, the lock file would list
|
|
exactly what should be installed for a specific platform such that
|
|
installers did not have to make any decisions about *what* to install,
|
|
only validating that the lock file would work for the target platform.
|
|
|
|
This idea was eventually rejected due to the number of combinations
|
|
of potential :pep:`508` environment markers. The decision was made that
|
|
trying to have lockers generate all possible combinations as
|
|
individual lock files when a project wants to be cross-platform would
|
|
be too much.
|
|
|
|
|
|
-------------------------------
|
|
Use Wheel Tags in the File Name
|
|
-------------------------------
|
|
|
|
Instead of having the ``metadata.tag`` field there was a suggestion
|
|
of encoding the tags into the file name. But due to the addition of
|
|
the ``metadata.marker`` field and what to do when no tags were needed,
|
|
the idea was dropped.
|
|
|
|
|
|
----------------------------------
|
|
Alternative Names for ``requires``
|
|
----------------------------------
|
|
|
|
Some other names for what became ``requires`` were ``installs``,
|
|
``needs``, and ``dependencies``. Initially this PEP chose ``needs``
|
|
after asking a Python beginner which term they preferred. But based
|
|
on feedback on an earlier draft of this PEP, ``requires`` was chosen
|
|
as the term.
|
|
|
|
|
|
-----------------
|
|
Accepting PEP 650
|
|
-----------------
|
|
|
|
:pep:`650` was an earlier attempt at trying to tackle this problem by
|
|
specifying an API for installers instead of standardizing on a lock
|
|
file format (ala :pep:`517`). The
|
|
`initial response <https://discuss.python.org/t/pep-650-specifying-installer-requirements-for-python-projects/6657/>`__
|
|
to :pep:`650` could be considered mild/lukewarm. People seemed to be
|
|
consistently confused over which tools should provide what
|
|
functionality to implement the PEP. It also potentially incurred more
|
|
overhead as it would require executing Python APIs to perform any
|
|
actions involving packaging.
|
|
|
|
This PEP chooses to standardize around an artifact instead of an API
|
|
(ala :pep:`621`). This would allow for more tool integrations as it
|
|
removes the need to specifically use Python to do things such as
|
|
create a lock file, update it, or even install packages listed in
|
|
a lock file. It also allows for easier introspection by forcing
|
|
dependency graph details to be written in a human-readable format.
|
|
It also allows for easier sharing of knowledge by standardizing what
|
|
people need to know more (e.g. tutorials become more portable between
|
|
tools when it comes to understanding the artifact they produce). It's
|
|
also simply the approach other language communities have taken and
|
|
seem to be happy with.
|
|
|
|
Acceptance of this PEP would mean :pep:`650` gets rejected.
|
|
|
|
|
|
-------------------------------------------------------
|
|
Specifying Requirements per Package Instead of per File
|
|
-------------------------------------------------------
|
|
|
|
An earlier draft of this PEP specified dependencies at the package
|
|
level instead of per file. While this has traditionally been how
|
|
packaging systems work, it actually did not reflect accurately how
|
|
things are specified. As such, this PEP was subsequently updated to
|
|
reflect the granularity that dependencies can truly be specified at.
|
|
|
|
|
|
----------------------------------
|
|
Specify Where Lockers Gather Input
|
|
----------------------------------
|
|
|
|
This PEP does not specify how a locker gets its input. An initial
|
|
suggestion was to partially reuse :pep:`621`, but due to disagreements
|
|
on how flexible the potential input should be in terms of specifying
|
|
things such as indexes, etc., it was decided this would best be left
|
|
to a separate PEP.
|
|
|
|
|
|
-------------------------------------------------------------------------------------
|
|
Allowing Source Distributions and Source Trees to be an Opt-In, Supported File Format
|
|
-------------------------------------------------------------------------------------
|
|
|
|
After `extensive discussion <https://discuss.python.org/t/supporting-sdists-and-source-trees-in-pep-665/11869/>`__,
|
|
it was decided that this PEP would not support source distributions
|
|
(aka sdists) or source trees as an acceptable format for code.
|
|
Introducing sdists and source trees to this PEP would immediately undo
|
|
the reproducibility and security goals due to needing to execute code
|
|
to build the sdist or source tree. It would also greatly increase
|
|
the complexity for (at least) installers as the dynamic build nature
|
|
of sdists and source trees means the installer would need to handle
|
|
fully resolving whatever requirements the sdists produced dynamically,
|
|
both from a building and installation perspective.
|
|
|
|
Due to all of this, it was decided it was best to have a separate
|
|
discussion about what supporting sdists and source trees **after**
|
|
this PEP is accepted/rejected. As the proposed file format is
|
|
versioned, introducing sdists and source tree support in a later PEP
|
|
is doable.
|
|
|
|
It should be noted, though, that this PEP is **not** stop an
|
|
out-of-band solution from being developed to be used in conjunction
|
|
with this PEP. Building wheel files from sdists and shipping them with
|
|
code upon deployment so they can be included in the lock file is one
|
|
option. Another is to use a requirements file *just* for sdists and
|
|
source trees, then relying on a lock file for all wheels.
|
|
|
|
|
|
===========
|
|
Open Issues
|
|
===========
|
|
|
|
None.
|
|
|
|
|
|
===============
|
|
Acknowledgments
|
|
===============
|
|
|
|
Thanks to Frost Ming of PDM_ and Sébastien Eustace of Poetry_ for
|
|
providing input around dynamic install-time resolution of :pep:`508`
|
|
requirements.
|
|
|
|
Thanks to Kushal Das for making sure reproducible builds stayed a
|
|
concern for this PEP.
|
|
|
|
Thanks to Andrea McInnes for initially settling the bikeshedding and
|
|
choosing the paint colour of ``needs`` (at which point people ralled
|
|
around the ``requires`` colour instead).
|
|
|
|
|
|
=========
|
|
Copyright
|
|
=========
|
|
|
|
This document is placed in the public domain or under the
|
|
CC0-1.0-Universal license, whichever is more permissive.
|
|
|
|
|
|
.. _build system declaration spec: https://packaging.python.org/specifications/declaring-build-dependencies/
|
|
.. _core metadata spec: https://packaging.python.org/specifications/core-metadata/
|
|
.. _Dart: https://dart.dev/
|
|
.. _dependency specifier spec: https://packaging.python.org/specifications/dependency-specifiers/
|
|
.. _direct URL origin of installed distributions spec: https://packaging.python.org/specifications/direct-url/
|
|
.. _Git: https://git-scm.com/
|
|
.. _Go: https://go.dev/
|
|
.. _JSON: https://www.json.org/
|
|
.. _npm: https://www.npmjs.com/
|
|
.. _PDM: https://pypi.org/project/pdm/
|
|
.. _pip: https://pip.pypa.io/
|
|
.. _pip-tools: https://pypi.org/project/pip-tools/
|
|
.. _Pipenv: https://pypi.org/project/pipenv/
|
|
.. _platform compatibility tags: https://packaging.python.org/specifications/platform-compatibility-tags/
|
|
.. _Poetry: https://pypi.org/project/poetry/
|
|
.. _Pyflow: https://pypi.org/project/pyflow/
|
|
.. _PyWeek: https://pyweek.org/
|
|
.. _requirements file format: https://pip.pypa.io/en/latest/reference/requirements-file-format/
|
|
.. _Rust: https://www.rust-lang.org/
|
|
.. _SecureDrop: https://securedrop.org/
|
|
.. _simple repository API: https://packaging.python.org/specifications/simple-repository-api/
|
|
.. _source distribution file: https://packaging.python.org/specifications/source-distribution-format/
|
|
.. _SOURCE_DATE_EPOCH: https://reproducible-builds.org/specs/source-date-epoch/
|
|
.. _TOML: https://toml.io
|
|
.. _version specifiers spec: https://packaging.python.org/specifications/version-specifiers/
|
|
.. _wheel file: https://packaging.python.org/specifications/binary-distribution-format/
|
|
|
|
|
|
..
|
|
Local Variables:
|
|
mode: indented-text
|
|
indent-tabs-mode: nil
|
|
sentence-end-double-space: t
|
|
fill-column: 70
|
|
coding: utf-8
|
|
End:
|