python-peps/pep-0665.rst

898 lines
34 KiB
ReStructuredText
Raw Normal View History

PEP: 665
2021-11-03 15:30:54 -04:00
Title: A file format to list Python dependencies for reproducibility of an application
Author: Brett Cannon <brett@python.org>,
Pradyun Gedam <pradyunsg@gmail.com>,
Tzu-ping Chung <uranusjr@gmail.com>
2021-11-03 15:30:54 -04:00
PEP-Delegate: Paul Moore <p.f.moore@gmail.com>
Discussions-To: https://discuss.python.org/t/9911
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 29-Jul-2021
2021-11-03 15:30:54 -04:00
Post-History: 29-Jul-2021, 03-Nov-2021
Resolution:
========
Abstract
========
2021-11-03 15:30:54 -04:00
This PEP specifies a file format to specify the list of Python package
installation requirements for an application, and the relation between the
specified requirements. The list of requirements is considered
exhaustive for the installation target, and thus not requiring any
information beyond the platform being installed for, and the file
itself. The file format is flexible enough to allow installing the
requirements across different platforms, which guarantees
reproducibility on multiple platforms from the same file.
===========
Terminology
===========
There are several terms whose definition must be agreed upon in order
to facilitate a discussion on the topic of this PEP.
A *package* is something you install as a dependency and use via the
import system. The packages on PyPI are an example of this.
2021-11-03 15:30:54 -04:00
An *application* or *app* is an end product that other external code
does not directly rely on via the import system (i.e. they are
standalone). Desktop applications, command-line tools, etc. are
examples.
2021-11-03 15:30:54 -04:00
A *lock file* records the packages that are to be installed for an
app. Traditionally, the exact version of the package to be installed
is specified by a lock file, but all specified packages are not always
installed on a given platform (according a filtering logic described
in a later section), which enables the lock file to describe
reproducibility across multiple platforms. Examples of this are
``package-lock.json`` from npm_, ``Poetry.lock`` from Poetry, etc.
*Locking* is the act of taking the input of the packages an app
depends on and producting a lock file from that.
A *locker* is a tool which produces a lock file.
An *installer* consumes a lock file to install what the lock file
specifies.
==========
Motivation
==========
2021-11-03 15:30:54 -04:00
Applications want reproducible installs for a few reasons (we are not
worrying about package development, integration into larger systems
that would handle locking dependencies external to the Python
application, or other situations where *flexible* installation
requirements are desired over strict, reproducible installations).
2021-11-03 15:30:54 -04:00
One, reproducibility eases development. When you and your fellow
developers all end up with the same files on a specific platform, you
make sure you are all developing towards the same experience for the
application. You also want your users to install the same files as
you expect to guarantee the experience is the same as you developed
for them.
Two, you want to be able to reproduce what gets installed across
multiple platforms. Thanks to Python's portability across operating
systems, CPUs, etc., it is very easy and often desirable to create
applications that are not restricted to a single platform. Thus, you
want to be flexible enough to allow for differences in your package
dependencies between platforms, while still having consistency
and reproducibility on any one specific platform.
Three, reproducibility is more secure. When you control exactly what
files are installed, you can make sure no malicious actor is
attempting to slip nefarious code into your application (i.e. some
supply chain attacks). By making a lock file which always leads to
reproducible installs, we can avoid certain risks entirely.
This PEP proposes a standard for a lock file, as the current solutions
don't meet the outlined goals. Today, the closest we come to a lock
file standard is the `requirements file format`_ from pip.
Unfortunately, that format does not lead to inherently reproducible
installs (it requires optional features both in the requirements file
and the installer itself, to be discussed later).
The community itself has also shown a need for lock files based on the
fact that multiple tools have independently created their own lock
file formats:
#. PDM_
#. `pip-tools`_
#. Pipenv_
#. Poetry_
#. Pyflow_
2021-11-03 15:30:54 -04:00
Unfortunately, those tools all use differing lock file formats. This
means tooling around these tools much be unique. This impacts tooling
such as code editors and hosting providers, which want to be as
flexible as possible when it comes to accepting a user's application
code, but also have a limit as to how much development resource they
can spend to add support for yet another lock file format. A
standardized format would allow tools to focus their work on a single
target, and make sure that workflow decisions made by developers
outside of the lock file format are of no concern to e.g. hosting
providers.
Other programming language communities have also shown the usefulness
of lock files by developing their own solution to this problem. Some
of those communities include:
#. Dart_
#. npm_/Node
2021-11-03 15:30:54 -04:00
#. Go
#. Rust_
2021-11-03 15:30:54 -04:00
The trend in programming languages in the past decade seems to have
been toward providing a lock file solution.
2021-11-03 15:30:54 -04:00
=========
Rationale
=========
2021-11-03 15:30:54 -04:00
-----------
File Format
-----------
2021-11-03 15:30:54 -04:00
We wanted the file format to be easy to read as a diff when auditing
a change to the lock file. As such, and thanks to PEP 518 and
``pyproject.toml``, we decided to go with the TOML_ file format.
2021-11-03 15:30:54 -04:00
-----------------
Secure by Design
-----------------
2021-11-03 15:30:54 -04:00
Viewing the `requirements file format`_ as the closest we have to
a lock file standard, there are a few issues with the file format when
it comes to security. First is that the file format simply does not
require you specify the exact version of a package. This is why
tools like `pip-tools`_ exist to help manage that for the user.
Second, you must opt into specifying what files may be installed by
using the ``--hash`` argument for a specific dependency. This is also
optional with pip-tools as it requires specifying the
``--generate-hashes`` CLI argument.
Third, even when you control what files may be installed, it does not
prevent other packages from being installed. If a dependency is not
listed in the requirements file, pip will happily go searching for a
file to meet that need, unless you specify ``--no-deps`` as an
argument.
Fourth, the format allows for installing a
`source distribution file`_ (aka "sdist"). By its very nature,
installing an sdist may imply executing arbitrary Python code, meaning
that there is no control over what files may be installed. Only by
specifying ``--only-binary :all:`` can you guarantee pip to only use a
`wheel file`_ for each package.
To recap, in order for a requirements file to be as secure as what is
being proposed, a user should always do the following steps:
#. Use pip-tools and its command ``pip-compile --generate-hashes``
#. Install the requirements file using
``pip install --no-deps --only-binary :all:``
Critically, all of those flags, and both specificity and exhaustion of
what to install that pip-tools provides, are optional.
As such, the proposal raised in this PEP is secure by design to combat
some supply chain attacks. Hashes for files which would be used to
install from are **required**. You can **only** install from wheels
to unambiguously define what files will be placed in the file system.
Installers **must** have an unambiguous installation from a lock file
for a given platform.
--------------
Cross-Platform
--------------
Various projects which already have a lock file, like PDM_ and
Poetry_, provide a lock file which is *cross-platform*. This allows
for a single lock file to work on multiple platforms while still
leading to exact same top-level requirements to be installed
everywhere while the installation being consistent/unambiguous on
each platform.
As to why this is useful, let's use an example involving PyWeek_
(a week-long game development competition). We assume you are
developing on Linux, while someone you choose to partner with is
using macOS. Now assume the judges are using Windows. How do you make
sure everyone is using the same top-level dependencies, while allowing
for any platform-specific requirements (e.g. a package requires a
helper package under Windows)?
With a cross-platform lock file, you can make sure that the key
requirements are met consistently across all platforms. You can then
also make sure that all users on the same platform get the same
reproducible installation.
----------------
Simple Installer
----------------
The separation of concerns between a locker and an installer allows
for an installer to have a much simpler operation to perform. As
such, it not only allows for installers to be easier to write, but
facilitates in making sure installers create unambiguous, reproducible
installations.
The installer can also expend less computation/energy in creating the
installation. This is beneficial not only for faster installs, but
also from an energy consumption perspective, as installers are
expected to be run more often than lockers.
This has led to a design where the locker must do more work upfront
to benefit installers. It also means the complexity of package
dependencies is simpler and easier to comprehend to avoid ambiguity.
=============
Specification
=============
-------
Details
-------
2021-11-03 15:30:54 -04:00
Lock files MUST use the TOML_ file format. This not only prevents the
need to have another file format in the Python packaging ecosystem,
thanks to its adoption by PEP 518 for ``pyproject.toml``, but also
assists in making lock files more human-readable.
2021-11-03 15:30:54 -04:00
Lock files MUST end their file names with ``.pylock.toml``. The
``.toml`` part unambiguously distinguishes the format of the file,
and helps tools like code editors support the file appropriately. The
2021-11-03 15:30:54 -04:00
``.pylock`` part distinguishes the file from other TOML files the user
has, to make logic easier for tools to create functionalities specific
to Python lock files, instead of TOML files in general.
The following sections are the top-level keys of the TOML file data
format. Any field not listed as required is considered optional.
``version``
===========
2021-11-03 15:30:54 -04:00
This field is **required**.
The version of the lock file being used. The key MUST be a string
consisting of a number that follows the same formatting as the
``Metadata-Version`` key in the `core metadata spec`_. The value MUST
be set to ``"1.0"`` until a future PEP allows for a different value.
The introduction of a new *optional* key SHOULD increase the minor
version. The introduction of a new required key or changing the
format MUST increase the major version. How to handle other scenarios
is left as a per-PEP decision.
Installers MUST warn the user if the lock file specifies a version
whose major version is support but whose minor version is
unsupported/unrecognized (e.g. the installer supports ``"1.0"``, but
the lock file specifies ``"1.1"``).
Installers MUST raise an error if the lock file specifies a major
version which is unsupported (e.g. the installer supports ``"1.9"``
but the lock file specifies ``"2.0"``).
``created-at``
==============
This field is **required**.
The timestamp for when the lock file was generated (using TOML's
native timestamp type). It MUST be recorded using the UTC time zone to
avoid ambiguity.
If the SOURCE_DATE_EPOCH_ environment variable is set, it MUST be used
as the timestamp by the locker. This faciliates reproducibility of the
lock file itself.
``[tool]``
==========
Tools may create their own sub-tables under the ``tool`` table. The
rules for this table match those for ``pyproject.toml`` and its
``[tool]`` table from the `build system declaration spec`_.
``[metadata]``
==============
2021-11-03 15:30:54 -04:00
This table is **required**.
A table containing data applying to the overall lock file.
``metadata.marker``
-------------------
2021-11-03 15:30:54 -04:00
A key storing a string containing an environment marker as
specified in the `dependency specifier spec`_.
The locker MAY specify an environment marker which specifies any
2021-11-03 15:30:54 -04:00
restrictions the lock file was generated under.
If the installer is installing for an environment which does not
satisfy the specified environment marker, the installer MUST raise an
error as the lock file does not support the environment.
2021-11-03 15:30:54 -04:00
``metadata.tag``
----------------
A key storing a string specifying `platform compatibility tags`_
(i.e. wheel tags). The tag MAY be a compressed tag set.
2021-11-03 15:30:54 -04:00
The locker MAY specify a tag (set) which specify which platform(s)
the lock file supports.
2021-11-03 15:30:54 -04:00
If the installer is installing for an environment which does not
satisfy the specified tag (set), the installer MUST raise an error
as the lock file does not support the environment.
2021-11-03 15:30:54 -04:00
``metadata.requires``
---------------------
2021-11-03 15:30:54 -04:00
This field is **required**.
2021-11-03 15:30:54 -04:00
An array of strings following the `dependency specifier spec`_. This
array represents the top-level package dependencies of the lock file
and thus the root of the dependency graph.
2021-11-03 15:30:54 -04:00
``metadata.requires-python``
----------------------------
2021-11-03 15:30:54 -04:00
A string specifying the support version(s) of Python for this lock
file. It follows the same format as that specified for the
``Requires-Python`` field in the `core metadata spec`_.
2021-11-03 15:30:54 -04:00
``[[package._name_._version_]]``
================================
This array is **required**.
2021-11-03 15:30:54 -04:00
An array per package and version containing details for the potential
(wheel) files to install (as represented by ``_name_`` and
``_version_``, respectively).
2021-11-03 15:30:54 -04:00
Lockers must MUST normalize a project's name according to the
`simple repository API`_. If extras are specified as part of the
project to install, the extras are to be included in the key name and
are to be sorted in lexicographic order.
2021-11-03 15:30:54 -04:00
Within the file, the tables for the projects SHOULD be sorted by:
#. Project/key name in lexicographic order
#. Package version, newest/highest to older/lowest according to the
`version specifiers spec`_
2021-11-03 15:30:54 -04:00
#. Optional dependencies (extras) via lexicographic order
#. File name based on the ``filename`` or ``url`` field (discussed
below)
2021-11-03 15:30:54 -04:00
All of this is to help minimize diff changes between tool executions.
2021-11-03 15:30:54 -04:00
``package._name_._version_.url``
--------------------------------
2021-11-03 15:30:54 -04:00
A string representing a URL where to get the file.
The installer MAY support any schemes it wants for URLs
(e.g. ``file:`` as well as ``https:``).
An installer MAY choose to not use the URL to retrieve a file
2021-11-03 15:30:54 -04:00
if a file matching the specified hash can be found using some
alternative means (e.g. on the file system in a cache directory).
2021-11-03 15:30:54 -04:00
``package._name_._version_.filename``
-------------------------------------
2021-11-11 22:35:37 -05:00
This field is **required**.
2021-11-11 22:35:37 -05:00
A string representing the name of the file as represented by an entry
in the array. This field is required to simplify installers as the
file name is required to resolve wheel tags derived from the file
name. It also guarantees that the association of the array entry to
the file it is meant for is always clear.
2021-11-03 15:30:54 -04:00
``package._name_._version_.direct``
-----------------------------------
2021-11-03 15:30:54 -04:00
A boolean representing whether an installer should consider the
project installed "directly" as specified by the
`direct URL origin of installed distributions spec`_.
2021-11-03 15:30:54 -04:00
If the key is true, then the installer MUST follow the
`direct URL origin of installed distributions spec`_ for recording
the installation as "direct".
2021-11-03 15:30:54 -04:00
``[package._name_._version_.hashes]``
-------------------------------------
2021-11-03 15:30:54 -04:00
This table is **required**.
2021-11-03 15:30:54 -04:00
A table with keys specifying hash algorithms and values as the hash
for the file represented by this entry in the
``package._name_._version_`` table.
2021-11-03 15:30:54 -04:00
Lockers SHOULD list hashes in lexicographic order. This is to help
minimize diff sizes and the potential to overlook hash value changes.
2021-11-03 15:30:54 -04:00
An installer MUST only install a file which matches one of the
specified hashes.
2021-11-03 15:30:54 -04:00
``package._name_._version_.requires``
-------------------------------------
2021-11-03 15:30:54 -04:00
An array of strings following the `dependency specifier spec`_ which
represent the dependencies of this file.
2021-11-03 15:30:54 -04:00
``package._name_._version_.requires-python``
--------------------------------------------
2021-11-03 15:30:54 -04:00
A string specifying the support version(s) of Python for this file. It
follows the same format as that specified for the
``Requires-Python`` field in the `core metadata spec`_.
-------
Example
-------
::
2021-11-03 15:30:54 -04:00
version = "1.0"
created-at = 2021-10-19T22:33:45.520739+00:00
[tool]
# Tool-specific table ala PEP 518's `[tool]` table.
2021-11-03 15:30:54 -04:00
[metadata]
requires = ["mousebender"]
requires-python = ">=3.6"
2021-11-03 15:30:54 -04:00
[[package.attrs."21.2.0"]]
url = "https://files.pythonhosted.org/packages/20/a9/ba6f1cd1a1517ff022b35acd6a7e4246371dfab08b8e42b829b6d07913cc/attrs-21.2.0-py2.py3-none-any.whl"
2021-11-11 22:35:37 -05:00
filename = "attrs-21.2.0-py2.py3-none-any.whl"
2021-11-03 15:30:54 -04:00
hashes.sha256 = "149e90d6d8ac20db7a955ad60cf0e6881a3f20d37096140088356da6c716b0b1"
2021-11-03 15:30:54 -04:00
[[package.mousebender."2.0.0"]]
url = "https://files.pythonhosted.org/packages/f4/b3/f6fdbff6395e9b77b5619160180489410fb2f42f41272994353e7ecf5bdf/mousebender-2.0.0-py3-none-any.whl"
2021-11-11 22:35:37 -05:00
filename = "mousebender-2.0.0-py3-none-any.whl"
2021-11-03 15:30:54 -04:00
hashes.sha256 = "a6f9adfbd17bfb0e6bb5de9a27083e01dfb86ed9c3861e04143d9fd6db373f7c"
requires = ["attrs", "packaging"]
2021-11-03 15:30:54 -04:00
[[package.packaging."20.9"]]
url = "https://files.pythonhosted.org/packages/3e/89/7ea760b4daa42653ece2380531c90f64788d979110a2ab51049d92f408af/packaging-20.9-py2.py3-none-any.whl"
2021-11-11 22:35:37 -05:00
filename = "packaging-20.9-py2.py3-none-any.whl"
2021-11-03 15:30:54 -04:00
hashes.blake-256 = "3e897ea760b4daa42653ece2380531c90f64788d979110a2ab51049d92f408af"
hashes.sha256 = "67714da7f7bc052e064859c05c595155bd1ee9f69f76557e21f051443c20947a"
requires = ["pyparsing"]
2021-11-03 15:30:54 -04:00
[[package.pyparsing."2.4.7"]]
url = "https://files.pythonhosted.org/packages/8a/bb/488841f56197b13700afd5658fc279a2025a39e22449b7cf29864669b15d/pyparsing-2.4.7-py2.py3-none-any.whl"
2021-11-11 22:35:37 -05:00
filename = "pyparsing-2.4.7-py2.py3-none-any.whl"
2021-11-03 15:30:54 -04:00
hashes.sha256="ef9d7589ef3c200abe66653d3f1ab1033c3c419ae9b9bdb1240a85b024efc88b"
2021-11-03 15:30:54 -04:00
------------------------
Expectations for Lockers
------------------------
2021-11-03 15:30:54 -04:00
Lockers MUST create lock files for which a topological sort of the
packages which qualify for installation on the specified platform
results in a graph for which only a single version of any package
is possible and there is at least one compatible file to install for
those packages. This equates to a lock file that which is acceptable
based on ``metadata.marker``, ``metadata.tag``, and
``metadata.requires-python`` will have a list of package versions
after evaluating environment markers and eliminating unsupported
files for which the only decision the installer will need to make is
which file to use for the package (which is outlined below).
This means that lockers are expected to utilize ``metadata.marker``,
``metadata.tag``, and ``metadata.requires-python`` as appropriate
as well as environment markers specified via ``requires`` and Python
version requirements via ``requires-python`` to enforce this result
for installers. Put another way, the information used in the lock
file is not expected to be pristine/raw from the locker's input and
instead is to be changed as necessary to the benefit of the locker's
goals.
---------------------------
Expectations for Installers
---------------------------
The expected algorithm for resolving what to install is:
#. Construct a dependency graph based on the data in the lock file
with ``metadata.requires`` as the starting/root point.
#. Eliminate all (wheel) files that are unsupported by the specified
platform.
#. Eliminate all irrelevant edges between packages based on marker
evaluation.
#. Raise an error if a package version is still reachable from the
root of the dependency graph but lacks any compatible (wheel)
file.
#. Verify that all packages left only have one version to install,
raising an error otherwise.
#. Install the best-fitting wheel file for each package which
remains.
What constitues the "best-fitting wheel file" is an open issue.
Installers MUST support installing into an empty environment.
Installers MAY support installing into an environment that already
conatins installed packages (and whatever that would entail).
========================
(Potential) Tool Support
========================
The pip_ team has `said <https://github.com/pypa/pip/issues/10636>`__
they are interested in supporting this PEP if accepted. The current
proposal for pip may even
`supplant the need <https://github.com/jazzband/pip-tools/issues/1526#issuecomment-961883367>`__
for `pip-tools`_.
PDM_ has also said they would
`support the PEP <https://github.com/pdm-project/pdm/issues/718>`__
if accepted.
Pyflow_ has said they
`"like the idea" <https://github.com/David-OConnor/pyflow/issues/153#issuecomment-962482058>`__
of the PEP.
=======================
Backwards Compatibility
=======================
As there is no pre-existing specification regarding lock files, there
are no explicit backwards compatibility concerns.
As for pre-existing tools that have their own lock file, some updating
will be required. Most document the lock file name, but not its
2021-11-03 15:30:54 -04:00
contents. For projects which do not commit their lock file to
version control, they will need to update the equivalent of their
``.gitignore`` file. For projects that do commit their lock file to
version control, what file(s) get committed will need an update.
For projects which do document their lock file format like pipenv_,
2021-11-03 15:30:54 -04:00
they will very likely need a major version release which changes the
lock file format.
Specifically for Poetry_, it has an
`export command <https://python-poetry.org/docs/cli/#export>`_ which
should allow Poetry to support this lock file format even if the
2021-11-03 15:30:54 -04:00
project chooses not to adopt this PEP as Poetry's primary lock file
format.
=====================
Security Implications
=====================
A lock file should not introduce security issues but instead help
2021-11-03 15:30:54 -04:00
solve them. By requiring the recording of hashes for files, a lock
file is able to help prevent tampering with code since the hash
details were recorded. A lock file also helps prevent unexpected
package updates being installed which may be malicious.
=================
How to Teach This
=================
Teaching of this PEP will very much be dependent on the lockers and
installers being used for day-to-day use. Conceptually, though, users
2021-11-03 15:30:54 -04:00
could be taught that a lock file specifies what should be installed
for a project to work. The benefits of consistency and security should
be emphasized to help users realize why they should care about lock
files.
========================
Reference Implementation
========================
2021-11-03 15:30:54 -04:00
No proof-of-concept or reference implementation currently exists. An
example locker and installer will be provided before this PEP is
fully accepted (although this is not a necessarily a requirement for
conditional acceptance).
==============
Rejected Ideas
==============
----------------------------
File Formats Other Than TOML
----------------------------
JSON_ was briefly considered, but due to:
#. TOML already being used for ``pyproject.toml``
#. TOML being more human-readable
#. TOML leading to better diffs
the decision was made to go with TOML. There was some concern over
Python's standard library lacking a TOML parser, but most packaging
tools already use a TOML parser thanks to ``pyproject.toml`` so this
issue did not seem to be a showstopper. Some have also argued against
this concern in the past by the fact that if packaging tools abhor
installing dependencies and feel they can't vendor a package then the
packaging ecosystem has much bigger issues to rectify than needing to
depend on a third-party TOML parser.
2021-11-03 15:30:54 -04:00
--------------------------
Alternative Naming Schemes
--------------------------
Specifying a directory to install file to was considered, but
ultimately rejected due to people's distaste for the idea.
2021-11-03 15:30:54 -04:00
It was also suggested to not have a special file name suffix, but it
was decided that hurt discoverability by tools too much.
-----------------------------
Supporting a Single Lock File
-----------------------------
2021-11-03 15:30:54 -04:00
At one point the idea of only supporting single lock file which
contained all possible lock information was considered. But it quickly
became apparent that trying to devise a data format which could
encompass both a lock file format which could support multiple
environments as well as strict lock outcomes for
reproducible builds would become quite complex and cumbersome.
The idea of supporting a directory of lock files as well as a single
lock file named ``pyproject-lock.toml`` was also considered. But any
possible simplicity from skipping the directory in the case of a
single lock file seemed unnecessary. Trying to define appropriate
logic for what should be the ``pyproject-lock.toml`` file and what
should go into ``pyproject-lock.d`` seemed unnecessarily complicated.
-----------------------------------------------
Using a Flat List Instead of a Dependency Graph
-----------------------------------------------
The first version of this PEP proposed that the lock file have no
concept of a dependency graph. Instead, the lock file would list
exactly what should be installed for a specific platform such that
installers did not have to make any decisions about *what* to install,
only validating that the lock file would work for the target platform.
This idea was eventually rejected due to the number of combinations
of potential PEP 508 environment markers. The decision was made that
2021-11-03 15:30:54 -04:00
trying to have lockers generate all possible combinations as
individual lock files when a project wants to be cross-platform would
be too much.
-------------------------------
Use Wheel Tags in the File Name
-------------------------------
2021-11-03 15:30:54 -04:00
Instead of having the ``metadata.tag`` field there was a suggestion
of encoding the tags into the file name. But due to the addition of
the ``metadata.marker`` field and what to do when no tags were needed,
the idea was dropped.
2021-11-03 15:30:54 -04:00
----------------------------------
Alternative Names for ``requires``
----------------------------------
2021-11-03 15:30:54 -04:00
Some other names for what became ``requires`` were ``installs``,
``needs``, and ``dependencies``. Initially this PEP chose ``needs``
after asking a Python beginner which term they preferred. But based
on feedback on an earlier draft of this PEP, ``requires`` was chosen
as the term.
-----------------
Accepting PEP 650
-----------------
PEP 650 was an earlier attempt at trying to tackle this problem by
specifying an API for installers instead of standardizing on a lock file
format (ala PEP 517). The
`initial response <https://discuss.python.org/t/pep-650-specifying-installer-requirements-for-python-projects/6657/>`__
to PEP 650 could be considered mild/lukewarm. People seemed to be
consistently confused over which tools should provide what functionality
to implement the PEP. It also potentially incurred more overhead as
it would require executing Python APIs to perform any actions involving
packaging.
2021-11-03 15:30:54 -04:00
This PEP chooses to standardize around an artifact instead of an API
(ala PEP 621). This would allow for more tool integrations as it
removes the need to specifically use Python to do things such as
create a lock file, update it, or even install packages listed in
a lock file. It also allows for easier introspection by forcing
dependency graph details to be written in a human-readable format.
It also allows for easier sharing of knowledge by standardizing what
people need to know more (e.g. tutorials become more portable between
tools when it comes to understanding the artifact they produce). It's
also simply the approach other language communities have taken and seem
to be happy with.
2021-08-13 20:25:36 -04:00
2021-11-03 15:30:54 -04:00
-------------------------------------------------------
Specifying Requirements per Package Instead of per File
-------------------------------------------------------
2021-08-13 20:25:36 -04:00
2021-11-03 15:30:54 -04:00
An earlier draft of this PEP specified dependencies at the package
level instead of per (wheel) file. While this has traditionally been
how packaging systems work, it actually did not reflect accurately
how things are specified. As such, this PEP was subsequently updated
to reflect the granularity that dependencies can truly be specified
at.
2021-08-13 20:25:36 -04:00
2021-11-03 15:30:54 -04:00
===========
Open Issues
===========
2021-08-16 17:08:49 -04:00
2021-11-03 15:30:54 -04:00
-------------------------------------------------------------------------------------
Allowing Source Distributions and Source Trees to be an Opt-In, Supported File Format
-------------------------------------------------------------------------------------
For security reproducibility reasons this PEP only considers
supporting installation from wheel files. Installing from either an
sdist or source tree requires arbitrary code execution during
installation, unknown files to be installed, and an unknown set of
dependencies. Those issues all run counter to guaranteeing users get
the same files for the same platform as well as making sure they are
receiving the expected files.
To deal with this issue, people would need to build their own wheels
from sdists and cache them. Then the lockers would record the hashes
of those wheels and the installers would then be expected to use
those wheels.
Another option is to allow sdists (and potentially source trees) be
listed as support file formats, but have them marked as insecure in
the lock file and require the installer force the user to opt into
using insecure file formats. Unfortunately because sdists which don't
necessarily follow version 2.2 of the `core metadata spec`_ for their
``PKG-INFO`` file will have unknown dependencies, breaking the
guarantee that results will be reproducible thanks to potential
arbitrary calculations of those dependencies. And even if an sdist did
follow the latest spec, they could still list their requirements as
dynamic, still making it impossible to statically know what should be
installed. As such, installers would either have to have a full
resolver to handle these dynamic cases or only sdists which follow
version 2.2 of the core metadata spec **and** statically specify
their dependencies could be listed. But at that point the project is
probably capable of providing wheels, making support for sdists that
much less important/useful.
----------------------------------
Specify Where Lockers Gather Input
----------------------------------
This PEP currently does not specify how a locker gets its input. It
could be possible to support a subset of PEP 621 such that
``project.requires-python`` and ``project.dependencies`` are read
from ``pyproject.toml`` and automatically used as input if provided.
But this or some other practice could also be left as something to
grow organically in the community and making that the standard at a
later date.
------------------------------------
What is a "best-fitting wheel file"?
------------------------------------
The expected steps of installing a package much include decided which
wheel file to install as a package may have a universal wheel on top
of very specific wheels. But as `platform compatibility tags`_ do not
specify how to determine priority and there is no way to use
environment markers to specify an exact wheel, there's no defined way
for an installer to deterministically determine what wheel file to
select.
There are two possible solutions. One is for the locker to specify a
ranking/priority order to the wheel files. That way the installer
simply sorts to the supported wheel files by that order and installs
2021-11-15 10:17:17 -05:00
the top rated/ranked wheel file. This puts the priority order under
the control of the locker.
2021-11-03 15:30:54 -04:00
The other option is to specify in this PEP how to calculate the
priority/ranking of wheel files. This is currently tool-based and
seems to have been acceptable overall by the community, but having a
specification for this would probably still be welcome. It may be
somewhat disruptive, though, as it could change what files get
installed by tools which implement the ordering outside of the
context of this PEP. And if this PEP gains traction, it is reasonable
to assume that users will expect the ordering to be consistent across
tools.
===============
Acknowledgments
===============
Thanks to Frost Ming of PDM_ and Sébastien Eustace of Poetry_ for
providing input around dynamic install-time resolution of PEP 508
requirements.
Thanks to Kushal Das for making sure reproducible builds stayed a
concern for this PEP.
2021-11-03 15:30:54 -04:00
Thanks to Andrea McInnes for initially settling the bikeshedding and
choosing the paint colour of ``needs`` (at which point that caused
people to rally around the ``requires`` colour).
=========
Copyright
=========
This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.
.. _build system declaration spec: https://packaging.python.org/specifications/declaring-build-dependencies/
.. _core metadata spec: https://packaging.python.org/specifications/core-metadata/
.. _Dart: https://dart.dev/
.. _dependency specifier spec: https://packaging.python.org/specifications/dependency-specifiers/
2021-08-03 20:25:57 -04:00
.. _direct URL origin of installed distributions spec: https://packaging.python.org/specifications/direct-url/
.. _Git: https://git-scm.com/
2021-11-03 15:30:54 -04:00
.. _Go: https://go.dev/
.. _JSON: https://www.json.org/
.. _npm: https://www.npmjs.com/
.. _PDM: https://pypi.org/project/pdm/
.. _pip: https://pip.pypa.io/
.. _pip-tools: https://pypi.org/project/pip-tools/
.. _Pipenv: https://pypi.org/project/pipenv/
.. _platform compatibility tags: https://packaging.python.org/specifications/platform-compatibility-tags/
.. _Poetry: https://pypi.org/project/poetry/
.. _Pyflow: https://pypi.org/project/pyflow/
2021-11-03 15:30:54 -04:00
.. _PyWeek: https://pyweek.org/
.. _requirements file format: https://pip.pypa.io/en/latest/reference/requirements-file-format/
.. _Rust: https://www.rust-lang.org/
.. _SecureDrop: https://securedrop.org/
.. _simple repository API: https://packaging.python.org/specifications/simple-repository-api/
.. _source distribution file: https://packaging.python.org/specifications/source-distribution-format/
.. _SOURCE_DATE_EPOCH: https://reproducible-builds.org/specs/source-date-epoch/
.. _TOML: https://toml.io
.. _version specifiers spec: https://packaging.python.org/specifications/version-specifiers/
.. _wheel file: https://packaging.python.org/specifications/binary-distribution-format/
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: