1147 lines
44 KiB
ReStructuredText
1147 lines
44 KiB
ReStructuredText
PEP: 751
|
|
Title: A file format to list Python dependencies for installation reproducibility
|
|
Author: Brett Cannon <brett@python.org>
|
|
Discussions-To: https://discuss.python.org/t/59173
|
|
Status: Draft
|
|
Type: Standards Track
|
|
Topic: Packaging
|
|
Created: 24-Jul-2024
|
|
Post-History: `25-Jul-2024 <https://discuss.python.org/t/59173>`__
|
|
Replaces: 665
|
|
|
|
========
|
|
Abstract
|
|
========
|
|
|
|
This PEP proposes a new file format for dependency specification
|
|
to enable reproducible installation in a Python environment. The format is
|
|
designed to be human-readable and machine-generated. Installers consuming the
|
|
file should be able to evaluate each package in question in isolation, with no
|
|
need for dependency resolution at install-time.
|
|
|
|
|
|
==========
|
|
Motivation
|
|
==========
|
|
|
|
Currently, no standard exists to:
|
|
|
|
- Specify what top-level dependencies should be installed into a Python
|
|
environment.
|
|
- Create an immutable record, such as a lock file, of which dependencies were
|
|
installed.
|
|
|
|
Considering there are at least five well-known solutions to this problem in the
|
|
community (``pip freeze``, pip-tools_, uv_, Poetry_, and PDM_), there seems to
|
|
be an appetite for lock files in general.
|
|
|
|
Those tools also vary in what locking scenarios they support. For instance,
|
|
``pip freeze`` and pip-tools only generate lock files for the current
|
|
environment while PDM and Poetry try to lock for *any* environment to some
|
|
degree. And none of them directly support locking to specific files to install
|
|
which can be important for some workflows. There's also concerns around the lack
|
|
of secure defaults in the face of supply chain attacks (e.g., always including
|
|
hashes for files). Finally, not all the formats are easy to audit to determine
|
|
what would be installed into an environment ahead of time.
|
|
|
|
The lack of a standard also has some drawbacks. For instance, any tooling that
|
|
wants to work with lock files must choose which format to support, potentially
|
|
leaving users unsupported (e.g., Dependabot_ only supporting select tools,
|
|
same for cloud providers who can do dependency installations on your behalf,
|
|
etc.). It also impacts portability between tools, which causes vendor lock-in.
|
|
By not having compatibility and interoperability it fractures tooling around
|
|
lock files where both users and tools have to choose what lock file format to
|
|
use upfront and making it costly to use/switch to other formats. Rallying
|
|
around a single format removes that cost/barrier.
|
|
|
|
.. note::
|
|
|
|
Much of the motivation from :pep:`665` also applies to this PEP.
|
|
|
|
|
|
=========
|
|
Rationale
|
|
=========
|
|
|
|
The format is designed so that a *locker* which produces the lock file
|
|
and an *installer* which consumes the lock file can be separate tools. This
|
|
allows for situations such as cloud hosting providers to use their own installer
|
|
that's optimized for their system which is independent of what locker the user
|
|
used to create their lock file.
|
|
|
|
The file format is designed to be human-readable. This is
|
|
so that the contents of the file can be audited by a human to make sure no
|
|
undesired dependencies end up being included in the lock file. It is also
|
|
designed to facilitate easy understanding of what would be installed from the
|
|
lock file without necessitating running a tool, once again to help with
|
|
auditing. Finally, the format is designed so that viewing a diff of the file is
|
|
easy by centralizing relevant details.
|
|
|
|
The file format is also designed to not require a resolver at install time.
|
|
Being able to analyze dependencies in isolation from one another when listed in
|
|
a lock file provides a few benefits. First, it supports auditing by making it
|
|
easy to figure out if a certain dependency would be installed for a certain
|
|
environment without needing to reference other parts of the file contextually.
|
|
It should also lead to faster installs which are much more frequent than
|
|
creating a lock file. Finally, the four tools mentioned in the Motivation_
|
|
section either already implement this approach of evaluating dependencies in
|
|
isolation or have suggested they could (in
|
|
`Poetry's case <https://discuss.python.org/t/lock-files-again-but-this-time-w-sdists/46593/83>`__).
|
|
|
|
|
|
-----------------
|
|
Locking Scenarios
|
|
-----------------
|
|
|
|
The lock file format is designed to support two locking scenarios. The format
|
|
should also be flexible enough that adding support for other locking scenarios
|
|
is possible via a separate PEP.
|
|
|
|
|
|
Per-file Locking
|
|
================
|
|
|
|
*Per-file locking* operates under the premise that one wants to install exactly
|
|
the same files in any matching environment. As such, the lock file specifies
|
|
what files to install. There can be multiple environments specified in a
|
|
single file, each with their own set of files to install. By specifying the
|
|
exact files to install, installers avoid performing any resolution to decide what
|
|
to install.
|
|
|
|
The motivation for this approach to locking is for those who have controlled
|
|
environments that they work with. For instance, if you have specific, controlled
|
|
development and production environments then you can use per-file locking to
|
|
make sure the **same** files are installed in both environments for everyone.
|
|
This is similar to what ``pip freeze`` and pip-tools_
|
|
support, but with more strictness of the exact files as well as incorporating
|
|
support to specify the locked files for multiple environments in the same file.
|
|
|
|
Per-file locking should be used when the installation attempt should fail
|
|
outright if there is no explicitly pre-approved set of installation artifacts
|
|
for the target platform. For example: locking the deployment dependencies for a
|
|
managed web service.
|
|
|
|
|
|
Package Locking
|
|
===============
|
|
|
|
*Package locking* lists the packages and their versions that *may* apply to any
|
|
environment being installed for. The list of packages and their versions are
|
|
evaluated individually and independently from any other packages and versions
|
|
listed in the file. This allows installation to be linear -- read each package
|
|
and version and make an isolated decision as to whether it should be installed.
|
|
This avoids requiring the installer to perform a *resolution* (i.e.
|
|
determine what to install based on what else is to be installed).
|
|
|
|
The motivation of this approach comes from
|
|
`PDM lock files <https://frostming.com/en/2024/pdm-lockfile/>`__. By listing the
|
|
potential packages and versions that may be installed, what's installed is
|
|
controlled in a way that's easy to reason about. This also allows for not
|
|
specifying the exact environments that would be supported by the lock file so
|
|
there's more flexibility for what environments are compatible with the lock
|
|
file. This approach supports scenarios like open-source projects that want to
|
|
lock what people should use to build the documentation without knowing upfront
|
|
what environments their contributors are working from.
|
|
|
|
As already mentioned, this approach is supported by PDM_. Poetry_ has
|
|
`shown some interest <https://discuss.python.org/t/46593/83>`__.
|
|
|
|
Per-package locking should be used when the exact set of potential target
|
|
platforms is not known when generating the lock file, as it allows installation
|
|
tools to choose the most appropriate artifacts for each platform from the
|
|
pre-approved set. For example: locking the development dependencies for an open
|
|
source project.
|
|
|
|
|
|
=============
|
|
Specification
|
|
=============
|
|
|
|
---------
|
|
File Name
|
|
---------
|
|
|
|
A lock file MUST be named :file:`pylock.toml` or match the regular expression
|
|
``r"pylock\.(.+)\.toml"`` if a name for the lock file is desired or if multiple lock files exist.
|
|
The use of the ``.toml`` file extension is to make syntax highlighting in
|
|
editors easier and to reinforce the fact that the file format is meant to be
|
|
human-readable. The prefix and suffix of a named file MUST be lowercase for easy
|
|
detection and stripping off to find the name, e.g.::
|
|
|
|
if filename.startswith("pylock.") and filename.endswith(".toml"):
|
|
name = filename.removeprefix("pylock.").removesuffix(".toml")
|
|
|
|
This PEP has no opinion as to the location of lock files (i.e. in the root or
|
|
the subdirectory of a project).
|
|
|
|
|
|
-----------
|
|
File Format
|
|
-----------
|
|
|
|
The format of the file is TOML_.
|
|
|
|
All keys listed below are required unless otherwise noted. If two keys are
|
|
mutually exclusive to one another, then one of the keys is required while the
|
|
other is disallowed.
|
|
|
|
Keys in tables -- including the top-level table -- SHOULD be emitted by
|
|
lockers in the order they are listed in this PEP when applicable unless
|
|
another sort order is specified to minimize noise in diffs. If the keys are not
|
|
explicitly specified in this PEP, then the keys SHOULD be sorted by
|
|
lexicographic order.
|
|
|
|
As well, lockers SHOULD sort arrays in lexicographic order
|
|
unless otherwise specified for the same reason.
|
|
|
|
|
|
``version``
|
|
===========
|
|
|
|
- String
|
|
- The version of the lock file format.
|
|
- This PEP specifies the initial version -- and only valid value until future
|
|
updates to the standard change it -- as ``"1.0"``.
|
|
- If an installer supports the major version but not the minor version, a tool
|
|
SHOULD warn when an unknown key is seen.
|
|
- If an installer doesn't support a major version, it MUST raise an error.
|
|
|
|
|
|
``hash-algorithm``
|
|
==================
|
|
|
|
- String
|
|
- The name of the hash algorithm used for calculating all hash values.
|
|
- Only a single hash algorithm is used for the entire file to allow the
|
|
``[[packages.files]]`` table to be written inline for readability and
|
|
compactness purposes by only listing a single hash value instead of multiple
|
|
values based on multiple hash algorithms.
|
|
- Specifying a single hash algorithm guarantees that an algorithm that the user
|
|
prefers is used consistently throughout the file without having to audit
|
|
each file hash value separately.
|
|
- Allows for updating the entire file to a new hash algorithm without running
|
|
the risk of accidentally leaving an old hash value in the file.
|
|
- :ref:`packaging:simple-repository-api-json` and the ``hashes`` dictionary of
|
|
of the ``files`` dictionary of the Project Details dictionary specifies what
|
|
values are valid and guidelines on what hash algorithms to use.
|
|
- Failure to validate any hash values for any file that is to be installed MUST
|
|
raise an error.
|
|
|
|
|
|
``dependencies``
|
|
================
|
|
|
|
- Array of strings
|
|
- A listing of the `dependency specifiers`_ that act as the input to the lock file,
|
|
representing the direct, top-level dependencies to be installed.
|
|
|
|
|
|
``[[file-locks]]``
|
|
==================
|
|
|
|
- Array of tables
|
|
- Mutually exclusive with ``[package-lock]``.
|
|
- The array's existence implies the use of the per-file locking approach.
|
|
- An environment that meets all of the specified criteria in the table will be
|
|
considered compatible with the environment that was locked for.
|
|
- Lockers MUST NOT generate multiple ``[file-locks]`` tables which would be
|
|
considered compatible for the same environment.
|
|
- In instances where there would be a conflict but the lock is still desired,
|
|
either separate lock files can be written or per-package locking can be used.
|
|
- Entries in array SHOULD be sorted by ``file-locks.name`` lexicographically.
|
|
|
|
|
|
``file-locks.name``
|
|
-------------------
|
|
|
|
- String
|
|
- A unique name within the array for the environment this table represents.
|
|
|
|
|
|
``[file-locks.marker-values]``
|
|
------------------------------
|
|
|
|
- Optional
|
|
- Table of strings
|
|
- The keys represent the names of `environment markers`_ and the values are the
|
|
values for those markers.
|
|
- Compatibility is defined by the environment's values matching what is in the
|
|
table.
|
|
|
|
|
|
``file-locks.wheel-tags``
|
|
-------------------------
|
|
|
|
- Optional
|
|
- Array of strings
|
|
- An unordered array of `wheel tags`_ for which all tags must be supported by
|
|
the environment.
|
|
- The array MAY not be exhaustive to allow for a smaller array as well as to
|
|
help prevent multiple ``[[file-locks]]`` tables being compatible with the
|
|
same environment by having one array being a strict subset of another
|
|
``file-locks.wheel-tags`` entry in the same file's
|
|
``[[file-locks]]`` tables.
|
|
- Lockers MUST NOT include
|
|
`compressed tag sets <https://packaging.python.org/en/latest/specifications/platform-compatibility-tags/#compressed-tag-sets>`__
|
|
or duplicate tags for consistency across lockers and to simplify checking for
|
|
compatibility.
|
|
|
|
|
|
``[package-lock]``
|
|
==================
|
|
|
|
- Table
|
|
- Mutually exclusive with ``[[file-locks]]``.
|
|
- Signifies the use of the package locking approach.
|
|
|
|
|
|
``package-lock.requires-python``
|
|
--------------------------------
|
|
|
|
- String
|
|
- Holds the `version specifiers`_ for Python version compatibility for the
|
|
overall package locking.
|
|
- Provides at-a-glance information to know if the lock file *may* apply to a
|
|
version of Python instead of having to scan the entire file to compile the
|
|
same information.
|
|
|
|
|
|
``[[packages]]``
|
|
================
|
|
|
|
- Array of tables
|
|
- The array contains all data on the locked package versions.
|
|
- Lockers SHOULD record packages in order by ``packages.name`` lexicographically
|
|
, ``packages.version`` by the sort order for `version specifiers`_, and
|
|
``packages.markers`` lexicographically.
|
|
- Lockers SHOULD record keys in the same order as written in this PEP to
|
|
minimize changes when updating.
|
|
- Entries are designed so that relevant details as to why a package is included
|
|
are in one place to make diff reading easier.
|
|
|
|
|
|
``packages.name``
|
|
-----------------
|
|
|
|
- String
|
|
- The `normalized name`_ of the packages.
|
|
- Part of what's required to uniquely identify this entry.
|
|
|
|
|
|
``packages.version``
|
|
--------------------
|
|
|
|
- String
|
|
- The version of the packages.
|
|
- Part of what's required to uniquely identify this entry.
|
|
|
|
|
|
``packages.multiple-entries``
|
|
-----------------------------
|
|
|
|
- Optional (defaults to ``false``)
|
|
- Boolean
|
|
- If package locking via ``[package-lock]``, then the multiple entries for the
|
|
same package MUST be mutually exclusive via ``packages.marker`` (this is not
|
|
required for per-file locking as the ``packages.*.lock`` entries imply mutual
|
|
exclusivity).
|
|
- Aids in auditing by knowing that there are multiple entries for the same
|
|
package that may need to be considered.
|
|
|
|
|
|
``packages.description``
|
|
------------------------
|
|
|
|
- Optional
|
|
- String
|
|
- The package's ``Summary`` from its `core metadata`_.
|
|
- Useful to help understand why a package was included in the file based on its
|
|
purpose.
|
|
|
|
|
|
``packages.index-url``
|
|
------------------------------------
|
|
|
|
- Optional (although mutually exclusive with
|
|
``packages.files.index-url``)
|
|
- String
|
|
- Stores the `project index`_ URL from the `Simple Repository API`_.
|
|
- Useful for generating Packaging URLs (aka PURLs).
|
|
- When possible, lockers SHOULD include this or
|
|
``packages.files.index-url`` to assist with generating
|
|
`software bill of materials`_ (aka SBOMs).
|
|
|
|
|
|
``packages.marker``
|
|
-------------------
|
|
|
|
- Optional
|
|
- String
|
|
- The `environment markers`_ expression which specifies whether this package and
|
|
version applies to the environment.
|
|
- Only applicable via ``[package-lock]`` and the package locking scenario.
|
|
- The lack of this key means this package and version is required to be
|
|
installed.
|
|
|
|
|
|
``packages.requires-python``
|
|
----------------------------
|
|
|
|
- Optional
|
|
- String
|
|
- Holds the `version specifiers`_ for Python version compatibility for the
|
|
package and version.
|
|
- Useful for documenting why this package and version was included in the file.
|
|
- Also helps document why the version restriction in
|
|
``package-lock.requires-python`` was chosen.
|
|
- It should not provide useful information for installers as it would be
|
|
captured by ``package-lock.requires-python`` and isn't relevant when
|
|
``[[file-locks]]`` is used.
|
|
|
|
|
|
``packages.dependents``
|
|
-----------------------
|
|
|
|
- Optional
|
|
- Array of strings
|
|
- A record of the packages that depend on this package and version.
|
|
- Useful for analyzing why a package happens to be listed in the file
|
|
for auditing purposes.
|
|
- This does not provide information which influences installers.
|
|
|
|
|
|
``packages.dependencies``
|
|
-------------------------
|
|
|
|
- Optional
|
|
- Array of strings
|
|
- A record of the dependencies of the package and version.
|
|
- Useful in analyzing why a package happens to be listed in the file
|
|
for auditing purposes.
|
|
- This does not provide information which influences the installer as
|
|
``[[file-locks]]`` specifies the exact files to use and ``[package-lock]``
|
|
applicability is determined by ``packages.marker``.
|
|
|
|
|
|
``packages.direct``
|
|
-------------------
|
|
|
|
- Optional (defaults to ``false``)
|
|
- Boolean
|
|
- Represents whether the installation is via a `direct URL reference`_.
|
|
|
|
|
|
``[[packages.files]]``
|
|
----------------------
|
|
|
|
- Must be specified if ``[packages.vcs]`` and ``[packages.directory]`` is not
|
|
(although may be specified simultaneously with the other options).
|
|
- Array of tables
|
|
- Tables can be written inline.
|
|
- Represents the files to potentially install for the package and version.
|
|
- Entries in ``[[packages.files]]`` SHOULD be lexicographically sorted by
|
|
``packages.files.name`` key to minimize changes in diffs.
|
|
|
|
|
|
``packages.files.name``
|
|
'''''''''''''''''''''''
|
|
|
|
- String
|
|
- The file name.
|
|
- Necessary for installers to decide what to install when using package locking.
|
|
|
|
|
|
``packages.files.lock``
|
|
'''''''''''''''''''''''
|
|
|
|
- Required when ``[[file-locks]]`` is used (does not apply under per-package
|
|
locking)
|
|
- Array of strings
|
|
- An array of ``file-locks.name`` values which signify that the file is to be
|
|
installed when the corresponding ``[[file-locks]]`` table applies to the
|
|
environment.
|
|
- There MUST only be a single file with any one ``file-locks.name`` entry per
|
|
package, regardless of version.
|
|
|
|
|
|
``packages.files.index-url``
|
|
''''''''''''''''''''''''''''''''''''''''''
|
|
|
|
- Optional (although mutually exclusive with
|
|
``packages.index-url``)
|
|
- String
|
|
- The value has the same meaning as ``packages.index-url``.
|
|
- This key is available per-file to support :pep:`708` when some files override
|
|
what's provided by another `Simple Repository API`_ index.
|
|
|
|
|
|
``packages.files.url``
|
|
''''''''''''''''''''''
|
|
|
|
- Optional (and mutually exclusive with ``packages.path``)
|
|
- String
|
|
- URL where the file was found when the lock file was generated.
|
|
- Useful for documenting where the file was originally found and potentially
|
|
where to look for the file if it is not already downloaded/available.
|
|
- Installers MUST NOT assume the URL will always work, but installers MAY use
|
|
the URL if it happens to work.
|
|
|
|
|
|
``packages.path``
|
|
'''''''''''''''''
|
|
|
|
- Optional (and mutually exclusive with ``packages.path``)
|
|
- String
|
|
- File system path to where the file was found when the lock file was generated.
|
|
- Path may be relative to the lock file's location or absolute.
|
|
- Installers MUST NOT assume the path will always work, but installers MAY use
|
|
the path if it happens to work.
|
|
|
|
|
|
``packages.files.hash``
|
|
'''''''''''''''''''''''
|
|
|
|
- String
|
|
- The hash value of the file contents using the hash algorithm specified by
|
|
``hash-algorithm``.
|
|
- Used by installers to verify the file contents match what the locker worked
|
|
with.
|
|
|
|
|
|
``[packages.vcs]``
|
|
------------------
|
|
|
|
- Must be specified if ``[[packages.files]]`` and ``[packages.directory]`` is
|
|
not (although may be specified simultaneously with the other options).
|
|
- Table representing the version control system containing the package and
|
|
version.
|
|
|
|
|
|
``packages.vcs.type``
|
|
'''''''''''''''''''''
|
|
|
|
- String
|
|
- The type of version control system used.
|
|
- The valid values are specified by the
|
|
`registered VCSs <https://packaging.python.org/en/latest/specifications/direct-url-data-structure/#registered-vcs>`__
|
|
of the direct URL data structure.
|
|
|
|
|
|
``packages.vcs.url``
|
|
'''''''''''''''''''''''
|
|
|
|
- Mutually exclusive with ``packages.vcs.path``
|
|
- String
|
|
- The URL of where the repository was located when the lock file was generated.
|
|
|
|
|
|
``packages.vcs.path``
|
|
'''''''''''''''''''''
|
|
|
|
- Mutually exclusive with ``packages.vcs.url``
|
|
- String
|
|
- The file system path where the repository was located when the lock file was
|
|
generated.
|
|
- The path may be relative to the lock file or absolute.
|
|
|
|
|
|
``packages.vcs.commit``
|
|
'''''''''''''''''''''''
|
|
|
|
- String
|
|
- The commit ID for the repository which represents the package and version.
|
|
- The value MUST be immutable for the VCS for security purposes
|
|
(e.g. no Git tags).
|
|
|
|
|
|
``packages.vcs.lock``
|
|
'''''''''''''''''''''
|
|
|
|
- Required when ``[[file-locks]]`` is used
|
|
- An array of strings
|
|
- An array of ``file-locks.name`` values which signify that the repository at the
|
|
specified commit is to be installed when the corresponding ``[[file-locks]]``
|
|
table applies to the environment.
|
|
- A name in the array may only appear if no file listed in
|
|
``packages.files.lock`` contains the name for the same package, regardless of
|
|
version.
|
|
|
|
|
|
``[packages.directory]``
|
|
------------------------
|
|
|
|
- Must be specified if ``[[packages.files]]`` and ``[packages.vcs]`` is not
|
|
and doing per-package locking.
|
|
- Table representing a source tree found on the local file system.
|
|
|
|
|
|
``packages.directory.path``
|
|
'''''''''''''''''''''''''''
|
|
|
|
- String
|
|
- A local directory where a source tree for the package and version exists.
|
|
- The path MUST use forward slashes as the path separator.
|
|
- If the path is relative it is relative to the location of the lock file.
|
|
|
|
|
|
``packages.directory.editable``
|
|
'''''''''''''''''''''''''''''''
|
|
|
|
- Boolean
|
|
- Optional (defaults to ``false``)
|
|
- Flag representing whether the source tree should be installed as an editable
|
|
install.
|
|
|
|
|
|
``[packages.tool]``
|
|
-------------------
|
|
|
|
- Optional
|
|
- Table
|
|
- Similar usage as that of the ``[tool]`` table from the
|
|
`pyproject.toml specification`_ , but at the package version level instead of
|
|
at the lock file level (which is also available via ``[tool]``).
|
|
- Useful for scoping package version/release details (e.g., recording signing
|
|
identities to then use to verify package integrity separately from where the
|
|
package is hosted, prototyping future extensions to this file format, etc.).
|
|
|
|
|
|
``[tool]``
|
|
==========
|
|
|
|
- Optional
|
|
- Table
|
|
- Same usage as that of the equivalent ``[tool]`` table from the
|
|
`pyproject.toml specification`_.
|
|
|
|
|
|
--------
|
|
Examples
|
|
--------
|
|
|
|
Per-file locking
|
|
================
|
|
|
|
.. code-block:: toml
|
|
|
|
version = '1.0'
|
|
hash-algorithm = 'sha256'
|
|
dependencies = ['cattrs', 'numpy']
|
|
|
|
[[file-locks]]
|
|
name = 'CPython 3.12 on manylinux 2.17 x86-64'
|
|
marker-values = {}
|
|
wheel-tags = ['cp312-cp312-manylinux_2_17_x86_64', 'py3-none-any']
|
|
|
|
[[file-locks]]
|
|
name = 'CPython 3.12 on Windows x64'
|
|
marker-values = {}
|
|
wheel-tags = ['cp312-cp312-win_amd64', 'py3-none-any']
|
|
|
|
[[packages]]
|
|
name = 'attrs'
|
|
version = '23.2.0'
|
|
multiple-entries = false
|
|
description = 'Classes Without Boilerplate'
|
|
requires-python = '>=3.7'
|
|
dependents = ['cattrs']
|
|
dependencies = []
|
|
direct = false
|
|
files = [
|
|
{name = 'attrs-23.2.0-py3-none-any.whl', lock = ['CPython 3.12 on manylinux 2.17 x86-64', 'CPython 3.12 on Windows x64'], url = 'https://files.pythonhosted.org/packages/e0/44/827b2a91a5816512fcaf3cc4ebc465ccd5d598c45cefa6703fcf4a79018f/attrs-23.2.0-py3-none-any.whl', hash = '99b87a485a5820b23b879f04c2305b44b951b502fd64be915879d77a7e8fc6f1'}
|
|
]
|
|
|
|
[[packages]]
|
|
name = 'cattrs'
|
|
version = '23.2.3'
|
|
multiple-entries = false
|
|
description = 'Composable complex class support for attrs and dataclasses.'
|
|
requires-python = '>=3.8'
|
|
dependents = []
|
|
dependencies = ['attrs']
|
|
direct = false
|
|
files = [
|
|
{name = 'cattrs-23.2.3-py3-none-any.whl', lock = ['CPython 3.12 on manylinux 2.17 x86-64', 'CPython 3.12 on Windows x64'], url = 'https://files.pythonhosted.org/packages/b3/0d/cd4a4071c7f38385dc5ba91286723b4d1090b87815db48216212c6c6c30e/cattrs-23.2.3-py3-none-any.whl', hash = '0341994d94971052e9ee70662542699a3162ea1e0c62f7ce1b4a57f563685108'}
|
|
]
|
|
|
|
[[packages]]
|
|
name = 'numpy'
|
|
version = '2.0.1'
|
|
multiple-entries = false
|
|
description = 'Fundamental package for array computing in Python'
|
|
requires-python = '>=3.9'
|
|
dependents = []
|
|
dependencies = []
|
|
direct = false
|
|
files = [
|
|
{name = 'numpy-2.0.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl', lock = ['cp312-manylinux_2_17_x86_64'], url = 'https://files.pythonhosted.org/packages/2c/f3/61eeef119beb37decb58e7cb29940f19a1464b8608f2cab8a8616aba75fd/numpy-2.0.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl', hash = '6790654cb13eab303d8402354fabd47472b24635700f631f041bd0b65e37298a'},
|
|
{name = 'numpy-2.0.1-cp312-cp312-win_amd64.whl', lock = ['cp312-win_amd64'], url = 'https://files.pythonhosted.org/packages/b5/59/f6ad30785a6578ad85ed9c2785f271b39c3e5b6412c66e810d2c60934c9f/numpy-2.0.1-cp312-cp312-win_amd64.whl', hash = 'bb2124fdc6e62baae159ebcfa368708867eb56806804d005860b6007388df171'}
|
|
]
|
|
|
|
|
|
Per-package locking
|
|
===================
|
|
|
|
Some values for ``packages.files.url`` left out to make creating this
|
|
example more easily as it was done by hand.
|
|
|
|
.. code-block:: toml
|
|
|
|
version = '1.0'
|
|
hash-algorithm = 'sha256'
|
|
dependencies = ['cattrs', 'numpy']
|
|
|
|
[package-lock]
|
|
requires-python = ">=3.9"
|
|
|
|
|
|
[[packages]]
|
|
name = 'attrs'
|
|
version = '23.2.0'
|
|
multiple-entries = false
|
|
description = 'Classes Without Boilerplate'
|
|
requires-python = '>=3.7'
|
|
dependents = ['cattrs']
|
|
dependencies = []
|
|
direct = false
|
|
files = [
|
|
{name = 'attrs-23.2.0-py3-none-any.whl', lock = ['cp312-manylinux_2_17_x86_64', 'cp312-win_amd64'], url = 'https://files.pythonhosted.org/packages/e0/44/827b2a91a5816512fcaf3cc4ebc465ccd5d598c45cefa6703fcf4a79018f/attrs-23.2.0-py3-none-any.whl', hash = '99b87a485a5820b23b879f04c2305b44b951b502fd64be915879d77a7e8fc6f1'}
|
|
]
|
|
|
|
[[packages]]
|
|
name = 'cattrs'
|
|
version = '23.2.3'
|
|
multiple-entries = false
|
|
description = 'Composable complex class support for attrs and dataclasses.'
|
|
requires-python = '>=3.8'
|
|
dependents = []
|
|
dependencies = ['attrs']
|
|
direct = false
|
|
files = [
|
|
{name = 'cattrs-23.2.3-py3-none-any.whl', lock = ['cp312-manylinux_2_17_x86_64', 'cp312-win_amd64'], url = 'https://files.pythonhosted.org/packages/b3/0d/cd4a4071c7f38385dc5ba91286723b4d1090b87815db48216212c6c6c30e/cattrs-23.2.3-py3-none-any.whl', hash = '0341994d94971052e9ee70662542699a3162ea1e0c62f7ce1b4a57f563685108'}
|
|
]
|
|
|
|
[[packages]]
|
|
name = 'numpy'
|
|
version = '2.0.1'
|
|
multiple-entries = false
|
|
description = 'Fundamental package for array computing in Python'
|
|
requires-python = '>=3.9'
|
|
dependents = []
|
|
dependencies = []
|
|
direct = false
|
|
files = [
|
|
{name = "numpy-2.0.1-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:6bf4e6f4a2a2e26655717a1983ef6324f2664d7011f6ef7482e8c0b3d51e82ac"},
|
|
{name = "numpy-2.0.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:7d6fddc5fe258d3328cd8e3d7d3e02234c5d70e01ebe377a6ab92adb14039cb4"},
|
|
{name = "numpy-2.0.1-cp312-cp312-macosx_14_0_arm64.whl", hash = "sha256:5daab361be6ddeb299a918a7c0864fa8618af66019138263247af405018b04e1"},
|
|
{name = "numpy-2.0.1-cp312-cp312-macosx_14_0_x86_64.whl", hash = "sha256:ea2326a4dca88e4a274ba3a4405eb6c6467d3ffbd8c7d38632502eaae3820587"},
|
|
{name = "numpy-2.0.1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:529af13c5f4b7a932fb0e1911d3a75da204eff023ee5e0e79c1751564221a5c8"},
|
|
{name = "numpy-2.0.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:6790654cb13eab303d8402354fabd47472b24635700f631f041bd0b65e37298a"},
|
|
{name = "numpy-2.0.1-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:cbab9fc9c391700e3e1287666dfd82d8666d10e69a6c4a09ab97574c0b7ee0a7"},
|
|
{name = "numpy-2.0.1-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:99d0d92a5e3613c33a5f01db206a33f8fdf3d71f2912b0de1739894668b7a93b"},
|
|
{name = "numpy-2.0.1-cp312-cp312-win32.whl", hash = "sha256:173a00b9995f73b79eb0191129f2455f1e34c203f559dd118636858cc452a1bf"},
|
|
{name = "numpy-2.0.1-cp312-cp312-win_amd64.whl", hash = "sha256:bb2124fdc6e62baae159ebcfa368708867eb56806804d005860b6007388df171"},
|
|
]
|
|
|
|
|
|
------------------------
|
|
Expectations for Lockers
|
|
------------------------
|
|
|
|
- When creating a lock file for ``[package-lock]``, the locker SHOULD read
|
|
the metadata of **all** files that end up being listed in
|
|
``[[packages.files]]`` to make sure all potential metadata cases are covered
|
|
- If a locker chooses not to check every file for its metadata, the tool MUST
|
|
either provide the user with the option to have all files checked (whether
|
|
that is opt-in or out is left up to the tool), or the user is somehow notified
|
|
that such a standards-violating shortcut is being taken (whether this is by
|
|
documentation or at runtime is left to the tool)
|
|
- Lockers MAY want to provide a way to let users provide the information
|
|
necessary to install for multiple environments at once when doing per-file
|
|
locking, e.g. supporting a JSON file format which specifies wheel tags and
|
|
marker values much like in ``[[file-locks]]`` for which multiple files can be
|
|
specified, which could then be directly recorded in the corresponding
|
|
``[[file-locks]]`` table (if it allowed for unambiguous per-file locking
|
|
environment selection)
|
|
|
|
.. code-block:: JSON
|
|
|
|
{
|
|
"marker-values": {"<marker>": "<value>"},
|
|
"wheel-tags": ["<tag>"]
|
|
}
|
|
|
|
|
|
---------------------------
|
|
Expectations for Installers
|
|
---------------------------
|
|
|
|
- Installers MAY support installation of non-binary files
|
|
(i.e. source distributions, source trees, and VCS), but are not required to.
|
|
- Installers MUST provide a way to avoid non-binary file installation for
|
|
reproducibility and security purposes.
|
|
- Installers SHOULD make it opt-in to use non-binary file installation to
|
|
facilitate a secure-by-default approach.
|
|
- Under per-file locking, if what to install is ambiguous then the installer
|
|
MUST raise an error.
|
|
|
|
|
|
Installing for per-file locking
|
|
===============================
|
|
|
|
- If no compatible environment is found an error MUST be raised.
|
|
- If multiple environments are found to be compatible then an error MUST be
|
|
raised.
|
|
- If a ``[[packages.files]]`` contains multiple matching entries an error MUST
|
|
be raised due to ambiguity for what is to be installed.
|
|
- If multiple ``[[packages]]`` entries for the same package have matching files
|
|
an error MUST be raised due to ambiguity for what is to be installed.
|
|
|
|
|
|
Example workflow
|
|
----------------
|
|
|
|
- Iterate through each ``[[file-locks]]`` table to find the one that applies to
|
|
the environment being installed for.
|
|
- If no compatible environment is found an error MUST be raised.
|
|
- If multiple environments are found to be compatible then an error MUST be
|
|
raised.
|
|
- For the compatible environment, iterate through each entry in
|
|
``[[packages]]``.
|
|
- For each ``[[packages]]`` entry, iterate through ``[[packages.files]]`` to
|
|
look for any files with ``file-locks.name`` listed in ``packages.files.lock``.
|
|
- If a file is found with a matching lock name, add it to the list of candidate
|
|
files to install and move on to the next ``[[packages]]`` entry.
|
|
- If no file is found then check if ``packages.vcs.lock`` contains a match (no
|
|
match is also acceptable).
|
|
- If a ``[[packages.files]]`` contains multiple matching entries an error MUST
|
|
be raised due to ambiguity for what is to be installed.
|
|
- If multiple ``[[packages]]`` entries for the same package have matching files
|
|
an error MUST be raised due to ambiguity for what is to be installed.
|
|
- Find and verify the candidate files and/or VCS entries based on their hash or
|
|
commit ID as appropriate.
|
|
- Install the candidate files.
|
|
|
|
|
|
Installing for package locking
|
|
==============================
|
|
|
|
- Verify that the environment is compatible with
|
|
``package-lock.requires-python``; if it isn't an error MUST be raised.
|
|
- If no way to install a required package is found, an error MUST be raised.
|
|
|
|
|
|
Example workflow
|
|
----------------
|
|
|
|
- Verify that the environment is compatible with
|
|
``package-lock.requires-python``; if it isn't an error MUST be raised.
|
|
- Iterate through each entry in ``[packages]]``.
|
|
- For each entry, if there's a ``packages.marker`` key, evaluate the expression.
|
|
|
|
- If the expression is false, then move on.
|
|
- Otherwise the package entry must be installed somehow.
|
|
|
|
- Iterate through the files listed in ``[[packages.files]]``, looking for the
|
|
"best" file to install.
|
|
- If no file is found, check for ``[packages.vcs]``.
|
|
- It no VCS is found, check for ``packages.directory``.
|
|
- If no match is found, an error MUST be raised.
|
|
- Find and verify the selected files and/or VCS entries based on their hash or
|
|
commit ID as appropriate.
|
|
- Install the selected files.
|
|
|
|
|
|
=======================
|
|
Backwards Compatibility
|
|
=======================
|
|
|
|
Because there is no preexisting lock file format, there are no explicit
|
|
backwards-compatibility concerns in terms of Python packaging standards.
|
|
|
|
As for packaging tools themselves, that will be a per-tool decision. For tools
|
|
that don't document their lock file format, they could choose to simply start
|
|
using the format internally and then transition to saving their lock files with
|
|
a name supported by this PEP. For tools with a preexisting, documented format,
|
|
they could provide an option to choose which format to emit.
|
|
|
|
|
|
=====================
|
|
Security Implications
|
|
=====================
|
|
|
|
The hope is that by standardizing on a lock file format that starts from a
|
|
security-first posture it will help make overall packaging installation safer.
|
|
However, this PEP does not solve all potential security concerns.
|
|
|
|
One potential concern is tampering with a lock file. If a lock file is not kept
|
|
in source control and properly audited, a bad actor could change the file in
|
|
nefarious ways (e.g. point to a malware version of a package). Tampering could
|
|
also occur in transit to e.g. a cloud provider who will perform an installation
|
|
on the user's behalf. Both could be mitigated by signing the lock file either
|
|
within the file in a ``[tool]`` entry or via a side channel external to the lock
|
|
file itself.
|
|
|
|
This PEP does not do anything to prevent a user from installing an incorrect
|
|
packages. While including many details to help in auditing a package's inclusion,
|
|
there isn't any mechanism to stop e.g. name confusion attacks via typosquatting.
|
|
Lockers may be able to provide some UX to help with this (e.g. by providing
|
|
download counts for a package).
|
|
|
|
|
|
=================
|
|
How to Teach This
|
|
=================
|
|
|
|
Users should be informed that when they ask to install some package, that
|
|
package may have its own dependencies, those dependencies may have dependencies,
|
|
and so on. Without writing down what gets installed as part of installing the
|
|
package they requested, things could change from underneath them (e.g. package
|
|
versions). Changes to the underlying dependencies can lead to accidental
|
|
breakage of their code. Lock files help deal with that by providing a way to
|
|
write down what was installed.
|
|
|
|
Having what to install written down also helps in collaborating with others. By
|
|
agreeing to a lock file's contents, everyone ends up with the same packages
|
|
installed. This helps make sure no one relies on e.g. an API that's only
|
|
available in a certain version that not everyone working on the project has
|
|
installed.
|
|
|
|
Lock files also help with security by making sure you always get the same files
|
|
installed and not a malicious one that someone may have slipped in. It also
|
|
lets one be more deliberate in upgrading their dependencies and thus making sure
|
|
the change is on purpose and not one slipped in by a bad actor.
|
|
|
|
|
|
========================
|
|
Reference Implementation
|
|
========================
|
|
|
|
A rough proof-of-concept for per-file locking can be found at
|
|
https://github.com/brettcannon/mousebender/tree/pep. An example lock file can
|
|
be seen at
|
|
https://github.com/brettcannon/mousebender/blob/pep/pylock.example.toml.
|
|
|
|
For per-package locking, PDM_ indirectly proves the approach works as this PEP
|
|
maintains equivalent data as PDM does for its lock files (whose format was
|
|
inspired by Poetry_). Some of the details of PDM's approach are covered in
|
|
https://frostming.com/en/2024/pdm-lockfile/ and
|
|
https://frostming.com/en/2024/pdm-lock-strategy/.
|
|
|
|
|
|
==============
|
|
Rejected Ideas
|
|
==============
|
|
|
|
----------------------------
|
|
Only support package locking
|
|
----------------------------
|
|
|
|
At one point it was suggested to skip per-file locking and only support package
|
|
locking as the former was not explicitly supported in the larger Python
|
|
ecosystem while the latter was. But because this PEP has taken the position
|
|
that security is important and per-file locking is the more secure of the two
|
|
options, leaving out per-file locking was never considered.
|
|
|
|
|
|
-------------------------------------------------------------------------------------
|
|
Specifying a new core metadata version that requires consistent metadata across files
|
|
-------------------------------------------------------------------------------------
|
|
|
|
At one point, to handle the issue of metadata varying between files and thus
|
|
require examining every released file for a package and version for accurate
|
|
locking results, the idea was floated to introduce a new core metadata version
|
|
which would require all metadata for all wheel files be the same for a single
|
|
version of a packages. Ultimately, though, it was deemed unnecessary as this PEP
|
|
will put pressure on people to make files consistent for performance reasons or
|
|
to make indexes provide all the metadata separate from the wheel files
|
|
themselves. As well, there's no easy enforcement mechanism, and so community
|
|
expectation would work as well as a new metadata version.
|
|
|
|
|
|
-------------------------------------------
|
|
Have the installer do dependency resolution
|
|
-------------------------------------------
|
|
|
|
In order to support a format more akin to how Poetry worked when this PEP was
|
|
drafted, it was suggested that lockers effectively record the packages and their
|
|
versions which may be necessary to make an install work in any possible
|
|
scenario, and then the installer resolves what to install. But that complicates
|
|
auditing a lock file by requiring much more mental effort to know what packages
|
|
may be installed in any given scenario. Also, one of the Poetry developers
|
|
`suggested <https://discuss.python.org/t/lock-files-again-but-this-time-w-sdists/46593/83>`__
|
|
that markers as represented in the package locking approach of this PEP may be
|
|
sufficient to cover the needs of Poetry. Not having the installer do a
|
|
resolution also simplifies their implementation, centralizing complexity in
|
|
lockers.
|
|
|
|
|
|
-----------------------------------------
|
|
Requiring specific hash algorithm support
|
|
-----------------------------------------
|
|
|
|
It was proposed to require a baseline hash algorithm for the files. This was
|
|
rejected as no other Python packaging specification requires specific hash
|
|
algorithm support. As well, the minimum hash algorithm suggested may eventually
|
|
become an outdated/unsafe suggestion, requiring further updates. In order to
|
|
promote using the best algorithm at all times, no baseline is provided to avoid
|
|
simply defaulting to the baseline in tools without considering the security
|
|
ramifications of that hash algorithm.
|
|
|
|
|
|
-----------
|
|
File naming
|
|
-----------
|
|
|
|
Using ``*.pylock.toml`` as the file name
|
|
========================================
|
|
|
|
It was proposed to put the ``pylock`` constant part of the file name after the
|
|
identifier for the purpose of the lock file. It was decided not to do this so
|
|
that lock files would sort together when looking at directory contents instead
|
|
of purely based on their purpose which could spread them out in a directory.
|
|
|
|
|
|
Using ``*.pylock`` as the file name
|
|
===================================
|
|
|
|
Not using ``.toml`` as the file extension and instead making it ``.pylock``
|
|
itself was proposed. This was decided against so that code editors would know
|
|
how to provide syntax highlighting to a lock file without having special
|
|
knowledge about the file extension.
|
|
|
|
|
|
Not having a naming convention for the file
|
|
===========================================
|
|
|
|
Having no requirements or guidance for a lock file's name was considered, but
|
|
ultimately rejected. By having a standardized naming convention it makes it easy
|
|
to identify a lock file for both a human and a code editor. This helps
|
|
facilitate discovery when e.g. a tool wants to know all of the lock files that
|
|
are available.
|
|
|
|
|
|
-----------
|
|
File format
|
|
-----------
|
|
|
|
Use JSON over TOML
|
|
==================
|
|
|
|
Since having a format that is machine-writable was a goal of this PEP, it was
|
|
suggested to use JSON. But it was deemed less human-readable than TOML while
|
|
not improving on the machine-writable aspect enough to warrant the change.
|
|
|
|
|
|
Use YAML over TOML
|
|
==================
|
|
|
|
Some argued that YAML met the machine-writable/human-readable requirement in a
|
|
better way than TOML. But as that's subjective and ``pyproject.toml`` already
|
|
existed as the human-writable file used by Python packaging standards it was
|
|
deemed more important to keep using TOML.
|
|
|
|
|
|
----------
|
|
Other keys
|
|
----------
|
|
|
|
Multiple hashes per file
|
|
========================
|
|
|
|
An initial version of this PEP proposed supporting multiple hashes per file. The
|
|
idea was to allow one to choose which hashing algorithm they wanted to go with
|
|
when installing. But upon reflection it seemed like an unnecessary complication
|
|
as there was no guarantee the hashes provided would satisfy the user's needs.
|
|
As well, if the single hash algorithm used in the lock file wasn't sufficient,
|
|
rehashing the files involved as a way to migrate to a different algorithm didn't
|
|
seem insurmountable.
|
|
|
|
|
|
Hashing the contents of the lock file itself
|
|
============================================
|
|
|
|
Hashing the contents of the bytes of the file and storing hash value within the
|
|
file itself was proposed at some point. This was removed to make it easier
|
|
when merging changes to the lock file as each merge would have to recalculate
|
|
the hash value to avoid a merge conflict.
|
|
|
|
Hashing the semantic contents of the file was also proposed, but it would lead
|
|
to the same merge conflict issue.
|
|
|
|
Regardless of which contents were hashed, either approach could have the hash
|
|
value stored outside of the file if such a hash was desired.
|
|
|
|
|
|
Recording the creation date of the lock file
|
|
============================================
|
|
|
|
To know how potentially stale the lock file was, an earlier proposal suggested
|
|
recording the creation date of the lock file. But for some same merge conflict
|
|
reasons as storing the hash of the file contents, this idea was dropped.
|
|
|
|
|
|
Recording the package indexes used
|
|
==================================
|
|
|
|
Recording what package indexes were used by the locker to decide what to lock
|
|
for was considered. In the end, though, it was rejected as it was deemed
|
|
unnecessary bookkeeping.
|
|
|
|
|
|
Locking build requirements for sdists
|
|
=====================================
|
|
|
|
An earlier version of this PEP tried to lock the build requirements for sdists
|
|
under a ``packages.build-requires`` key. Unfortunately it confused enough people
|
|
about how it was expected to operate and there were enough edge case issues to
|
|
decide it wasn't worth trying to do in this PEP upfront. Instead, a future PEP
|
|
could propose a solution.
|
|
|
|
|
|
===========
|
|
Open Issues
|
|
===========
|
|
|
|
N/A
|
|
|
|
|
|
================
|
|
Acknowledgements
|
|
================
|
|
|
|
Thanks to everyone who participated in the discussions in
|
|
https://discuss.python.org/t/lock-files-again-but-this-time-w-sdists/46593/,
|
|
especially Alyssa Coghlan who probably caused the biggest structural shifts from
|
|
the initial proposal.
|
|
|
|
Also thanks to Randy Döring, Seth Michael Larson, Paul Moore, and Ofek Lev for
|
|
providing feedback on a draft version of this PEP.
|
|
|
|
|
|
=========
|
|
Copyright
|
|
=========
|
|
|
|
This document is placed in the public domain or under the
|
|
CC0-1.0-Universal license, whichever is more permissive.
|
|
|
|
|
|
.. _core metadata: https://packaging.python.org/en/latest/specifications/core-metadata/
|
|
.. _Dependabot: https://docs.github.com/en/code-security/dependabot
|
|
.. _dependency specifiers: https://packaging.python.org/en/latest/specifications/dependency-specifiers/
|
|
.. _direct URL reference: https://packaging.python.org/en/latest/specifications/direct-url/
|
|
.. _environment markers: https://packaging.python.org/en/latest/specifications/dependency-specifiers/#environment-markers
|
|
.. _normalized name: https://packaging.python.org/en/latest/specifications/name-normalization/#name-normalization
|
|
.. _PDM: https://pypi.org/project/pdm/
|
|
.. _pip-tools: https://pypi.org/project/pip-tools/
|
|
.. _Poetry: https://python-poetry.org/
|
|
.. _project index: https://packaging.python.org/en/latest/specifications/simple-repository-api/#project-list
|
|
.. _pyproject.toml specification: https://packaging.python.org/en/latest/specifications/pyproject-toml/#pyproject-toml-specification
|
|
.. _Simple Repository API: https://packaging.python.org/en/latest/specifications/simple-repository-api/
|
|
.. _software bill of materials: https://www.cisa.gov/sbom
|
|
.. _TOML: https://toml.io/
|
|
.. _uv: https://github.com/astral-sh/uv
|
|
.. _version specifiers: https://packaging.python.org/en/latest/specifications/version-specifiers/
|
|
.. _wheel tags: https://packaging.python.org/en/latest/specifications/platform-compatibility-tags/
|