630 lines
28 KiB
ReStructuredText
630 lines
28 KiB
ReStructuredText
PEP: 710
|
|
Title: Recording the provenance of installed packages
|
|
Author: Fridolín Pokorný <fridolin.pokorny at gmail.com>
|
|
Sponsor: Donald Stufft <donald@stufft.io>
|
|
PEP-Delegate: Paul Moore <p.f.moore@gmail.com>
|
|
Discussions-To: https://discuss.python.org/t/pep-710-recording-the-provenance-of-installed-packages/25428
|
|
Status: Draft
|
|
Type: Standards Track
|
|
Topic: Packaging
|
|
Content-Type: text/x-rst
|
|
Created: 27-Mar-2023
|
|
Post-History: `03-Dec-2021 <https://discuss.python.org/t/pip-installation-reports/12316>`__,
|
|
`30-Jan-2023 <https://discuss.python.org/t/pre-pep-recording-provenance-of-installed-packages/23340>`__,
|
|
`14-Mar-2023 <https://discuss.python.org/t/draft-pep-recording-provenance-of-installed-packages/24838>`__,
|
|
`03-Apr-2023 <https://discuss.python.org/t/pep-710-recording-the-provenance-of-installed-packages/25428>`__,
|
|
|
|
Abstract
|
|
========
|
|
|
|
This PEP describes a way to record the provenance of installed Python distributions.
|
|
The record is created by an installer and is available to users in
|
|
the form of a JSON file ``provenance_url.json`` in the ``.dist-info`` directory.
|
|
The mentioned JSON file captures additional metadata to allow recording a URL to a
|
|
:term:`distribution package` together with the installed distribution hash. This
|
|
proposal is built on top of :pep:`610` following
|
|
:ref:`its corresponding canonical PyPA spec <packaging:direct-url>` and
|
|
complements ``direct_url.json`` with ``provenance_url.json`` for when packages
|
|
are identified by a name, and optionally a version.
|
|
|
|
Motivation
|
|
==========
|
|
|
|
Installing a Python :term:`Project` involves downloading a :term:`Distribution Package`
|
|
from a :term:`Package Index`
|
|
and extracting its content to an appropriate place. After the installation
|
|
process is done, information about the release artifact used as well as its source
|
|
is generally lost. However, there are use cases for keeping records of
|
|
distributions used for installing packages and their provenance.
|
|
|
|
Python wheels can be built with different compiler flags or supporting
|
|
different wheel tags. In both cases, users might get into a situation in which
|
|
multiple wheels might be considered by installers (possibly from different
|
|
package indexes) and immediately finding out which wheel file was actually used
|
|
during the installation might be helpful. This way, developers can use
|
|
information about wheels to debug issues making sure the desired wheel was
|
|
actually installed. Another use case could be tools reporting software
|
|
installed, such as tools reporting a SBOM (Software Bill of Materials), that might
|
|
give more accurate reports. Yet another use case could be reconstruction of the
|
|
Python environment by pinning each installed package to a specific distribution
|
|
artifact consumed from a Python package index.
|
|
|
|
Rationale
|
|
=========
|
|
|
|
The motivation described in this PEP is an extension of that in :pep:`610`.
|
|
In addition to recording provenance information for packages installed using a direct URL,
|
|
installers should also do so for packages installed by name
|
|
(and optionally version) from Python package indexes.
|
|
|
|
The idea described in this PEP originated in a tool called `micropipenv`_
|
|
that is used to install
|
|
:term:`distribution packages <Distribution Package>` in containerized
|
|
environments (see the reported issue `thoth-station/micropipenv#206`_).
|
|
Currently, the assembled containerized application does not implicitly carry
|
|
information about the provenance of installed distribution packages
|
|
(unless these are installed from full URLs and recorded via ``direct_url.json``).
|
|
This requires container image suppliers to link
|
|
container images with the corresponding build process, its configuration and
|
|
the application source code for checking requirements files in cases when
|
|
software present in containerized environments needs to be audited.
|
|
|
|
The `subsequent discussion in the Discourse thread
|
|
<https://discuss.python.org/t/12316>`__ also brought up
|
|
pip's new ``--report`` option that can
|
|
`generate a detailed JSON report <pip_installation_report_>`__ about
|
|
the installation process. This option could help with the provenance problem
|
|
this PEP approaches. Nevertheless, this option needs to be *explicitly* passed
|
|
to pip to obtain the provenance information, and includes additional metadata that
|
|
might not be necessary for checking the provenance (such as Python version
|
|
requirements of each distribution package). Also, this option is
|
|
specific to pip as of the writing of this PEP.
|
|
|
|
Note the current :ref:`spec for recording installed packages
|
|
<packaging:recording-installed-packages>` defines a ``RECORD`` file that
|
|
records installed files, but not the distribution artifact from which these
|
|
files were obtained. Auditing installed artifacts can be performed
|
|
based on matching the entries listed in the ``RECORD`` file. However, this
|
|
technique requires a pre-computed database of files each artifact provides or a
|
|
comparison with the actual artifact content. Both approaches are relatively
|
|
expensive and time consuming operations which could be eliminated with the
|
|
proposed ``provenance_url.json`` file.
|
|
|
|
Recording provenance information for installed distribution packages,
|
|
both those obtained from direct URLs and by name/version from an index,
|
|
can simplify auditing Python environments in general, beyond just
|
|
the specific use case for containerized applications mentioned earlier.
|
|
A community project `pip-audit
|
|
<https://github.com/pypa/pip-audit>`__ raised their possible interest in
|
|
`pypa/pip-audit#170`_.
|
|
|
|
Specification
|
|
=============
|
|
|
|
The keywords “MUST”, “MUST NOT”, “REQUIRED”, “SHOULD”,
|
|
“SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL”
|
|
in this document are to be interpreted as described in :rfc:`2119`.
|
|
|
|
The ``provenance_url.json`` file SHOULD be created in the ``.dist-info``
|
|
directory by installers when installing a :term:`Distribution Package`
|
|
specified by name (and optionally by :term:`Version Specifier`).
|
|
|
|
This file MUST NOT be created when installing a distribution package from a requirement
|
|
specifying a direct URL reference (including a VCS URL).
|
|
|
|
Only one of the files ``provenance_url.json`` and ``direct_url.json`` (from :pep:`610`),
|
|
may be present in a given ``.dist-info`` directory; installers MUST NOT add both.
|
|
|
|
The ``provenance_url.json`` JSON file MUST be a dictionary, compliant with
|
|
:rfc:`8259` and UTF-8 encoded.
|
|
|
|
If present, it MUST contain exactly two keys. The first one is ``url``, with
|
|
type ``string``. The second key MUST be ``archive_info`` with a value defined
|
|
below.
|
|
|
|
The value of the ``url`` key MUST be the URL from which the distribution package was downloaded. If a wheel is
|
|
built from a source distribution, the ``url`` value MUST be the URL from which
|
|
the source distribution was downloaded. If a wheel is downloaded and installed directly,
|
|
the ``url`` field MUST be the URL from which the wheel was downloaded.
|
|
As in the :ref:`direct URL origin specification<packaging:direct-url>`, the ``url`` value
|
|
MUST be stripped of any sensitive authentication information for security reasons.
|
|
|
|
The user:password section of the URL MAY however be composed of environment
|
|
variables, matching the following regular expression:
|
|
|
|
.. code-block:: text
|
|
|
|
\$\{[A-Za-z0-9-_]+\}(:\$\{[A-Za-z0-9-_]+\})?
|
|
|
|
Additionally, the user:password section of the URL MAY be a well-known,
|
|
non-security sensitive string. A typical example is ``git`` in the case of an
|
|
URL such as ``ssh://git@gitlab.com``.
|
|
|
|
The value of ``archive_info`` MUST be a dictionary with a single key
|
|
``hashes``. The value of ``hashes`` is a dictionary mapping hash function names to a
|
|
hex-encoded digest of the file referenced by the ``url`` value. Multiple hashes
|
|
can be included, and it is up to the consumer to decide what to do with
|
|
multiple hashes (it may validate all of them or a subset of them, or nothing at
|
|
all).
|
|
|
|
Each hash MUST be one of the single argument hashes provided by
|
|
:data:`py3.11:hashlib.algorithms_guaranteed`, excluding ``sha1`` and ``md5`` which MUST NOT be used.
|
|
As of Python 3.11, with ``shake_128`` and ``shake_256`` excluded
|
|
for being multi-argument, the allowed set of hashes is:
|
|
|
|
.. code-block:: python
|
|
|
|
>>> import hashlib
|
|
>>> sorted(hashlib.algorithms_guaranteed - {"shake_128", "shake_256", "sha1", "md5"})
|
|
['blake2b', 'blake2s', 'sha224', 'sha256', 'sha384', 'sha3_224', 'sha3_256', 'sha3_384', 'sha3_512', 'sha512']
|
|
|
|
Each hash MUST be referenced by the canonical name of the hash, always lower case.
|
|
|
|
Hashes ``sha1`` and ``md5`` MUST NOT be present, due to the security
|
|
limitations of these hash algorithms. Conversely, hash ``sha256`` SHOULD
|
|
be included.
|
|
|
|
Installers that cache distribution packages from an index SHOULD keep
|
|
information related to the cached distribution artifact, so that
|
|
the ``provenance_url.json`` file can be created even when installing distribution packages
|
|
from the installer's cache.
|
|
|
|
Backwards Compatibility
|
|
=======================
|
|
|
|
Following the :ref:`packaging:recording-installed-packages` specification,
|
|
installers may keep additional installer-specific files in the ``.dist-info``
|
|
directory. To make sure this PEP does not cause any backwards compatibility
|
|
issues, a `comprehensive survey of installers and libraries <710-tool-survey_>`_
|
|
found no current tools that are using a similarly-named file,
|
|
or other major feasibility concerns.
|
|
|
|
The :ref:`Wheel specification <packaging:binary-distribution-format>` lists files that can be
|
|
present in the ``.dist-info`` directory. None of these file names collide with
|
|
the proposed ``provenance_url.json`` file from this PEP.
|
|
|
|
Presence of provenance_url.json in installers and libraries
|
|
-----------------------------------------------------------
|
|
|
|
A comprehensive survey of the existing installers, libraries, and dependency
|
|
managers in the Python ecosystem analyzed the implications of adding support for
|
|
``provenance_url.json`` to each tool.
|
|
In summary, no major backwards compatibility issues, conflicts or feasibility blockers
|
|
were found as of the time of writing of this PEP. More details about the survey
|
|
can be found in the `Appendix: Survey of installers and libraries`_ section.
|
|
|
|
Compatibility with direct_url.json
|
|
----------------------------------
|
|
|
|
This proposal does not make any changes to the ``direct_url.json`` file
|
|
described in :pep:`610` and :ref:`its corresponding canonical PyPA spec
|
|
<packaging:direct-url>`.
|
|
|
|
The content of ``provenance_url.json`` file was designed in a way to eventually
|
|
allow installers reuse some of the logic supporting ``direct_url.json`` when a
|
|
direct URL refers to a source archive or a wheel.
|
|
|
|
The main difference between the ``provenance_url.json`` and ``direct_url.json``
|
|
files are the mandatory keys and their values in the ``provenance_url.json`` file.
|
|
This helps make sure consumers of the ``provenance_url.json`` file can rely
|
|
on its content, if the file is present in the ``.dist-info`` directory.
|
|
|
|
Security Implications
|
|
=====================
|
|
|
|
One of the main security features of the ``provenance_url.json`` file is the
|
|
ability to audit installed artifacts in Python environments. Tools can check
|
|
which Python package indexes were used to install Python :term:`distribution
|
|
packages <Distribution Package>` as well as the hash digests of their release
|
|
artifacts.
|
|
|
|
As an example, we can take the recent compromised dependency chain in `the
|
|
PyTorch incident <https://pytorch.org/blog/compromised-nightly-dependency/>`__.
|
|
The PyTorch index provided a package named ``torchtriton``. An attacker
|
|
published ``torchtriton`` on PyPI, which ran a malicious binary. By checking
|
|
the URL of the installed Python distribution stated in the
|
|
``provenance_url.json`` file, tools can automatically check the source of the
|
|
installed Python distribution. In case of the PyTorch incident, the URL of
|
|
``torchtriton`` should point to the PyTorch index, not PyPI. Tools can help
|
|
identifying such malicious Python distributions installed by checking the
|
|
installed Python distribution URL. A more exact check can include also the hash
|
|
of the installed Python distribution stated in the ``provenance_url.json``
|
|
file. Such checks on hashes can be helpful for mirrored Python package indexes
|
|
where Python distributions are not distinguishable by their source URLs, making
|
|
sure only desired Python package distributions are installed.
|
|
|
|
A malicious actor can intentionally adjust the content of
|
|
``provenance_url.json`` to possibly hide provenance information of the
|
|
installed Python distribution. A security check which would uncover such
|
|
malicious activity is beyond scope of this PEP as it would require monitoring
|
|
actions on the filesystem and eventually reviewing user or file permissions.
|
|
|
|
How to Teach This
|
|
=================
|
|
|
|
The ``provenance_url.json`` metadata file is intended for tools and is not
|
|
directly visible to end users.
|
|
|
|
Examples
|
|
========
|
|
|
|
Examples of a valid provenance_url.json
|
|
---------------------------------------
|
|
|
|
A valid ``provenance_url.json`` list multiple hashes:
|
|
|
|
.. code-block:: json
|
|
|
|
{
|
|
"archive_info": {
|
|
"hashes": {
|
|
"blake2s": "fffeaf3d0bd71dc960ca2113af890a2f2198f2466f8cd58ce4b77c1fc54601ff",
|
|
"sha256": "236bcb61156d76c4b8a05821b988c7b8c35bf0da28a4b614e8d6ab5212c25c6f",
|
|
"sha3_256": "c856930e0f707266d30e5b48c667a843d45e79bb30473c464e92dfa158285eab",
|
|
"sha512": "6bad5536c30a0b2d5905318a1592948929fbac9baf3bcf2e7faeaf90f445f82bc2b656d0a89070d8a6a9395761f4793c83187bd640c64b2656a112b5be41f73d"
|
|
}
|
|
},
|
|
"url": "https://files.pythonhosted.org/packages/07/51/2c0959c5adf988c44d9e1e0d940f5b074516ecc87e96b1af25f59de9ba38/pip-23.0.1-py3-none-any.whl"
|
|
}
|
|
|
|
A valid ``provenance_url.json`` listing a single hash entry:
|
|
|
|
.. code-block:: json
|
|
|
|
{
|
|
"archive_info": {
|
|
"hashes": {
|
|
"sha256": "236bcb61156d76c4b8a05821b988c7b8c35bf0da28a4b614e8d6ab5212c25c6f"
|
|
}
|
|
},
|
|
"url": "https://files.pythonhosted.org/packages/07/51/2c0959c5adf988c44d9e1e0d940f5b074516ecc87e96b1af25f59de9ba38/pip-23.0.1-py3-none-any.whl"
|
|
}
|
|
|
|
A valid ``provenance_url.json`` listing a source distribution which was used to
|
|
build and install a wheel:
|
|
|
|
.. code-block:: json
|
|
|
|
{
|
|
"archive_info": {
|
|
"hashes": {
|
|
"sha256": "8bfe29f17c10e2f2e619de8033a07a224058d96b3bfe2ed61777596f7ffd7fa9"
|
|
}
|
|
},
|
|
"url": "https://files.pythonhosted.org/packages/1d/43/ad8ae671de795ec2eafd86515ef9842ab68455009d864c058d0c3dcf680d/micropipenv-0.0.1.tar.gz"
|
|
}
|
|
|
|
Examples of an invalid provenance_url.json
|
|
------------------------------------------
|
|
|
|
The following example includes a ``hash`` key in the ``archive_info`` dictionary
|
|
as originally designed in :pep:`610` and the data structure documented in
|
|
:ref:`packaging:direct-url`.
|
|
The ``hash`` key MUST NOT be present to prevent from any possible confusion
|
|
with ``hashes`` and additional checks that would be required to keep hash
|
|
values in sync.
|
|
|
|
.. code-block:: json
|
|
|
|
{
|
|
"archive_info": {
|
|
"hash": "sha256=236bcb61156d76c4b8a05821b988c7b8c35bf0da28a4b614e8d6ab5212c25c6f",
|
|
"hashes": {
|
|
"sha256": "236bcb61156d76c4b8a05821b988c7b8c35bf0da28a4b614e8d6ab5212c25c6f"
|
|
}
|
|
},
|
|
"url": "https://files.pythonhosted.org/packages/07/51/2c0959c5adf988c44d9e1e0d940f5b074516ecc87e96b1af25f59de9ba38/pip-23.0.1-py3-none-any.whl"
|
|
}
|
|
|
|
Another example demonstrates an invalid hash name. The referenced hash name does not
|
|
correspond to the canonical hash names described in this PEP and
|
|
in the Python docs under :attr:`py3.11:hashlib.hash.name`.
|
|
|
|
.. code-block:: json
|
|
|
|
{
|
|
"archive_info": {
|
|
"hashes": {
|
|
"SHA-256": "236bcb61156d76c4b8a05821b988c7b8c35bf0da28a4b614e8d6ab5212c25c6f"
|
|
}
|
|
},
|
|
"url": "https://files.pythonhosted.org/packages/07/51/2c0959c5adf988c44d9e1e0d940f5b074516ecc87e96b1af25f59de9ba38/pip-23.0.1-py3-none-any.whl"
|
|
}
|
|
|
|
|
|
Example pip commands and their effect on provenance_url.json and direct_url.json
|
|
--------------------------------------------------------------------------------
|
|
|
|
These commands generate a ``direct_url.json`` file but do not generate a
|
|
``provenance_url.json`` file. These examples follow examples from :pep:`610`:
|
|
|
|
* ``pip install https://example.com/app-1.0.tgz``
|
|
* ``pip install https://example.com/app-1.0.whl``
|
|
* ``pip install "git+https://example.com/repo/app.git#egg=app&subdirectory=setup"``
|
|
* ``pip install ./app``
|
|
* ``pip install file:///home/user/app``
|
|
* ``pip install --editable "git+https://example.com/repo/app.git#egg=app&subdirectory=setup"`` (in which case, ``url`` will be the local directory where the git repository has been cloned to, and ``dir_info`` will be present with ``"editable": true`` and no ``vcs_info`` will be set)
|
|
* ``pip install -e ./app``
|
|
|
|
Commands that generate a ``provenance_url.json`` file but do not generate
|
|
a ``direct_url.json`` file:
|
|
|
|
* ``pip install app``
|
|
* ``pip install app~=2.2.0``
|
|
* ``pip install app --no-index --find-links "https://example.com/"``
|
|
|
|
This behaviour can be tested using changes to pip implemented in the PR
|
|
`pypa/pip#11865`_.
|
|
|
|
Reference Implementation
|
|
========================
|
|
|
|
A proof-of-concept for creating the ``provenance_url.json`` metadata file when
|
|
installing a Python :term:`Distribution Package` is available in the PR to pip
|
|
`pypa/pip#11865`_. It reuses the already available implementation for the
|
|
:ref:`direct URL data structure <packaging:direct-url-data-structure>` to provide
|
|
the ``provenance_url.json`` metadata file for cases when ``direct_url.json`` is not
|
|
created.
|
|
|
|
A prototype called `pip-preserve <pip_preserve_>`_ was developed to
|
|
demonstrate creation of ``requirements.txt`` files considering ``direct_url.json``
|
|
and ``provenance_url.json`` metadata files. This tool mimics the ``pip
|
|
freeze`` functionality, but the listing of installed packages also includes
|
|
the hashes of the Python distribution artifacts.
|
|
|
|
To further support this proposal, `pip-sbom <pip_sbom_>`_ demonstrates creation
|
|
of SBOM in the SPDX format. The tool uses information stored in the ``provenance_url.json``
|
|
file.
|
|
|
|
Rejected Ideas
|
|
==============
|
|
|
|
Naming the file direct_url.json instead of provenance_url.json
|
|
--------------------------------------------------------------
|
|
|
|
To preserve backwards compatibility with the
|
|
:ref:`Direct URL Origin specification <packaging:direct-url>`,
|
|
the file cannot be named ``direct_url.json``, as per the text of that specification:
|
|
|
|
This file MUST NOT be created when installing a distribution from an other
|
|
type of requirement (i.e. name plus version specifier).
|
|
|
|
Such a change might introduce backwards compatibility issues for consumers of
|
|
``direct_url.json`` who rely on its presence only when distributions are
|
|
installed using a direct URL reference.
|
|
|
|
Deprecating direct_url.json and using only provenance_url.json
|
|
--------------------------------------------------------------
|
|
|
|
File ``direct_url.json`` is already well established with :pep:`610` being accepted and is
|
|
already used by installers. For example, ``pip`` uses ``direct_url.json`` to
|
|
report a direct URL reference on ``pip freeze``. Deprecating
|
|
``direct_url.json`` would require additional changes to the ``pip freeze``
|
|
implementation in pip (see PR `fridex/pip#2`_) and could introduce backwards compatibility
|
|
issues for already existing ``direct_url.json`` consumers.
|
|
|
|
Keeping the hash key in the archive_info dictionary
|
|
---------------------------------------------------
|
|
|
|
:pep:`610` and :ref:`its corresponding canonical PyPA spec <packaging:direct-url>`
|
|
discuss the possibility to include the ``hash`` key alongside the ``hashes`` key in the
|
|
``archive_info`` dictionary. This PEP explicitly does not include the ``hash`` key in
|
|
the ``provenance_url.json`` file and allows only the ``hashes`` key to be present.
|
|
By doing so we eliminate possible redundancy in the file, possible confusion,
|
|
and any additional checks that would need to be done to make sure the hashes are in
|
|
sync.
|
|
|
|
Making the hashes key optional
|
|
------------------------------
|
|
|
|
:pep:`610` and :ref:`its corresponding canonical PyPA spec <packaging:direct-url>`
|
|
recommend including the ``hashes`` key of the ``archive_info`` in the
|
|
``direct_url.json`` file but it is not required (per the :rfc:`2119` language):
|
|
|
|
A hashes key SHOULD be present as a dictionary mapping a hash name to a hex
|
|
encoded digest of the file.
|
|
|
|
This PEP requires the ``hashes`` key be included in ``archive_info``
|
|
in the ``provenance_url.json`` file if that file is created; per this PEP:
|
|
|
|
The value of ``archive_info`` MUST be a dictionary with a single key
|
|
``hashes``.
|
|
|
|
By doing so, consumers of ``provenance_url.json`` can check
|
|
artifact digests when the ``provenance_url.json`` file is created by installers.
|
|
|
|
Open Issues
|
|
===========
|
|
|
|
Availability of the provenance_url.json file in Conda
|
|
-----------------------------------------------------
|
|
|
|
We would like to get feedback on the ``provenance_url.json`` file from the Conda
|
|
maintainers. It is not clear whether Conda would like to adopt the
|
|
``provenance_url.json`` file. Conda already stores provenance related
|
|
information (similar to the provenance information proposed in this PEP) in
|
|
JSON files located in the ``conda-meta`` directory `following its actions
|
|
during installation
|
|
<https://conda.io/projects/conda/en/latest/dev-guide/deep-dives/install.html>`__.
|
|
|
|
Using provenance_url.json in downstream installers
|
|
--------------------------------------------------
|
|
|
|
The proposed ``provenance_url.json`` file was meant to be adopted primarily by
|
|
Python installers. Other installers, such as APT or DNF, might record the
|
|
provenance of the installed downstream Python distributions in their own
|
|
way specific to downstream package management. The proposed file is
|
|
not expected to be created by these downstream package installers and thus they
|
|
were intentionally left out of this PEP. However, any input by developers or
|
|
maintainers of these installers is valuable to possibly enrich the
|
|
``provenance_url.json`` file with information that would help in some way.
|
|
|
|
.. _710-tool-survey:
|
|
|
|
Appendix: Survey of installers and libraries
|
|
============================================
|
|
|
|
pip
|
|
---
|
|
|
|
The function from pip's internal API responsible for installing wheels, named
|
|
`_install_wheel
|
|
<https://github.com/pypa/pip/blob/10d9cbc601e5cadc45163452b1bc463d8ad2c1f7/src/pip/_internal/operations/install/wheel.py#L432>`__,
|
|
does not store any ``provenance_url.json`` file in the ``.dist-info``
|
|
directory. Additionally, a prototype introducing the mentioned file to pip in
|
|
`pypa/pip#11865`_ demonstrates incorporating logic for handling the
|
|
``provenance_url.json`` file in pip's source code.
|
|
|
|
As pip is used by some of the tools mentioned below to install Python package
|
|
distributions, findings for pip apply to these tools, as well as pip does not
|
|
allow parametrizing creation of files in the ``.dist-info`` directory in its
|
|
internal API. Most of the tools mentioned below that use pip invoke pip as a
|
|
subprocess which has no effect on the eventual presence of the
|
|
``provenance_url.json`` file in the ``.dist-info`` directory.
|
|
|
|
distlib
|
|
-------
|
|
|
|
`distlib`_ implements low-level functionality to manipulate the
|
|
``dist-info`` directory. The database of installed distributions does not use
|
|
any file named ``provenance_url.json``, based on `the distlib's source code
|
|
<https://github.com/pypa/distlib/blob/05375908c1b2d6b0e74bdeb574569d3609db9f56/distlib/database.py#L39-L40>`__.
|
|
|
|
Pipenv
|
|
------
|
|
|
|
`Pipenv`_ uses pip `to install Python package distributions
|
|
<https://github.com/pypa/pipenv/blob/babd428d8ee3c5caeb818d746f715c02f338839b/pipenv/routines/install.py#L262>`__.
|
|
There wasn't any additional identified logic that would cause backwards
|
|
compatibility issues when introducing the ``provenance_url.json`` file in the
|
|
``.dist-info`` directory.
|
|
|
|
installer
|
|
---------
|
|
|
|
`installer`_ does not create a ``provenance_url.json`` file explicitly.
|
|
Nevertheless, as per the :ref:`Recording Installed Projects <packaging:recording-installed-packages>`
|
|
specification, installer allows passing the ``additional_metadata`` argument to
|
|
create a file in the ``.dist-info`` directory - see `the source code
|
|
<https://github.com/pypa/installer/blob/f89b5d93a643ef5e9858a6e3f450c83a57bbe1f1/src/installer/_core.py#L67>`__.
|
|
To avoid any backwards compatibility issues, any library or tool using
|
|
installer must not request creating the ``provenance_url.json`` file using the
|
|
mentioned ``additional_metadata`` argument.
|
|
|
|
Poetry
|
|
------
|
|
|
|
The installation logic in `Poetry`_ depends on the
|
|
``installer.modern-installer`` configuration option (`see docs
|
|
<https://python-poetry.org/docs/configuration#installermodern-installation>`__).
|
|
|
|
For cases when the ``installer.modern-installer`` configuration option is set
|
|
to ``false``, Poetry uses `pip for installing Python package distributions
|
|
<https://github.com/python-poetry/poetry/blob/2b15ce10f02b0c6347fe2f12ae902488edeaaf7c/src/poetry/installation/executor.py#L543-L544>`__.
|
|
|
|
On the other hand, when ``installer.modern-installer`` configuration option is
|
|
set to ``true``, Poetry uses `installer to install Python package distributions
|
|
<https://github.com/python-poetry/poetry/blob/2b15ce10f02b0c6347fe2f12ae902488edeaaf7c/src/poetry/installation/wheel_installer.py#L99-L109>`__.
|
|
As can be seen from the linked sources, there isn't passed any additional
|
|
metadata file named ``provenance_url.json`` that would cause compatibility
|
|
issues with this PEP.
|
|
|
|
Conda
|
|
-----
|
|
|
|
`Conda`_ does not create any ``provenance_url.json`` file
|
|
`when Python package distributions are installed
|
|
<https://github.com/conda/conda/blob/86e83925e17c68233ac659633bdc4d76b05a245a/conda/common/pkg_formats/python.py#L370-L390>`__.
|
|
|
|
Hatch
|
|
-----
|
|
|
|
`Hatch`_ uses pip `to install project dependencies
|
|
<https://github.com/pypa/hatch/blob/dd6e9545a355a0b5b58e065b489c1ef087e3bcaf/src/hatch/env/system.py#L28-L29>`__.
|
|
|
|
micropipenv
|
|
-----------
|
|
|
|
As `micropipenv`_ is a wrapper on top of pip, it uses
|
|
pip to install Python distributions, for both `lock files
|
|
<https://github.com/thoth-station/micropipenv/blob/8176862ec96df23e152938659d6f45645246e398/micropipenv.py#L393>`__
|
|
as well as `for requirements files
|
|
<https://github.com/thoth-station/micropipenv/blob/8176862ec96df23e152938659d6f45645246e398/micropipenv.py#L977>`__.
|
|
|
|
Thamos
|
|
------
|
|
|
|
`Thamos`_ uses micropipenv `to install Python package
|
|
distributions
|
|
<https://github.com/thoth-station/thamos/blob/234351025c77cfe28b0df07f7ee017469b57d3f4/thamos/lib.py#L1290>`__,
|
|
hence any findings for micropipenv apply for Thamos.
|
|
|
|
PDM
|
|
---
|
|
|
|
`PDM`_ uses installer `to install binary distributions
|
|
<https://github.com/pdm-project/pdm/blob/d39a8e5b36c37093ea31e666d0e55fe21b38c16b/src/pdm/installers/installers.py#L241>`__.
|
|
The only additional metadata file it eventually creates in the ``.dist-info``
|
|
directory is `the REFER_TO file
|
|
<https://github.com/pdm-project/pdm/blob/d39a8e5b36c37093ea31e666d0e55fe21b38c16b/src/pdm/installers/installers.py#L197>`__.
|
|
|
|
References
|
|
==========
|
|
|
|
.. _pypa/pip#11865: https://github.com/pypa/pip/pull/11865
|
|
|
|
.. _fridex/pip#2: https://github.com/fridex/pip/pull/2/
|
|
|
|
.. _pip_preserve: https://pypi.org/project/pip-preserve/
|
|
|
|
.. _pip_sbom: https://github.com/sethmlarson/pip-sbom
|
|
|
|
.. _thoth-station/micropipenv#206: https://github.com/thoth-station/micropipenv/issues/206
|
|
|
|
.. _pypa/pip-audit#170: https://github.com/pypa/pip-audit/issues/170
|
|
|
|
.. _pip_installation_report: https://pip.pypa.io/en/stable/reference/installation-report/
|
|
|
|
.. _distlib: https://distlib.readthedocs.io/
|
|
|
|
.. _Pipenv: https://pipenv.pypa.io/
|
|
|
|
.. _installer: https://github.com/pypa/installer
|
|
|
|
.. _Poetry: https://python-poetry.org/
|
|
|
|
.. _Conda: https://docs.conda.io/
|
|
|
|
.. _Hatch: https://hatch.pypa.io/
|
|
|
|
.. _micropipenv: https://github.com/thoth-station/micropipenv
|
|
|
|
.. _Thamos: https://github.com/thoth-station/thamos/
|
|
|
|
.. _PDM: https://pdm.fming.dev/
|
|
|
|
Acknowledgements
|
|
================
|
|
|
|
Thanks to Dustin Ingram, Brett Cannon, and Paul Moore for the initial discussion in
|
|
which this idea originated.
|
|
|
|
Thanks to Donald Stufft, Ofek Lev, and Trishank Kuppusamy for early feedback
|
|
and support to work on this PEP.
|
|
|
|
Thanks to Gregory P. Smith, Stéphane Bidoul, and C.A.M. Gerlach for
|
|
reviewing this PEP and providing valuable suggestions.
|
|
|
|
Thanks to Seth Michael Larson for providing valuable suggestions and for
|
|
the proposed pip-sbom prototype.
|
|
|
|
Thanks to Stéphane Bidoul and Chris Jerdonek for :pep:`610`.
|
|
|
|
Last, but not least, thanks to Donald Stufft for sponsoring this PEP.
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document is placed in the public domain or under the CC0-1.0-Universal
|
|
license, whichever is more permissive.
|