PEP 710: Link specifications instead of PEP 610 (#3887)

Signed-off-by: Fridolin Pokorny <fridolin.pokorny@gmail.com>
This commit is contained in:
Fridolín Pokorný 2024-08-03 13:39:19 +02:00 committed by GitHub
parent 40e8ff87a5
commit 64bbebc786
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
1 changed files with 57 additions and 47 deletions

View File

@ -21,11 +21,11 @@ This PEP describes a way to record the provenance of installed Python distributi
The record is created by an installer and is available to users in
the form of a JSON file ``provenance_url.json`` in the ``.dist-info`` directory.
The mentioned JSON file captures additional metadata to allow recording a URL to a
:term:`distribution package` together with the installed distribution hash. This
proposal is built on top of :pep:`610` following
:ref:`its corresponding canonical PyPA spec <packaging:direct-url>` and
complements ``direct_url.json`` with ``provenance_url.json`` for when packages
are identified by a name, and optionally a version.
:term:`distribution package` together with the installed distribution hash.
This proposal is built on top of :pep:`610` following :ref:`its corresponding
canonical PyPA spec <packaging:direct-url>` and complements ``direct_url.json``
with ``provenance_url.json`` for when packages are identified by a name, and
optionally a version.
Motivation
==========
@ -38,7 +38,7 @@ is generally lost. However, there are use cases for keeping records of
distributions used for installing packages and their provenance.
Python wheels can be built with different compiler flags or supporting
different wheel tags. In both cases, users might get into a situation in which
different wheel tags. In both cases, users might get into a situation in which
multiple wheels might be considered by installers (possibly from different
package indexes) and immediately finding out which wheel file was actually used
during the installation might be helpful. This way, developers can use
@ -52,10 +52,11 @@ artifact consumed from a Python package index.
Rationale
=========
The motivation described in this PEP is an extension of that in :pep:`610`.
In addition to recording provenance information for packages installed using a direct URL,
installers should also do so for packages installed by name
(and optionally version) from Python package indexes.
The motivation described in this PEP is an extension of :ref:`Recording the
Direct URL Origin of installed distributions <packaging:direct-url>`
specification. In addition to recording provenance information for packages
installed using a direct URL, installers should also do so for packages
installed by name (and optionally version) from Python package indexes.
The idea described in this PEP originated in a tool called `micropipenv`_
that is used to install
@ -112,22 +113,28 @@ specified by name (and optionally by :term:`Version Specifier`).
This file MUST NOT be created when installing a distribution package from a requirement
specifying a direct URL reference (including a VCS URL).
Only one of the files ``provenance_url.json`` and ``direct_url.json`` (from :pep:`610`),
may be present in a given ``.dist-info`` directory; installers MUST NOT add both.
Only one of the files ``provenance_url.json`` and ``direct_url.json`` (from
:ref:`Recording the Direct URL Origin of installed distributions
<packaging:direct-url>` specification and the corresponding specification of
the :ref:`Direct URL Data Structure <packaging:direct-url-data-structure>`),
may be present in a given ``.dist-info`` directory; installers MUST NOT add
both.
The ``provenance_url.json`` JSON file MUST be a dictionary, compliant with
:rfc:`8259` and UTF-8 encoded.
If present, it MUST contain exactly two keys. The first MUST be ``url``, with
type ``string``. The second key MUST be ``archive_info`` with a value defined
type ``string``. The second key MUST be ``archive_info`` with a value defined
below.
The value of the ``url`` key MUST be the URL from which the distribution package was downloaded. If a wheel is
built from a source distribution, the ``url`` value MUST be the URL from which
the source distribution was downloaded. If a wheel is downloaded and installed directly,
the ``url`` field MUST be the URL from which the wheel was downloaded.
As in the :ref:`direct URL origin specification<packaging:direct-url>`, the ``url`` value
MUST be stripped of any sensitive authentication information for security reasons.
The value of the ``url`` key MUST be the URL from which the distribution
package was downloaded. If a wheel is built from a source distribution, the
``url`` value MUST be the URL from which the source distribution was
downloaded. If a wheel is downloaded and installed directly, the ``url`` field
MUST be the URL from which the wheel was downloaded. As in the :ref:`Direct URL
Data Structure <packaging:direct-url-data-structure>` specification, the ``url``
value MUST be stripped of any sensitive authentication information for security
reasons.
The user:password section of the URL MAY however be composed of environment
variables, matching the following regular expression:
@ -141,7 +148,7 @@ non-security sensitive string. A typical example is ``git`` in the case of an
URL such as ``ssh://git@gitlab.com``.
The value of ``archive_info`` MUST be a dictionary with a single key
``hashes``. The value of ``hashes`` is a dictionary mapping hash function
``hashes``. The value of ``hashes`` is a dictionary mapping hash function
names to a hex-encoded digest of the file referenced by the ``url`` value. At
least one hash MUST be recorded. Multiple hashes MAY be included, and it is up
to the consumer to decide what to do with multiple hashes (it may validate all
@ -174,7 +181,7 @@ Backwards Compatibility
Following the :ref:`packaging:recording-installed-packages` specification,
installers may keep additional installer-specific files in the ``.dist-info``
directory. To make sure this PEP does not cause any backwards compatibility
directory. To make sure this PEP does not cause any backwards compatibility
issues, a `comprehensive survey of installers and libraries <710-tool-survey_>`_
found no current tools that are using a similarly-named file,
or other major feasibility concerns.
@ -204,7 +211,7 @@ The content of ``provenance_url.json`` file was designed in a way to eventually
allow installers reuse some of the logic supporting ``direct_url.json`` when a
direct URL refers to a source archive or a wheel.
The main difference between the ``provenance_url.json`` and ``direct_url.json``
The main difference between the ``provenance_url.json`` and ``direct_url.json``
files are the mandatory keys and their values in the ``provenance_url.json`` file.
This helps make sure consumers of the ``provenance_url.json`` file can rely
on its content, if the file is present in the ``.dist-info`` directory.
@ -297,12 +304,11 @@ build and install a wheel:
Examples of an invalid provenance_url.json
------------------------------------------
The following example includes a ``hash`` key in the ``archive_info`` dictionary
as originally designed in :pep:`610` and the data structure documented in
:ref:`packaging:direct-url`.
The ``hash`` key MUST NOT be present to prevent from any possible confusion
with ``hashes`` and additional checks that would be required to keep hash
values in sync.
The following example includes a ``hash`` key in the ``archive_info``
dictionary as originally designed in the data structure documented in
:ref:`packaging:direct-url`. The ``hash`` key MUST NOT be present to prevent
from any possible confusion with ``hashes`` and additional checks that would be
required to keep hash values in sync.
.. code-block:: json
@ -347,7 +353,8 @@ Example pip commands and their effect on provenance_url.json and direct_url.json
--------------------------------------------------------------------------------
These commands generate a ``direct_url.json`` file but do not generate a
``provenance_url.json`` file. These examples follow examples from :pep:`610`:
``provenance_url.json`` file. These examples follow examples from :ref:`Direct
URL Data Structure <packaging:direct-url-data-structure>` specification:
* ``pip install https://example.com/app-1.0.tgz``
* ``pip install https://example.com/app-1.0.whl``
@ -373,16 +380,16 @@ Reference Implementation
A proof-of-concept for creating the ``provenance_url.json`` metadata file when
installing a Python :term:`Distribution Package` is available in the PR to pip
`pypa/pip#11865`_. It reuses the already available implementation for the
:ref:`direct URL data structure <packaging:direct-url-data-structure>` to provide
the ``provenance_url.json`` metadata file for cases when ``direct_url.json`` is not
created.
:ref:`direct URL data structure <packaging:direct-url-data-structure>` to
provide the ``provenance_url.json`` metadata file for cases when
``direct_url.json`` is not created.
A reference implementation for supporting the ``provenance_url.json`` file
in PDM exists is available in `pdm-project/pdm#3013`_.
A prototype called `pip-preserve <pip_preserve_>`_ was developed to
demonstrate creation of ``requirements.txt`` files considering ``direct_url.json``
and ``provenance_url.json`` metadata files. This tool mimics the ``pip
and ``provenance_url.json`` metadata files. This tool mimics the ``pip
freeze`` functionality, but the listing of installed packages also includes
the hashes of the Python distribution artifacts.
@ -396,9 +403,8 @@ Rejected Ideas
Naming the file direct_url.json instead of provenance_url.json
--------------------------------------------------------------
To preserve backwards compatibility with the
:ref:`Direct URL Origin specification <packaging:direct-url>`,
the file cannot be named ``direct_url.json``, as per the text of that specification:
To preserve backwards compatibility with the :ref:`Recording the Direct URL Origin of installed distributions <packaging:direct-url>`, the file cannot be named
``direct_url.json``, as per the text of that specification:
This file MUST NOT be created when installing a distribution from an other
type of requirement (i.e. name plus version specifier).
@ -410,23 +416,24 @@ installed using a direct URL reference.
Deprecating direct_url.json and using only provenance_url.json
--------------------------------------------------------------
File ``direct_url.json`` is already well established with :pep:`610` being accepted and is
File ``direct_url.json`` is already well established by the :ref:`Direct URL
Data Structure <packaging:direct-url-data-structure>` specification and is
already used by installers. For example, ``pip`` uses ``direct_url.json`` to
report a direct URL reference on ``pip freeze``. Deprecating
``direct_url.json`` would require additional changes to the ``pip freeze``
implementation in pip (see PR `fridex/pip#2`_) and could introduce backwards compatibility
issues for already existing ``direct_url.json`` consumers.
implementation in pip (see PR `fridex/pip#2`_) and could introduce backwards
compatibility issues for already existing ``direct_url.json`` consumers.
Keeping the hash key in the archive_info dictionary
---------------------------------------------------
:pep:`610` and :ref:`its corresponding canonical PyPA spec <packaging:direct-url>`
discuss the possibility to include the ``hash`` key alongside the ``hashes`` key in the
``archive_info`` dictionary. This PEP explicitly does not include the ``hash`` key in
the ``provenance_url.json`` file and allows only the ``hashes`` key to be present.
By doing so we eliminate possible redundancy in the file, possible confusion,
and any additional checks that would need to be done to make sure the hashes are in
sync.
:ref:`Direct URL Data Structure <packaging:direct-url-data-structure>`
specification discusses the possibility to include the ``hash`` key alongside
the ``hashes`` key in the ``archive_info`` dictionary. This PEP explicitly does
not include the ``hash`` key in the ``provenance_url.json`` file and allows
only the ``hashes`` key to be present. By doing so we eliminate possible
redundancy in the file, possible confusion, and any additional checks that
would need to be done to make sure the hashes are in sync.
Allowing no hashes stated
-------------------------
@ -670,7 +677,10 @@ reviewing this PEP and providing valuable suggestions.
Thanks to Seth Michael Larson for providing valuable suggestions and for
the proposed pip-sbom prototype.
Thanks to Stéphane Bidoul and Chris Jerdonek for :pep:`610`.
Thanks to Stéphane Bidoul and Chris Jerdonek for :pep:`610`, and related
:ref:`Recording the Direct URL Origin of installed distributions
<packaging:direct-url>` and :ref:`Direct URL Data Structure
<packaging:direct-url-data-structure>` specifications.
Thanks to Frost Ming for raising possible concern around storing index URL in
the ``provenance_url.json`` file and initial PEP 710 support in PDM.