PEP 710: Link specifications instead of PEP 610 (#3887)

Signed-off-by: Fridolin Pokorny <fridolin.pokorny@gmail.com>
This commit is contained in:
Fridolín Pokorný 2024-08-03 13:39:19 +02:00 committed by GitHub
parent 40e8ff87a5
commit 64bbebc786
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
1 changed files with 57 additions and 47 deletions

View File

@ -21,11 +21,11 @@ This PEP describes a way to record the provenance of installed Python distributi
The record is created by an installer and is available to users in The record is created by an installer and is available to users in
the form of a JSON file ``provenance_url.json`` in the ``.dist-info`` directory. the form of a JSON file ``provenance_url.json`` in the ``.dist-info`` directory.
The mentioned JSON file captures additional metadata to allow recording a URL to a The mentioned JSON file captures additional metadata to allow recording a URL to a
:term:`distribution package` together with the installed distribution hash. This :term:`distribution package` together with the installed distribution hash.
proposal is built on top of :pep:`610` following This proposal is built on top of :pep:`610` following :ref:`its corresponding
:ref:`its corresponding canonical PyPA spec <packaging:direct-url>` and canonical PyPA spec <packaging:direct-url>` and complements ``direct_url.json``
complements ``direct_url.json`` with ``provenance_url.json`` for when packages with ``provenance_url.json`` for when packages are identified by a name, and
are identified by a name, and optionally a version. optionally a version.
Motivation Motivation
========== ==========
@ -38,7 +38,7 @@ is generally lost. However, there are use cases for keeping records of
distributions used for installing packages and their provenance. distributions used for installing packages and their provenance.
Python wheels can be built with different compiler flags or supporting Python wheels can be built with different compiler flags or supporting
different wheel tags. In both cases, users might get into a situation in which different wheel tags. In both cases, users might get into a situation in which
multiple wheels might be considered by installers (possibly from different multiple wheels might be considered by installers (possibly from different
package indexes) and immediately finding out which wheel file was actually used package indexes) and immediately finding out which wheel file was actually used
during the installation might be helpful. This way, developers can use during the installation might be helpful. This way, developers can use
@ -52,10 +52,11 @@ artifact consumed from a Python package index.
Rationale Rationale
========= =========
The motivation described in this PEP is an extension of that in :pep:`610`. The motivation described in this PEP is an extension of :ref:`Recording the
In addition to recording provenance information for packages installed using a direct URL, Direct URL Origin of installed distributions <packaging:direct-url>`
installers should also do so for packages installed by name specification. In addition to recording provenance information for packages
(and optionally version) from Python package indexes. installed using a direct URL, installers should also do so for packages
installed by name (and optionally version) from Python package indexes.
The idea described in this PEP originated in a tool called `micropipenv`_ The idea described in this PEP originated in a tool called `micropipenv`_
that is used to install that is used to install
@ -112,22 +113,28 @@ specified by name (and optionally by :term:`Version Specifier`).
This file MUST NOT be created when installing a distribution package from a requirement This file MUST NOT be created when installing a distribution package from a requirement
specifying a direct URL reference (including a VCS URL). specifying a direct URL reference (including a VCS URL).
Only one of the files ``provenance_url.json`` and ``direct_url.json`` (from :pep:`610`), Only one of the files ``provenance_url.json`` and ``direct_url.json`` (from
may be present in a given ``.dist-info`` directory; installers MUST NOT add both. :ref:`Recording the Direct URL Origin of installed distributions
<packaging:direct-url>` specification and the corresponding specification of
the :ref:`Direct URL Data Structure <packaging:direct-url-data-structure>`),
may be present in a given ``.dist-info`` directory; installers MUST NOT add
both.
The ``provenance_url.json`` JSON file MUST be a dictionary, compliant with The ``provenance_url.json`` JSON file MUST be a dictionary, compliant with
:rfc:`8259` and UTF-8 encoded. :rfc:`8259` and UTF-8 encoded.
If present, it MUST contain exactly two keys. The first MUST be ``url``, with If present, it MUST contain exactly two keys. The first MUST be ``url``, with
type ``string``. The second key MUST be ``archive_info`` with a value defined type ``string``. The second key MUST be ``archive_info`` with a value defined
below. below.
The value of the ``url`` key MUST be the URL from which the distribution package was downloaded. If a wheel is The value of the ``url`` key MUST be the URL from which the distribution
built from a source distribution, the ``url`` value MUST be the URL from which package was downloaded. If a wheel is built from a source distribution, the
the source distribution was downloaded. If a wheel is downloaded and installed directly, ``url`` value MUST be the URL from which the source distribution was
the ``url`` field MUST be the URL from which the wheel was downloaded. downloaded. If a wheel is downloaded and installed directly, the ``url`` field
As in the :ref:`direct URL origin specification<packaging:direct-url>`, the ``url`` value MUST be the URL from which the wheel was downloaded. As in the :ref:`Direct URL
MUST be stripped of any sensitive authentication information for security reasons. Data Structure <packaging:direct-url-data-structure>` specification, the ``url``
value MUST be stripped of any sensitive authentication information for security
reasons.
The user:password section of the URL MAY however be composed of environment The user:password section of the URL MAY however be composed of environment
variables, matching the following regular expression: variables, matching the following regular expression:
@ -141,7 +148,7 @@ non-security sensitive string. A typical example is ``git`` in the case of an
URL such as ``ssh://git@gitlab.com``. URL such as ``ssh://git@gitlab.com``.
The value of ``archive_info`` MUST be a dictionary with a single key The value of ``archive_info`` MUST be a dictionary with a single key
``hashes``. The value of ``hashes`` is a dictionary mapping hash function ``hashes``. The value of ``hashes`` is a dictionary mapping hash function
names to a hex-encoded digest of the file referenced by the ``url`` value. At names to a hex-encoded digest of the file referenced by the ``url`` value. At
least one hash MUST be recorded. Multiple hashes MAY be included, and it is up least one hash MUST be recorded. Multiple hashes MAY be included, and it is up
to the consumer to decide what to do with multiple hashes (it may validate all to the consumer to decide what to do with multiple hashes (it may validate all
@ -174,7 +181,7 @@ Backwards Compatibility
Following the :ref:`packaging:recording-installed-packages` specification, Following the :ref:`packaging:recording-installed-packages` specification,
installers may keep additional installer-specific files in the ``.dist-info`` installers may keep additional installer-specific files in the ``.dist-info``
directory. To make sure this PEP does not cause any backwards compatibility directory. To make sure this PEP does not cause any backwards compatibility
issues, a `comprehensive survey of installers and libraries <710-tool-survey_>`_ issues, a `comprehensive survey of installers and libraries <710-tool-survey_>`_
found no current tools that are using a similarly-named file, found no current tools that are using a similarly-named file,
or other major feasibility concerns. or other major feasibility concerns.
@ -204,7 +211,7 @@ The content of ``provenance_url.json`` file was designed in a way to eventually
allow installers reuse some of the logic supporting ``direct_url.json`` when a allow installers reuse some of the logic supporting ``direct_url.json`` when a
direct URL refers to a source archive or a wheel. direct URL refers to a source archive or a wheel.
The main difference between the ``provenance_url.json`` and ``direct_url.json`` The main difference between the ``provenance_url.json`` and ``direct_url.json``
files are the mandatory keys and their values in the ``provenance_url.json`` file. files are the mandatory keys and their values in the ``provenance_url.json`` file.
This helps make sure consumers of the ``provenance_url.json`` file can rely This helps make sure consumers of the ``provenance_url.json`` file can rely
on its content, if the file is present in the ``.dist-info`` directory. on its content, if the file is present in the ``.dist-info`` directory.
@ -297,12 +304,11 @@ build and install a wheel:
Examples of an invalid provenance_url.json Examples of an invalid provenance_url.json
------------------------------------------ ------------------------------------------
The following example includes a ``hash`` key in the ``archive_info`` dictionary The following example includes a ``hash`` key in the ``archive_info``
as originally designed in :pep:`610` and the data structure documented in dictionary as originally designed in the data structure documented in
:ref:`packaging:direct-url`. :ref:`packaging:direct-url`. The ``hash`` key MUST NOT be present to prevent
The ``hash`` key MUST NOT be present to prevent from any possible confusion from any possible confusion with ``hashes`` and additional checks that would be
with ``hashes`` and additional checks that would be required to keep hash required to keep hash values in sync.
values in sync.
.. code-block:: json .. code-block:: json
@ -347,7 +353,8 @@ Example pip commands and their effect on provenance_url.json and direct_url.json
-------------------------------------------------------------------------------- --------------------------------------------------------------------------------
These commands generate a ``direct_url.json`` file but do not generate a These commands generate a ``direct_url.json`` file but do not generate a
``provenance_url.json`` file. These examples follow examples from :pep:`610`: ``provenance_url.json`` file. These examples follow examples from :ref:`Direct
URL Data Structure <packaging:direct-url-data-structure>` specification:
* ``pip install https://example.com/app-1.0.tgz`` * ``pip install https://example.com/app-1.0.tgz``
* ``pip install https://example.com/app-1.0.whl`` * ``pip install https://example.com/app-1.0.whl``
@ -373,16 +380,16 @@ Reference Implementation
A proof-of-concept for creating the ``provenance_url.json`` metadata file when A proof-of-concept for creating the ``provenance_url.json`` metadata file when
installing a Python :term:`Distribution Package` is available in the PR to pip installing a Python :term:`Distribution Package` is available in the PR to pip
`pypa/pip#11865`_. It reuses the already available implementation for the `pypa/pip#11865`_. It reuses the already available implementation for the
:ref:`direct URL data structure <packaging:direct-url-data-structure>` to provide :ref:`direct URL data structure <packaging:direct-url-data-structure>` to
the ``provenance_url.json`` metadata file for cases when ``direct_url.json`` is not provide the ``provenance_url.json`` metadata file for cases when
created. ``direct_url.json`` is not created.
A reference implementation for supporting the ``provenance_url.json`` file A reference implementation for supporting the ``provenance_url.json`` file
in PDM exists is available in `pdm-project/pdm#3013`_. in PDM exists is available in `pdm-project/pdm#3013`_.
A prototype called `pip-preserve <pip_preserve_>`_ was developed to A prototype called `pip-preserve <pip_preserve_>`_ was developed to
demonstrate creation of ``requirements.txt`` files considering ``direct_url.json`` demonstrate creation of ``requirements.txt`` files considering ``direct_url.json``
and ``provenance_url.json`` metadata files. This tool mimics the ``pip and ``provenance_url.json`` metadata files. This tool mimics the ``pip
freeze`` functionality, but the listing of installed packages also includes freeze`` functionality, but the listing of installed packages also includes
the hashes of the Python distribution artifacts. the hashes of the Python distribution artifacts.
@ -396,9 +403,8 @@ Rejected Ideas
Naming the file direct_url.json instead of provenance_url.json Naming the file direct_url.json instead of provenance_url.json
-------------------------------------------------------------- --------------------------------------------------------------
To preserve backwards compatibility with the To preserve backwards compatibility with the :ref:`Recording the Direct URL Origin of installed distributions <packaging:direct-url>`, the file cannot be named
:ref:`Direct URL Origin specification <packaging:direct-url>`, ``direct_url.json``, as per the text of that specification:
the file cannot be named ``direct_url.json``, as per the text of that specification:
This file MUST NOT be created when installing a distribution from an other This file MUST NOT be created when installing a distribution from an other
type of requirement (i.e. name plus version specifier). type of requirement (i.e. name plus version specifier).
@ -410,23 +416,24 @@ installed using a direct URL reference.
Deprecating direct_url.json and using only provenance_url.json Deprecating direct_url.json and using only provenance_url.json
-------------------------------------------------------------- --------------------------------------------------------------
File ``direct_url.json`` is already well established with :pep:`610` being accepted and is File ``direct_url.json`` is already well established by the :ref:`Direct URL
Data Structure <packaging:direct-url-data-structure>` specification and is
already used by installers. For example, ``pip`` uses ``direct_url.json`` to already used by installers. For example, ``pip`` uses ``direct_url.json`` to
report a direct URL reference on ``pip freeze``. Deprecating report a direct URL reference on ``pip freeze``. Deprecating
``direct_url.json`` would require additional changes to the ``pip freeze`` ``direct_url.json`` would require additional changes to the ``pip freeze``
implementation in pip (see PR `fridex/pip#2`_) and could introduce backwards compatibility implementation in pip (see PR `fridex/pip#2`_) and could introduce backwards
issues for already existing ``direct_url.json`` consumers. compatibility issues for already existing ``direct_url.json`` consumers.
Keeping the hash key in the archive_info dictionary Keeping the hash key in the archive_info dictionary
--------------------------------------------------- ---------------------------------------------------
:pep:`610` and :ref:`its corresponding canonical PyPA spec <packaging:direct-url>` :ref:`Direct URL Data Structure <packaging:direct-url-data-structure>`
discuss the possibility to include the ``hash`` key alongside the ``hashes`` key in the specification discusses the possibility to include the ``hash`` key alongside
``archive_info`` dictionary. This PEP explicitly does not include the ``hash`` key in the ``hashes`` key in the ``archive_info`` dictionary. This PEP explicitly does
the ``provenance_url.json`` file and allows only the ``hashes`` key to be present. not include the ``hash`` key in the ``provenance_url.json`` file and allows
By doing so we eliminate possible redundancy in the file, possible confusion, only the ``hashes`` key to be present. By doing so we eliminate possible
and any additional checks that would need to be done to make sure the hashes are in redundancy in the file, possible confusion, and any additional checks that
sync. would need to be done to make sure the hashes are in sync.
Allowing no hashes stated Allowing no hashes stated
------------------------- -------------------------
@ -670,7 +677,10 @@ reviewing this PEP and providing valuable suggestions.
Thanks to Seth Michael Larson for providing valuable suggestions and for Thanks to Seth Michael Larson for providing valuable suggestions and for
the proposed pip-sbom prototype. the proposed pip-sbom prototype.
Thanks to Stéphane Bidoul and Chris Jerdonek for :pep:`610`. Thanks to Stéphane Bidoul and Chris Jerdonek for :pep:`610`, and related
:ref:`Recording the Direct URL Origin of installed distributions
<packaging:direct-url>` and :ref:`Direct URL Data Structure
<packaging:direct-url-data-structure>` specifications.
Thanks to Frost Ming for raising possible concern around storing index URL in Thanks to Frost Ming for raising possible concern around storing index URL in
the ``provenance_url.json`` file and initial PEP 710 support in PDM. the ``provenance_url.json`` file and initial PEP 710 support in PDM.