python-peps/peps/pep-0740.rst

413 lines
17 KiB
ReStructuredText
Raw Normal View History

PEP: 740
Title: Index support for digital attestations
Author: William Woodruff <william@yossarian.net>,
Facundo Tuesca <facundo.tuesca@trailofbits.com>
Sponsor: Donald Stufft <donald@stufft.io>
PEP-Delegate: Donald Stufft <donald@stufft.io>
2024-01-29 18:47:43 -05:00
Discussions-To: https://discuss.python.org/t/pep-740-index-support-for-digital-attestations/44498
Status: Draft
Type: Informational
Topic: Packaging
Created: 08-Jan-2024
2024-01-29 18:47:43 -05:00
Post-History: `02-Jan-2024 <https://discuss.python.org/t/pre-pep-exposing-trusted-publisher-provenance-on-pypi/42337>`__,
`29-Jan-2024 <https://discuss.python.org/t/pep-740-index-support-for-digital-attestations/44498>`__
Abstract
========
This PEP proposes a collection of changes related to the upload and distribution
of digitally signed attestations and metadata used to verify them on a Python
package repository, such as PyPI.
These changes have two subcomponents:
* Changes to the currently unstandardized PyPI upload API, allowing clients
to upload digital attestations;
* Changes to the :pep:`503` and :pep:`691` "simple" APIs, allowing clients
to retrieve both digital attestations and
`Trusted Publishing <https://docs.pypi.org/trusted-publishers/>`_ metadata
for individual release files.
This PEP does not recommend a specific digital attestation format, nor does
it make a policy recommendation around mandatory digital attestations on
release uploads or their subsequent verification by installing clients like
``pip``.
Rationale and Motivation
========================
Desire for digital signatures on Python packages has been repeatedly
expressed by both package maintainers and downstream users:
* Maintainers wish to demonstrate the integrity and authenticity of their
package uploads;
* Individual downstream users wish to verify package integrity and authenticity
without placing additional trust in their index's honesty;
* "Bulk" downstream users (such as Operating System distributions) wish to
perform similar verifications and potentially re-expose or countersign
for their own downstream packaging ecosystems.
This proposal seeks to accommodate each of the above use cases.
Additionally, this proposal identifies the following motivations:
* Verifiable provenance for Python package distributions: many Python
packages currently contain *unauthenticated* provenance metadata, such
as URLs for source hosts. A cryptographic attestation format could enable
strong *authenticated* links between these packages and their source hosts,
allowing both the index and downstream users to cryptographically verify that
a package originates from its claimed source repository.
* Raising attacker requirements: an attacker who seeks to take
over a Python package can be described along *sophistication*
(unsophisticated to sophisticated) and *targeting* dimensions
(opportunistic to targeted).
Digital attestations impose additional sophistication requirements: the
attacker must be sufficiently sophisticated to access private signing material
(or signing identities). They also impose additional targeting requirements:
the release consistency requirement (mentioned below) means that the attacker
cannot upload *any* attestation, but only one of a type already seen for a
particular release. In the future, this could be further "ratcheted" down
by allowing project maintainers to disable releases without attestations
entirely.
* Release consistency: in the status quo, the only attestation provided by the
index is an optional PGP signature per release file
(see :ref:`PGP signatures <pgp-signatures>`). These signatures are not
checked by the index either for well-formedness or for validity, since
the index has no mechanism for identifying the right public key for the
signature.
Additionally, the index does not have an "all or none" requirement
for PGP signatures, meaning that there is no consistency requirement
between distributions within a release (and that maintainers may
accidentally forget to upload signatures when adding additional
release distributions).
While this PEP does not recommend a specific digital attestation format,
it does recognize the utility of Trusted Publishing as a pre-existing,
"zero-configuration" source of strong provenance for Python packages.
Consequently this PEP includes a proposed scheme for exposing each release
file's Trusted Publisher metadata, with the expectation that a future digital
attestation format will likely make use of it.
Design Considerations
---------------------
This PEP identifies the following design considerations when evaluating
both its own proposed changes and previous work in the same or adjacent
areas of Python packaging:
1. Index accessibility: digital attestations for Python packages
are ideally retrievable directly from the index itself, as "detached"
resources.
This both simplifies some compatibility concerns (by avoiding
the need to modify the distribution formats themselves) and also simplifies
the behavior of potential installing clients (by allowing them to
retrieve each attestation before its corresponding package without needing
to do streaming decompression).
2. Verification by the index itself: in addition to enabling verification
by installing clients, each digital attestation is *ideally* verifiable
in some form by the index itself.
This both increases the overall quality
of attestations uploaded to the index (preventing, for example, users
from accidentally uploading incorrect or invalid attestations) and also
enables UI and UX refinements on the index itself (such as a "provenance"
view for each uploaded package).
3. General applicability: digital attestations should be applicable to
*any and every* package uploaded to the index, regardless of its format
(sdist or wheel) or interior contents.
4. Metadata support: this PEP refers to "digital attestations" rather than
just "digital signatures" to emphasize the ideal presence of additional
metadata within the cryptographic envelope.
For example, to prevent domain separation between a distribution's name and
its contents, the digital attestation could be performed over
``HASH(name || HASH(contents))`` rather than just ``HASH(contents)``.
5. Consistent release attestations: if a file belonging to a release has a
set of digital attestations, then all of the other files belonging to that
release should also have the same types of attestations.
This simplifies the downstream use story for digital attestations, and
prevents potentially vulnerable "swiss cheese" release patterns (where
a verifier checks for a valid attestation on ``HolyGrail-1.0.tar.gz``
but their installing client actually resolves an attacker-controlled,
platform-specific ``.whl`` instead).
Previous Work
-------------
.. _pgp-signatures:
PGP signatures
^^^^^^^^^^^^^^
PyPI and other indices have historically supported PGP signatures on uploaded
distributions. These could be supplied during upload, and could be retrieved
by installing clients via the ``data-gpg-sig`` attribute in the :pep:`503`
API, the ``gpg-sig`` key on the :pep:`691` API, or via an adjacent
``.asc``-suffixed URL.
PGP signature uploads have been disabled on PyPI since
`May 2023 <https://blog.pypi.org/posts/2023-05-23-removing-pgp/>`_, after
`an investigation <https://blog.yossarian.net/2023/05/21/PGP-signatures-on-PyPI-worse-than-useless>`_
determined that the majority of signatures (which, themselves, constituted a
tiny percentage of overall uploads) could not be associated with a public key or
otherwise meaningfully verified.
In their previously supported form on PyPI, PGP signatures satisfied
considerations (1) and (3) above but not (2) (owing to the need for external
keyservers and key distribution) or (4) (due to PGP signatures typically being
constructed over just an input file, without any associated signed metadata).
Similarly, PyPI's historical implementation of PGP did not satisfy consideration
(5), due to a lack of consistency checks between different release files
(and an inability to perform those checks due to no access to the signer's
public key).
Wheel signatures
^^^^^^^^^^^^^^^^
:pep:`427` (and its :ref:`living PyPA counterpart <packaging:binary-distribution-format>`)
specify the :term:`wheel format <packaging:Wheel>`.
This format includes accommodations for digital signatures embedded directly
into the wheel, in either JWS or S/MIME format. These signatures are specified
over a :pep:`376` RECORD, which is modified to include a cryptographic digest
for each recorded file in the wheel.
While wheel signatures are fully specified, they do not appear to be broadly
used; the official `wheel tooling <https://github.com/pypa/wheel>`_ deprecated
signature generation and verification support
`in 0.32.0 <https://wheel.readthedocs.io/en/stable/news.html>`_, which was
released in 2018.
Additionally, wheel signatures do not satisfy any of
the above considerations (due to the "attached" nature of the signatures,
non-verifiability on the index itself, and support for wheels only).
Specification
=============
.. _upload-endpoint:
Upload endpoint changes
-----------------------
The current upload API is not standardized. However, we propose the following
changes to it:
* In addition to the current top-level ``content`` and ``gpg_signature`` fields,
the index **SHALL** accept ``attestations`` as an additional multipart form
field.
* The new ``attestations`` field **SHALL** be a JSON object.
* The JSON object **SHALL** have one or more keys, each identifying an
attestation format known to the index. If any key does not identify an
attestation format known to the index, the index **MUST** reject the upload.
* The value associated with each well-known key **SHALL** be a JSON object.
* Each attestation value **MUST** be verifiable by the index. If the index fails
to verify any attestation in ``attestations``, it **MUST** reject the upload.
In addition to the above, the index **SHALL** enforce a consistency
policy for release attestations via the following:
* If the first file under a new release is supplied with ``attestations``,
then all subsequently uploaded files under the same release **MUST** also
have ``attestations``. Conversely, if the first file under a new release
does not have any ``attestations``, then all subsequent uploads under the
same release **MUST NOT** have ``attestations``.
* All files under the same release **MUST** have the same set of well-known
attestation format keys.
The index **MUST** reject any file upload that does not satisfy these
consistency properties.
Index changes
-------------
.. _provenance-object:
Provenance objects
^^^^^^^^^^^^^^^^^^
The index will serve uploaded attestations along with metadata that can assist
in verifying them in the form of JSON serialized objects.
These "provenance objects" will be available via both the :pep:`503` Simple Index
and :pep:`691` JSON-based Simple API as described below, and will have the
following structure:
.. code-block:: json
{
"publisher": {
"type": "important-ci-service",
"claims": {},
"vendor-property": "foo",
"another-property": 123
},
"attestations": {
"some-attestation": {/* ... */},
"another-attestation": {/* ... */}
}
}
* ``publisher`` is an **optional** JSON object, containing a
representation of the file's Trusted Publisher configuration at the time
the file was uploaded to the package index. The keys within the ``publisher``
object are specific to each Trusted Publisher but include, at minimum:
* A ``type`` key, which **MUST** be a JSON string that uniquely identifies the
kind of Trusted Publisher.
* A ``claims`` key, which **MUST** be a JSON object containing any context-specific
claims retained by the index during Trusted Publisher authentication.
All other keys in the ``publisher`` object are publisher-specific. A full
illustrative example of a ``publisher`` object is provided in :ref:`appendix-2`.
* ``attestations`` is a **required** JSON object, containing one or
more attestation objects as identified by their keys. This object is
a superset of ``attestations`` object supplied by the uploader through the
``attestations`` field, as described in :ref:`upload-endpoint`.
Because ``attestations`` is a superset of the file's original uploaded attestations,
the index **MAY** chose to embed additional attestations of its own.
Simple Index
^^^^^^^^^^^^
* When an uploaded file has one or more attestations, the index **MAY** include a
``data-provenance`` attribute on its file link, with a value of either
``true`` or ``false``.
* When ``data-provenance`` is ``true``, the index **MUST** serve a
:ref:`provenance object <provenance-object>` at the same URL, but with
``.provenance`` appended to it. For example, if ``HolyGrail-1.0.tar.gz``
exists and has associated attestations, those attestations would be located
within the provenance object hosted at ``HolyGrail-1.0.tar.gz.provenance``.
JSON-based Simple API
^^^^^^^^^^^^^^^^^^^^^
* When an uploaded file has one or more attestations, the index **MAY** include a
``provenance`` object in the ``file`` dictionary for that file.
* ``provenance``, when present, **MUST** be a :ref:`provenance object <provenance-object>`.
These changes require a version change to the JSON API:
* The ``api-version`` must specify version 1.2 or later.
Security Implications
=====================
This PEP is "mechanical" in nature; it provides only the plumbing for future
digital attestations on package indices, without specifying their concrete
cryptographic details.
As such, we do not identify any positive or negative security implications
for this PEP.
Index trust
-----------
This PEP does **not** increase (or decrease) trust in the index itself:
the index is still effectively trusted to honestly deliver unmodified package
distributions, since a dishonest index capable of modifying package
contents could also dishonestly modify or omit package attestations.
As a result, this PEP's presumption of index trust is equivalent to the
unstated presumption with earlier mechanisms, like PGP and Wheel signatures.
This PEP does not preclude or exclude future index trust mechanisms, such
as :pep:`458` and/or :pep:`480`.
Recommendations
===============
This PEP does not recommend specific attestation formats. It does,
however, make the following recommendations to package indices seeking
to create new or implement pre-existing attestation formats:
1. Consult the :ref:`living PyPA specifications <packaging:packaging-specifications>`
first, and determine if any currently defined attestation formats suit
your purpose.
2. If no suitable attestation format is defined under the PyPA specifications,
consider submitting it to the PyPA specifications for longevity and reuse
purposes.
When designing a new attestation format, we make the following recommendations:
1. Pick a short, but unique name for your attestation format; this name will
serve as the attestation's identifier in the upload and index APIs.
When appropriate for an attestation format, we recommend using ``:`` as a
domain separator. For example, an attestation format that provides publish
provenance using `Sigstore <https://www.sigstore.dev/>`_ might have the
name ``sigstore:publish``.
2. Prefer parsimony in your format: avoid optional fields and functionality,
avoid unnecessary cryptographic agility and message malleability, and ensure
that verifying the attestation communicates something meaningful beyond a
basic integrity check (since the index itself already supplies cryptographic
digests for this purpose).
.. _appendix-1:
Appendix 1: Example Uploaded Attestations
=========================================
This appendix provides a fictional example of the ``attestations`` field
submitted on file upload, with two fictional attestations (``publish`` and
``timestamp``):
.. code-block:: json
{
"publish": {
"mediaType": "application/vnd.dev.sigstore.bundle+json;version=0.2",
"verificationMaterial": { /* omitted for brevity */ },
"messageSignature": {
"messageDigest": {
"algorithm": "some-hash-algo",
"digest": "digest-here"
},
"signature": "signature-here"
}
},
"timestamp": {
"cms": "some-long-blob-here"
}
}
The payloads of these fictional attestations are purely illustrative.
.. _appendix-2:
Appendix 2: Example Trusted Publisher Representation
====================================================
This appendix provides a fictional example of a ``publisher`` key within
a :pep:`691` ``project.files[].provenance`` listing:
.. code-block:: json
"publisher": {
"type": "GitHub",
"claims": {
"ref": "refs/tags/v1.0.0",
"sha": "da39a3ee5e6b4b0d3255bfef95601890afd80709"
},
"repository_name": "HolyGrail",
"repository_owner": "octocat",
"repository_owner_id": "1",
"workflow_filename": "publish.yml",
"environment": null
}
Copyright
=========
This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.