python-peps/peps/pep-0658.rst

180 lines
6.8 KiB
ReStructuredText
Raw Normal View History

PEP: 658
Title: Serve Distribution Metadata in the Simple Repository API
Author: Tzu-ping Chung <uranusjr@gmail.com>
Sponsor: Brett Cannon <brett@python.org>
PEP-Delegate: Donald Stufft <donald@stufft.io>
Discussions-To: https://discuss.python.org/t/8651
2021-08-17 18:37:50 -04:00
Status: Accepted
Type: Standards Track
Topic: Packaging
Content-Type: text/x-rst
Created: 10-May-2021
Post-History: 10-May-2021
2021-08-17 18:37:50 -04:00
Resolution: https://discuss.python.org/t/8651/48
Abstract
========
This PEP proposes adding an anchor tag to expose the ``METADATA`` file
from distributions in the :pep:`503` "simple" repository API. A
2021-08-23 15:14:11 -04:00
``data-dist-info-metadata`` attribute is introduced to indicate that
the file from a given distribution can be independently fetched.
Motivation
==========
Package management workflows made popular by recent tooling increase
the need to inspect distribution metadata without intending to install
the distribution, and download multiple distributions of a project to
choose from based on their metadata. This means they end up discarding
much downloaded data, which is inefficient and results in a bad user
experience.
Rationale
=========
Tools have been exploring methods to reduce the download size by
partially downloading wheels with HTTP range requests. This, however,
adds additional run-time requirements to the repository server. It
also still adds additional overhead, since a separate request is
needed to fetch the wheel's file listing to find the correct offset to
fetch the metadata file. It is therefore desired to make the server
extract the metadata file in advance, and serve it as an independent
file to avoid the need to perform additional requests and ZIP
inspection.
The metadata file defined by the Core Metadata Specification
[core-metadata]_ will be served directly by repositories since it
contains the necessary information for common use cases. The metadata
must only be served for standards-compliant distributions such as
wheels [wheel]_ and sdists [sdist]_, and must be identical to the
distribution's canonical metadata file, such as a wheel's ``METADATA``
file in the ``.dist-info`` directory [dist-info]_.
An HTML attribute on the distribution file's anchor link is needed to
indicate whether a client is able to choose the separately served
metadata file. The attribute is also used to provide the metadata
content's hash for client-side verification. The attribute's absence
indicates that a separate metadata entry is not available for the
distribution, either because of the distribution's content, or lack of
repository support.
Specification
=============
In a simple repository's project page, each anchor tag pointing to a
distribution **MAY** have a ``data-dist-info-metadata`` attribute. The
presence of the attribute indicates the distribution represented by
the anchor tag **MUST** contain a Core Metadata file that will not be
modified when the distribution is processed and/or installed.
If a ``data-dist-info-metadata`` attribute is present, the repository
**MUST** serve the distribution's Core Metadata file alongside the
distribution with a ``.metadata`` appended to the distribution's file
name. For example, the Core Metadata of a distribution served at
``/files/distribution-1.0-py3.none.any.whl`` would be located at
``/files/distribution-1.0-py3.none.any.whl.metadata``. This is similar
to how :pep:`503` specifies the GPG signature file's location.
The repository **SHOULD** provide the hash of the Core Metadata file
as the ``data-dist-info-metadata`` attribute's value using the syntax
``<hashname>=<hashvalue>``, where ``<hashname>`` is the lower cased
name of the hash function used, and ``<hashvalue>`` is the hex encoded
digest. The repository **MAY** use ``true`` as the attribute's value
if a hash is unavailable.
Backwards Compatibility
=======================
If an anchor tag lacks the ``data-dist-info-metadata`` attribute,
tools are expected to revert to their current behaviour of downloading
the distribution to inspect the metadata.
Older tools not supporting the new ``data-dist-info-metadata``
attribute are expected to ignore the attribute and maintain their
current behaviour of downloading the distribution to inspect the
metadata. This is similar to how prior ``data-`` attribute additions
expect existing tools to operate.
Rejected Ideas
==============
Put metadata content on the project page
----------------------------------------
2021-05-12 10:14:28 -04:00
Since tools generally only need dependency information from a
distribution in addition to what's already available on the project
page, it was proposed that repositories may directly include the
information on the project page, like the ``data-requires-python``
attribute specified in :pep:`503`.
This approach was abandoned since a distribution may contain
arbitrarily long lists of dependencies (including required and
optional), and it is unclear whether including the information for
every distribution in a project would result in net savings since the
information for most distributions generally ends up unneeded. By
serving the metadata separately, performance can be better estimated
since data usage will be more proportional to the number of
distributions inspected.
Expose more files in the distribution
-------------------------------------
It was proposed to provide the entire ``.dist-info`` directory as a
2023-10-13 01:15:59 -04:00
separate part, instead of only the metadata file. However, serving
multiple files in one entity through HTTP requires re-archiving them
separately after they are extracted from the original distribution
by the repository server, and there are no current use cases for files
other than ``METADATA`` when the distribution itself is not going to
be installed.
It should also be noted that the approach taken here does not
preclude other files from being introduced in the future, whether we
want to serve them together or individually.
Explicitly specify the metadata file's URL on the project page
--------------------------------------------------------------
An early version of this draft proposed putting the metadata file's
URL in the ``data-dist-info-metadata`` attribute. But people feel it
is better for discoverability to require the repository to serve the
metadata file at a determined location instead. The current approach
also has an additional benefit of making the project page smaller.
References
==========
.. [core-metadata] https://packaging.python.org/specifications/core-metadata/
.. [dist-info] https://packaging.python.org/specifications/recording-installed-packages/
.. [wheel] https://packaging.python.org/specifications/binary-distribution-format/
.. [sdist] https://packaging.python.org/specifications/source-distribution-format/
Copyright
=========
This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: