180 lines
6.8 KiB
ReStructuredText
180 lines
6.8 KiB
ReStructuredText
PEP: 658
|
|
Title: Serve Distribution Metadata in the Simple Repository API
|
|
Author: Tzu-ping Chung <uranusjr@gmail.com>
|
|
Sponsor: Brett Cannon <brett@python.org>
|
|
PEP-Delegate: Donald Stufft <donald@stufft.io>
|
|
Discussions-To: https://discuss.python.org/t/8651
|
|
Status: Accepted
|
|
Type: Standards Track
|
|
Topic: Packaging
|
|
Content-Type: text/x-rst
|
|
Created: 10-May-2021
|
|
Post-History: 10-May-2021
|
|
Resolution: https://discuss.python.org/t/8651/48
|
|
|
|
|
|
Abstract
|
|
========
|
|
|
|
This PEP proposes adding an anchor tag to expose the ``METADATA`` file
|
|
from distributions in the :pep:`503` "simple" repository API. A
|
|
``data-dist-info-metadata`` attribute is introduced to indicate that
|
|
the file from a given distribution can be independently fetched.
|
|
|
|
|
|
Motivation
|
|
==========
|
|
|
|
Package management workflows made popular by recent tooling increase
|
|
the need to inspect distribution metadata without intending to install
|
|
the distribution, and download multiple distributions of a project to
|
|
choose from based on their metadata. This means they end up discarding
|
|
much downloaded data, which is inefficient and results in a bad user
|
|
experience.
|
|
|
|
|
|
Rationale
|
|
=========
|
|
|
|
Tools have been exploring methods to reduce the download size by
|
|
partially downloading wheels with HTTP range requests. This, however,
|
|
adds additional run-time requirements to the repository server. It
|
|
also still adds additional overhead, since a separate request is
|
|
needed to fetch the wheel's file listing to find the correct offset to
|
|
fetch the metadata file. It is therefore desired to make the server
|
|
extract the metadata file in advance, and serve it as an independent
|
|
file to avoid the need to perform additional requests and ZIP
|
|
inspection.
|
|
|
|
The metadata file defined by the Core Metadata Specification
|
|
[core-metadata]_ will be served directly by repositories since it
|
|
contains the necessary information for common use cases. The metadata
|
|
must only be served for standards-compliant distributions such as
|
|
wheels [wheel]_ and sdists [sdist]_, and must be identical to the
|
|
distribution's canonical metadata file, such as a wheel's ``METADATA``
|
|
file in the ``.dist-info`` directory [dist-info]_.
|
|
|
|
An HTML attribute on the distribution file's anchor link is needed to
|
|
indicate whether a client is able to choose the separately served
|
|
metadata file. The attribute is also used to provide the metadata
|
|
content's hash for client-side verification. The attribute's absence
|
|
indicates that a separate metadata entry is not available for the
|
|
distribution, either because of the distribution's content, or lack of
|
|
repository support.
|
|
|
|
|
|
Specification
|
|
=============
|
|
|
|
In a simple repository's project page, each anchor tag pointing to a
|
|
distribution **MAY** have a ``data-dist-info-metadata`` attribute. The
|
|
presence of the attribute indicates the distribution represented by
|
|
the anchor tag **MUST** contain a Core Metadata file that will not be
|
|
modified when the distribution is processed and/or installed.
|
|
|
|
If a ``data-dist-info-metadata`` attribute is present, the repository
|
|
**MUST** serve the distribution's Core Metadata file alongside the
|
|
distribution with a ``.metadata`` appended to the distribution's file
|
|
name. For example, the Core Metadata of a distribution served at
|
|
``/files/distribution-1.0-py3.none.any.whl`` would be located at
|
|
``/files/distribution-1.0-py3.none.any.whl.metadata``. This is similar
|
|
to how :pep:`503` specifies the GPG signature file's location.
|
|
|
|
The repository **SHOULD** provide the hash of the Core Metadata file
|
|
as the ``data-dist-info-metadata`` attribute's value using the syntax
|
|
``<hashname>=<hashvalue>``, where ``<hashname>`` is the lower cased
|
|
name of the hash function used, and ``<hashvalue>`` is the hex encoded
|
|
digest. The repository **MAY** use ``true`` as the attribute's value
|
|
if a hash is unavailable.
|
|
|
|
|
|
Backwards Compatibility
|
|
=======================
|
|
|
|
If an anchor tag lacks the ``data-dist-info-metadata`` attribute,
|
|
tools are expected to revert to their current behaviour of downloading
|
|
the distribution to inspect the metadata.
|
|
|
|
Older tools not supporting the new ``data-dist-info-metadata``
|
|
attribute are expected to ignore the attribute and maintain their
|
|
current behaviour of downloading the distribution to inspect the
|
|
metadata. This is similar to how prior ``data-`` attribute additions
|
|
expect existing tools to operate.
|
|
|
|
|
|
Rejected Ideas
|
|
==============
|
|
|
|
Put metadata content on the project page
|
|
----------------------------------------
|
|
|
|
Since tools generally only need dependency information from a
|
|
distribution in addition to what's already available on the project
|
|
page, it was proposed that repositories may directly include the
|
|
information on the project page, like the ``data-requires-python``
|
|
attribute specified in :pep:`503`.
|
|
|
|
This approach was abandoned since a distribution may contain
|
|
arbitrarily long lists of dependencies (including required and
|
|
optional), and it is unclear whether including the information for
|
|
every distribution in a project would result in net savings since the
|
|
information for most distributions generally ends up unneeded. By
|
|
serving the metadata separately, performance can be better estimated
|
|
since data usage will be more proportional to the number of
|
|
distributions inspected.
|
|
|
|
|
|
Expose more files in the distribution
|
|
-------------------------------------
|
|
|
|
It was proposed to provide the entire ``.dist-info`` directory as a
|
|
separate part, instead of only the metadata file. However, serving
|
|
multiple files in one entity through HTTP requires re-archiving them
|
|
separately after they are extracted from the original distribution
|
|
by the repository server, and there are no current use cases for files
|
|
other than ``METADATA`` when the distribution itself is not going to
|
|
be installed.
|
|
|
|
It should also be noted that the approach taken here does not
|
|
preclude other files from being introduced in the future, whether we
|
|
want to serve them together or individually.
|
|
|
|
|
|
Explicitly specify the metadata file's URL on the project page
|
|
--------------------------------------------------------------
|
|
|
|
An early version of this draft proposed putting the metadata file's
|
|
URL in the ``data-dist-info-metadata`` attribute. But people feel it
|
|
is better for discoverability to require the repository to serve the
|
|
metadata file at a determined location instead. The current approach
|
|
also has an additional benefit of making the project page smaller.
|
|
|
|
|
|
References
|
|
==========
|
|
|
|
.. [core-metadata] https://packaging.python.org/specifications/core-metadata/
|
|
|
|
.. [dist-info] https://packaging.python.org/specifications/recording-installed-packages/
|
|
|
|
.. [wheel] https://packaging.python.org/specifications/binary-distribution-format/
|
|
|
|
.. [sdist] https://packaging.python.org/specifications/source-distribution-format/
|
|
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document is placed in the public domain or under the
|
|
CC0-1.0-Universal license, whichever is more permissive.
|
|
|
|
|
|
..
|
|
Local Variables:
|
|
mode: indented-text
|
|
indent-tabs-mode: nil
|
|
sentence-end-double-space: t
|
|
fill-column: 70
|
|
coding: utf-8
|
|
End:
|