python-peps/peps/pep-0714.rst

276 lines
12 KiB
ReStructuredText

PEP: 714
Title: Rename dist-info-metadata in the Simple API
Author: Donald Stufft <donald@stufft.io>
PEP-Delegate: Paul Moore <p.f.moore@gmail.com>
Discussions-To: https://discuss.python.org/t/27471
Status: Accepted
Type: Standards Track
Topic: Packaging
Content-Type: text/x-rst
Created: 06-Jun-2023
Post-History: `06-Jun-2023 <https://discuss.python.org/t/27471>`__
Resolution: https://discuss.python.org/t/27471/19
Abstract
========
This PEP renames the metadata provided by :pep:`658` in both HTML and JSON
formats of the Simple API and provides guidelines for both clients and servers
in how to handle the renaming.
Motivation
==========
:pep:`658` specified a mechanism to host the core metadata files from an
artifact available through the Simple API such that a client could fetch the
metadata and use it without having to download the entire artifact. Later
:pep:`691` was written to add the ability to use JSON rather than HTML on the
Simple API, which included support for the :pep:`658` metadata.
Unfortunately, PyPI did not support :pep:`658` until just
`recently <https://github.com/pypi/warehouse/pull/13649>`__, which released with
a `bug <https://github.com/pypi/warehouse/issues/13705>`__ where the
``dist-info-metadata`` key from :pep:`658` was incorrectly named in the JSON
representation, to be ``data-dist-info-metadata``. However, when
attempting to fix that bug, it was discovered that pip *also* had a
`bug <https://github.com/pypa/pip/issues/12042>`__, where any use of
``dist-info-metadata`` in the JSON representation would cause pip to hard fail
with an exception.
The bug in pip has existed since at least ``v22.3``, which means that it has
been released for approximately 8 months, long enough to have been pulled into
Python releases, downstream Linux releases, baked into containers, virtual
environments, etc.
This puts us in an awkward position of having a bug on PyPI that cannot be fixed
without breaking pip, due to a bug in pip, but that version of pip is old enough
to have been widely deployed. To make matters worse, a version of pip that is
broken in this way cannot install *anything* from PyPI once it fixes its bug,
including installing a new, fixed version of pip.
Rationale
=========
There are 3 main options for a path forward for fixing these bugs:
1. Do not change the spec, fix the bug in pip, wait some amount of time, then
fix the bug in PyPI, breaking anyone using an unfixed pip such that they
cannot even install a new pip from PyPI.
2. Do the same as (1), but special case PyPI so it does not emit the :pep:`658`
metadata for pip, even if it is available. This allows people to upgrade pip
if they're on a broken version, but nothing else.
3. Change the spec to avoid the key that pip can't handle currently, allowing
PyPI to emit that key and a new version of pip to be released to take
advantage of that key.
This PEP chooses (3), but goes a little further and also renames the key in the
HTML representation.
Typically we do not change specs because of bugs that only affect one particular
implementation, unless the spec itself is at fault, which isn't the case here:
the spec is fine and these are just genuine bugs in pip and PyPI.
However, we choose to do this for 4 reasons:
1. Bugs that affect pip and PyPI together represent an outsized amount of impact
compared to any other client or repository combination.
2. The impact of being broken is that installs do not function, at all, rather
than degrading gracefully in some way.
3. The feature that is being blocked by these bugs is of large importance to
the ability to quickly and efficiently resolve dependencies from PyPI with
pip, and having to delay it for a long period of time while we wait for the
broken versions of pip to fall out of use would be of detriment to the entire
ecosystem.
4. The downsides of changing the spec are fairly limited, given that we do not
believe that support for this is widespread, so it affects only a limited
number of projects.
Specification
=============
The keywords "**MUST**", "**MUST NOT**", "**REQUIRED**", "**SHALL**",
"**SHALL NOT**", "**SHOULD**", "**SHOULD NOT**", "**RECOMMENDED**", "**MAY**",
and "**OPTIONAL**"" in this document are to be interpreted as described in
:rfc:`RFC 2119 <2119>`.
Servers
-------
The :pep:`658` metadata, when used in the HTML representation of the Simple API,
**MUST** be emitted using the attribute name ``data-core-metadata``, with the
supported values remaining the same.
The :pep:`658` metadata, when used in the :pep:`691` JSON representation of the
Simple API, **MUST** be emitted using the key ``core-metadata``, with the
supported values remaining the same.
To support clients that used the previous key names, the HTML representation
**MAY** also be emitted using the ``data-dist-info-metadata``, and if it does
so it **MUST** match the value of ``data-core-metadata``.
Clients
-------
Clients consuming any of the HTML representations of the Simple API **MUST**
read the :pep:`658` metadata from the key ``data-core-metadata`` if it is
present. They **MAY** optionally use the legacy ``data-dist-info-metadata`` if
it is present but ``data-core-metadata`` is not.
Clients consuming the JSON representation of the Simple API **MUST** read the
:pep:`658` metadata from the key ``core-metadata`` if it is present. They
**MAY** optionally use the legacy ``dist-info-metadata`` key if it is present
but ``core-metadata`` is not.
Backwards Compatibility
=======================
There is a minor compatibility break in this PEP, in that clients that currently
correctly handle the existing metadata keys will not automatically understand
the newer metadata keys, but they should degrade gracefully, and simply act
as if the :pep:`658` metadata does not exist.
Otherwise there should be no compatibility problems with this PEP.
Rejected Ideas
==============
Leave the spec unchanged, and cope with fixing in PyPI and/or pip
-----------------------------------------------------------------
We believe that the improvements brought by :pep:`658` are very important to
improving the performance of resolving dependencies from PyPI, and would like to
be able to deploy it as quickly as we can.
Unfortunately the nature of these bugs is that we cannot deploy them as is
without breaking widely deployed and used versions of pip. The breakages in
this case would be bad enough that affected users would not even be able to
directly upgrade their version of pip to fix it, but would have to manually
fetch pip another way first (e.g. ``get-pip.py``).
This is something that PyPI would be unwilling to do without some way to
mitigate those breakages for those users. Without some reasonable mitigation
strategy, we would have to wait until those versions of pip are no longer in use
on PyPI, which would likely be 5+ years from now.
There are a few possible mitigation strategies that we could use, but we've
rejected them as well.
Mitigation: Special Case pip
++++++++++++++++++++++++++++
The breakages are particularly bad in that it prevents users from even upgrading
pip to get an unbroken version of pip, so a command like
``pip install --upgrade pip`` would fail. We could mitigate this by having PyPI
special case pip itself, so that the JSON endpoint never returns the :pep:`658`
metadata and the above still works.
This PEP rejects this idea because while the simple command that only upgrades
pip would work, if the user included *anything* else in that command to upgrade
then the command would go back to failing, which we consider to be still too
large of a breakage.
Additionally, while this bug happens to be getting exposed right now with PyPI,
it is really a bug that would happen with any :pep:`691` repository that
correctly exposed the :pep:`658` metadata. This would mean that every repository
would have to carry this special case for pip.
Mitigation: Have the server use User-Agent Detection
++++++++++++++++++++++++++++++++++++++++++++++++++++
pip puts its version number into its ``User-Agent``, which means that the server
could detect the version number and serve different responses based on that
version number so that we don't serve the :pep:`658` metadata to versions of pip
that are broken.
This PEP rejects this idea because supporting ``User-Agent`` detection is too
difficult to implement in a reasonable way.
1. On PyPI we rely heavily on caching the Simple API in our CDN. If we varied
the responses based on ``User-Agent``, then our CDN cache would have an
explosion of cache keys for the same content, which would make it more likely
that any particular request would not be cached and fall back to hitting
our backend servers, which would have to scale much higher to support the
load.
2. PyPI *could* support the ``User-Agent`` detection idea by mutating the
``Accept`` header of the request so that those versions appear to only
accept the HTML version, allowing us to maintain the CDNs cache keys. This
doesn't affect any downstream caches of PyPI though, including pip's HTTP
cache which would possibly have JSON versions cached for those requests and
we wouldn't emit a ``Vary`` on ``User-Agent`` for them to know that it isn't
acceptable to share those caches, and adding a ``Vary: User-Agent`` for
downstream caches would have the same problem as (1), but for downstream
caches instead of our CDN cache.
3. The pip bug ultimately isn't PyPI specific, it affects any repository that
implements :pep:`691` and :pep:`658` together. This would mean that
workarounds that rely on implementation specific fixes have to be replicated
for each repository that implements both, which may not be easy or possible
in all cases (static mirrors may not be able to do this ``User-Agent``
detection for instance).
Only change the JSON key
------------------------
The bug in pip only affects the JSON representation of the Simple API, so we only
*need* to actually change the key in the JSON, and we could leave the existing
HTML keys alone.
This PEP rejects doing that because we believe that in the long term, having
the HTML and JSON key names diverge would make mistakes like this more likely
and make implementing and understanding the spec more confusing.
The main reason that we would want to not change the HTML keys is to not lose
:pep:`658` support in any HTML only clients or repositories that might already
support it. This PEP mitigates that breakage by allowing both clients and
servers to continue to support both keys, with a recommendation of when and
how to do that.
Recommendations
===============
The recommendations in this section, other than this notice itself, are
non-normative, and represent what the PEP authors believe to be the best default
implementation decisions for something implementing this PEP, but it does not
represent any sort of requirement to match these decisions.
Servers
-------
We recommend that servers *only* emit the newer keys, particularly for the JSON
representation of the Simple API since the bug itself only affected JSON.
Servers that wish to support :pep:`658` in clients that use HTML and have it
implemented, can safely emit both keys *only* in HTML.
Servers should not emit the old keys in JSON unless they know that no broken
versions of pip will be used to access their server.
Clients
-------
We recommend that clients support both keys, for both HTML and JSON, preferring
the newer key as this PEP requires. This will allow clients to support
repositories that already have correctly implemented :pep:`658` and :pep:`691`
but have not implemented this PEP.
Copyright
=========
This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.