PEP 691: Refactor based on Feedback (#2592)
* Describe JSON format first and more clearly * Clarify versioning * Move content types down and flesh them out more * Expand the conneg section * Expand upon the impact to PEP 458 * Add alternative mechanisms for conneg * Provide recommendations to clients and servers * Add a FAQ for implications for static file serving * Add a FAQ to clarify that TUF targets != URLs * Add a FAQ for application/json * Add a FAQ about PyPI * Add support for PEP 629 in the JSON response body * Add PyPI to the appendix * Recommend ;q=0 on text/html * Rename the dist-info-metadata-available field * Update the PEP-Delegate to Brett Cannon
This commit is contained in:
parent
7557f1959f
commit
178afaf170
644
pep-0691.rst
644
pep-0691.rst
|
@ -7,7 +7,7 @@ Author: Donald Stufft <donald@stufft.io>,
|
||||||
Status: Draft
|
Status: Draft
|
||||||
Type: Standards Track
|
Type: Standards Track
|
||||||
Content-Type: text/x-rst
|
Content-Type: text/x-rst
|
||||||
BDFL-Delegate: Donald Stufft <donald@stufft.io>
|
PEP-Delegate: Brett Cannon <brett@python.org>
|
||||||
Discussions-To: https://discuss.python.org/t/pep-691-json-based-simple-api-for-python-package-indexes/15553
|
Discussions-To: https://discuss.python.org/t/pep-691-json-based-simple-api-for-python-package-indexes/15553
|
||||||
Created: 04-May-2022
|
Created: 04-May-2022
|
||||||
Post-History: `05-May-2022 <https://discuss.python.org/t/pep-691-json-based-simple-api-for-python-package-indexes/15553>`__
|
Post-History: `05-May-2022 <https://discuss.python.org/t/pep-691-json-based-simple-api-for-python-package-indexes/15553>`__
|
||||||
|
@ -53,10 +53,11 @@ that effort has not gained much if any traction beyond people thinking that it
|
||||||
would be nice to do it.
|
would be nice to do it.
|
||||||
|
|
||||||
This PEP attempts to take a different route. It doesn't fundamentally change
|
This PEP attempts to take a different route. It doesn't fundamentally change
|
||||||
the overall API structure, but instead specifies a new representation of the
|
the overall API structure, but instead specifies a new serialization of the
|
||||||
existing data contained in existing :pep:`503` responses in a format that is
|
existing data contained in existing :pep:`503` responses in a format that is
|
||||||
easier for software to parse rather than using a human centric document format.
|
easier for software to parse rather than using a human centric document format.
|
||||||
|
|
||||||
|
|
||||||
Goals
|
Goals
|
||||||
=====
|
=====
|
||||||
|
|
||||||
|
@ -98,109 +99,111 @@ Specification
|
||||||
|
|
||||||
To enable parsing responses with only the standard library, this PEP specifies that
|
To enable parsing responses with only the standard library, this PEP specifies that
|
||||||
all responses (besides the files themselves, and the HTML responses from
|
all responses (besides the files themselves, and the HTML responses from
|
||||||
:pep:`503`) should be encoded using `JSON <https://www.json.org/>`_.
|
:pep:`503`) should be serialized using `JSON <https://www.json.org/>`_.
|
||||||
|
|
||||||
To enable zero configuration discovery and to minimize the amount of additional HTTP
|
To enable zero configuration discovery and to minimize the amount of additional HTTP
|
||||||
requests, this PEP extends :pep:`503` such that all of the API endpoints (other than the
|
requests, this PEP extends :pep:`503` such that all of the API endpoints (other than the
|
||||||
files themselves) will utilize HTTP content negotiation to allow client and server to
|
files themselves) will utilize HTTP content negotiation to allow client and server to
|
||||||
select the correct format to serve, i.e. either HTML or JSON.
|
select the correct serialization format to serve, i.e. either HTML or JSON.
|
||||||
|
|
||||||
Format Selection
|
|
||||||
----------------
|
|
||||||
|
|
||||||
A HTML response will be the default when requesting in version 1.0:
|
|
||||||
|
|
||||||
- ``/simple/``
|
|
||||||
- ``/simple/foo/``
|
|
||||||
- Like :pep:`503`, the trailing ``/`` is expected
|
|
||||||
|
|
||||||
To request a JSON response, the ``Accept`` header will need to be added to the
|
|
||||||
request specify the response type and version. For version 1.0 this will look like:
|
|
||||||
|
|
||||||
``Accept: application/vnd.pypi.simple.v1+json``
|
|
||||||
|
|
||||||
The version is also optional and will then always return the latest version:
|
|
||||||
|
|
||||||
``Accept: application/vnd.pypi.simple+json``
|
|
||||||
|
|
||||||
This is for clients who always want latest and should expect potential
|
|
||||||
breakages. Additionally, it is potential useful way to run integration tests
|
|
||||||
against a possibly breaking version.
|
|
||||||
|
|
||||||
Specifying HTML is also allowed so clients can be explicit to backends (e.g if we
|
|
||||||
switch to JSON default in the future):
|
|
||||||
|
|
||||||
``Accept: application/vnd.pypi.simple.v1+html``
|
|
||||||
|
|
||||||
Using ``text/html`` will also work, which will serve the latest API version. To
|
|
||||||
be explicit, clients should use specific HTML ``Accept``. If no
|
|
||||||
``Accept`` is specified, the latest HTML version will be returned unless
|
|
||||||
the backend *only* supports JSON. Backends may default to returning JSON in the
|
|
||||||
future.
|
|
||||||
|
|
||||||
The ``Accept:`` header also allows you to say that you prefer the the V1 Simple JSON API,
|
|
||||||
if that's not available then you prefer the V1 HTML API, and if that's not available,
|
|
||||||
just ``text/html``. To do this would look like:
|
|
||||||
|
|
||||||
``Accept: application/vnd.pypi.simple.v1+json, application/vnd.pypi.simple.v1+html, text/html``
|
|
||||||
|
|
||||||
Versioning
|
Versioning
|
||||||
----------
|
----------
|
||||||
|
|
||||||
Versioning will adhere to :pep:`629` format (``Major.Minor``) and will be
|
Versioning will adhere to :pep:`629` format (``Major.Minor``), which has defined the
|
||||||
included in the ``Accept`` request that clients add to obtain a JSON
|
existing HTML responses to be ``1.0``. Since this PEP does not introduce new features
|
||||||
response. We don't foresee the use of *Minor* versioning but will support it if
|
into the API, rather it describes a different serialization format for the existing
|
||||||
the need does arise.
|
features, this PEP does not change the existing ``1.0`` version, and instead just
|
||||||
|
describes how to serialize that into JSON.
|
||||||
|
|
||||||
The header for clients accessing version 1.0 of the API will be:
|
Simililary to :pep:`629`, the major version number **MUST** be incremented if any
|
||||||
|
changes to the new format would result in no longer being able to expect existing
|
||||||
|
clients to meaningfully understand the format.
|
||||||
|
|
||||||
``application/vnd.pypi.simple.index.v1+json``
|
Likewise, incrementing the minor version **MUST** be incremented if features are
|
||||||
|
added or removed from the format, but existing clients would be expected to continue
|
||||||
|
to meaningfully understand the format.
|
||||||
|
|
||||||
An example for Accept values that a newer APIs could support **would** look like:
|
Changes that would not result in existing clients being unable to meaningfully
|
||||||
|
understand the format and which do not represent features being added or removed
|
||||||
|
may occur without changing the version number.
|
||||||
|
|
||||||
``application/vnd.pypi.simple.index.v2+json``
|
This is intentionally vague, as this PEP believes it is best left up to future PEPs
|
||||||
|
that make any changes to the API to investigate and decide whether or not that
|
||||||
|
change should increment the major or minor version.
|
||||||
|
|
||||||
If a version that does not exist is requested, the server will explicitly return a
|
Future versions of the API may add things that can only be represented in a subset
|
||||||
`406 Not Acceptable
|
of the available serializations of that version. All serializations version numbers
|
||||||
<https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/406>`_ HTTP status
|
**SHOULD** be kept in sync, but the specifics of how a feature serializes into each
|
||||||
code. The response will also indicate available API versions and links to
|
format may differ, including whether or not that feature is present at all.
|
||||||
version formats.
|
|
||||||
|
It is the intent of this PEP that the API should be thought of as URL endpoints that
|
||||||
|
return data, whose interpretation is defined by the version of that data, and then
|
||||||
|
serialized into the target serialization format.
|
||||||
|
|
||||||
|
|
||||||
TUF Support - PEP 458
|
JSON Serialization
|
||||||
---------------------
|
------------------
|
||||||
|
|
||||||
:pep:`458` states that the "Simple Index" needs to be hashable. To adhere to the TUF
|
The URL structure from :pep:`503` still applies, as this PEP only adds an additional
|
||||||
standard, we will need a target for each response, i.e. the HTML and JSON (plus any
|
serialization format for the already existing API.
|
||||||
future type) response. To provide this we could have two targets per API endpoint:
|
|
||||||
|
|
||||||
- ``/simple/foo/vnd.pypi.simple.v1.html``
|
The following constraints apply to all JSON serialized responses described in this
|
||||||
- ``/simple/foo/vnd.pypi.simple.v1.json``
|
PEP:
|
||||||
|
|
||||||
Additionally, when calculating the digest of a JSON response, indices should
|
* All JSON responses will *always* be a JSON object rather than an array or other
|
||||||
use the `Canonical JSON <https://wiki.laptop.org/go/Canonical_JSON>`_ format.
|
type.
|
||||||
|
|
||||||
|
* While JSON doesn't natively support an URL type, any value that represents an
|
||||||
|
URL in this API may be either absolute or relative as long as they point to
|
||||||
|
the correct location. If relative, they are relative to the current URL as if
|
||||||
|
it were HTML.
|
||||||
|
|
||||||
|
* Additional keys may be added to any dictionary objects in the API responses
|
||||||
|
and clients **MUST** ignore keys that they don't understand.
|
||||||
|
|
||||||
|
* All JSON responses will have a ``meta`` key, which contains information related to
|
||||||
|
the response itself, rather than the content of the response.
|
||||||
|
|
||||||
|
* All JSON responses will have a ``meta.api-version`` key, which will be a string that
|
||||||
|
contains the :pep:`629` ``Major.Minor`` version number, with the same fail/warn
|
||||||
|
semantics as in :pep:`629`.
|
||||||
|
|
||||||
|
* All requirements of :pep:`503` that are not HTML specific still apply.
|
||||||
|
|
||||||
|
|
||||||
Root URL
|
Project List
|
||||||
--------
|
~~~~~~~~~~~~
|
||||||
|
|
||||||
The root URL ``/`` for this PEP (which represents the base URL) will be a JSON encoded
|
The root URL ``/`` for this PEP (which represents the base URL) will be a JSON encoded
|
||||||
dictionary where each key is a string of the normalized project name, and the value is
|
dictionary which has a single key, ``projects``, which is itself a dictionary where each
|
||||||
a dictionary with a single key, ``url``, which represents the URL that the project can
|
key is a string of the normalized project name, and the value is a dictionary with a
|
||||||
be fetched from. As an example::
|
single key, ``url``, which represents the URL that the project can be fetched from. As
|
||||||
|
an example:
|
||||||
|
|
||||||
|
.. code-block:: json
|
||||||
|
|
||||||
{
|
{
|
||||||
"frob": {"url": "/frob/"},
|
"meta": {
|
||||||
"spamspamspam": {"url": "/spamspamspam/"}
|
"api-version": "1.0"
|
||||||
|
},
|
||||||
|
"projects": {
|
||||||
|
"frob": {"url": "/frob/"},
|
||||||
|
"spamspamspam": {"url": "/spamspamspam/"}
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
Below the root URL is another URL for each individual project contained within
|
|
||||||
a repository. The format of this URL is ``/<project>/`` where the ``<project>``
|
Project Detail
|
||||||
is replaced by the :pep:`503`-canonicalized name for that project, so a project named
|
~~~~~~~~~~~~~~
|
||||||
"Holy_Grail" would have a URL like ``/holy-grail/``. This URL must respond with a
|
|
||||||
JSON encoded dictionary that has two keys, ``name``, which represents the normalized
|
The format of this URL is ``/<project>/`` where the ``<project>`` is replaced by the
|
||||||
name of the project and ``files``. The ``files`` key is a list of dictionaries,
|
:pep:`503`-canonicalized name for that project, so a project named "Holy_Grail" would
|
||||||
each one representing an individual file.
|
have a URL like ``/holy-grail/``.
|
||||||
|
|
||||||
|
This URL must respond with a JSON encoded dictionary that has two keys, ``name``, which
|
||||||
|
represents the normalized name of the project and ``files``. The ``files`` key is a
|
||||||
|
list of dictionaries, each one representing an individual file.
|
||||||
|
|
||||||
Each individual file dictionary has the following keys:
|
Each individual file dictionary has the following keys:
|
||||||
|
|
||||||
|
@ -214,25 +217,52 @@ Each individual file dictionary has the following keys:
|
||||||
The ``hashes`` dictionary **MUST** be present, even if no hashes are available
|
The ``hashes`` dictionary **MUST** be present, even if no hashes are available
|
||||||
for the file, however it is **HIGHLY** recommended that at least one secure,
|
for the file, however it is **HIGHLY** recommended that at least one secure,
|
||||||
guaranteed to be available hash is always included.
|
guaranteed to be available hash is always included.
|
||||||
|
|
||||||
|
By default, any hash algorithm available via `hashlib
|
||||||
|
<https://docs.python.org/3/library/hashlib.html>`_ (specifically any that can
|
||||||
|
be passed to ``hashlib.new()`` and do not require additional parameters) can
|
||||||
|
be used as a key for the hashes dictionary. At least one secure algorithm from
|
||||||
|
``hashlib.algorithms_guaranteed`` **SHOULD** always be included. At the time
|
||||||
|
of this PEP, ``sha256`` specifically is recommended.
|
||||||
- ``requires-python``: An **optional** key that exposes the *Requires-Python*
|
- ``requires-python``: An **optional** key that exposes the *Requires-Python*
|
||||||
metadata field, specified in :pep:`345`. Where this is present, installer tools
|
metadata field, specified in :pep:`345`. Where this is present, installer tools
|
||||||
**SHOULD** ignore the download when installing to a Python version that
|
**SHOULD** ignore the download when installing to a Python version that
|
||||||
doesn't satisfy the requirement.
|
doesn't satisfy the requirement.
|
||||||
- ``dist-info-metadata-available``: An **optional** key that indicates
|
|
||||||
|
Unlike ``data-requires-python`` in :pep:`503`, the ``requires-python`` key does not
|
||||||
|
require any special escaping other than anything JSON does naturally.
|
||||||
|
- ``dist-info-metadata``: An **optional** key that indicates
|
||||||
that metadata for this file is available, via the same location as specified in
|
that metadata for this file is available, via the same location as specified in
|
||||||
:pep:`658` (`{file_url}.metadata`). Where this is present, it **MUST** be true,
|
:pep:`658` (``{file_url}.metadata``). Where this is present, it **MUST** be
|
||||||
or a dictionary mapping a hash name to a hex encoded digest of the metadata hash.
|
boolean to indicate if the file has an associated metadata file, or a dictionary
|
||||||
|
mapping hash names to a hex encoded digest of the metadata's hash.
|
||||||
|
|
||||||
|
When this is a dictionary of hashes, then all the same requirements and
|
||||||
|
recommendations as the ``hashes`` key hold true for this key as well.
|
||||||
|
|
||||||
|
If this key is missing then the metadata file may or may not exist. If the key
|
||||||
|
value is truthy, then the metadata file is present, and if it is falsey then it
|
||||||
|
is not.
|
||||||
|
|
||||||
|
It is recommended that servers make the hashes of the metadata file available if
|
||||||
|
possible.
|
||||||
- ``gpg-sig``: An **optional** key that acts a boolean to indicate if the file has
|
- ``gpg-sig``: An **optional** key that acts a boolean to indicate if the file has
|
||||||
an associated GPG signature or not. If this key does not exist, then the signature
|
an associated GPG signature or not. If this key does not exist, then the signature
|
||||||
may or may not exist.
|
may or may not exist.
|
||||||
- ``yanked``: An **optional** key which may have no value, or may have an
|
- ``yanked``: An **optional** key which may be a boolean to indicate if the file
|
||||||
arbitrary string as a value. The presence of a ``yanked`` key SHOULD
|
has been yanked, or a non empty, but otherwise arbitrary, string to indicate that
|
||||||
be interpreted as indicating that the file pointed to by the ``url`` field
|
a file has been yanked with a specific reason. If the ``yanked`` key is present
|
||||||
has been "Yanked" as per :pep:`592`.
|
and is a truthy value, then it **SHOULD** be interpreted as indicating that the
|
||||||
|
file pointed to by the ``url`` field has been "Yanked" as per :pep:`592`.
|
||||||
|
|
||||||
As an example::
|
As an example:
|
||||||
|
|
||||||
|
.. code-block:: json
|
||||||
|
|
||||||
{
|
{
|
||||||
|
"meta": {
|
||||||
|
"api-version": "1.0"
|
||||||
|
},
|
||||||
"name": "holygrail",
|
"name": "holygrail",
|
||||||
"files": [
|
"files": [
|
||||||
{
|
{
|
||||||
|
@ -247,42 +277,358 @@ As an example::
|
||||||
"url": "https://example.com/files/holygrail-1.0-py3-none-any.whl",
|
"url": "https://example.com/files/holygrail-1.0-py3-none-any.whl",
|
||||||
"hashes": {"sha256": "...", "blake2b": "..."},
|
"hashes": {"sha256": "...", "blake2b": "..."},
|
||||||
"requires-python": ">=3.7",
|
"requires-python": ">=3.7",
|
||||||
"dist-info-metadata-available": true
|
"dist-info-metadata": true
|
||||||
},
|
}
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
|
|
||||||
In addition to the above, the following constraints are placed on the API:
|
|
||||||
|
|
||||||
* While JSON doesn't natively support an URL type, any value that represents an
|
Content-Types
|
||||||
URL in this API may be either absolute or relative as long as they point to
|
-------------
|
||||||
the correct location. If relative, they are relative to the current URL as if
|
|
||||||
it were HTML.
|
|
||||||
|
|
||||||
* Additional keys may be added to any dictionary objects in the API responses
|
This PEP proposes that all responses from the Simple API will have a standard
|
||||||
and clients **MUST** ignore keys that they don't understand.
|
content type that describes what the response is (a simple api response), what
|
||||||
|
version of the API it represents, and what serialization format has been used.
|
||||||
|
|
||||||
* By default, any hash algorithm available via `hashlib
|
The structure of this content type will be::
|
||||||
<https://docs.python.org/3/library/hashlib.html>`_ (specifically any that can
|
|
||||||
be passed to ``hashlib.new()`` and do not require additional parameters) can
|
|
||||||
be used as a key for the hashes dictionary. At least one secure algorithm from
|
|
||||||
``hashlib.algorithms_guaranteed`` **SHOULD** always be included. At the time
|
|
||||||
of this PEP, ``sha256`` specifically is recommended.
|
|
||||||
|
|
||||||
* Unlike ``data-requires-python`` in :pep:`503`, the ``requires-python`` key does not
|
application/vnd.pypi.simple.$version+format
|
||||||
require any special escaping other than anything JSON does naturally.
|
|
||||||
|
|
||||||
* Future features **MAY** be implemented or only supported when operating under JSON.
|
Since only major versions should be disruptive to clients attempting to
|
||||||
This would be decided on a case by case basis depending on how important the feature
|
understand one of these API responses, only the major version will be included
|
||||||
is, how widely used HTML is at that point, and how difficult representing the feature
|
in the content type, and will be prefixed with a ``v`` to clarify that it is a
|
||||||
in HTML would be.
|
version number.
|
||||||
|
|
||||||
* All requirements of :pep:`503` that are not HTML specific still apply.
|
Which means that for the existing 1.0 API, the content types would be:
|
||||||
|
|
||||||
|
- **JSON:** ``application/vnd.pypi.simple.v1+json``
|
||||||
|
- **HTML:** ``application/vnd.pypi.simple.v1+html``
|
||||||
|
|
||||||
|
In addition to the above, a special "meta" version is supported named ``latest``,
|
||||||
|
whose purpose is to allow clients to request the absolute latest version, without
|
||||||
|
having to know ahead of time what that version is. It is recommended however,
|
||||||
|
that clients be explicit about what versions they support.
|
||||||
|
|
||||||
|
To support existing clients which expect the existing :pep:`503` API responses to
|
||||||
|
use the ``text/html`` content type, this PEP further defines ``text/html`` as an alias
|
||||||
|
for the ``application/vnd.pypi.simple.v1+html`` content type.
|
||||||
|
|
||||||
|
|
||||||
|
Version + Format Selection
|
||||||
|
--------------------------
|
||||||
|
|
||||||
|
Now that there is multiple possible serializations, we need a mechanism to allow
|
||||||
|
clients to indicate what serialization formats that they're able to understand. In
|
||||||
|
addition, it would be a benefit if any possible new major version to the API can
|
||||||
|
be added without disrupting existing clients expecting the previous API version.
|
||||||
|
|
||||||
|
To enable this, this PEP standardizes on the use of HTTP's
|
||||||
|
`Server-Driven Content Negotiation <https://developer.mozilla.org/en-US/docs/Web/HTTP/Content_negotiation>`_.
|
||||||
|
|
||||||
|
While this PEP won't fully describe the entirety of server-driven content
|
||||||
|
negotiation, the flow is roughly:
|
||||||
|
|
||||||
|
1. The client makes an HTTP request containing an ``Accept`` header listing all
|
||||||
|
of the version+format content types that they are able to understand.
|
||||||
|
2. The server inspects that header, selects one of the listed content types,
|
||||||
|
then returns a response using that content type.
|
||||||
|
3. If the server does not support any of the content types in the ``Accept``
|
||||||
|
header or if the client did not provide an ``Accept`` header at all, then
|
||||||
|
they are able to choose between 3 different options for how to respond:
|
||||||
|
|
||||||
|
a. Select a default content type other than what the client has requested
|
||||||
|
and return a response with that.
|
||||||
|
b. Return a HTTP ``406 Not Acceptable`` response to indicate that none of
|
||||||
|
the requested content types were available, and the server was unable
|
||||||
|
or unwilling to select a default content type to respond with.
|
||||||
|
c. Return a HTTP ``300 Multiple Choices`` response that contains a list of
|
||||||
|
all of the possible responses that could have been chosen.
|
||||||
|
4. The client interprets the response, handling the different types of responses
|
||||||
|
that the server may have responded with.
|
||||||
|
|
||||||
|
This PEP does not specify which choices the server makes in regards to handling
|
||||||
|
a content type that it isn't able to return, and clients **SHOULD** be prepared
|
||||||
|
to handle all of the possible responses in whatever way makes the most sense for
|
||||||
|
that client.
|
||||||
|
|
||||||
|
However, as there is no standard format for how a ``300 Multiple Choices``
|
||||||
|
response can be interpreted, this PEP highly discourages servers from utilizing
|
||||||
|
that option, as clients will have no way to understand and select a different
|
||||||
|
content-type to request. In addition, it's unlikely that the client *could*
|
||||||
|
understand a different content type anyways, so at best this response would
|
||||||
|
likely just be treated the same as a ``406 Not Acceptable`` error.
|
||||||
|
|
||||||
|
This PEP **does** require that if the meta version ``latest`` is being used, the
|
||||||
|
server **MUST** respond with the content type for the actual version that is
|
||||||
|
contained in the response
|
||||||
|
(i.e. A ``Accept: application/vnd.pypi.simple.latest+json`` request that returns
|
||||||
|
a v1.x response should have a ``Content-Type`` of
|
||||||
|
``application/vnd.pypi.simple.v1+json``).
|
||||||
|
|
||||||
|
The ``Accept`` header is a comma separated list of content types that the client
|
||||||
|
understands and is able to process. It supports three different formats for each
|
||||||
|
content type that is being requested:
|
||||||
|
|
||||||
|
- ``$type/$subtype``
|
||||||
|
- ``$type/*``
|
||||||
|
- ``*/*``
|
||||||
|
|
||||||
|
For the use of selecting a version+format, the most useful of these is
|
||||||
|
``$type/$subtype``, as that is the only way to actually specify the version
|
||||||
|
and format you want.
|
||||||
|
|
||||||
|
The order of the content types listed in the ``Accept`` header does not have any
|
||||||
|
specific meaning, and the server **SHOULD** consider all of them to be equally
|
||||||
|
valid to respond with. If a client wishes to specify that they prefer a specific
|
||||||
|
content type over another, they may use the ``Accept`` header's
|
||||||
|
`quality value <https://developer.mozilla.org/en-US/docs/Glossary/Quality_values>`_
|
||||||
|
syntax.
|
||||||
|
|
||||||
|
This allows a client to specify a priority for a specific entry in their
|
||||||
|
``Accept`` header, by append a ``;q=`` followed by a value between ``0`` and
|
||||||
|
``1`` inclusive, with up to 3 decimal digits. When interpreting this value,
|
||||||
|
an entry with a higher quality has priority over an entry with a lower quality,
|
||||||
|
and any entry without a quality present will default to a quality of ``1``.
|
||||||
|
|
||||||
|
However, clients should keep in mind that a server is free to select **any** of
|
||||||
|
the content types they've asked for, regardless of their requested priority, and
|
||||||
|
it may even return a content type that they did **not** ask for.
|
||||||
|
|
||||||
|
To aid clients in determining the content type of the response that they have
|
||||||
|
received from an API request, this PEP requires that servers always include a
|
||||||
|
``Content-Type`` header indicating the content type of the response. This is
|
||||||
|
technically a backwards incompatible change, however in practice
|
||||||
|
`pip has been enforcing this requirement <https://github.com/pypa/pip/blob/cf3696a81b341925f82f20cb527e656176987565/src/pip/_internal/index/collector.py#L123-L150>`_
|
||||||
|
so the risks for actual breakages is low.
|
||||||
|
|
||||||
|
An example of how a client can operate would look like:
|
||||||
|
|
||||||
|
.. code-block:: python3
|
||||||
|
|
||||||
|
import cgi
|
||||||
|
import requests
|
||||||
|
|
||||||
|
# Construct our list of acceptable content types, we want to prefer
|
||||||
|
# that we get a v1 response serialized using JSON, however we also
|
||||||
|
# can support a v1 response serialized using HTML. For compatibility
|
||||||
|
# we also request text/html, but we prefer it least of all since we
|
||||||
|
# don't know if it's actually a Simple API response, or just some
|
||||||
|
# random HTML page that we've gotten due to a misconfiguration.
|
||||||
|
CONTENT_TYPES = [
|
||||||
|
"application/vnd.pypi.simple.v1+json",
|
||||||
|
"application/vnd.pypi.simple.v1+html",
|
||||||
|
"text/html;q=0", # For legacy compatibility
|
||||||
|
]
|
||||||
|
ACCEPT = ", ".join(CONTENT_TYPES)
|
||||||
|
|
||||||
|
|
||||||
|
# Actually make our request to the API, requesting all of the content
|
||||||
|
# types that we find acceptable, and letting the server select one of
|
||||||
|
# them out of the list.
|
||||||
|
resp = requests.get("https://pypi.org/simple/", headers={"Accept": ACCEPT})
|
||||||
|
|
||||||
|
# If the server does not support any of the content types you requested,
|
||||||
|
# AND it has chosen to return a HTTP 406 error instead of a default
|
||||||
|
# response then this will raise an exception for the 406 error.
|
||||||
|
resp.raise_for_status()
|
||||||
|
|
||||||
|
|
||||||
|
# Determine what kind of response we've gotten to ensure that it is one
|
||||||
|
# that we can support, and if it is, dispatch to a function that will
|
||||||
|
# understand how to interpret that particular version+serialization. If
|
||||||
|
# we don't understand the content type we've gotten, then we'll raise
|
||||||
|
# an exception.
|
||||||
|
content_type, _ = cgi.parse_header(resp.headers.get("content-type", ""))
|
||||||
|
match content_type:
|
||||||
|
case "application/vnd.pypi.simple.v1+json":
|
||||||
|
handle_v1_json(resp)
|
||||||
|
case "application/vnd.pypi.simple.v1+html" | "text/html":
|
||||||
|
handle_v1_html(resp)
|
||||||
|
case _:
|
||||||
|
raise Exception(f"Unknown content type: {content_type}")
|
||||||
|
|
||||||
|
If a client wishes to only support HTML or only support JSON, then they would
|
||||||
|
just remove the content types that they do not want from the ``Accept`` header,
|
||||||
|
and turn receiving them into an error.
|
||||||
|
|
||||||
|
|
||||||
|
Alternative Negotiation Mechanisms
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
While using HTTP's Content negotiation is considered the standard way for a client
|
||||||
|
and server to coordinate to ensure that the client is getting an HTTP response that
|
||||||
|
it is able to understand, there are situations where that mechanism may not be
|
||||||
|
sufficient. For those cases this PEP has alternative negotiation mechanisms that
|
||||||
|
may *optionally* be used instead.
|
||||||
|
|
||||||
|
|
||||||
|
URL Parameter
|
||||||
|
^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
Servers that implement the Simple API may choose to support an URL parameter named
|
||||||
|
``format`` to allow the clients to request a specific version of the URL.
|
||||||
|
|
||||||
|
The value of the ``format`` parameter should be **one** of the valid content types.
|
||||||
|
Passing multiple content types, wild cards, quality values, etc is **not** supported.
|
||||||
|
|
||||||
|
Supporting this parameter is optional, and clients **SHOULD NOT** rely on it for
|
||||||
|
interacting with the API. This negotiation mechanism is intended to allow for easier
|
||||||
|
human based exploration of the API within a browser, or to allow documentation or
|
||||||
|
notes to link to a specific version+format.
|
||||||
|
|
||||||
|
Servers that do not support this parameter may choose to return an error when it is
|
||||||
|
present, or they may simple ignore it's presence.
|
||||||
|
|
||||||
|
When a server does implement this parameter, it **SHOULD** take precedence over any
|
||||||
|
values in the client's ``Accept`` header, and if the server does not support the
|
||||||
|
requested format, it may choose to fall back to the ``Accept`` header, or choose any
|
||||||
|
of the error conditions that standard server-driven content negotiation typically
|
||||||
|
has (e.g. ``406 Not Available``, ``303 Multiple Choices``, or selecting a default
|
||||||
|
type to return).
|
||||||
|
|
||||||
|
|
||||||
|
Endpoint Configuration
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
This option technically is not a special option at all, it is just a natural
|
||||||
|
consequence of using content negotiation and allowing servers to select which of the
|
||||||
|
available content types is their default.
|
||||||
|
|
||||||
|
If a server is unwilling or unable to implement the server-driven content negotiation,
|
||||||
|
and would instead rather require users to explicitly configure their client to select
|
||||||
|
the version they want, then that is a supported configuration.
|
||||||
|
|
||||||
|
To enable this, a server should make multiple endpoints (for instance,
|
||||||
|
``/simple/v1+html/`` and/or ``/simple/v1+json/``) for each version+format that they
|
||||||
|
wish to support. Under that endpoint, they can host a copy of their repository that
|
||||||
|
only supports one (or a subset) of the content-types. When a client makes a request
|
||||||
|
using the ``Accept`` header, the server can ignore it and return the content type
|
||||||
|
that corresponds to that endpoint.
|
||||||
|
|
||||||
|
For clients that wish to require specific configuration, they can keep track of
|
||||||
|
which version+format a specific repository url was configured for, and when making
|
||||||
|
a request to that server, emit an ``Accept`` header that *only* includes the correct
|
||||||
|
content type.
|
||||||
|
|
||||||
|
|
||||||
|
TUF Support - PEP 458
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
:pep:`458` requires that all API responses are hashable and that they can be uniquely
|
||||||
|
identified by a path relative to the repository root. For a Simple API repository, the
|
||||||
|
target path is the Root of our API (e.g. ``/simple/`` on PyPI). This creates
|
||||||
|
challenges when accessing the API using a TUF client instead of directly using a
|
||||||
|
standard HTTP client, as the TUF client cannot handle the fact that a target could
|
||||||
|
have multiple different representations that all hash differently.
|
||||||
|
|
||||||
|
:pep:`458` does not specify what the target path should be for the Simple API, but I
|
||||||
|
believe that TUF requires that the target paths be "file-like", in other words, a
|
||||||
|
path like ``simple/PROJECT/`` is not acceptable, because it technically points to a
|
||||||
|
directory.
|
||||||
|
|
||||||
|
The saving grace is that the target path does not *have* to actually match the URL
|
||||||
|
being fetched from the Simple API, and it can just be a sigil that the fetching code
|
||||||
|
knows how to transform into the actual URL that needs to be fetched. This same thing
|
||||||
|
can hold true for other aspects of the actual HTTP request, such as the ``Accept``
|
||||||
|
header.
|
||||||
|
|
||||||
|
Ultimately figuring out how to map a directory to a filename is out of scope for this
|
||||||
|
PEP (but it would be in scope for :pep:`458`), and this PEP defers making a decision
|
||||||
|
about how exactly to represent this inside of :pep:`458` metadata.
|
||||||
|
|
||||||
|
However, it appears that the current WIP branch against pip that attempts to implement
|
||||||
|
:pep:`458` is using a target path like ``simple/PROJECT/index.html``. This could be
|
||||||
|
modified to include the API version and serialization format using something like
|
||||||
|
``simple/PROJECT/vnd.pypi.simple.vN.FORMAT``. So the v1 HTML format would be
|
||||||
|
``simple/PROJECT/vnd.pypi.simple.v1.html`` and the v1 JSON format woould be
|
||||||
|
``simple/PROJECT/vnd.pypi.simple.v1.json``.
|
||||||
|
|
||||||
|
In this case, since ``text/html`` is an alias to ``application/vnd.pypi.simple.v1+html``
|
||||||
|
when interacting through TUF, likely it will make the most sense to normalize to the
|
||||||
|
more explicit name.
|
||||||
|
|
||||||
|
Likewise the ``latest`` metaversion should not be included in the targets, only
|
||||||
|
explicitly declared versions should be supported.
|
||||||
|
|
||||||
|
|
||||||
|
Recommendations
|
||||||
|
===============
|
||||||
|
|
||||||
|
This section is non-normative, and represents what the PEP authors believe to be
|
||||||
|
the best default implementation decisions for something implementing this PEP, but
|
||||||
|
it does **not** represent any sort of requirement to match these decisions.
|
||||||
|
|
||||||
|
These decisions have been chosen to maximize the number of requests that can be
|
||||||
|
moved onto the newest version of an API, while maintaining the greatest amount
|
||||||
|
of compatibility. In addition, they've also tried to make using the API provide
|
||||||
|
guardrails that attempt to push clients into making the best choices it can.
|
||||||
|
|
||||||
|
It is recommended that servers:
|
||||||
|
|
||||||
|
- Support all 3 content types described in this PEP, using server-driven
|
||||||
|
content negotiation, for as long as they reasonably can, or at least as
|
||||||
|
long as they're receiving non trivial traffic that uses the HTML responses.
|
||||||
|
|
||||||
|
- When encountering an ``Accept`` header that does not contain any content types
|
||||||
|
that it knows how to work with, should not ever return a ``300 Multiple Choice``
|
||||||
|
response, and it should be preferred to return a ``406 Not Acceptable`` response.
|
||||||
|
|
||||||
|
- However, if choosing to use the endpoint configuration, you should prefer to
|
||||||
|
return a ``200 OK`` response in the expected content type for that endpoint.
|
||||||
|
|
||||||
|
- When selecting an acceptable version, should choose the highest version that
|
||||||
|
the client supports, with the most expressive/featureful serialization format,
|
||||||
|
taking into account the specificity of the client requests as well as any
|
||||||
|
quality priority values they have expressed, and it should only use the
|
||||||
|
``text/html`` content type as a last resort.
|
||||||
|
|
||||||
|
It is recommended that clients:
|
||||||
|
|
||||||
|
- Support all 3 content types described in this PEP, using server-driven
|
||||||
|
content negotiation, for as long as they reasonably can.
|
||||||
|
|
||||||
|
- When constructing an ``Accept`` header, include all of the content types
|
||||||
|
that you support.
|
||||||
|
|
||||||
|
You should generally *not* include a quality priority value for your content
|
||||||
|
types, unless you have implementation specific reasons that you want the
|
||||||
|
server to take into account (for example, if you're using the stdlib html
|
||||||
|
parser and you're worried that there may be some kinds of HTML responses that
|
||||||
|
you're unable to parse in some edge cases).
|
||||||
|
|
||||||
|
The one exception to this recommendation is that it is recommended that you
|
||||||
|
*should* include a ``;q=0`` value on the legacy ``text/html`` content type,
|
||||||
|
unless it is the only content type that you are requesting.
|
||||||
|
|
||||||
|
- Explicitly select what versions they are looking for, rather than using the
|
||||||
|
``latest`` meta version during normal operation.
|
||||||
|
|
||||||
|
- Check the ``Content-Type`` of the response and ensure it matches something
|
||||||
|
that you were expecting.
|
||||||
|
|
||||||
|
|
||||||
FAQ
|
FAQ
|
||||||
===
|
===
|
||||||
|
|
||||||
|
Does this mean PyPI is planning to drop support for HTML/PEP 503?
|
||||||
|
-----------------------------------------------------------------
|
||||||
|
|
||||||
|
No, PyPI has no plans at this time to drop support for :pep:`503` or HTML
|
||||||
|
responses.
|
||||||
|
|
||||||
|
While this PEP does give repositories the flexibility to do that, that largely
|
||||||
|
exists to ensure that things like using the Endpoint Configuration mechanism is
|
||||||
|
able to work, and to ensure that clients do not make any assumptions that would
|
||||||
|
prevent, at some point in the future, gracefully dropping support for HTML.
|
||||||
|
|
||||||
|
The existing HTML responses incur almost no maintenance burden on PyPI and
|
||||||
|
there is no pressing need to remove them. The only real benefit to dropping them
|
||||||
|
would be to reduce the number of items cached in our CDN.
|
||||||
|
|
||||||
|
If in the future PyPI *does* wish to drop support for them, doing so would
|
||||||
|
almost certainly be the topic of a PEP, or at a minimum a public, open, discussion
|
||||||
|
and would be informed by metrics showing any impact to end users.
|
||||||
|
|
||||||
|
|
||||||
Why JSON instead of X format?
|
Why JSON instead of X format?
|
||||||
-----------------------------
|
-----------------------------
|
||||||
|
@ -401,10 +747,90 @@ using separate API routes a less desirable solution than relying on content
|
||||||
negotiation to select the most ideal representation of the data.
|
negotiation to select the most ideal representation of the data.
|
||||||
|
|
||||||
|
|
||||||
|
Does this mean that static servers are no longer supported?
|
||||||
|
-----------------------------------------------------------
|
||||||
|
|
||||||
|
In short, no, static servers are still (almost) fully supported by this PEP.
|
||||||
|
|
||||||
|
The specifics of how they are supported will depend on the static server in
|
||||||
|
question. For example:
|
||||||
|
|
||||||
|
- **S3:** S3 fully supports custom content types, however it does not support
|
||||||
|
any form of content negotiation. In order to have a server hosted on S3, you
|
||||||
|
would have to use the "Endpoint configuration" style of negotiation, and
|
||||||
|
users would have to configure their clients explicitly.
|
||||||
|
- **Github Pages:** Github pages does not support custom content types, so the
|
||||||
|
S3 solution is not currently workable, which means that only ``text/html``
|
||||||
|
repositories would function.
|
||||||
|
- **Apache:** Apache fully supports server-driven content negotiation, and would
|
||||||
|
just need to be configured to map the custom content types to specific extension.
|
||||||
|
|
||||||
|
|
||||||
|
Doesn't TUF support require having different URLs for each representation?
|
||||||
|
--------------------------------------------------------------------------
|
||||||
|
|
||||||
|
While in TUF, each target can only have a single representation, and by default
|
||||||
|
that is assumed to map exactly to the target path that is being referenced
|
||||||
|
within TUF, there is actually no requirement that the target path is the same
|
||||||
|
as the server path, that the same data can't be represented by multiple targets.
|
||||||
|
|
||||||
|
In fact, TUF doesn't support the Simple API URLs as they are already, because
|
||||||
|
TUF assumes that a target points to a filename, but all of the Simple API URLs
|
||||||
|
are directories. Thus regardless of this PEP, there is going to have to be
|
||||||
|
something that translates between the naming of the targets within the TUF
|
||||||
|
metadata, and the actual requests being made to the server.
|
||||||
|
|
||||||
|
Currently the WIP TUF implementation for pip maps a target like
|
||||||
|
``simple/PROJECT/index.html`` to an HTTP request to fetch ``/simple/PROJECT/``.
|
||||||
|
However there is no reason that it could not be extended to map a target
|
||||||
|
like ``/simple/PROJECT/vnd.pypi.simple.v1.html`` to an HTTP request to
|
||||||
|
fetch ``/simple/PROJECT/`` with an ``Accept`` header of
|
||||||
|
``application/vnd.pypi.simple.v1+html``.
|
||||||
|
|
||||||
|
|
||||||
|
Why not add an ``application/json`` alias like ``text/html``?
|
||||||
|
-------------------------------------------------------------
|
||||||
|
|
||||||
|
This PEP believes that it is best for both clients and servers to be explicit
|
||||||
|
about the types of the API responses that are being used, and a content type
|
||||||
|
like ``application/json`` is the exact opposite of explicit.
|
||||||
|
|
||||||
|
The existence of the ``text/html`` alias exists as a compromise primarily to
|
||||||
|
ensure that existing consumers of the API continue to function as they already
|
||||||
|
do. There is no such expectation of existing clients using the Simple API with
|
||||||
|
a ``application/json`` content type.
|
||||||
|
|
||||||
|
In addition, ``application/json`` has no versioning in it, which means that
|
||||||
|
if there is ever a ``2.0`` version of the Simple API, we will be forced to make
|
||||||
|
a decision. Should ``application/json`` preserve backwards compatibility and
|
||||||
|
continue to be an alias for ``application/vnd.pypi.simple.v1+json``, or should
|
||||||
|
it be updated to be an alias for ``application/vnd.pypi.simple.v2+json``?
|
||||||
|
|
||||||
|
This problem doesn't exist for ``text/html``, because the assumption is that
|
||||||
|
HTML will remain a legacy format, and will likely not gain *any* new features,
|
||||||
|
much less features that require breaking compatability. So having it be an
|
||||||
|
alias for ``application/vnd.pypi.simple.v1+html`` is effectively the same as
|
||||||
|
having it be an alias for ``application/vnd.pypi.simple.latest+html``, since
|
||||||
|
``1.0`` will likely be the only HTML version to exist.
|
||||||
|
|
||||||
|
The largest benefit to adding the ``application/json`` content type is that
|
||||||
|
there do things that do not allow you to have custom content types, and require
|
||||||
|
you to select one of their preset content types. The main example of this being
|
||||||
|
Github Pages, which the lack of ``application/json`` support in this PEP means
|
||||||
|
that static repositories will no longer be able to be hosted on Github Pages
|
||||||
|
unless GitHub adds the ``application/vnd.pypi.simple.v1+json`` content type.
|
||||||
|
|
||||||
|
This PEP believes that the benefits are not large enough to add that content
|
||||||
|
type alias at this time, and that it's inclusion would likely be a footgun
|
||||||
|
waiting for unsuspecting people to accidentally pick it up. Especially given
|
||||||
|
that we can always add it in the future, but removing things is a lot harder
|
||||||
|
to do.
|
||||||
|
|
||||||
|
|
||||||
Appendix 1: Survey of use cases to cover
|
Appendix 1: Survey of use cases to cover
|
||||||
========================================
|
========================================
|
||||||
|
|
||||||
This was done through a discussion between ``pip`` and ``bandersnarch``
|
This was done through a discussion between ``pip``, ``PyPI``, and ``bandersnarch``
|
||||||
maintainers, who are the two first potential users for the new API. This is
|
maintainers, who are the two first potential users for the new API. This is
|
||||||
how they use the Simple + JSON APIs today:
|
how they use the Simple + JSON APIs today:
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue