From ce38a96f432f388e8a990c44d2497145acd84246 Mon Sep 17 00:00:00 2001 From: William Woodruff Date: Tue, 10 Sep 2024 15:07:31 -0400 Subject: [PATCH] PEP 753: Uniform URLs in core metadata (#3936) Signed-off-by: William Woodruff Co-authored-by: Alyssa Coghlan Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com> --- .github/CODEOWNERS | 1 + peps/pep-0753.rst | 327 +++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 328 insertions(+) create mode 100644 peps/pep-0753.rst diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS index 89aca1b8b..f3cd9391a 100644 --- a/.github/CODEOWNERS +++ b/.github/CODEOWNERS @@ -632,6 +632,7 @@ peps/pep-0749.rst @JelleZijlstra peps/pep-0750.rst @gvanrossum @lysnikolaou peps/pep-0751.rst @brettcannon peps/pep-0752.rst @warsaw +peps/pep-0753.rst @warsaw # ... # peps/pep-0754.rst # ... diff --git a/peps/pep-0753.rst b/peps/pep-0753.rst new file mode 100644 index 000000000..0034ac5b4 --- /dev/null +++ b/peps/pep-0753.rst @@ -0,0 +1,327 @@ +PEP: 753 +Title: Uniform project URLs in core metadata +Author: William Woodruff , + Facundo Tuesca +Sponsor: Barry Warsaw +PEP-Delegate: Paul Moore +Discussions-To: https://discuss.python.org/t/pep-753-uniform-urls-in-core-metadata/62792 +Status: Draft +Type: Standards Track +Topic: Packaging +Created: 29-Aug-2024 +Post-History: `26-Aug-2024 `__, + `03-Sep-2024 `__ + +Abstract +======== + +This PEP recommends two discrete changes to handling of core metadata by +indices, such as PyPI: + +* Deprecation of the ``Home-page`` and ``Download-URL`` fields in favor of + their ``Project-URL`` equivalents; +* A set of conventions for normalizing and assigning semantics to + ``Project-URL`` labels. + +Rationale and Motivation +======================== + +Python's standard :ref:`core metadata ` has gone +through many years of revisions, with various standardized milestone versions. + +These revisions of the core metadata have introduced various mechanisms +for expressing a package's relationship to external resources, via URLs: + +1. Metadata 1.0 introduced ``Home-page``, a single-use field containing + a URL to the distribution's home page. + + .. code-block:: + + Home-page: https://example.com/sampleproject + +2. Metadata 1.1 introduced ``Download-URL``, a complementary single-use field + containing a URL suitable for downloading the current distribution. + + .. code-block:: + + Download-URL: https://example.com/sampleproject/sampleproject-1.2.3.tar.gz + +3. Metadata 1.2 introduced ``Project-URL``, a **multiple-use** field containing + a label-and-URL pair. Each label is free text conveying the URL's semantics. + + .. code-block:: + + Project-URL: Homepage, https://example.com/sampleproject + Project-URL: Download, https://example.com/sampleproject/sampleproject-1.2.3.tar.gz + Project-URL: Documentation, https://example.com/sampleproject/docs + +Metadata 2.1, 2.2, and 2.3 leave the behavior of these fields as originally +specified. + +Because ``Project-URL`` allows free text labels and is multiple-use, informal +conventions have arisen for representing the values of +``Home-page`` and ``Download-URL`` within ``Project-URL`` instead. + +These conventions have seen significant adoption, with :pep:`621` explicitly +choosing to provide only a ``project.urls`` table rather than a +``project.home-page`` field. From :pep:`621`'s rejected ideas: + +.. pull-quote:: + + While the core metadata supports it, having a single field for a project's + URL while also supporting a full table seemed redundant and confusing. + +This PEP exists to formalize the informal conventions that have arisen, as well +as explicitly document ``Home-page`` and ``Download-URL`` as deprecated in +favor of equivalent ``Project-URL`` representations. + +Specification +============= + +This PEP proposes that ``Home-page`` and ``Download-URL`` be considered +**deprecated**. This deprecation has implications for both package metadata +producers (e.g. build backends and packaging tools) and package indices +(e.g. PyPI). + +.. _metadata-producers: + +Metadata producers +------------------ + +This PEP stipulates the following for metadata producers: + +* When generating metadata 1.2 or later, producers **SHOULD** emit only + ``Project-URL``, and **SHOULD NOT** emit ``Home-page`` or ``Download-URL`` + fields. +* When generating ``Project-URL`` equivalents for ``Home-page`` and + ``Download-URL``, producers **SHOULD** use the + :ref:`label conventions ` described below. + +These stipulations do not change the optionality of URL fields in core metadata. +In other words, producers **MAY** choose to omit ``Project-URL`` entirely +at their discretion. + +This PEP does **not** propose the outright removal of support for ``Home-page`` +or ``Download-URL``. However, see :ref:`future-considerations` for +thoughts on how a new (as of yet unspecified) major core metadata version +could complete the deprecation cycle via removal of these deprecated fields. + +.. _package-indices: + +Package indices +--------------- + +This PEP stipulates the following for package indices: + +* When interpreting a distribution's metadata of version 1.2 or later + (e.g. for rendering on a web page), the index **MUST** prefer + ``Project-URL`` fields as a source of URLs over ``Home-page`` and + ``Download-URL``, even if the latter are explicitly provided. + +* If a distribution's metadata contains only the ``Home-page`` and + ``Download-URL`` fields, the index **MAY** choose to ignore those fields + and behave as though no URLs were provided in the metadata. In this case, + the index **SHOULD** present an appropriate warning or notification to + the uploading party. + + * The mechanism for presenting this warning or notification is not + specified, since it will vary by index. By way of example, an index may + choose to present a warning in the HTTP response to an upload request, or + send an email or other notification to the maintainer(s) of the project. + +* If a distribution's metadata contains both sets of fields, the index **MAY** + choose to reject the distribution outright. However, this is + **NOT RECOMMENDED** until a future unspecified major metadata version + formally removes support for ``Home-page`` and ``Download-URL``. + +* Any changes to the interpretation of metadata of version 1.2 or later that + result in previously recognized URLs no longer being recognized + **SHOULD NOT** be retroactively applied to previously uploaded packages. + +These stipulations do not change the optionality of URL processing by indices. +In other words, an index that does not process URLs within uploaded +distributions may continue to ignore all URL fields entirely. + +.. _conventions-for-labels: + +Conventions for ``Project-URL`` labels +====================================== + +The deprecations proposed above require a formalization of the currently +informal relationship between ``Home-page``, ``Download-URL``, and their +``Project-URL`` equivalents. + +This formalization has two parts: + +1. A set of rules for canonicalizing ``Project-URL`` labels; +2. A set of "well-known" canonical label values that indices may specialize + URL presentation for. + +Label canonicalization +---------------------- + +The core metadata specification stipulates that ``Project-URL`` labels are +free text, limited to 32 characters. + +This PEP proposes adding the concept of a "canonicalized" label to the core +metadata specification. Label canonicalization is defined via the following +Python function: + +.. code-block:: python + + import string + def canonicalize_label(label: str) -> str: + chars_to_remove = string.punctuation + string.whitespace + removal_map = str.maketrans("", "", chars_to_remove) + return label.translate(removal_map).lower() + +In plain language: a label is *canonicalized* by deleting all ASCII punctuation and +whitespace, and then converting the result to lowercase. + +The following table shows examples of labels before (raw) and after +canonicalization: + +.. csv-table:: + :header: "Raw", "Canonicalized" + + "``Homepage``", "``homepage``" + "``Home-page``", "``homepage``" + "``Home page``", "``homepage``" + "``Change_Log``", "``changelog``" + "``What's New?``", "``whatsnew``" + +Metadata producers **SHOULD** emit the canonicalized form of a user +specified label, but **MAY** choose to emit the un-canonicalized form so +long as it adheres to the existing 32 character constraint. + +Package indices **SHOULD NOT** use the canonicalized labels belonging to the set +of well-known labels directly as UI elements (instead replacing them with +appropriately capitalized text labels). Labels not belonging to the well-known +set **MAY** be used directly as UI elements. + +Well-known labels +----------------- + +In addition to the canonicalization rules above, this PEP proposes a +fixed (but extensible) set of "well-known" ``Project-URL`` labels, +as well as equivalent aliases. + +The following table lists these labels, in canonical form: + +.. csv-table:: + :header: "Label", "Description", "Aliases" + :widths: 20, 50, 30 + + "``homepage``", "The project's home page", "*(none)*" + "``download``", "A download URL for the current distribution, equivalent to ``Download-URL``", "*(none)*" + "``changelog``", "The project's changelog", "``changes``, ``releasenotes``, ``whatsnew``, ``history``" + "``documentation``", "The project's online documentation", "``docs``" + "``issues``", "The project's bug tracker", "``bugs``, ``issue``, ``bug``, ``tracker``, ``report``" + "``sponsor``", "Sponsoring information", "``funding``, ``donate``, ``donation``" + +Packagers and metadata producers **MAY** choose to use these well-known +labels to communicate specific URL intents to package indices and downstreams. + +Packagers and metadata producers **SHOULD** produce the canonicalized version +of the well-known labels in package metadata. + +Similarly, indices **MAY** choose to specialize their rendering or presentation +of URLs with these labels, e.g. by presenting an appropriate icon or tooltip +for each label. + +Indices **MAY** also specialize the rendering or presentation of additional labels or URLs, +including (but not limited to), labels that start with a well-known label, and URLs that refer +to a known service provider domain (e.g. for documentation hosting or issue tracking). + +This PEP recognizes that the list of well-known labels is unlikely to remain +static, and that subsequent additions to it should not require the overhead +associated with a formal PEP process or new metadata version. As the primary +expected use case for this information is to control the way project URLs are +displayed on the Python Package Index, this PEP proposes that the list above +become a "living" list within PyPI's documentation (at time of writing, the +documentation for influencing PyPI's URL display can be found +`here `__). + +Backwards Compatibility +======================= + +Limited Impact +-------------- + +This PEP is expected to have little to no impact on existing packaging tooling +or package indices: + +* Packaging tooling: no changes to the correctness or well-formedness + of the core metadata. This PEP proposes deprecations as well as behavioral + refinements, but all currently (and historically) produced metadata will + continue to be valid per the rules of its respective version. +* Package indices: indices will continue to expect well-formed core metadata, + with no behavioral changes. Indices **MAY** choose to emit warnings or + notifications on the presence of now-deprecated fields, + :ref:`per above `. + +.. _future-considerations: + +Future Considerations +===================== + +This PEP does not stipulate or require any future metadata changes. + +However, per :ref:`metadata-producers` and :ref:`conventions-for-labels`, +we identify the following potential future goals for a new major release of +the core metadata standards: + +* Outright removal of support for ``Home-page`` and ``Download-URL`` in the + next major core metadata version. If removed, package indices and consumers + **MUST** reject metadata containing these fields when said metadata is of + the new major version. +* Enforcement of label canonicalization. If enforced, package producers + **MUST** emit only canonicalized ``Project-URL`` labels when generating + distribution metadata, and package indices and consumers **MUST** reject + distributions containing non-canonicalized labels. Note: requiring + canonicalization merely restricts labels to lowercase text, and excludes + whitespace and punctuation. It does NOT restrict project URLs solely to + the use of "well-known" labels. + +These potential changes would be backwards incompatible, hence their +inclusion only in this section. Acceptance of this PEP does NOT commit +any future metadata revision to actually making these changes. + +Security Implications +===================== + +This PEP does not identify any positive or negative security implications +associated with deprecating ``Home-page`` and ``Download-URL`` or with +label canonicalization. + +How To Teach This +================= + +The changes in this PEP should be transparent to the majority of the packaging +ecosystem's userbase; the primary beneficiaries of this PEP's changes are +packaging tooling authors and index maintainers, who will be able to reduce the +number of unique URL fields produced and checked. + +A small number of package maintainers may observe new warnings or notifications +from their index of choice, should the index choose to ignore ``Home-page`` +and ``Download-URL`` as suggested. Similarly, a small number of package +maintainers may observe that their index of choice no longer renders +their URLs, if only present in the deprecated fields. However, no package +maintainers should observe rejected package uploads or other breaking +changes to packaging workflows due to this PEP's proposed changes. + +Anybody who observes warnings or changes to the presentation of +URLs on indices can be taught about this PEP's behavior via official +packaging resources, such as the +:ref:`Python Packaging User Guide ` +and `PyPI's user documentation `__, the latter of which +already contains an informal description of PyPI's URL handling behavior. + +If this PEP is accepted, the authors of this PEP will coordinate to update +and cross-link the resources mentioned above. + +Copyright +========= + +This document is placed in the public domain or under the +CC0-1.0-Universal license, whichever is more permissive.