PEP: 753 Title: Uniform project URLs in core metadata Author: William Woodruff , Facundo Tuesca Sponsor: Barry Warsaw PEP-Delegate: Paul Moore Discussions-To: https://discuss.python.org/t/pep-753-uniform-urls-in-core-metadata/62792 Status: Accepted Type: Standards Track Topic: Packaging Created: 29-Aug-2024 Post-History: `26-Aug-2024 `__, `03-Sep-2024 `__ Resolution: `10-Oct-2024 `__ .. canonical-pypa-spec:: :ref:`packaging:well-known-project-urls` Abstract ======== This PEP recommends two discrete changes to the processing of core metadata by indices (such as PyPI) and other core metadata consumers: * Deprecation of the ``Home-page`` and ``Download-URL`` fields in favor of their ``Project-URL`` equivalents; * A set of :ref:`conventions ` for normalizing and assigning semantics to ``Project-URL`` labels during consumer-side metadata processing. Rationale and Motivation ======================== Python's standard :ref:`core metadata ` has gone through many years of revisions, with various standardized milestone versions. These revisions of the core metadata have introduced various mechanisms for expressing a package's relationship to external resources, via URLs: 1. Metadata 1.0 introduced ``Home-page``, a single-use field containing a URL to the distribution's home page. .. code-block:: email Home-page: https://example.com/sampleproject 2. Metadata 1.1 introduced ``Download-URL``, a complementary single-use field containing a URL suitable for downloading the current distribution. .. code-block:: email Download-URL: https://example.com/sampleproject/sampleproject-1.2.3.tar.gz 3. Metadata 1.2 introduced ``Project-URL``, a **multiple-use** field containing a label-and-URL pair. Each label is free text conveying the URL's semantics. .. code-block:: email Project-URL: Homepage, https://example.com/sampleproject Project-URL: Download, https://example.com/sampleproject/sampleproject-1.2.3.tar.gz Project-URL: Documentation, https://example.com/sampleproject/docs Metadata 2.1, 2.2, and 2.3 leave the behavior of these fields as originally specified. Because ``Project-URL`` allows free text labels and is multiple-use, informal conventions have arisen for representing the values of ``Home-page`` and ``Download-URL`` within ``Project-URL`` instead. These conventions have seen significant adoption, with :pep:`621` explicitly choosing to provide only a ``project.urls`` table rather than a ``project.home-page`` field. From :pep:`621`'s rejected ideas: .. pull-quote:: While the core metadata supports it, having a single field for a project's URL while also supporting a full table seemed redundant and confusing. This PEP exists to formalize the informal conventions that have arisen, as well as explicitly document ``Home-page`` and ``Download-URL`` as deprecated in favor of equivalent ``Project-URL`` representations. Specification ============= This PEP proposes that ``Home-page`` and ``Download-URL`` be considered **deprecated**. This deprecation has implications for both package metadata producers (e.g. build backends and packaging tools) and package indices (e.g. PyPI). .. _metadata-producers: Metadata producers ------------------ This PEP stipulates the following for metadata producers: * When generating metadata 1.2 or later, producers **SHOULD** emit only ``Project-URL``, and **SHOULD NOT** emit ``Home-page`` or ``Download-URL`` fields. These stipulations do not change the optionality of URL fields in core metadata. In other words, producers **MAY** choose to omit ``Project-URL`` entirely at their discretion. This PEP does **not** propose the outright removal of support for ``Home-page`` or ``Download-URL``. However, see :ref:`future-considerations` for thoughts on how a new (as of yet unspecified) major core metadata version could complete the deprecation cycle via removal of these deprecated fields. Similarly, this PEP does **not** propose that metadata producers emit :ref:`normalized labels `. Label normalization is performed solely on the processing and consumption side (i.e. within indices and other consumers of distribution metadata). .. _package-indices: Package indices --------------- This PEP stipulates the following for package indices: * When interpreting a distribution's metadata of version 1.2 or later (e.g. for rendering on a web page), the index **MUST** prefer ``Project-URL`` fields as a source of URLs over ``Home-page`` and ``Download-URL``, even if the latter are explicitly provided. * If a distribution's metadata contains only the ``Home-page`` and ``Download-URL`` fields, the index **MAY** choose to ignore those fields and behave as though no URLs were provided in the metadata. In this case, the index **SHOULD** present an appropriate warning or notification to the uploading party. * The mechanism for presenting this warning or notification is not specified, since it will vary by index. By way of example, an index may choose to present a warning in the HTTP response to an upload request, or send an email or other notification to the maintainer(s) of the project. * If a distribution's metadata contains both sets of fields, the index **MAY** choose to reject the distribution outright. However, this is **NOT RECOMMENDED** until a future unspecified major metadata version formally removes support for ``Home-page`` and ``Download-URL``. * Any changes to the interpretation of metadata of version 1.2 or later that result in previously recognized URLs no longer being recognized **SHOULD NOT** be retroactively applied to previously uploaded packages. These stipulations do not change the optionality of URL processing by indices. In other words, an index that does not process URLs within uploaded distributions may continue to ignore all URL fields entirely. Similarly, these stipulations do **not** imply that the index should persist any normalizations that it performs to a distribution's metadata. In other words, this PEP stipulates label normalization for the purpose of user presentation, not for the purpose of modifying an uploaded distribution or its "sidecar" :pep:`658` metadata file. .. _conventions-for-labels: Conventions for ``Project-URL`` labels ====================================== The deprecations proposed above require a formalization of the currently informal relationship between ``Home-page``, ``Download-URL``, and their ``Project-URL`` equivalents. This formalization has two parts: 1. A set of rules for normalizing ``Project-URL`` labels during index-side processing; 2. A set of "well-known" normalized label values that indices may specialize URL presentation for. .. _label-normalization: Label normalization ------------------- The core metadata specification stipulates that ``Project-URL`` labels are free text, limited to 32 characters. This PEP proposes adding the concept of a "normalized" label to the core metadata specification. Label normalization is defined via the following Python function: .. code-block:: python import string def normalize_label(label: str) -> str: chars_to_remove = string.punctuation + string.whitespace removal_map = str.maketrans("", "", chars_to_remove) return label.translate(removal_map).lower() In plain language: a label is *normalized* by deleting all ASCII punctuation and whitespace, and then converting the result to lowercase. The following table shows examples of labels before (raw) and after normalization: .. csv-table:: :header: "Raw", "Normalized" "``Homepage``", "``homepage``" "``Home-page``", "``homepage``" "``Home page``", "``homepage``" "``Change_Log``", "``changelog``" "``What's New?``", "``whatsnew``" When processing distribution metadata, package indices **SHOULD** perform label normalization to determine if a given label is :ref:`well known ` for subsequent special processing. Labels that are not well-known **MUST** be processed in their un-normalized form. Normalization does **not** change pre-existing semantics around duplicated ``Project-URL`` labels. In other words, normalization may result in duplicate labels in the project's metadata, but only in the same manner that was already permitted (since the :ref:`core metadata specification ` does not stipulate that labels are unique). Excerpted examples of normalized metadata fields are provided in :ref:`Appendix A `. .. _well-known: Well-known labels ----------------- In addition to the normalization rules above, this PEP proposes a fixed (but extensible) set of "well-known" ``Project-URL`` labels, as well as aliases and human-readable equivalents. The following table lists these labels, in normalized form: .. list-table:: :header-rows: 1 * - Label (Human-readable equivalent) - Description - Aliases * - ``homepage`` (Homepage) - The project's home page - *(none)* * - ``source`` (Source Code) - The project's hosted source code or repository - ``repository``, ``sourcecode``, ``github`` * - ``download`` (Download) - A download URL for the current distribution, equivalent to ``Download-URL`` - *(none)* * - ``changelog`` (Changelog) - The project's comprehensive changelog - ``changes``, ``whatsnew``, ``history`` * - ``releasenotes`` (Release Notes) - The project's curated release notes - *(none)* * - ``documentation`` (Documentation) - The project's online documentation - ``docs`` * - ``issues`` (Issue Tracker) - The project's bug tracker - ``bugs``, ``issue``, ``tracker``, ``issuetracker``, ``bugtracker`` * - ``funding`` (Funding) - Funding Information - ``sponsor``, ``donate``, ``donation`` Indices **MAY** choose to use the human-readable equivalents suggested above in their UI elements, if appropriate. Alternatively, indices **MAY** choose their own appropriate human-readable equivalents for UI elements. Packagers and metadata producers **MAY** choose to use any label that normalizes to these labels (or their aliases) to communicate specific URL intents to package indices and downstreams. Similarly, indices **MAY** choose to specialize their rendering or presentation of URLs with these labels, e.g. by presenting an appropriate icon or tooltip for each label. Indices **MAY** also specialize the rendering or presentation of additional labels or URLs, including (but not limited to), labels that start with a well-known label, and URLs that refer to a known service provider domain (e.g. for documentation hosting or issue tracking). This PEP recognizes that the list of well-known labels is unlikely to remain static, and that subsequent additions to it should not require the overhead associated with a formal PEP process or new metadata version. As such, this PEP proposes that the list above become a "living" list within the :ref:`PyPA specifications `. Backwards Compatibility ======================= Limited Impact -------------- This PEP is expected to have little to no impact on existing packaging tooling or package indices: * Packaging tooling: no changes to the correctness or well-formedness of the core metadata. This PEP proposes deprecations as well as behavioral refinements, but all currently (and historically) produced metadata will continue to be valid per the rules of its respective version. * Package indices: indices will continue to expect well-formed core metadata, with no behavioral changes. Indices **MAY** choose to emit warnings or notifications on the presence of now-deprecated fields, :ref:`per above `. .. _future-considerations: Future Considerations ===================== This PEP does not stipulate or require any future metadata changes. However, per :ref:`metadata-producers` and :ref:`conventions-for-labels`, we identify the following potential future goals for a new major release of the core metadata standards: * Outright removal of support for ``Home-page`` and ``Download-URL`` in the next major core metadata version. If removed, package indices and consumers **MUST** reject metadata containing these fields when said metadata is of the new major version. * Enforcement of label normalization. If enforced, package producers **MUST** emit only normalized ``Project-URL`` labels when generating distribution metadata, and package indices and consumers **MUST** reject distributions containing non-normalized labels. Note: requiring normalization merely restricts labels to lowercase text, and excludes whitespace and punctuation. It does NOT restrict project URLs solely to the use of "well-known" labels. These potential changes would be backwards incompatible, hence their inclusion only in this section. Acceptance of this PEP does NOT commit any future metadata revision to actually making these changes. Security Implications ===================== This PEP does not identify any positive or negative security implications associated with deprecating ``Home-page`` and ``Download-URL`` or with label normalization. How To Teach This ================= The changes in this PEP should be transparent to the majority of the packaging ecosystem's userbase; the primary beneficiaries of this PEP's changes are packaging tooling authors and index maintainers, who will be able to reduce the number of unique URL fields produced and checked. A small number of package maintainers may observe new warnings or notifications from their index of choice, should the index choose to ignore ``Home-page`` and ``Download-URL`` as suggested. Similarly, a small number of package maintainers may observe that their index of choice no longer renders their URLs, if only present in the deprecated fields. However, no package maintainers should observe rejected package uploads or other breaking changes to packaging workflows due to this PEP's proposed changes. Anybody who observes warnings or changes to the presentation of URLs on indices can be taught about this PEP's behavior via official packaging resources, such as the :ref:`Python Packaging User Guide ` and `PyPI's user documentation `__, the latter of which already contains an informal description of PyPI's URL handling behavior. If this PEP is accepted, the authors of this PEP will coordinate to update and cross-link the resources mentioned above. .. _appendix-a: Appendix A: Label normalization examples ======================================== This appendix provides an illustrative excerpt of a distribution's metadata, before and after index-side processing: Before: .. code-block:: email Project-URL: Home-page, https://example.com Project-URL: Homepage, https://another.example.com Project-URL: Source, https://github.com/example/example Project-URL: GitHub, https://github.com/example/example Project-URL: Another Service, https://custom.example.com After: .. code-block:: email Project-URL: homepage, https://example.com Project-URL: homepage, https://another.example.com Project-URL: source, https://github.com/example/example Project-URL: github, https://github.com/example/example Project-URL: Another Service, https://custom.example.com In particular, observe: * Normalized duplicates are preserved (both ``Home-page`` and ``Homepage`` become ``homepage``); * ``Source`` and ``GitHub`` are both normalized into their respective forms, but ``github`` is **not** transformed into ``source``. * ``Another Service`` is **not** normalized, since its normal form (``anotherservice``) is not a :ref:`well-known label `. Copyright ========= This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.