408 lines
16 KiB
ReStructuredText
408 lines
16 KiB
ReStructuredText
PEP: 753
|
|
Title: Uniform project URLs in core metadata
|
|
Author: William Woodruff <william@yossarian.net>,
|
|
Facundo Tuesca <facundo.tuesca@trailofbits.com>
|
|
Sponsor: Barry Warsaw <barry at python.org>
|
|
PEP-Delegate: Paul Moore <p.f.moore at gmail.com>
|
|
Discussions-To: https://discuss.python.org/t/pep-753-uniform-urls-in-core-metadata/62792
|
|
Status: Accepted
|
|
Type: Standards Track
|
|
Topic: Packaging
|
|
Created: 29-Aug-2024
|
|
Post-History: `26-Aug-2024 <https://discuss.python.org/t/core-metadata-should-home-page-and-download-url-be-deprecated/62037>`__,
|
|
`03-Sep-2024 <https://discuss.python.org/t/pep-753-uniform-urls-in-core-metadata/62792>`__
|
|
Resolution: https://discuss.python.org/t/62792/30
|
|
|
|
|
|
Abstract
|
|
========
|
|
|
|
This PEP recommends two discrete changes to the processing of core metadata by
|
|
indices (such as PyPI) and other core metadata consumers:
|
|
|
|
* Deprecation of the ``Home-page`` and ``Download-URL`` fields in favor of
|
|
their ``Project-URL`` equivalents;
|
|
* A set of :ref:`conventions <conventions-for-labels>` for normalizing and
|
|
assigning semantics to ``Project-URL`` labels during consumer-side metadata
|
|
processing.
|
|
|
|
Rationale and Motivation
|
|
========================
|
|
|
|
Python's standard :ref:`core metadata <packaging:core-metadata>` has gone
|
|
through many years of revisions, with various standardized milestone versions.
|
|
|
|
These revisions of the core metadata have introduced various mechanisms
|
|
for expressing a package's relationship to external resources, via URLs:
|
|
|
|
1. Metadata 1.0 introduced ``Home-page``, a single-use field containing
|
|
a URL to the distribution's home page.
|
|
|
|
.. code-block:: email
|
|
|
|
Home-page: https://example.com/sampleproject
|
|
|
|
2. Metadata 1.1 introduced ``Download-URL``, a complementary single-use field
|
|
containing a URL suitable for downloading the current distribution.
|
|
|
|
.. code-block:: email
|
|
|
|
Download-URL: https://example.com/sampleproject/sampleproject-1.2.3.tar.gz
|
|
|
|
3. Metadata 1.2 introduced ``Project-URL``, a **multiple-use** field containing
|
|
a label-and-URL pair. Each label is free text conveying the URL's semantics.
|
|
|
|
.. code-block:: email
|
|
|
|
Project-URL: Homepage, https://example.com/sampleproject
|
|
Project-URL: Download, https://example.com/sampleproject/sampleproject-1.2.3.tar.gz
|
|
Project-URL: Documentation, https://example.com/sampleproject/docs
|
|
|
|
Metadata 2.1, 2.2, and 2.3 leave the behavior of these fields as originally
|
|
specified.
|
|
|
|
Because ``Project-URL`` allows free text labels and is multiple-use, informal
|
|
conventions have arisen for representing the values of
|
|
``Home-page`` and ``Download-URL`` within ``Project-URL`` instead.
|
|
|
|
These conventions have seen significant adoption, with :pep:`621` explicitly
|
|
choosing to provide only a ``project.urls`` table rather than a
|
|
``project.home-page`` field. From :pep:`621`'s rejected ideas:
|
|
|
|
.. pull-quote::
|
|
|
|
While the core metadata supports it, having a single field for a project's
|
|
URL while also supporting a full table seemed redundant and confusing.
|
|
|
|
This PEP exists to formalize the informal conventions that have arisen, as well
|
|
as explicitly document ``Home-page`` and ``Download-URL`` as deprecated in
|
|
favor of equivalent ``Project-URL`` representations.
|
|
|
|
Specification
|
|
=============
|
|
|
|
This PEP proposes that ``Home-page`` and ``Download-URL`` be considered
|
|
**deprecated**. This deprecation has implications for both package metadata
|
|
producers (e.g. build backends and packaging tools) and package indices
|
|
(e.g. PyPI).
|
|
|
|
.. _metadata-producers:
|
|
|
|
Metadata producers
|
|
------------------
|
|
|
|
This PEP stipulates the following for metadata producers:
|
|
|
|
* When generating metadata 1.2 or later, producers **SHOULD** emit only
|
|
``Project-URL``, and **SHOULD NOT** emit ``Home-page`` or ``Download-URL``
|
|
fields.
|
|
|
|
These stipulations do not change the optionality of URL fields in core metadata.
|
|
In other words, producers **MAY** choose to omit ``Project-URL`` entirely
|
|
at their discretion.
|
|
|
|
This PEP does **not** propose the outright removal of support for ``Home-page``
|
|
or ``Download-URL``. However, see :ref:`future-considerations` for
|
|
thoughts on how a new (as of yet unspecified) major core metadata version
|
|
could complete the deprecation cycle via removal of these deprecated fields.
|
|
|
|
Similarly, this PEP does **not** propose that metadata producers emit
|
|
:ref:`normalized labels <label-normalization>`. Label normalization is performed
|
|
solely on the processing and consumption side (i.e. within indices and other
|
|
consumers of distribution metadata).
|
|
|
|
.. _package-indices:
|
|
|
|
Package indices
|
|
---------------
|
|
|
|
This PEP stipulates the following for package indices:
|
|
|
|
* When interpreting a distribution's metadata of version 1.2 or later
|
|
(e.g. for rendering on a web page), the index **MUST** prefer
|
|
``Project-URL`` fields as a source of URLs over ``Home-page`` and
|
|
``Download-URL``, even if the latter are explicitly provided.
|
|
|
|
* If a distribution's metadata contains only the ``Home-page`` and
|
|
``Download-URL`` fields, the index **MAY** choose to ignore those fields
|
|
and behave as though no URLs were provided in the metadata. In this case,
|
|
the index **SHOULD** present an appropriate warning or notification to
|
|
the uploading party.
|
|
|
|
* The mechanism for presenting this warning or notification is not
|
|
specified, since it will vary by index. By way of example, an index may
|
|
choose to present a warning in the HTTP response to an upload request, or
|
|
send an email or other notification to the maintainer(s) of the project.
|
|
|
|
* If a distribution's metadata contains both sets of fields, the index **MAY**
|
|
choose to reject the distribution outright. However, this is
|
|
**NOT RECOMMENDED** until a future unspecified major metadata version
|
|
formally removes support for ``Home-page`` and ``Download-URL``.
|
|
|
|
* Any changes to the interpretation of metadata of version 1.2 or later that
|
|
result in previously recognized URLs no longer being recognized
|
|
**SHOULD NOT** be retroactively applied to previously uploaded packages.
|
|
|
|
These stipulations do not change the optionality of URL processing by indices.
|
|
In other words, an index that does not process URLs within uploaded
|
|
distributions may continue to ignore all URL fields entirely.
|
|
|
|
Similarly, these stipulations do **not** imply that the index should
|
|
persist any normalizations that it performs to a distribution's metadata.
|
|
In other words, this PEP stipulates label normalization for the purpose
|
|
of user presentation, not for the purpose of modifying an uploaded distribution
|
|
or its "sidecar" :pep:`658` metadata file.
|
|
|
|
.. _conventions-for-labels:
|
|
|
|
Conventions for ``Project-URL`` labels
|
|
======================================
|
|
|
|
The deprecations proposed above require a formalization of the currently
|
|
informal relationship between ``Home-page``, ``Download-URL``, and their
|
|
``Project-URL`` equivalents.
|
|
|
|
This formalization has two parts:
|
|
|
|
1. A set of rules for normalizing ``Project-URL`` labels during index-side
|
|
processing;
|
|
2. A set of "well-known" normalized label values that indices may specialize
|
|
URL presentation for.
|
|
|
|
.. _label-normalization:
|
|
|
|
Label normalization
|
|
-------------------
|
|
|
|
The core metadata specification stipulates that ``Project-URL`` labels are
|
|
free text, limited to 32 characters.
|
|
|
|
This PEP proposes adding the concept of a "normalized" label to the core
|
|
metadata specification. Label normalization is defined via the following
|
|
Python function:
|
|
|
|
.. code-block:: python
|
|
|
|
import string
|
|
def normalize_label(label: str) -> str:
|
|
chars_to_remove = string.punctuation + string.whitespace
|
|
removal_map = str.maketrans("", "", chars_to_remove)
|
|
return label.translate(removal_map).lower()
|
|
|
|
In plain language: a label is *normalized* by deleting all ASCII punctuation and
|
|
whitespace, and then converting the result to lowercase.
|
|
|
|
The following table shows examples of labels before (raw) and after
|
|
normalization:
|
|
|
|
.. csv-table::
|
|
:header: "Raw", "Normalized"
|
|
|
|
"``Homepage``", "``homepage``"
|
|
"``Home-page``", "``homepage``"
|
|
"``Home page``", "``homepage``"
|
|
"``Change_Log``", "``changelog``"
|
|
"``What's New?``", "``whatsnew``"
|
|
|
|
When processing distribution metadata, package indices **SHOULD** perform
|
|
label normalization to determine if a given label is
|
|
:ref:`well known <well-known>` for subsequent special processing.
|
|
Labels that are not well-known **MUST** be processed in their un-normalized
|
|
form.
|
|
|
|
Normalization does **not** change pre-existing semantics around duplicated
|
|
``Project-URL`` labels. In other words, normalization may result in duplicate
|
|
labels in the project's metadata, but only in the same manner that was already permitted
|
|
(since the :ref:`core metadata specification <packaging:core-metadata>` does
|
|
not stipulate that labels are unique).
|
|
|
|
Excerpted examples of normalized metadata fields are provided in
|
|
:ref:`Appendix A <appendix-a>`.
|
|
|
|
.. _well-known:
|
|
|
|
Well-known labels
|
|
-----------------
|
|
|
|
In addition to the normalization rules above, this PEP proposes a
|
|
fixed (but extensible) set of "well-known" ``Project-URL`` labels,
|
|
as well as aliases and human-readable equivalents.
|
|
|
|
The following table lists these labels, in normalized form:
|
|
|
|
.. list-table::
|
|
:header-rows: 1
|
|
|
|
* - Label (Human-readable equivalent)
|
|
- Description
|
|
- Aliases
|
|
* - ``homepage`` (Homepage)
|
|
- The project's home page
|
|
- *(none)*
|
|
* - ``source`` (Source Code)
|
|
- The project's hosted source code or repository
|
|
- ``repository``, ``sourcecode``, ``github``
|
|
* - ``download`` (Download)
|
|
- A download URL for the current distribution, equivalent to ``Download-URL``
|
|
- *(none)*
|
|
* - ``changelog`` (Changelog)
|
|
- The project's comprehensive changelog
|
|
- ``changes``, ``whatsnew``, ``history``
|
|
* - ``releasenotes`` (Release Notes)
|
|
- The project's curated release notes
|
|
- *(none)*
|
|
* - ``documentation`` (Documentation)
|
|
- The project's online documentation
|
|
- ``docs``
|
|
* - ``issues`` (Issue Tracker)
|
|
- The project's bug tracker
|
|
- ``bugs``, ``issue``, ``tracker``, ``issuetracker``, ``bugtracker``
|
|
* - ``funding`` (Funding)
|
|
- Funding Information
|
|
- ``sponsor``, ``donate``, ``donation``
|
|
|
|
Indices **MAY** choose to use the human-readable equivalents suggested above
|
|
in their UI elements, if appropriate. Alternatively, indices **MAY** choose
|
|
their own appropriate human-readable equivalents for UI elements.
|
|
|
|
Packagers and metadata producers **MAY** choose to use any label that normalizes
|
|
to these labels (or their aliases) to communicate specific URL intents to
|
|
package indices and downstreams.
|
|
|
|
Similarly, indices **MAY** choose to specialize their rendering or presentation
|
|
of URLs with these labels, e.g. by presenting an appropriate icon or tooltip
|
|
for each label.
|
|
|
|
Indices **MAY** also specialize the rendering or presentation of additional
|
|
labels or URLs, including (but not limited to), labels that start with a
|
|
well-known label, and URLs that refer to a known service provider domain (e.g.
|
|
for documentation hosting or issue tracking).
|
|
|
|
This PEP recognizes that the list of well-known labels is unlikely to remain
|
|
static, and that subsequent additions to it should not require the overhead
|
|
associated with a formal PEP process or new metadata version. As such,
|
|
this PEP proposes that the list above become a "living" list within
|
|
the :ref:`PyPA specifications <packaging:packaging-specifications>`.
|
|
|
|
Backwards Compatibility
|
|
=======================
|
|
|
|
Limited Impact
|
|
--------------
|
|
|
|
This PEP is expected to have little to no impact on existing packaging tooling
|
|
or package indices:
|
|
|
|
* Packaging tooling: no changes to the correctness or well-formedness
|
|
of the core metadata. This PEP proposes deprecations as well as behavioral
|
|
refinements, but all currently (and historically) produced metadata will
|
|
continue to be valid per the rules of its respective version.
|
|
* Package indices: indices will continue to expect well-formed core metadata,
|
|
with no behavioral changes. Indices **MAY** choose to emit warnings or
|
|
notifications on the presence of now-deprecated fields,
|
|
:ref:`per above <package-indices>`.
|
|
|
|
.. _future-considerations:
|
|
|
|
Future Considerations
|
|
=====================
|
|
|
|
This PEP does not stipulate or require any future metadata changes.
|
|
|
|
However, per :ref:`metadata-producers` and :ref:`conventions-for-labels`,
|
|
we identify the following potential future goals for a new major release of
|
|
the core metadata standards:
|
|
|
|
* Outright removal of support for ``Home-page`` and ``Download-URL`` in the
|
|
next major core metadata version. If removed, package indices and consumers
|
|
**MUST** reject metadata containing these fields when said metadata is of
|
|
the new major version.
|
|
* Enforcement of label normalization. If enforced, package producers
|
|
**MUST** emit only normalized ``Project-URL`` labels when generating
|
|
distribution metadata, and package indices and consumers **MUST** reject
|
|
distributions containing non-normalized labels. Note: requiring
|
|
normalization merely restricts labels to lowercase text, and excludes
|
|
whitespace and punctuation. It does NOT restrict project URLs solely to
|
|
the use of "well-known" labels.
|
|
|
|
These potential changes would be backwards incompatible, hence their
|
|
inclusion only in this section. Acceptance of this PEP does NOT commit
|
|
any future metadata revision to actually making these changes.
|
|
|
|
Security Implications
|
|
=====================
|
|
|
|
This PEP does not identify any positive or negative security implications
|
|
associated with deprecating ``Home-page`` and ``Download-URL`` or with
|
|
label normalization.
|
|
|
|
How To Teach This
|
|
=================
|
|
|
|
The changes in this PEP should be transparent to the majority of the packaging
|
|
ecosystem's userbase; the primary beneficiaries of this PEP's changes are
|
|
packaging tooling authors and index maintainers, who will be able to reduce the
|
|
number of unique URL fields produced and checked.
|
|
|
|
A small number of package maintainers may observe new warnings or notifications
|
|
from their index of choice, should the index choose to ignore ``Home-page``
|
|
and ``Download-URL`` as suggested. Similarly, a small number of package
|
|
maintainers may observe that their index of choice no longer renders
|
|
their URLs, if only present in the deprecated fields. However, no package
|
|
maintainers should observe rejected package uploads or other breaking
|
|
changes to packaging workflows due to this PEP's proposed changes.
|
|
|
|
Anybody who observes warnings or changes to the presentation of
|
|
URLs on indices can be taught about this PEP's behavior via official
|
|
packaging resources, such as the
|
|
:ref:`Python Packaging User Guide <packaging>`
|
|
and `PyPI's user documentation <https://docs.pypi.org/>`__, the latter of which
|
|
already contains an informal description of PyPI's URL handling behavior.
|
|
|
|
If this PEP is accepted, the authors of this PEP will coordinate to update
|
|
and cross-link the resources mentioned above.
|
|
|
|
.. _appendix-a:
|
|
|
|
Appendix A: Label normalization examples
|
|
========================================
|
|
|
|
This appendix provides an illustrative excerpt of a distribution's
|
|
metadata, before and after index-side processing:
|
|
|
|
Before:
|
|
|
|
.. code-block:: email
|
|
|
|
Project-URL: Home-page, https://example.com
|
|
Project-URL: Homepage, https://another.example.com
|
|
Project-URL: Source, https://github.com/example/example
|
|
Project-URL: GitHub, https://github.com/example/example
|
|
Project-URL: Another Service, https://custom.example.com
|
|
|
|
|
|
After:
|
|
|
|
.. code-block:: email
|
|
|
|
Project-URL: homepage, https://example.com
|
|
Project-URL: homepage, https://another.example.com
|
|
Project-URL: source, https://github.com/example/example
|
|
Project-URL: github, https://github.com/example/example
|
|
Project-URL: Another Service, https://custom.example.com
|
|
|
|
In particular, observe:
|
|
|
|
* Normalized duplicates are preserved (both ``Home-page`` and ``Homepage``
|
|
become ``homepage``);
|
|
* ``Source`` and ``GitHub`` are both normalized into their respective forms,
|
|
but ``github`` is **not** transformed into ``source``.
|
|
* ``Another Service`` is **not** normalized, since its normal form
|
|
(``anotherservice``) is not a :ref:`well-known label <well-known>`.
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document is placed in the public domain or under the
|
|
CC0-1.0-Universal license, whichever is more permissive.
|