193 lines
7.1 KiB
ReStructuredText
193 lines
7.1 KiB
ReStructuredText
PEP: 685
|
|
Title: Comparison of extra names for optional distribution dependencies
|
|
Author: Brett Cannon <brett@python.org>
|
|
PEP-Delegate: Paul Moore <p.f.moore@gmail.com>
|
|
Discussions-To: https://discuss.python.org/t/14141
|
|
Status: Draft
|
|
Type: Standards Track
|
|
Content-Type: text/x-rst
|
|
Created: 08-Mar-2022
|
|
Post-History: `08-Mar-2022 <https://discuss.python.org/t/14141>`__
|
|
|
|
|
|
Abstract
|
|
========
|
|
|
|
This PEP specifies how to normalize `distribution extra <Provides-Extra_>`_
|
|
names when performing comparisons.
|
|
This prevents tools from either failing to find an extra name, or
|
|
accidentally matching against an unexpected name.
|
|
|
|
|
|
Motivation
|
|
==========
|
|
|
|
The `Provides-Extra`_ core metadata specification states that an extra's
|
|
name "must be a valid Python identifier".
|
|
:pep:`508` specifies that the value of an ``extra`` marker may contain a
|
|
letter, digit, or any one of ``.``, ``-``, or ``_`` after the initial character.
|
|
Otherwise, there is no other `PyPA specification
|
|
<https://packaging.python.org/en/latest/specifications/>`_
|
|
which outlines how extra names should be written or normalized for comparison.
|
|
Due to the amount of packaging-related code in existence,
|
|
it is important to evaluate current practices by the community and
|
|
standardize on one that doesn't break most existing code, while being
|
|
something tool authors can agree to following.
|
|
|
|
The issue of there being no consistent standard was brought forward by an
|
|
`initial discussion <https://discuss.python.org/t/7614>`__
|
|
noting that the extra ``adhoc-ssl`` was not considered equal to the name
|
|
``adhoc_ssl`` by pip 22.
|
|
|
|
|
|
Rationale
|
|
=========
|
|
|
|
:pep:`503` specifies how to normalize distribution names::
|
|
|
|
re.sub(r"[-_.]+", "-", name).lower()
|
|
|
|
This collapses any run of the characters ``-``, ``_`` and ``.``
|
|
down to a single ``-``.
|
|
For example, ``---`` ``.`` and ``__`` all get converted to just ``-``.
|
|
This does **not** produce a valid Python identifier, per
|
|
the core metadata 2.2 specification for extra names.
|
|
|
|
`Setuptools 60 performs normalization <https://github.com/pypa/setuptools/blob/b2f7b8f92725c63b164d5776f85e67cc560def4e/pkg_resources/__init__.py#L1324-L1330>`__
|
|
via::
|
|
|
|
re.sub(r'[^A-Za-z0-9-.]+', '_', name).lower()
|
|
|
|
The use of an underscore/``_`` differs from PEP 503's use of a hyphen/``-``,
|
|
and it also normalizes characters outside of those allowed by :pep`508`.
|
|
Runs of ``.`` and ``-``, unlike PEP 503, do **not** get normalized to one ``_``,
|
|
e.g. ``..`` stays the same. To note, this is inconsistent with this function's
|
|
docstring, which *does* specify that all non-alphanumeric characters
|
|
(which would include ``-`` and ``.``) are normalized and collapsed.
|
|
|
|
For pip 22, its
|
|
"extra normalisation behaviour is quite convoluted and erratic" [pip-erratic]_
|
|
and so its use is not considered.
|
|
|
|
.. [pip-erratic] Tzu-ping Chung on Python Discourse <https://discuss.python.org/t/7614/10
|
|
|
|
|
|
Specification
|
|
=============
|
|
|
|
When comparing extra names, tools MUST normalize the names being compared
|
|
using the semantics outlined in `PEP 503 for names <https://peps.python.org/pep-0503/#normalized-names>`__::
|
|
|
|
re.sub(r"[-_.]+", "-", name).lower()
|
|
|
|
The `core metadata`_ specification will be updated such that the allowed
|
|
names for `Provides-Extra`_ matches what :pep:`508` specifies for names.
|
|
This will bring extra naming in line with that of the Name_ field.
|
|
Because this changes what is considered valid, it will lead to a core
|
|
metadata version increase to ``2.3``.
|
|
|
|
For tools writing `core metadata`_,
|
|
they MUST write out extra names in their normalized form.
|
|
This applies to the `Provides-Extra`_ field and the `extra marker`_
|
|
when used in the `Requires-Dist`_ field.
|
|
|
|
Tools generating metadata MUST raise an error if a user specified
|
|
two or more extra names which would normalize to the same name.
|
|
Tools generating metadata MUST raise an error if an invalid extra
|
|
name is provided as appropriate for the specified core metadata version.
|
|
If an older core metadata version is specified and the name would be
|
|
invalid with newer core metadata versions,
|
|
tools SHOULD warn the user.
|
|
Tools SHOULD warn users when an invalid extra name is read and SHOULD not use
|
|
the name to avoid ambiguity.
|
|
Tools MAY raise an error instead of a warning when reading an
|
|
invalid name, if they so desire.
|
|
|
|
|
|
Backwards Compatibility
|
|
=======================
|
|
|
|
Moving to :pep:`503` normalization and :pep:`508` name acceptance
|
|
allows for all preexisting, valid names to continue to be valid.
|
|
|
|
Based on research looking at a collection of wheels on PyPI [pypi-results]_,
|
|
the risk of extra name clashes is limited to 73 instances when considering
|
|
all extras names on PyPI, valid or not (not just those within a single package)
|
|
while *only* looking at valid names leads to only 3 clashes:
|
|
|
|
* ``dev-test``: ``dev_test``, ``dev-test``, ``dev.test``
|
|
* ``dev-lint``: ``dev-lint``, ``dev.lint``, ``dev_lint``
|
|
* ``apache-beam``: ``apache-beam``, ``apache.beam``
|
|
|
|
By requiring tools writing core metadata to only record the normalized name,
|
|
the issue of preexisting, invalid extra names should diminish over time.
|
|
|
|
.. [pypi-results] Paul Moore on Python Discourse https://discuss.python.org/t/14141/17
|
|
|
|
|
|
Security Implications
|
|
=====================
|
|
|
|
It is possible that for a distribution that has conflicting extra names, a
|
|
tool ends up installing dependencies that somehow weaken the security
|
|
of the system.
|
|
This is only hypothetical and if it were to occur,
|
|
it would probably be more of a security concern for the distributions
|
|
specifying such extras names rather than the distribution that pulled
|
|
them in together.
|
|
|
|
|
|
How to Teach This
|
|
=================
|
|
|
|
This should be transparent to users on a day-to-day basis.
|
|
It will be up to tools to educate/stop users when they select extra
|
|
names which conflict.
|
|
|
|
|
|
Reference Implementation
|
|
========================
|
|
|
|
No reference implementation is provided aside from the code above,
|
|
but the expectation is the `packaging project`_ will provide a
|
|
function in its ``packaging.utils`` module that will implement extra name
|
|
normalization.
|
|
It will also implement extra name comparisons appropriately.
|
|
Finally, if the project ever gains the ability to write out metadata,
|
|
it will also implement this PEP.
|
|
|
|
|
|
Rejected Ideas
|
|
==============
|
|
|
|
Using setuptools 60's normalization
|
|
-----------------------------------
|
|
|
|
Initially, this PEP proposed using setuptools ``safe_extra()`` for normalization
|
|
to try to minimize backwards-compatibility issues.
|
|
However, after checking various wheels on PyPI,
|
|
it became clear that standardizing **all** naming on :pep:`508` and
|
|
:pep:`503` semantics was easier and better long-term,
|
|
while causing minimal backwards compatibility issues.
|
|
|
|
|
|
Open Issues
|
|
===========
|
|
|
|
N/A
|
|
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document is placed in the public domain or under the
|
|
CC0-1.0-Universal license, whichever is more permissive.
|
|
|
|
|
|
.. _core metadata: https://packaging.python.org/en/latest/specifications/core-metadata/
|
|
.. _extra marker: https://peps.python.org/pep-0508/#extras
|
|
.. _Name: https://packaging.python.org/en/latest/specifications/core-metadata/#name
|
|
.. _packaging project: https://packaging.pypa.io
|
|
.. _Provides-Extra: https://packaging.python.org/en/latest/specifications/core-metadata/#provides-extra-multiple-use
|
|
.. _Requires-Dist: https://packaging.python.org/en/latest/specifications/core-metadata/#requires-dist-multiple-use
|