PEP: 685 Title: Comparison of extra names for optional distribution dependencies Author: Brett Cannon Discussions-To: https://discuss.python.org/t/pep-685-comparison-of-extra-names-for-optional-distribution-dependencies/14141 Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 08-Mar-2022 Post-History: 08-Mar-2022 Abstract ======== This PEP specifies how to normalize `distribution _extra_ `_ names when performing comparisons. This prevents tools from either failing to find an extra name or accidentally matching against an unexpected name. Motivation ========== The `Provides-Extra`_ core metadata specification says that an extra's name "must be a valid Python identifier". :pep:`508` says that the value of an ``extra`` marker may contain a letter, digit, or any one of ``.``, ``-``, or ``_`` after the initial character. Otherwise there is no other specification at https://packaging.python.org which outlines how extra names should be written or normalization for comparison. Due to the amount of packaging-related code out there, it is important to evaluate current practices by the community and standardize on a practice that doesn't break most code while being something tool authors can agree to following. The issue of no standard was brought forward via the discussion at https://discuss.python.org/t/what-extras-names-are-treated-as-equal-and-why/7614 where the extra ``adhoc-ssl`` was not considered equal to the name ``adhoc_ssl`` by pip. Rationale ========= :pep:`503` specifies how to normalize distribution names: ``re.sub(r"[-_.]+", "-", name).lower()``. This collapses any run of the substitution character down to a single character, e.g. ``---`` gets collapsed down to ``-``. This does not produce a valid Python identifier as specified by the core metadata specification for extra names. `Setuptools does normalization `__ via ``re.sub('[^A-Za-z0-9.-]+', '_', name).lower()``. The use of an underscore/``_`` differs from PEP 503's use of a hyphen/``-``. Runs of characters, unlike PEP 503, do **not** get collapsed, e.g. ``___`` stays the same. For pip, its "extra normalisaton behaviour is quite convoluted and eratic", and so its use is not considered. Specification ============= [Describe the syntax and semantics of any new language feature.] When comparing extra names, tools MUST normalize the names being compared using the equivalent semantics of ``re.sub('[^A-Za-z0-9.-]+', '_', name).lower()``. This normalizes any extra name previously allowed by :pep:`508` in a consistent fashion with setuptools. For tools writing `core metadata`_, they MUST write out extra names in their normalized form. This applies to the ``Provides-Extra`` field and the ``Provides-Dist`` field both when specifying extras for a distribution as well as the ``extra`` marker. This will also help enforce the curren requirement from the core metadata specification that extra names be valid Python identifiers. Tools generating metadata MUST also raise an error if a user specified two or more extra names which would normalize to the same name. Backwards Compatibility ======================= Older distributions which contain conflicting names when normalized will no longer have all of their extra names made available to users as independent options, but instead as a single extra. It is hoped that relying on setuptools' algorithm for normalization will minimize the breakage from this. As distributions make new releases using tools which implement this PEP, the backwards-compatibility issues will become less of a concern. Security Implications ===================== It is possible that a distribution has conflicting extra names and a tool ends up installing distributions that somehow weaken the security of the system. This is only hypothetical and if it were to occur it would probably be more of a security concern for the distributions involved more than the distribution that pulled them in together. How to Teach This ================= This should be transparent to users on a day-to-day basis. It will be up to tools to educate/stop users when they select extra names which conflict. Reference Implementation ======================== No reference implementation is provided, but the expectation is the `packaging project`_ will provide a function in its ``packaging.utils`` that will implement extra name normalization. It will also implement extra name comparisons appropriately. Finally, if the project ever gains the ability to write out metadata, it will also implement this PEP. Rejected Ideas ============== Normalize names according to PEP 503 ------------------------------------ For backwards-compatibility concerns, it was decided not to follow :pep:`503` and how it normalizes distribution names. Open Issues =========== N/A Copyright ========= This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive. .. _core metadata: https://packaging.python.org/en/latest/specifications/core-metadata/ .. _packaging project: https://packaging.pypa.io .. _Provides-Extra: https://packaging.python.org/en/latest/specifications/core-metadata/#provides-extra-multiple-use