diff --git a/pep-0440.txt b/pep-0440.txt index 099816f3d..c663c1361 100644 --- a/pep-0440.txt +++ b/pep-0440.txt @@ -110,7 +110,8 @@ Public version identifiers are separated into up to five segments: Any given release will be a "final release", "pre-release", "post-release" or "developmental release" as defined in the following sections. -All numeric components MUST be non-negative integers. +All numeric components MUST be non-negative integers represented as sequences +of ASCII digits. All numeric components MUST be interpreted and ordered according to their numeric value, not as text strings. @@ -1533,6 +1534,13 @@ the initial reference implementation was released in setuptools 8.0 and pip This change was based on user feedback received when setuptools 8.0 started applying normalisation to the release metadata generated when preparing packages for publication on PyPI [8]_. + +* The PEP text and the ``is_canonical`` regex were updated to be explicit + that numeric components are specifically required to be represented as + squences of ASCII digits, not arbitrary Unicode [Nd] code points. This + was previously implied by the version parsing regex in Appendix B, but + not stated explicitly [10]_. + References @@ -1567,6 +1575,9 @@ justifications for needing such a standard can be found in PEP 386. .. [9] Changing the status of PEP 440 to Provisional https://mail.python.org/pipermail/distutils-sig/2014-December/025412.html + +.. [10] PEP 440: regex should not permit Unicode [Nd] characters + https://github.com/python/peps/pull/966 Appendix A ========== @@ -1594,7 +1605,7 @@ the following function:: import re def is_canonical(version): - return re.match(r'^([1-9]\d*!)?(0|[1-9]\d*)(\.(0|[1-9]\d*))*((a|b|rc)(0|[1-9]\d*))?(\.post(0|[1-9]\d*))?(\.dev(0|[1-9]\d*))?$', version) is not None + return re.match(r'^([1-9][0-9]*!)?(0|[1-9][0-9]*)(\.(0|[1-9][0-9]*))*((a|b|rc)(0|[1-9][0-9]*))?(\.post(0|[1-9][0-9]*))?(\.dev(0|[1-9][0-9]*))?$', version) is not None To extract the components of a version identifier, use the following regular expression (as defined by the `packaging `_