PEP 440: regex should not permit Unicode [Nd] characters (#966)

Numeric components in version numbers are expected to be sequences
of ASCII digits, but the PEP didn't actually say that explicitly.

Omission picked up and the `is_canonical` regex in Appendix B corrected
by Frazer McLean.
This commit is contained in:
Frazer McLean 2019-04-07 09:26:48 +02:00 committed by Nick Coghlan
parent f25f585af9
commit 1dd991e900
1 changed files with 13 additions and 2 deletions

View File

@ -110,7 +110,8 @@ Public version identifiers are separated into up to five segments:
Any given release will be a "final release", "pre-release", "post-release" or
"developmental release" as defined in the following sections.
All numeric components MUST be non-negative integers.
All numeric components MUST be non-negative integers represented as sequences
of ASCII digits.
All numeric components MUST be interpreted and ordered according to their
numeric value, not as text strings.
@ -1533,6 +1534,13 @@ the initial reference implementation was released in setuptools 8.0 and pip
This change was based on user feedback received when setuptools 8.0
started applying normalisation to the release metadata generated when
preparing packages for publication on PyPI [8]_.
* The PEP text and the ``is_canonical`` regex were updated to be explicit
that numeric components are specifically required to be represented as
squences of ASCII digits, not arbitrary Unicode [Nd] code points. This
was previously implied by the version parsing regex in Appendix B, but
not stated explicitly [10]_.
References
@ -1567,6 +1575,9 @@ justifications for needing such a standard can be found in PEP 386.
.. [9] Changing the status of PEP 440 to Provisional
https://mail.python.org/pipermail/distutils-sig/2014-December/025412.html
.. [10] PEP 440: regex should not permit Unicode [Nd] characters
https://github.com/python/peps/pull/966
Appendix A
==========
@ -1594,7 +1605,7 @@ the following function::
import re
def is_canonical(version):
return re.match(r'^([1-9]\d*!)?(0|[1-9]\d*)(\.(0|[1-9]\d*))*((a|b|rc)(0|[1-9]\d*))?(\.post(0|[1-9]\d*))?(\.dev(0|[1-9]\d*))?$', version) is not None
return re.match(r'^([1-9][0-9]*!)?(0|[1-9][0-9]*)(\.(0|[1-9][0-9]*))*((a|b|rc)(0|[1-9][0-9]*))?(\.post(0|[1-9][0-9]*))?(\.dev(0|[1-9][0-9]*))?$', version) is not None
To extract the components of a version identifier, use the following regular
expression (as defined by the `packaging <https://github.com/pypa/packaging>`_