PEP 538: Fix footnotes (#2714)

This commit is contained in:
Hugo van Kemenade 2022-07-21 00:50:22 +03:00 committed by GitHub
parent 3a792e4a1d
commit 99c0bda3df
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 28 additions and 38 deletions

View File

@ -124,7 +124,7 @@ can cause problems in some situations (for example, when using the GNU readline
module [16_]).
On non-Apple and non-Android \*nix systems, these operations are handled using
the C locale system in glibc, which has the following characteristics [4_]:
the C locale system in glibc, which has the following characteristics [4]_:
* by default, all processes start in the ``C`` locale, which uses ``ASCII``
for these conversions. This is almost never what anyone doing multilingual
@ -136,7 +136,7 @@ the C locale system in glibc, which has the following characteristics [4_]:
The specific locale category that covers the APIs that CPython depends on is
``LC_CTYPE``, which applies to "classification and conversion of characters,
and to multibyte and wide characters" [5_]. Accordingly, CPython includes the
and to multibyte and wide characters" [5]_. Accordingly, CPython includes the
following key calls to ``setlocale``:
* in the main ``python`` binary, CPython calls ``setlocale(LC_ALL, "")`` to
@ -183,7 +183,7 @@ Mac OS X and other \*BSD systems have taken a different approach: instead of
offering a ``C.UTF-8`` locale, they offer a partial ``UTF-8`` locale that only
defines the ``LC_CTYPE`` category. On such systems, the preferred
environmental locale adjustment is to set ``LC_CTYPE=UTF-8`` rather than to set
``LC_ALL`` or ``LANG``. [17_]
``LC_ALL`` or ``LANG``. [17]_
In the specific case of Docker containers and similar technologies, the
appropriate locale setting can be specified directly in the container image
@ -247,8 +247,8 @@ Motivation
While Linux container technologies like Docker, Kubernetes, and OpenShift are
best known for their use in web service development, the related container
formats and execution models are also being adopted for Linux command line
application development. Technologies like Gnome Flatpak [7_] and
Ubuntu Snappy [8_] further aim to bring these same techniques to Linux GUI
application development. Technologies like Gnome Flatpak [7]_ and
Ubuntu Snappy [8]_ further aim to bring these same techniques to Linux GUI
application development.
When using Python 3 for application development in these contexts, it isn't
@ -327,7 +327,7 @@ with this problem automatically rather than relying on redistributors or end
users to handle it through system configuration changes.
While the glibc developers are working towards making the C.UTF-8 locale
universally available for use by glibc based applications like CPython [6_],
universally available for use by glibc based applications like CPython [6]_,
this unfortunately doesn't help on platforms that ship older versions of glibc
without that feature, and also don't provide C.UTF-8 (or an equivalent) as an
on-disk locale the way Debian and Fedora do. These platforms are considered
@ -649,7 +649,7 @@ Defaulting to "surrogateescape" error handling on the standard IO streams
By coercing the locale away from the legacy C default and its assumption of
ASCII as the preferred text encoding, this PEP also disables the implicit use
of the "surrogateescape" error handler on the standard IO streams that was
introduced in Python 3.5 ([15_]), as well as the automatic use of
introduced in Python 3.5 ([15]_), as well as the automatic use of
``surrogateescape`` when operating in :pep:`540`'s proposed UTF-8 mode.
Rather than introducing yet another configuration option to adjust that
@ -662,7 +662,7 @@ provided text values are typically able to be transparently passed through a
Python 3 application even if it is incorrect in assuming that that text has
been encoded as UTF-8.
In particular, GB 18030 [12_] is a Chinese national text encoding standard
In particular, GB 18030 [12]_ is a Chinese national text encoding standard
that handles all Unicode code points, that is formally incompatible with both
ASCII and UTF-8, but will nevertheless often tolerate processing as surrogate
escaped data - the points where GB 18030 reuses ASCII byte values in an
@ -672,7 +672,7 @@ the relevant ASCII code points. Operations that don't involve splitting on or
searching for particular ASCII or Unicode code point values are almost
certain to work correctly.
Similarly, Shift-JIS [13_] and ISO-2022-JP [14_] remain in widespread use in
Similarly, Shift-JIS [13]_ and ISO-2022-JP [14]_ remain in widespread use in
Japan, and are incompatible with both ASCII and UTF-8, but will tolerate text
processing operations that don't involve splitting on or searching for
particular ASCII or Unicode code point values.
@ -908,7 +908,7 @@ This was later removed on the grounds that setting only ``LC_CTYPE`` is
sufficient to handle all of the problematic scenarios that the PEP aimed
to resolve, while setting ``LANG`` as well would break cases where ``LANG``
was set correctly, and the locale problems were solely due to an incorrect
``LC_CTYPE`` setting ([22_]).
``LC_CTYPE`` setting ([22]_).
For example, consider a Python application that called the Linux ``date``
utility in a subprocess rather than doing its own date formatting::
@ -1077,7 +1077,7 @@ be entirely redundant.
However, that assumption turned out to be incorrect, as subsequent
investigations showed that if you explicitly configure ``LANG=C`` on
these platforms, extension modules like GNU readline will misbehave in much the
same way as they do on other \*nix systems. [21_]
same way as they do on other \*nix systems. [21]_
In addition, Mac OS X is also frequently used as a development and testing
platform for Python software intended for deployment to other \*nix environments
@ -1093,12 +1093,12 @@ Implementation
==============
The reference implementation is being developed in the
``pep538-coerce-c-locale`` feature branch [18_] in Nick Coghlan's fork of the
CPython repository on GitHub. A work-in-progress PR is available at [20_].
``pep538-coerce-c-locale`` feature branch [18]_ in Nick Coghlan's fork of the
CPython repository on GitHub. A work-in-progress PR is available at [20]_.
This reference implementation covers not only the enhancement request in
issue 28180 [1_], but also the Android compatibility fixes needed to resolve
issue 28997 [16_].
issue 28180 [1]_, but also the Android compatibility fixes needed to resolve
issue 28997 [16]_.
Backporting to earlier Python 3 releases
@ -1115,7 +1115,7 @@ default, or else specifically for platforms where such a locale is already
consistently available.
At least the Fedora project is planning to pursue this approach for the
upcoming Fedora 26 release [19_].
upcoming Fedora 26 release [19]_.
Backporting to other 3.x releases
@ -1139,7 +1139,7 @@ Acknowledgements
The locale coercion approach proposed in this PEP is inspired directly by
Armin Ronacher's handling of this problem in the ``click`` command line
utility development framework [2_]::
utility development framework [2]_::
$ LANG=C python3 -c 'import click; cli = click.command()(lambda:None); cli()'
Traceback (most recent call last):
@ -1157,18 +1157,18 @@ utility development framework [2_]::
export LANG=C.UTF-8
The change was originally proposed as a downstream patch for Fedora's
system Python 3.6 package [3_], and then reformulated as a PEP for Python 3.7
system Python 3.6 package [3]_, and then reformulated as a PEP for Python 3.7
with a section allowing for backports to earlier versions by redistributors.
In parallel with the development of the upstream patch, Charalampos Stratakis
has been working on the Fedora 26 backport and providing feedback on the
practical viability of the proposed changes.
The initial draft was posted to the Python Linux SIG for discussion [10_] and
The initial draft was posted to the Python Linux SIG for discussion [10]_ and
then amended based on both that discussion and Victor Stinner's work in
:pep:`540` [11_].
:pep:`540` [11]_.
The "ℙƴ☂ℌøἤ" string used in the Unicode handling examples throughout this PEP
is taken from Ned Batchelder's excellent "Pragmatic Unicode" presentation [9_].
is taken from Ned Batchelder's excellent "Pragmatic Unicode" presentation [9]_.
Stephen Turnbull has long provided valuable insight into the text encoding
handling challenges he regularly encounters at the University of Tsukuba
@ -1179,16 +1179,16 @@ References
==========
.. [1] CPython: sys.getfilesystemencoding() should default to utf-8
(http://bugs.python.org/issue28180)
(https://bugs.python.org/issue28180)
.. [2] Locale configuration required for click applications under Python 3
(http://click.pocoo.org/5/python3/#python-3-surrogate-handling)
(https://click.palletsprojects.com/en/5.x/python3/#python-3-surrogate-handling)
.. [3] Fedora: force C.UTF-8 when Python 3 is run under the C locale
(https://bugzilla.redhat.com/show_bug.cgi?id=1404918)
.. [4] GNU C: How Programs Set the Locale
( https://www.gnu.org/software/libc/manual/html_node/Setting-the-Locale.html)
(https://www.gnu.org/software/libc/manual/html_node/Setting-the-Locale.html)
.. [5] GNU C: Locale Categories
(https://www.gnu.org/software/libc/manual/html_node/Locale-Categories.html)
@ -1197,13 +1197,13 @@ References
(https://sourceware.org/glibc/wiki/Proposals/C.UTF-8)
.. [7] GNOME Flatpak
(http://flatpak.org/)
(https://flatpak.org/)
.. [8] Ubuntu Snappy
(https://www.ubuntu.com/desktop/snappy)
.. [9] Pragmatic Unicode
(http://nedbatchelder.com/text/unipain.html)
(https://nedbatchelder.com/text/unipain.html)
.. [10] linux-sig discussion of initial PEP draft
(https://mail.python.org/pipermail/linux-sig/2017-January/000014.html)
@ -1224,10 +1224,10 @@ References
(https://bugs.python.org/issue19977)
.. [16] test_readline.test_nonascii fails on Android
(http://bugs.python.org/issue28997)
(https://bugs.python.org/issue28997)
.. [17] UTF-8 locale discussion on "locale.getdefaultlocale() fails on Mac OS X with default language set to English"
(http://bugs.python.org/issue18378#msg215215)
(https://bugs.python.org/issue18378#msg215215)
.. [18] GitHub branch diff for ``ncoghlan:pep538-coerce-c-locale``
(https://github.com/python/cpython/compare/master...ncoghlan:pep538-coerce-c-locale)
@ -1250,13 +1250,3 @@ Copyright
This document has been placed in the public domain under the terms of the
CC0 1.0 license: https://creativecommons.org/publicdomain/zero/1.0/
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: