PEP 538: Fix footnotes (#2714)
This commit is contained in:
parent
3a792e4a1d
commit
99c0bda3df
66
pep-0538.txt
66
pep-0538.txt
|
@ -124,7 +124,7 @@ can cause problems in some situations (for example, when using the GNU readline
|
|||
module [16_]).
|
||||
|
||||
On non-Apple and non-Android \*nix systems, these operations are handled using
|
||||
the C locale system in glibc, which has the following characteristics [4_]:
|
||||
the C locale system in glibc, which has the following characteristics [4]_:
|
||||
|
||||
* by default, all processes start in the ``C`` locale, which uses ``ASCII``
|
||||
for these conversions. This is almost never what anyone doing multilingual
|
||||
|
@ -136,7 +136,7 @@ the C locale system in glibc, which has the following characteristics [4_]:
|
|||
|
||||
The specific locale category that covers the APIs that CPython depends on is
|
||||
``LC_CTYPE``, which applies to "classification and conversion of characters,
|
||||
and to multibyte and wide characters" [5_]. Accordingly, CPython includes the
|
||||
and to multibyte and wide characters" [5]_. Accordingly, CPython includes the
|
||||
following key calls to ``setlocale``:
|
||||
|
||||
* in the main ``python`` binary, CPython calls ``setlocale(LC_ALL, "")`` to
|
||||
|
@ -183,7 +183,7 @@ Mac OS X and other \*BSD systems have taken a different approach: instead of
|
|||
offering a ``C.UTF-8`` locale, they offer a partial ``UTF-8`` locale that only
|
||||
defines the ``LC_CTYPE`` category. On such systems, the preferred
|
||||
environmental locale adjustment is to set ``LC_CTYPE=UTF-8`` rather than to set
|
||||
``LC_ALL`` or ``LANG``. [17_]
|
||||
``LC_ALL`` or ``LANG``. [17]_
|
||||
|
||||
In the specific case of Docker containers and similar technologies, the
|
||||
appropriate locale setting can be specified directly in the container image
|
||||
|
@ -247,8 +247,8 @@ Motivation
|
|||
While Linux container technologies like Docker, Kubernetes, and OpenShift are
|
||||
best known for their use in web service development, the related container
|
||||
formats and execution models are also being adopted for Linux command line
|
||||
application development. Technologies like Gnome Flatpak [7_] and
|
||||
Ubuntu Snappy [8_] further aim to bring these same techniques to Linux GUI
|
||||
application development. Technologies like Gnome Flatpak [7]_ and
|
||||
Ubuntu Snappy [8]_ further aim to bring these same techniques to Linux GUI
|
||||
application development.
|
||||
|
||||
When using Python 3 for application development in these contexts, it isn't
|
||||
|
@ -327,7 +327,7 @@ with this problem automatically rather than relying on redistributors or end
|
|||
users to handle it through system configuration changes.
|
||||
|
||||
While the glibc developers are working towards making the C.UTF-8 locale
|
||||
universally available for use by glibc based applications like CPython [6_],
|
||||
universally available for use by glibc based applications like CPython [6]_,
|
||||
this unfortunately doesn't help on platforms that ship older versions of glibc
|
||||
without that feature, and also don't provide C.UTF-8 (or an equivalent) as an
|
||||
on-disk locale the way Debian and Fedora do. These platforms are considered
|
||||
|
@ -649,7 +649,7 @@ Defaulting to "surrogateescape" error handling on the standard IO streams
|
|||
By coercing the locale away from the legacy C default and its assumption of
|
||||
ASCII as the preferred text encoding, this PEP also disables the implicit use
|
||||
of the "surrogateescape" error handler on the standard IO streams that was
|
||||
introduced in Python 3.5 ([15_]), as well as the automatic use of
|
||||
introduced in Python 3.5 ([15]_), as well as the automatic use of
|
||||
``surrogateescape`` when operating in :pep:`540`'s proposed UTF-8 mode.
|
||||
|
||||
Rather than introducing yet another configuration option to adjust that
|
||||
|
@ -662,7 +662,7 @@ provided text values are typically able to be transparently passed through a
|
|||
Python 3 application even if it is incorrect in assuming that that text has
|
||||
been encoded as UTF-8.
|
||||
|
||||
In particular, GB 18030 [12_] is a Chinese national text encoding standard
|
||||
In particular, GB 18030 [12]_ is a Chinese national text encoding standard
|
||||
that handles all Unicode code points, that is formally incompatible with both
|
||||
ASCII and UTF-8, but will nevertheless often tolerate processing as surrogate
|
||||
escaped data - the points where GB 18030 reuses ASCII byte values in an
|
||||
|
@ -672,7 +672,7 @@ the relevant ASCII code points. Operations that don't involve splitting on or
|
|||
searching for particular ASCII or Unicode code point values are almost
|
||||
certain to work correctly.
|
||||
|
||||
Similarly, Shift-JIS [13_] and ISO-2022-JP [14_] remain in widespread use in
|
||||
Similarly, Shift-JIS [13]_ and ISO-2022-JP [14]_ remain in widespread use in
|
||||
Japan, and are incompatible with both ASCII and UTF-8, but will tolerate text
|
||||
processing operations that don't involve splitting on or searching for
|
||||
particular ASCII or Unicode code point values.
|
||||
|
@ -908,7 +908,7 @@ This was later removed on the grounds that setting only ``LC_CTYPE`` is
|
|||
sufficient to handle all of the problematic scenarios that the PEP aimed
|
||||
to resolve, while setting ``LANG`` as well would break cases where ``LANG``
|
||||
was set correctly, and the locale problems were solely due to an incorrect
|
||||
``LC_CTYPE`` setting ([22_]).
|
||||
``LC_CTYPE`` setting ([22]_).
|
||||
|
||||
For example, consider a Python application that called the Linux ``date``
|
||||
utility in a subprocess rather than doing its own date formatting::
|
||||
|
@ -1077,7 +1077,7 @@ be entirely redundant.
|
|||
However, that assumption turned out to be incorrect, as subsequent
|
||||
investigations showed that if you explicitly configure ``LANG=C`` on
|
||||
these platforms, extension modules like GNU readline will misbehave in much the
|
||||
same way as they do on other \*nix systems. [21_]
|
||||
same way as they do on other \*nix systems. [21]_
|
||||
|
||||
In addition, Mac OS X is also frequently used as a development and testing
|
||||
platform for Python software intended for deployment to other \*nix environments
|
||||
|
@ -1093,12 +1093,12 @@ Implementation
|
|||
==============
|
||||
|
||||
The reference implementation is being developed in the
|
||||
``pep538-coerce-c-locale`` feature branch [18_] in Nick Coghlan's fork of the
|
||||
CPython repository on GitHub. A work-in-progress PR is available at [20_].
|
||||
``pep538-coerce-c-locale`` feature branch [18]_ in Nick Coghlan's fork of the
|
||||
CPython repository on GitHub. A work-in-progress PR is available at [20]_.
|
||||
|
||||
This reference implementation covers not only the enhancement request in
|
||||
issue 28180 [1_], but also the Android compatibility fixes needed to resolve
|
||||
issue 28997 [16_].
|
||||
issue 28180 [1]_, but also the Android compatibility fixes needed to resolve
|
||||
issue 28997 [16]_.
|
||||
|
||||
|
||||
Backporting to earlier Python 3 releases
|
||||
|
@ -1115,7 +1115,7 @@ default, or else specifically for platforms where such a locale is already
|
|||
consistently available.
|
||||
|
||||
At least the Fedora project is planning to pursue this approach for the
|
||||
upcoming Fedora 26 release [19_].
|
||||
upcoming Fedora 26 release [19]_.
|
||||
|
||||
|
||||
Backporting to other 3.x releases
|
||||
|
@ -1139,7 +1139,7 @@ Acknowledgements
|
|||
|
||||
The locale coercion approach proposed in this PEP is inspired directly by
|
||||
Armin Ronacher's handling of this problem in the ``click`` command line
|
||||
utility development framework [2_]::
|
||||
utility development framework [2]_::
|
||||
|
||||
$ LANG=C python3 -c 'import click; cli = click.command()(lambda:None); cli()'
|
||||
Traceback (most recent call last):
|
||||
|
@ -1157,18 +1157,18 @@ utility development framework [2_]::
|
|||
export LANG=C.UTF-8
|
||||
|
||||
The change was originally proposed as a downstream patch for Fedora's
|
||||
system Python 3.6 package [3_], and then reformulated as a PEP for Python 3.7
|
||||
system Python 3.6 package [3]_, and then reformulated as a PEP for Python 3.7
|
||||
with a section allowing for backports to earlier versions by redistributors.
|
||||
In parallel with the development of the upstream patch, Charalampos Stratakis
|
||||
has been working on the Fedora 26 backport and providing feedback on the
|
||||
practical viability of the proposed changes.
|
||||
|
||||
The initial draft was posted to the Python Linux SIG for discussion [10_] and
|
||||
The initial draft was posted to the Python Linux SIG for discussion [10]_ and
|
||||
then amended based on both that discussion and Victor Stinner's work in
|
||||
:pep:`540` [11_].
|
||||
:pep:`540` [11]_.
|
||||
|
||||
The "ℙƴ☂ℌøἤ" string used in the Unicode handling examples throughout this PEP
|
||||
is taken from Ned Batchelder's excellent "Pragmatic Unicode" presentation [9_].
|
||||
is taken from Ned Batchelder's excellent "Pragmatic Unicode" presentation [9]_.
|
||||
|
||||
Stephen Turnbull has long provided valuable insight into the text encoding
|
||||
handling challenges he regularly encounters at the University of Tsukuba
|
||||
|
@ -1179,16 +1179,16 @@ References
|
|||
==========
|
||||
|
||||
.. [1] CPython: sys.getfilesystemencoding() should default to utf-8
|
||||
(http://bugs.python.org/issue28180)
|
||||
(https://bugs.python.org/issue28180)
|
||||
|
||||
.. [2] Locale configuration required for click applications under Python 3
|
||||
(http://click.pocoo.org/5/python3/#python-3-surrogate-handling)
|
||||
(https://click.palletsprojects.com/en/5.x/python3/#python-3-surrogate-handling)
|
||||
|
||||
.. [3] Fedora: force C.UTF-8 when Python 3 is run under the C locale
|
||||
(https://bugzilla.redhat.com/show_bug.cgi?id=1404918)
|
||||
|
||||
.. [4] GNU C: How Programs Set the Locale
|
||||
( https://www.gnu.org/software/libc/manual/html_node/Setting-the-Locale.html)
|
||||
(https://www.gnu.org/software/libc/manual/html_node/Setting-the-Locale.html)
|
||||
|
||||
.. [5] GNU C: Locale Categories
|
||||
(https://www.gnu.org/software/libc/manual/html_node/Locale-Categories.html)
|
||||
|
@ -1197,13 +1197,13 @@ References
|
|||
(https://sourceware.org/glibc/wiki/Proposals/C.UTF-8)
|
||||
|
||||
.. [7] GNOME Flatpak
|
||||
(http://flatpak.org/)
|
||||
(https://flatpak.org/)
|
||||
|
||||
.. [8] Ubuntu Snappy
|
||||
(https://www.ubuntu.com/desktop/snappy)
|
||||
|
||||
.. [9] Pragmatic Unicode
|
||||
(http://nedbatchelder.com/text/unipain.html)
|
||||
(https://nedbatchelder.com/text/unipain.html)
|
||||
|
||||
.. [10] linux-sig discussion of initial PEP draft
|
||||
(https://mail.python.org/pipermail/linux-sig/2017-January/000014.html)
|
||||
|
@ -1224,10 +1224,10 @@ References
|
|||
(https://bugs.python.org/issue19977)
|
||||
|
||||
.. [16] test_readline.test_nonascii fails on Android
|
||||
(http://bugs.python.org/issue28997)
|
||||
(https://bugs.python.org/issue28997)
|
||||
|
||||
.. [17] UTF-8 locale discussion on "locale.getdefaultlocale() fails on Mac OS X with default language set to English"
|
||||
(http://bugs.python.org/issue18378#msg215215)
|
||||
(https://bugs.python.org/issue18378#msg215215)
|
||||
|
||||
.. [18] GitHub branch diff for ``ncoghlan:pep538-coerce-c-locale``
|
||||
(https://github.com/python/cpython/compare/master...ncoghlan:pep538-coerce-c-locale)
|
||||
|
@ -1250,13 +1250,3 @@ Copyright
|
|||
|
||||
This document has been placed in the public domain under the terms of the
|
||||
CC0 1.0 license: https://creativecommons.org/publicdomain/zero/1.0/
|
||||
|
||||
|
||||
..
|
||||
Local Variables:
|
||||
mode: indented-text
|
||||
indent-tabs-mode: nil
|
||||
sentence-end-double-space: t
|
||||
fill-column: 70
|
||||
coding: utf-8
|
||||
End:
|
||||
|
|
Loading…
Reference in New Issue