* Add "POSIX locale used by mistake" section
* Add a lot of issues in the Links section
This commit is contained in:
Victor Stinner 2017-01-06 13:57:10 +01:00
parent 5b6b25f5d9
commit 3c6b56f10c
1 changed files with 67 additions and 3 deletions

View File

@ -98,6 +98,20 @@ See the `POSIX locale (2016 Edition)
<http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.html>`_.
POSIX locale used by mistake
----------------------------
In many cases, the POSIX locale is not really expected by users who get
it by mistake. Examples:
* program started in an empty environment
* User forcing LANG=C to get messages in english
* LANG=C used for bad reasons, without being aware of the ASCII encoding
* SSH shell
* User locale set to a non-existing locale, typo in the locale name for
example
C.UTF-8 and C.utf8 locales
--------------------------
@ -484,12 +498,15 @@ PEPs:
* PEP 529: "Change Windows filesystem encoding to UTF-8"
* PEP 383: "Non-decodable Bytes in System Character Interfaces"
Python issues:
Main Python issues:
* `issue #28180: sys.getfilesystemencoding() should default to utf-8
<http://bugs.python.org/issue28180>`_
* `Issue #19846: Python 3 raises Unicode errors with the C locale
<http://bugs.python.org/issue19846>`_
* `Issue #19977: Use "surrogateescape" error handler for sys.stdin and
sys.stdout on UNIX for the C locale
<http://bugs.python.org/issue19977>`_
* `Issue #19847: Setting the default filesystem-encoding
<http://bugs.python.org/issue19847>`_
* `Issue #8622: Add PYTHONFSENCODING environment variable
<https://bugs.python.org/issue8622>`_: added but reverted because of
many issues, read the `Inconsistencies if locale and filesystem
@ -497,6 +514,53 @@ Python issues:
<https://mail.python.org/pipermail/python-dev/2010-October/104509.html>`_
thread on the python-dev mailing list
Incomplete list of Python issues related to Unicode errors, especially
with the POSIX locale:
* 2016-12-22: `LANG=C python3 -c "import os; os.path.exists('\xff')"
<http://bugs.python.org/issue29042#msg283821>`_
* 2014-07-20: `issue #22016: Add a new 'surrogatereplace' output only error handler
<http://bugs.python.org/issue22016>`_
* 2014-04-27: `Issue #21368: Check for systemd locale on startup if current
locale is set to POSIX <http://bugs.python.org/issue21368>`_ -- read manually
/etc/locale.conf when the locale is POSIX
* 2014-01-21: `Issue #20329: zipfile.extractall fails in Posix shell with utf-8
filename
<http://bugs.python.org/issue20329>`_
* 2013-11-30: `Issue #19846: Python 3 raises Unicode errors with the C locale
<http://bugs.python.org/issue19846>`_
* 2010-05-04: `Issue #8610: Python3/POSIX: errors if file system encoding is None
<http://bugs.python.org/issue8610>`_
* 2013-08-12: `Issue #18713: Clearly document the use of PYTHONIOENCODING to
set surrogateescape <http://bugs.python.org/issue18713>`_
* 2013-09-27: `Issue #19100: Use backslashreplace in pprint
<http://bugs.python.org/issue19100>`_
* 2012-01-05: `Issue #13717: os.walk() + print fails with UnicodeEncodeError
<http://bugs.python.org/issue13717>`_
* 2011-12-20: `Issue #13643: 'ascii' is a bad filesystem default encoding
<http://bugs.python.org/issue13643>`_
* 2011-03-16: `issue #11574: TextIOWrapper should use UTF-8 by default for the
POSIX locale
<http://bugs.python.org/issue11574>`_, thread on python-dev:
`Low-Level Encoding Behavior on Python 3
<https://mail.python.org/pipermail/python-dev/2011-March/109361.html>`_
* 2010-04-26: `Issue #8533: regrtest: use backslashreplace error handler for
stdout <http://bugs.python.org/issue8533>`_, regrtest fails with Unicode
encode error if the locale is POSIX
Some issues are real bug in applications which must set explicitly the
encoding. Well, it just works in the common case (locale configured
correctly), so what? But the program "suddenly" fails when the POSIX
locale is used (probably for bad reasons). Such bug is not well
understood by users. Example of such issue:
* 2013-11-21: `pip: open() uses the locale encoding to parse Python
script, instead of the encoding cookie
<http://bugs.python.org/issue19685>`_ -- pip must use the encoding
cookie to read a Python source code file
* 2011-01-21: `IDLE 3.x can crash decoding recent file list
<http://bugs.python.org/issue10974>`_
Prior Art
=========