diff --git a/pep-0540.txt b/pep-0540.txt index df3631ca3..07e2b19e2 100644 --- a/pep-0540.txt +++ b/pep-0540.txt @@ -98,6 +98,20 @@ See the `POSIX locale (2016 Edition) `_. +POSIX locale used by mistake +---------------------------- + +In many cases, the POSIX locale is not really expected by users who get +it by mistake. Examples: + +* program started in an empty environment +* User forcing LANG=C to get messages in english +* LANG=C used for bad reasons, without being aware of the ASCII encoding +* SSH shell +* User locale set to a non-existing locale, typo in the locale name for + example + + C.UTF-8 and C.utf8 locales -------------------------- @@ -484,12 +498,15 @@ PEPs: * PEP 529: "Change Windows filesystem encoding to UTF-8" * PEP 383: "Non-decodable Bytes in System Character Interfaces" -Python issues: +Main Python issues: * `issue #28180: sys.getfilesystemencoding() should default to utf-8 `_ -* `Issue #19846: Python 3 raises Unicode errors with the C locale - `_ +* `Issue #19977: Use "surrogateescape" error handler for sys.stdin and + sys.stdout on UNIX for the C locale + `_ +* `Issue #19847: Setting the default filesystem-encoding + `_ * `Issue #8622: Add PYTHONFSENCODING environment variable `_: added but reverted because of many issues, read the `Inconsistencies if locale and filesystem @@ -497,6 +514,53 @@ Python issues: `_ thread on the python-dev mailing list +Incomplete list of Python issues related to Unicode errors, especially +with the POSIX locale: + +* 2016-12-22: `LANG=C python3 -c "import os; os.path.exists('\xff')" + `_ +* 2014-07-20: `issue #22016: Add a new 'surrogatereplace' output only error handler + `_ +* 2014-04-27: `Issue #21368: Check for systemd locale on startup if current + locale is set to POSIX `_ -- read manually + /etc/locale.conf when the locale is POSIX +* 2014-01-21: `Issue #20329: zipfile.extractall fails in Posix shell with utf-8 + filename + `_ +* 2013-11-30: `Issue #19846: Python 3 raises Unicode errors with the C locale + `_ +* 2010-05-04: `Issue #8610: Python3/POSIX: errors if file system encoding is None + `_ +* 2013-08-12: `Issue #18713: Clearly document the use of PYTHONIOENCODING to + set surrogateescape `_ +* 2013-09-27: `Issue #19100: Use backslashreplace in pprint + `_ +* 2012-01-05: `Issue #13717: os.walk() + print fails with UnicodeEncodeError + `_ +* 2011-12-20: `Issue #13643: 'ascii' is a bad filesystem default encoding + `_ +* 2011-03-16: `issue #11574: TextIOWrapper should use UTF-8 by default for the + POSIX locale + `_, thread on python-dev: + `Low-Level Encoding Behavior on Python 3 + `_ +* 2010-04-26: `Issue #8533: regrtest: use backslashreplace error handler for + stdout `_, regrtest fails with Unicode + encode error if the locale is POSIX + +Some issues are real bug in applications which must set explicitly the +encoding. Well, it just works in the common case (locale configured +correctly), so what? But the program "suddenly" fails when the POSIX +locale is used (probably for bad reasons). Such bug is not well +understood by users. Example of such issue: + +* 2013-11-21: `pip: open() uses the locale encoding to parse Python + script, instead of the encoding cookie + `_ -- pip must use the encoding + cookie to read a Python source code file +* 2011-01-21: `IDLE 3.x can crash decoding recent file list + `_ + Prior Art =========