PEP 686: Update (#2470)
This commit is contained in:
parent
4d8bc00d99
commit
6ccd6bc428
82
pep-0686.rst
82
pep-0686.rst
|
@ -54,22 +54,37 @@ Users can still disable UTF-8 mode by setting ``PYTHONUTF8=0`` or
|
||||||
``-X utf8=0``.
|
``-X utf8=0``.
|
||||||
|
|
||||||
|
|
||||||
``locale.get_encoding()``
|
``locale.getencoding()``
|
||||||
-------------------------
|
------------------------
|
||||||
|
|
||||||
Currently, ``TextIOWrapper`` uses ``locale.getpreferredencoding(False)``
|
Since UTF-8 mode affects ``locale.getpreferredencoding(False)``,
|
||||||
when ``encoding="locale"`` option is specified. It is ``"UTF-8"`` in UTF-8 mode.
|
we need an API to get locale encoding regardless of UTF-8 mode.
|
||||||
|
|
||||||
This behavior is inconsistent with the :pep:`597` motivation.
|
``locale.getencoding()`` will be added for this purpose.
|
||||||
|
It returns locale encoding too, but ignores UTF-8 mode.
|
||||||
|
|
||||||
|
When ``warn_default_encoding`` option is specified,
|
||||||
|
``locale.getpreferredencoding()`` will emit ``EncodingWarning`` like
|
||||||
|
``open()`` (see also :pep:`597`).
|
||||||
|
|
||||||
|
|
||||||
|
Fixing ``encoding="locale"`` option
|
||||||
|
-----------------------------------
|
||||||
|
|
||||||
|
:pep:`597` added the ``encoding="locale"`` option to the ``TextIOWrapper``.
|
||||||
|
This option is used to specify the locale encoding explicitly.
|
||||||
|
``TextIOWrapper`` should use locale encoding when the option is specified,
|
||||||
|
regardless of default text encoding.
|
||||||
|
|
||||||
|
But ``TextIOWrapper`` uses ``"UTF-8"`` in UTF-8 mode even if
|
||||||
|
``encoding="locale"`` is specified for now.
|
||||||
|
This behavior is inconsistent with the :pep:`597` motivation.
|
||||||
|
It is because we didn't expect making UTF-8 mode default when Python
|
||||||
|
changes its default text encoding.
|
||||||
|
|
||||||
|
This inconsistency should be fixed before making UTF-8 mode default.
|
||||||
``TextIOWrapper`` should use locale encoding when ``encoding="locale"`` is
|
``TextIOWrapper`` should use locale encoding when ``encoding="locale"`` is
|
||||||
passed before/after the default encoding is changed to UTF-8.
|
passed even in UTF-8 mode.
|
||||||
|
|
||||||
To fix this inconsistency, we will add ``locale.get_encoding()``.
|
|
||||||
It is the same as ``locale.getpreferredencoding(False)`` but it ignores
|
|
||||||
the UTF-8 mode.
|
|
||||||
|
|
||||||
This change will be released in Python 3.11 so that users can use UTF-8 mode
|
|
||||||
that is the same as Python 3.13.
|
|
||||||
|
|
||||||
|
|
||||||
Backward Compatibility
|
Backward Compatibility
|
||||||
|
@ -83,16 +98,18 @@ When a Python program depends on the default encoding, this change may cause
|
||||||
``UnicodeError``, mojibake, or even silent data corruption.
|
``UnicodeError``, mojibake, or even silent data corruption.
|
||||||
So this change should be announced loudly.
|
So this change should be announced loudly.
|
||||||
|
|
||||||
To resolve this backward incompatibility, users can do:
|
This is the guideline to fix this backward compatibility issue:
|
||||||
|
|
||||||
* Disable UTF-8 mode.
|
1. Disable UTF-8 mode.
|
||||||
* Use ``EncodingWarning`` to find where the default encoding is used and use
|
2. Use ``EncodingWarning`` (:pep:`597`) to find every places UTF-8 mode
|
||||||
``encoding="locale"`` option if locale encoding should be used
|
affects.
|
||||||
(as defined in :pep:`597`).
|
|
||||||
* Find every occurrence of ``locale.getpreferredencoding(False)`` in the
|
* If ``encoding`` option is omitted, consider using ``encoding="utf-8"``
|
||||||
application, and replace it with ``locale.get_locale_encoding()`` if
|
or ``encoding="locale"``.
|
||||||
locale encoding should be used.
|
* If ``locale.getpreferredencoding()`` is used, consider using
|
||||||
* Test the application with UTF-8 mode.
|
``"utf-8"`` or ``locale.getencoding()``.
|
||||||
|
|
||||||
|
3. Test the application with UTF-8 mode.
|
||||||
|
|
||||||
|
|
||||||
Preceding examples
|
Preceding examples
|
||||||
|
@ -122,10 +139,31 @@ Additionally, such warnings are not useful for non-cross platform applications
|
||||||
run on Unix.
|
run on Unix.
|
||||||
|
|
||||||
So forcing users to specify the ``encoding`` everywhere is too painful.
|
So forcing users to specify the ``encoding`` everywhere is too painful.
|
||||||
|
Emitting a lot of ``DeprecationWarning`` will lead users ignore warnings.
|
||||||
|
|
||||||
|
:pep:`387` requires adding a warning for backward incompatible changes.
|
||||||
|
But it doesn't require using ``DeprecationWarning``.
|
||||||
|
So using optional ``EncodingWarning`` doesn't violate the :pep:`387`.
|
||||||
|
|
||||||
Java also rejected this idea in `JEP 400`_.
|
Java also rejected this idea in `JEP 400`_.
|
||||||
|
|
||||||
|
|
||||||
|
Use ``PYTHONIOENCODING`` for PIPEs
|
||||||
|
----------------------------------
|
||||||
|
|
||||||
|
To ease backward compatibility issue, using ``PYTHONIOENCODING`` as the
|
||||||
|
default encoding of PIPEs in the ``subprocess`` module is considered.
|
||||||
|
|
||||||
|
With this idea, users can use legacy encoding for
|
||||||
|
``subprocess.Popen(text=True)`` even in UTF-8 mode.
|
||||||
|
|
||||||
|
But this idea makes "default encoding" complicated.
|
||||||
|
And this idea is also backward incompatible.
|
||||||
|
|
||||||
|
So this idea is rejected. Users can disable UTF-8 mode until they replace
|
||||||
|
``text=True`` with ``encoding="utf-8"`` or ``encoding="locale"``.
|
||||||
|
|
||||||
|
|
||||||
How to teach this
|
How to teach this
|
||||||
=================
|
=================
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue