PEP 528: Readability and style updates
This commit is contained in:
parent
a73c6dfc9c
commit
37567a1d72
79
pep-0528.txt
79
pep-0528.txt
|
@ -21,8 +21,7 @@ the active code page.
|
||||||
This PEP proposes changing the default standard stream implementation on Windows
|
This PEP proposes changing the default standard stream implementation on Windows
|
||||||
to use the Unicode APIs. This will allow users to print and input the full range
|
to use the Unicode APIs. This will allow users to print and input the full range
|
||||||
of Unicode characters at the default Windows console. This also requires a
|
of Unicode characters at the default Windows console. This also requires a
|
||||||
subtle change to how the tokenizer parses text from readline hooks, that should
|
subtle change to how the tokenizer parses text from readline hooks.
|
||||||
have no backwards compatibility issues.
|
|
||||||
|
|
||||||
Specific Changes
|
Specific Changes
|
||||||
================
|
================
|
||||||
|
@ -46,7 +45,7 @@ utf-16-le and converted into utf-8 when returned to Python.
|
||||||
|
|
||||||
The use of an ASCII compatible encoding is required to maintain compatibility
|
The use of an ASCII compatible encoding is required to maintain compatibility
|
||||||
with code that bypasses the ``TextIOWrapper`` and directly writes ASCII bytes to
|
with code that bypasses the ``TextIOWrapper`` and directly writes ASCII bytes to
|
||||||
the standard streams (for example, [process_stdinreader.py]_). Code that assumes
|
the standard streams (for example, `Twisted's process_stdinreader.py`_). Code that assumes
|
||||||
a particular encoding for the standard streams other than ASCII will likely
|
a particular encoding for the standard streams other than ASCII will likely
|
||||||
break.
|
break.
|
||||||
|
|
||||||
|
@ -78,8 +77,9 @@ behaviour.
|
||||||
Alternative Approaches
|
Alternative Approaches
|
||||||
======================
|
======================
|
||||||
|
|
||||||
The ``win_unicode_console`` package [win_unicode_console]_ is a pure-Python
|
The `win_unicode_console package`_ is a pure-Python alternative to changing the
|
||||||
alternative to changing the default behaviour of the console.
|
default behaviour of the console. It implements essentially the same
|
||||||
|
modifications as described here using pure Python code.
|
||||||
|
|
||||||
Code that may break
|
Code that may break
|
||||||
===================
|
===================
|
||||||
|
@ -94,21 +94,21 @@ Assuming stdin/stdout encoding
|
||||||
|
|
||||||
Code that assumes that the encoding required by ``sys.stdin.buffer`` or
|
Code that assumes that the encoding required by ``sys.stdin.buffer`` or
|
||||||
``sys.stdout.buffer`` is ``'mbcs'`` or a more specific encoding may currently be
|
``sys.stdout.buffer`` is ``'mbcs'`` or a more specific encoding may currently be
|
||||||
working by chance, but could encounter issues under this change. For example::
|
working by chance, but could encounter issues under this change. For example:
|
||||||
|
|
||||||
sys.stdout.buffer.write(text.encode('mbcs'))
|
>>> sys.stdout.buffer.write(text.encode('mbcs'))
|
||||||
r = sys.stdin.buffer.read(16).decode('cp437')
|
>>> r = sys.stdin.buffer.read(16).decode('cp437')
|
||||||
|
|
||||||
To correct this code, the encoding specified on the ``TextIOWrapper`` should be
|
To correct this code, the encoding specified on the ``TextIOWrapper`` should be
|
||||||
used, either implicitly or explicitly::
|
used, either implicitly or explicitly:
|
||||||
|
|
||||||
# Fix 1: Use wrapper correctly
|
>>> # Fix 1: Use wrapper correctly
|
||||||
sys.stdout.write(text)
|
>>> sys.stdout.write(text)
|
||||||
r = sys.stdin.read(16)
|
>>> r = sys.stdin.read(16)
|
||||||
|
|
||||||
# Fix 2: Use encoding explicitly
|
>>> # Fix 2: Use encoding explicitly
|
||||||
sys.stdout.buffer.write(text.encode(sys.stdout.encoding))
|
>>> sys.stdout.buffer.write(text.encode(sys.stdout.encoding))
|
||||||
r = sys.stdin.buffer.read(16).decode(sys.stdin.encoding)
|
>>> r = sys.stdin.buffer.read(16).decode(sys.stdin.encoding)
|
||||||
|
|
||||||
Incorrectly using the raw object
|
Incorrectly using the raw object
|
||||||
--------------------------------
|
--------------------------------
|
||||||
|
@ -117,32 +117,57 @@ Code that uses the raw IO object and does not correctly handle partial reads and
|
||||||
writes may be affected. This is particularly important for reads, where the
|
writes may be affected. This is particularly important for reads, where the
|
||||||
number of characters read will never exceed one-fourth of the number of bytes
|
number of characters read will never exceed one-fourth of the number of bytes
|
||||||
allowed, as there is no feasible way to prevent input from encoding as much
|
allowed, as there is no feasible way to prevent input from encoding as much
|
||||||
longer utf-8 strings::
|
longer utf-8 strings.
|
||||||
|
|
||||||
>>> stdin = open(sys.stdin.fileno(), 'rb')
|
>>> raw_stdin = sys.stdin.buffer.raw
|
||||||
>>> data = stdin.raw.read(15)
|
>>> data = raw_stdin.read(15)
|
||||||
abcdefghijklm
|
abcdefghijklm
|
||||||
b'abc'
|
b'abc'
|
||||||
# data contains at most 3 characters, and never more than 12 bytes
|
# data contains at most 3 characters, and never more than 12 bytes
|
||||||
# error, as "defghijklm\r\n" is passed to the interactive prompt
|
# error, as "defghijklm\r\n" is passed to the interactive prompt
|
||||||
|
|
||||||
To correct this code, the buffered reader/writer should be used, or the caller
|
To correct this code, the buffered reader/writer should be used, or the caller
|
||||||
should continue reading until its buffer is full.::
|
should continue reading until its buffer is full.
|
||||||
|
|
||||||
# Fix 1: Use the buffered reader/writer
|
>>> # Fix 1: Use the buffered reader/writer
|
||||||
>>> stdin = open(sys.stdin.fileno(), 'rb')
|
>>> stdin = sys.stdin.buffer
|
||||||
>>> data = stdin.read(15)
|
>>> data = stdin.read(15)
|
||||||
abcedfghijklm
|
abcedfghijklm
|
||||||
b'abcdefghijklm\r\n'
|
b'abcdefghijklm\r\n'
|
||||||
|
|
||||||
# Fix 2: Loop until enough bytes have been read
|
>>> # Fix 2: Loop until enough bytes have been read
|
||||||
>>> stdin = open(sys.stdin.fileno(), 'rb')
|
>>> raw_stdin = sys.stdin.buffer.raw
|
||||||
>>> b = b''
|
>>> b = b''
|
||||||
>>> while len(b) < 15:
|
>>> while len(b) < 15:
|
||||||
... b += stdin.raw.read(15)
|
... b += raw_stdin.read(15)
|
||||||
abcedfghijklm
|
abcedfghijklm
|
||||||
b'abcdefghijklm\r\n'
|
b'abcdefghijklm\r\n'
|
||||||
|
|
||||||
|
Using the raw object with small buffers
|
||||||
|
---------------------------------------
|
||||||
|
|
||||||
|
Code that uses the raw IO object and attempts to read less than four characters
|
||||||
|
will now receive an error. Because it's possible that any single character may
|
||||||
|
require up to four bytes when represented in utf-8, requests must fail.
|
||||||
|
|
||||||
|
>>> raw_stdin = sys.stdin.buffer.raw
|
||||||
|
>>> data = raw_stdin.read(3)
|
||||||
|
Traceback (most recent call last):
|
||||||
|
File "<stdin>", line 1, in <module>
|
||||||
|
ValueError: must read at least 4 bytes
|
||||||
|
|
||||||
|
The only workaround is to pass a larger buffer.
|
||||||
|
|
||||||
|
>>> # Fix: Request at least four bytes
|
||||||
|
>>> raw_stdin = sys.stdin.buffer.raw
|
||||||
|
>>> data = raw_stdin.read(4)
|
||||||
|
a
|
||||||
|
b'a'
|
||||||
|
>>> >>>
|
||||||
|
|
||||||
|
(The extra ``>>>`` is due to the newline remaining in the input buffer and is
|
||||||
|
expected in this situation.)
|
||||||
|
|
||||||
Copyright
|
Copyright
|
||||||
=========
|
=========
|
||||||
|
|
||||||
|
@ -151,7 +176,5 @@ This document has been placed in the public domain.
|
||||||
References
|
References
|
||||||
==========
|
==========
|
||||||
|
|
||||||
.. [process_stdinreader.py] Twisted's process_stdinreader.py
|
.. _Twisted's process_stdinreader.py: https://github.com/twisted/twisted/blob/trunk/src/twisted/test/process_stdinreader.py
|
||||||
(https://github.com/twisted/twisted/blob/trunk/src/twisted/test/process_stdinreader.py)
|
.. _win_unicode_console package: https://pypi.org/project/win_unicode_console/
|
||||||
.. [win_unicode_console] win_unicode_console package
|
|
||||||
(https://pypi.org/project/win_unicode_console/)
|
|
||||||
|
|
Loading…
Reference in New Issue