PEP 528: Readability and style updates
This commit is contained in:
parent
a73c6dfc9c
commit
37567a1d72
79
pep-0528.txt
79
pep-0528.txt
|
@ -21,8 +21,7 @@ the active code page.
|
|||
This PEP proposes changing the default standard stream implementation on Windows
|
||||
to use the Unicode APIs. This will allow users to print and input the full range
|
||||
of Unicode characters at the default Windows console. This also requires a
|
||||
subtle change to how the tokenizer parses text from readline hooks, that should
|
||||
have no backwards compatibility issues.
|
||||
subtle change to how the tokenizer parses text from readline hooks.
|
||||
|
||||
Specific Changes
|
||||
================
|
||||
|
@ -46,7 +45,7 @@ utf-16-le and converted into utf-8 when returned to Python.
|
|||
|
||||
The use of an ASCII compatible encoding is required to maintain compatibility
|
||||
with code that bypasses the ``TextIOWrapper`` and directly writes ASCII bytes to
|
||||
the standard streams (for example, [process_stdinreader.py]_). Code that assumes
|
||||
the standard streams (for example, `Twisted's process_stdinreader.py`_). Code that assumes
|
||||
a particular encoding for the standard streams other than ASCII will likely
|
||||
break.
|
||||
|
||||
|
@ -78,8 +77,9 @@ behaviour.
|
|||
Alternative Approaches
|
||||
======================
|
||||
|
||||
The ``win_unicode_console`` package [win_unicode_console]_ is a pure-Python
|
||||
alternative to changing the default behaviour of the console.
|
||||
The `win_unicode_console package`_ is a pure-Python alternative to changing the
|
||||
default behaviour of the console. It implements essentially the same
|
||||
modifications as described here using pure Python code.
|
||||
|
||||
Code that may break
|
||||
===================
|
||||
|
@ -94,21 +94,21 @@ Assuming stdin/stdout encoding
|
|||
|
||||
Code that assumes that the encoding required by ``sys.stdin.buffer`` or
|
||||
``sys.stdout.buffer`` is ``'mbcs'`` or a more specific encoding may currently be
|
||||
working by chance, but could encounter issues under this change. For example::
|
||||
working by chance, but could encounter issues under this change. For example:
|
||||
|
||||
sys.stdout.buffer.write(text.encode('mbcs'))
|
||||
r = sys.stdin.buffer.read(16).decode('cp437')
|
||||
>>> sys.stdout.buffer.write(text.encode('mbcs'))
|
||||
>>> r = sys.stdin.buffer.read(16).decode('cp437')
|
||||
|
||||
To correct this code, the encoding specified on the ``TextIOWrapper`` should be
|
||||
used, either implicitly or explicitly::
|
||||
used, either implicitly or explicitly:
|
||||
|
||||
# Fix 1: Use wrapper correctly
|
||||
sys.stdout.write(text)
|
||||
r = sys.stdin.read(16)
|
||||
>>> # Fix 1: Use wrapper correctly
|
||||
>>> sys.stdout.write(text)
|
||||
>>> r = sys.stdin.read(16)
|
||||
|
||||
# Fix 2: Use encoding explicitly
|
||||
sys.stdout.buffer.write(text.encode(sys.stdout.encoding))
|
||||
r = sys.stdin.buffer.read(16).decode(sys.stdin.encoding)
|
||||
>>> # Fix 2: Use encoding explicitly
|
||||
>>> sys.stdout.buffer.write(text.encode(sys.stdout.encoding))
|
||||
>>> r = sys.stdin.buffer.read(16).decode(sys.stdin.encoding)
|
||||
|
||||
Incorrectly using the raw object
|
||||
--------------------------------
|
||||
|
@ -117,32 +117,57 @@ Code that uses the raw IO object and does not correctly handle partial reads and
|
|||
writes may be affected. This is particularly important for reads, where the
|
||||
number of characters read will never exceed one-fourth of the number of bytes
|
||||
allowed, as there is no feasible way to prevent input from encoding as much
|
||||
longer utf-8 strings::
|
||||
longer utf-8 strings.
|
||||
|
||||
>>> stdin = open(sys.stdin.fileno(), 'rb')
|
||||
>>> data = stdin.raw.read(15)
|
||||
>>> raw_stdin = sys.stdin.buffer.raw
|
||||
>>> data = raw_stdin.read(15)
|
||||
abcdefghijklm
|
||||
b'abc'
|
||||
# data contains at most 3 characters, and never more than 12 bytes
|
||||
# error, as "defghijklm\r\n" is passed to the interactive prompt
|
||||
|
||||
To correct this code, the buffered reader/writer should be used, or the caller
|
||||
should continue reading until its buffer is full.::
|
||||
should continue reading until its buffer is full.
|
||||
|
||||
# Fix 1: Use the buffered reader/writer
|
||||
>>> stdin = open(sys.stdin.fileno(), 'rb')
|
||||
>>> # Fix 1: Use the buffered reader/writer
|
||||
>>> stdin = sys.stdin.buffer
|
||||
>>> data = stdin.read(15)
|
||||
abcedfghijklm
|
||||
b'abcdefghijklm\r\n'
|
||||
|
||||
# Fix 2: Loop until enough bytes have been read
|
||||
>>> stdin = open(sys.stdin.fileno(), 'rb')
|
||||
>>> # Fix 2: Loop until enough bytes have been read
|
||||
>>> raw_stdin = sys.stdin.buffer.raw
|
||||
>>> b = b''
|
||||
>>> while len(b) < 15:
|
||||
... b += stdin.raw.read(15)
|
||||
... b += raw_stdin.read(15)
|
||||
abcedfghijklm
|
||||
b'abcdefghijklm\r\n'
|
||||
|
||||
Using the raw object with small buffers
|
||||
---------------------------------------
|
||||
|
||||
Code that uses the raw IO object and attempts to read less than four characters
|
||||
will now receive an error. Because it's possible that any single character may
|
||||
require up to four bytes when represented in utf-8, requests must fail.
|
||||
|
||||
>>> raw_stdin = sys.stdin.buffer.raw
|
||||
>>> data = raw_stdin.read(3)
|
||||
Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in <module>
|
||||
ValueError: must read at least 4 bytes
|
||||
|
||||
The only workaround is to pass a larger buffer.
|
||||
|
||||
>>> # Fix: Request at least four bytes
|
||||
>>> raw_stdin = sys.stdin.buffer.raw
|
||||
>>> data = raw_stdin.read(4)
|
||||
a
|
||||
b'a'
|
||||
>>> >>>
|
||||
|
||||
(The extra ``>>>`` is due to the newline remaining in the input buffer and is
|
||||
expected in this situation.)
|
||||
|
||||
Copyright
|
||||
=========
|
||||
|
||||
|
@ -151,7 +176,5 @@ This document has been placed in the public domain.
|
|||
References
|
||||
==========
|
||||
|
||||
.. [process_stdinreader.py] Twisted's process_stdinreader.py
|
||||
(https://github.com/twisted/twisted/blob/trunk/src/twisted/test/process_stdinreader.py)
|
||||
.. [win_unicode_console] win_unicode_console package
|
||||
(https://pypi.org/project/win_unicode_console/)
|
||||
.. _Twisted's process_stdinreader.py: https://github.com/twisted/twisted/blob/trunk/src/twisted/test/process_stdinreader.py
|
||||
.. _win_unicode_console package: https://pypi.org/project/win_unicode_console/
|
||||
|
|
Loading…
Reference in New Issue