PEP 528, 529: Formatting fixes
This commit is contained in:
parent
5055d0ce94
commit
7a67822873
12
pep-0528.txt
12
pep-0528.txt
|
@ -94,13 +94,13 @@ Assuming stdin/stdout encoding
|
||||||
|
|
||||||
Code that assumes that the encoding required by ``sys.stdin.buffer`` or
|
Code that assumes that the encoding required by ``sys.stdin.buffer`` or
|
||||||
``sys.stdout.buffer`` is ``'mbcs'`` or a more specific encoding may currently be
|
``sys.stdout.buffer`` is ``'mbcs'`` or a more specific encoding may currently be
|
||||||
working by chance, but could encounter issues under this change. For example:
|
working by chance, but could encounter issues under this change. For example::
|
||||||
|
|
||||||
>>> sys.stdout.buffer.write(text.encode('mbcs'))
|
>>> sys.stdout.buffer.write(text.encode('mbcs'))
|
||||||
>>> r = sys.stdin.buffer.read(16).decode('cp437')
|
>>> r = sys.stdin.buffer.read(16).decode('cp437')
|
||||||
|
|
||||||
To correct this code, the encoding specified on the ``TextIOWrapper`` should be
|
To correct this code, the encoding specified on the ``TextIOWrapper`` should be
|
||||||
used, either implicitly or explicitly:
|
used, either implicitly or explicitly::
|
||||||
|
|
||||||
>>> # Fix 1: Use wrapper correctly
|
>>> # Fix 1: Use wrapper correctly
|
||||||
>>> sys.stdout.write(text)
|
>>> sys.stdout.write(text)
|
||||||
|
@ -117,7 +117,7 @@ Code that uses the raw IO object and does not correctly handle partial reads and
|
||||||
writes may be affected. This is particularly important for reads, where the
|
writes may be affected. This is particularly important for reads, where the
|
||||||
number of characters read will never exceed one-fourth of the number of bytes
|
number of characters read will never exceed one-fourth of the number of bytes
|
||||||
allowed, as there is no feasible way to prevent input from encoding as much
|
allowed, as there is no feasible way to prevent input from encoding as much
|
||||||
longer utf-8 strings.
|
longer utf-8 strings::
|
||||||
|
|
||||||
>>> raw_stdin = sys.stdin.buffer.raw
|
>>> raw_stdin = sys.stdin.buffer.raw
|
||||||
>>> data = raw_stdin.read(15)
|
>>> data = raw_stdin.read(15)
|
||||||
|
@ -127,7 +127,7 @@ longer utf-8 strings.
|
||||||
# error, as "defghijklm\r\n" is passed to the interactive prompt
|
# error, as "defghijklm\r\n" is passed to the interactive prompt
|
||||||
|
|
||||||
To correct this code, the buffered reader/writer should be used, or the caller
|
To correct this code, the buffered reader/writer should be used, or the caller
|
||||||
should continue reading until its buffer is full.
|
should continue reading until its buffer is full::
|
||||||
|
|
||||||
>>> # Fix 1: Use the buffered reader/writer
|
>>> # Fix 1: Use the buffered reader/writer
|
||||||
>>> stdin = sys.stdin.buffer
|
>>> stdin = sys.stdin.buffer
|
||||||
|
@ -148,7 +148,7 @@ Using the raw object with small buffers
|
||||||
|
|
||||||
Code that uses the raw IO object and attempts to read less than four characters
|
Code that uses the raw IO object and attempts to read less than four characters
|
||||||
will now receive an error. Because it's possible that any single character may
|
will now receive an error. Because it's possible that any single character may
|
||||||
require up to four bytes when represented in utf-8, requests must fail.
|
require up to four bytes when represented in utf-8, requests must fail::
|
||||||
|
|
||||||
>>> raw_stdin = sys.stdin.buffer.raw
|
>>> raw_stdin = sys.stdin.buffer.raw
|
||||||
>>> data = raw_stdin.read(3)
|
>>> data = raw_stdin.read(3)
|
||||||
|
@ -156,7 +156,7 @@ require up to four bytes when represented in utf-8, requests must fail.
|
||||||
File "<stdin>", line 1, in <module>
|
File "<stdin>", line 1, in <module>
|
||||||
ValueError: must read at least 4 bytes
|
ValueError: must read at least 4 bytes
|
||||||
|
|
||||||
The only workaround is to pass a larger buffer.
|
The only workaround is to pass a larger buffer::
|
||||||
|
|
||||||
>>> # Fix: Request at least four bytes
|
>>> # Fix: Request at least four bytes
|
||||||
>>> raw_stdin = sys.stdin.buffer.raw
|
>>> raw_stdin = sys.stdin.buffer.raw
|
||||||
|
|
|
@ -68,6 +68,7 @@ cannot represent all Unicode characters, the conversion of a path into bytes can
|
||||||
lose information without warning or any available indication.
|
lose information without warning or any available indication.
|
||||||
|
|
||||||
As a demonstration of this::
|
As a demonstration of this::
|
||||||
|
|
||||||
>>> open('test\uAB00.txt', 'wb').close()
|
>>> open('test\uAB00.txt', 'wb').close()
|
||||||
>>> import glob
|
>>> import glob
|
||||||
>>> glob.glob('test*')
|
>>> glob.glob('test*')
|
||||||
|
@ -320,13 +321,13 @@ Not managing encodings across boundaries
|
||||||
|
|
||||||
Code that does not manage encodings when crossing protocol boundaries may
|
Code that does not manage encodings when crossing protocol boundaries may
|
||||||
currently be working by chance, but could encounter issues when either encoding
|
currently be working by chance, but could encounter issues when either encoding
|
||||||
changes. For example:
|
changes. For example::
|
||||||
|
|
||||||
>>> filename = open('filename_in_mbcs.txt', 'rb').read()
|
>>> filename = open('filename_in_mbcs.txt', 'rb').read()
|
||||||
>>> text = open(filename, 'r').read()
|
>>> text = open(filename, 'r').read()
|
||||||
|
|
||||||
To correct this code, the encoding of the bytes in ``filename`` should be
|
To correct this code, the encoding of the bytes in ``filename`` should be
|
||||||
specified, either when reading from the file or before using the value:
|
specified, either when reading from the file or before using the value::
|
||||||
|
|
||||||
>>> # Fix 1: Open file as text
|
>>> # Fix 1: Open file as text
|
||||||
>>> filename = open('filename_in_mbcs.txt', 'r', encoding='mbcs').read()
|
>>> filename = open('filename_in_mbcs.txt', 'r', encoding='mbcs').read()
|
||||||
|
@ -341,13 +342,13 @@ Explicitly using 'mbcs'
|
||||||
-----------------------
|
-----------------------
|
||||||
|
|
||||||
Code that explicitly encodes text using 'mbcs' before passing to file system
|
Code that explicitly encodes text using 'mbcs' before passing to file system
|
||||||
APIs is now passing incorrectly encoded bytes. For example:
|
APIs is now passing incorrectly encoded bytes. For example::
|
||||||
|
|
||||||
>>> filename = open('files.txt', 'r').readline()
|
>>> filename = open('files.txt', 'r').readline()
|
||||||
>>> text = open(filename.encode('mbcs'), 'r')
|
>>> text = open(filename.encode('mbcs'), 'r')
|
||||||
|
|
||||||
To correct this code, the string should be passed without explicit encoding, or
|
To correct this code, the string should be passed without explicit encoding, or
|
||||||
should use ``os.fsencode()``:
|
should use ``os.fsencode()``::
|
||||||
|
|
||||||
>>> # Fix 1: Do not encode the string
|
>>> # Fix 1: Do not encode the string
|
||||||
>>> filename = open('files.txt', 'r').readline()
|
>>> filename = open('files.txt', 'r').readline()
|
||||||
|
|
Loading…
Reference in New Issue