PEP 528, 529: Formatting fixes

This commit is contained in:
Steve Dower 2016-09-06 11:03:28 -07:00
parent 5055d0ce94
commit 7a67822873
2 changed files with 11 additions and 10 deletions

View File

@ -94,13 +94,13 @@ Assuming stdin/stdout encoding
Code that assumes that the encoding required by ``sys.stdin.buffer`` or
``sys.stdout.buffer`` is ``'mbcs'`` or a more specific encoding may currently be
working by chance, but could encounter issues under this change. For example:
working by chance, but could encounter issues under this change. For example::
>>> sys.stdout.buffer.write(text.encode('mbcs'))
>>> r = sys.stdin.buffer.read(16).decode('cp437')
To correct this code, the encoding specified on the ``TextIOWrapper`` should be
used, either implicitly or explicitly:
used, either implicitly or explicitly::
>>> # Fix 1: Use wrapper correctly
>>> sys.stdout.write(text)
@ -117,7 +117,7 @@ Code that uses the raw IO object and does not correctly handle partial reads and
writes may be affected. This is particularly important for reads, where the
number of characters read will never exceed one-fourth of the number of bytes
allowed, as there is no feasible way to prevent input from encoding as much
longer utf-8 strings.
longer utf-8 strings::
>>> raw_stdin = sys.stdin.buffer.raw
>>> data = raw_stdin.read(15)
@ -127,7 +127,7 @@ longer utf-8 strings.
# error, as "defghijklm\r\n" is passed to the interactive prompt
To correct this code, the buffered reader/writer should be used, or the caller
should continue reading until its buffer is full.
should continue reading until its buffer is full::
>>> # Fix 1: Use the buffered reader/writer
>>> stdin = sys.stdin.buffer
@ -148,7 +148,7 @@ Using the raw object with small buffers
Code that uses the raw IO object and attempts to read less than four characters
will now receive an error. Because it's possible that any single character may
require up to four bytes when represented in utf-8, requests must fail.
require up to four bytes when represented in utf-8, requests must fail::
>>> raw_stdin = sys.stdin.buffer.raw
>>> data = raw_stdin.read(3)
@ -156,7 +156,7 @@ require up to four bytes when represented in utf-8, requests must fail.
File "<stdin>", line 1, in <module>
ValueError: must read at least 4 bytes
The only workaround is to pass a larger buffer.
The only workaround is to pass a larger buffer::
>>> # Fix: Request at least four bytes
>>> raw_stdin = sys.stdin.buffer.raw

View File

@ -68,6 +68,7 @@ cannot represent all Unicode characters, the conversion of a path into bytes can
lose information without warning or any available indication.
As a demonstration of this::
>>> open('test\uAB00.txt', 'wb').close()
>>> import glob
>>> glob.glob('test*')
@ -320,13 +321,13 @@ Not managing encodings across boundaries
Code that does not manage encodings when crossing protocol boundaries may
currently be working by chance, but could encounter issues when either encoding
changes. For example:
changes. For example::
>>> filename = open('filename_in_mbcs.txt', 'rb').read()
>>> text = open(filename, 'r').read()
To correct this code, the encoding of the bytes in ``filename`` should be
specified, either when reading from the file or before using the value:
specified, either when reading from the file or before using the value::
>>> # Fix 1: Open file as text
>>> filename = open('filename_in_mbcs.txt', 'r', encoding='mbcs').read()
@ -341,13 +342,13 @@ Explicitly using 'mbcs'
-----------------------
Code that explicitly encodes text using 'mbcs' before passing to file system
APIs is now passing incorrectly encoded bytes. For example:
APIs is now passing incorrectly encoded bytes. For example::
>>> filename = open('files.txt', 'r').readline()
>>> text = open(filename.encode('mbcs'), 'r')
To correct this code, the string should be passed without explicit encoding, or
should use ``os.fsencode()``:
should use ``os.fsencode()``::
>>> # Fix 1: Do not encode the string
>>> filename = open('files.txt', 'r').readline()