diff --git a/pep-0528.txt b/pep-0528.txt index ab9ef94ae..c11e70069 100644 --- a/pep-0528.txt +++ b/pep-0528.txt @@ -94,13 +94,13 @@ Assuming stdin/stdout encoding Code that assumes that the encoding required by ``sys.stdin.buffer`` or ``sys.stdout.buffer`` is ``'mbcs'`` or a more specific encoding may currently be -working by chance, but could encounter issues under this change. For example: +working by chance, but could encounter issues under this change. For example:: >>> sys.stdout.buffer.write(text.encode('mbcs')) >>> r = sys.stdin.buffer.read(16).decode('cp437') To correct this code, the encoding specified on the ``TextIOWrapper`` should be -used, either implicitly or explicitly: +used, either implicitly or explicitly:: >>> # Fix 1: Use wrapper correctly >>> sys.stdout.write(text) @@ -117,7 +117,7 @@ Code that uses the raw IO object and does not correctly handle partial reads and writes may be affected. This is particularly important for reads, where the number of characters read will never exceed one-fourth of the number of bytes allowed, as there is no feasible way to prevent input from encoding as much -longer utf-8 strings. +longer utf-8 strings:: >>> raw_stdin = sys.stdin.buffer.raw >>> data = raw_stdin.read(15) @@ -127,7 +127,7 @@ longer utf-8 strings. # error, as "defghijklm\r\n" is passed to the interactive prompt To correct this code, the buffered reader/writer should be used, or the caller -should continue reading until its buffer is full. +should continue reading until its buffer is full:: >>> # Fix 1: Use the buffered reader/writer >>> stdin = sys.stdin.buffer @@ -148,7 +148,7 @@ Using the raw object with small buffers Code that uses the raw IO object and attempts to read less than four characters will now receive an error. Because it's possible that any single character may -require up to four bytes when represented in utf-8, requests must fail. +require up to four bytes when represented in utf-8, requests must fail:: >>> raw_stdin = sys.stdin.buffer.raw >>> data = raw_stdin.read(3) @@ -156,7 +156,7 @@ require up to four bytes when represented in utf-8, requests must fail. File "", line 1, in ValueError: must read at least 4 bytes -The only workaround is to pass a larger buffer. +The only workaround is to pass a larger buffer:: >>> # Fix: Request at least four bytes >>> raw_stdin = sys.stdin.buffer.raw diff --git a/pep-0529.txt b/pep-0529.txt index 360b28aa9..bf32a0436 100644 --- a/pep-0529.txt +++ b/pep-0529.txt @@ -68,6 +68,7 @@ cannot represent all Unicode characters, the conversion of a path into bytes can lose information without warning or any available indication. As a demonstration of this:: + >>> open('test\uAB00.txt', 'wb').close() >>> import glob >>> glob.glob('test*') @@ -320,13 +321,13 @@ Not managing encodings across boundaries Code that does not manage encodings when crossing protocol boundaries may currently be working by chance, but could encounter issues when either encoding -changes. For example: +changes. For example:: >>> filename = open('filename_in_mbcs.txt', 'rb').read() >>> text = open(filename, 'r').read() To correct this code, the encoding of the bytes in ``filename`` should be -specified, either when reading from the file or before using the value: +specified, either when reading from the file or before using the value:: >>> # Fix 1: Open file as text >>> filename = open('filename_in_mbcs.txt', 'r', encoding='mbcs').read() @@ -341,13 +342,13 @@ Explicitly using 'mbcs' ----------------------- Code that explicitly encodes text using 'mbcs' before passing to file system -APIs is now passing incorrectly encoded bytes. For example: +APIs is now passing incorrectly encoded bytes. For example:: >>> filename = open('files.txt', 'r').readline() >>> text = open(filename.encode('mbcs'), 'r') To correct this code, the string should be passed without explicit encoding, or -should use ``os.fsencode()``: +should use ``os.fsencode()``:: >>> # Fix 1: Do not encode the string >>> filename = open('files.txt', 'r').readline()