From def841b431b27dc82166f446c9c889b728cd5729 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Martin=20v=2E=20L=C3=B6wis?= Date: Sun, 3 May 2009 08:16:57 +0000 Subject: [PATCH] Remove utf-8b codec. Rename error handler to utf8b. --- pep-0383.txt | 20 +++++--------------- 1 file changed, 5 insertions(+), 15 deletions(-) diff --git a/pep-0383.txt b/pep-0383.txt index 4fa50ca04..4fc729d1f 100644 --- a/pep-0383.txt +++ b/pep-0383.txt @@ -72,27 +72,17 @@ be decoded. With this PEP, non-decodable bytes >128 will be represented as lone half surrogate codes U+DC80..U+DCFF. Bytes below 128 will produce exceptions; see the discussion below. -To convert non-decodable bytes, a new error handler ([2]) -"python-escape" is introduced, which produces these half -surrogates. On encoding, the error handler converts the half surrogate -back to the corresponding byte. This error handler will be used in any -API that receives or produces file names, command line arguments, or -environment variables. +To convert non-decodable bytes, a new error handler ([2]) "utf8b" is +introduced, which produces these half surrogates. On encoding, the +error handler converts the half surrogate back to the corresponding +byte. This error handler will be used in any API that receives or +produces file names, command line arguments, or environment variables. The error handler interface is extended to allow the encode error handler to return byte strings immediately, in addition to returning Unicode strings which then get encoded again (also see the discussion below). -If the locale's encoding is UTF-8, the file system encoding is set to -a new encoding "utf-8b", as the regular UTF-8 codec would not -re-encode half surrogates as single bytes. The UTF-8b codec decodes -invalid bytes (which must be >= 0x80) into half surrogate codes -U+DC80..U+DCFF. Unlike the utf-8 codec, the utf-8b codec follows the -strict definition of UTF-8 to determine what an invalid byte is -(which, among other restrictions, disallows to encode surrogate codes -in UTF-8). - Byte-orientied interfaces that already exist in Python 3.0 are not affected by this specification. They are neither enhanced nor deprecated.