Restrict escapable bytes into the 128..255 range.

This commit is contained in:
Martin v. Löwis 2009-04-30 09:50:16 +00:00
parent ba7f822765
commit a67b456afe
1 changed files with 8 additions and 2 deletions

View File

@ -68,8 +68,9 @@ environmental data to Python str objects ([1]).
On POSIX systems, Python currently applies the locale's encoding to
convert the byte data to Unicode, failing for characters that cannot
be decoded. With this PEP, non-decodable bytes will be represented as
lone half surrogate codes U+DCxx.
be decoded. With this PEP, non-decodable bytes >128 will be
represented as lone half surrogate codes U+DC80..U+DCFF. Bytes below
128 will produce exceptions; see the discussion below.
To convert non-decodable bytes, a new error handler ([2])
"python-escape" is introduced, which produces these half
@ -109,6 +110,11 @@ will produce non-sensical data.
Data obtained from other sources may conflict with data produced
by this PEP. Dealing with such conflicts is out of scope of the PEP.
Encodings that are not compatible with ASCII are not supported by
this specification; bytes in the ASCII range that fail to decode
will cause an exception. It is widely agreed that such encodings
should not be used as locale charsets.
For most applications, we assume that they eventually pass data
received from a system interface back into the same system
interfaces. For example, an application invoking os.listdir() will