Restrict escapable bytes into the 128..255 range.
This commit is contained in:
parent
ba7f822765
commit
a67b456afe
10
pep-0383.txt
10
pep-0383.txt
|
@ -68,8 +68,9 @@ environmental data to Python str objects ([1]).
|
|||
|
||||
On POSIX systems, Python currently applies the locale's encoding to
|
||||
convert the byte data to Unicode, failing for characters that cannot
|
||||
be decoded. With this PEP, non-decodable bytes will be represented as
|
||||
lone half surrogate codes U+DCxx.
|
||||
be decoded. With this PEP, non-decodable bytes >128 will be
|
||||
represented as lone half surrogate codes U+DC80..U+DCFF. Bytes below
|
||||
128 will produce exceptions; see the discussion below.
|
||||
|
||||
To convert non-decodable bytes, a new error handler ([2])
|
||||
"python-escape" is introduced, which produces these half
|
||||
|
@ -109,6 +110,11 @@ will produce non-sensical data.
|
|||
Data obtained from other sources may conflict with data produced
|
||||
by this PEP. Dealing with such conflicts is out of scope of the PEP.
|
||||
|
||||
Encodings that are not compatible with ASCII are not supported by
|
||||
this specification; bytes in the ASCII range that fail to decode
|
||||
will cause an exception. It is widely agreed that such encodings
|
||||
should not be used as locale charsets.
|
||||
|
||||
For most applications, we assume that they eventually pass data
|
||||
received from a system interface back into the same system
|
||||
interfaces. For example, an application invoking os.listdir() will
|
||||
|
|
Loading…
Reference in New Issue