Add discussion of error handlers proposed by Glen
Linderman.
This commit is contained in:
parent
d99932829d
commit
d0df1270f4
14
pep-0383.txt
14
pep-0383.txt
|
@ -80,7 +80,8 @@ environment variables.
|
|||
|
||||
The error handler interface is extended to allow the encode error
|
||||
handler to return byte strings immediately, in addition to returning
|
||||
Unicode strings which then get encoded again.
|
||||
Unicode strings which then get encoded again (also see the discussion
|
||||
below).
|
||||
|
||||
If the locale's encoding is UTF-8, the file system encoding is set to
|
||||
a new encoding "utf-8b", as the regular UTF-8 codec would not
|
||||
|
@ -123,6 +124,17 @@ for accepting and returning bytes, would be written as::
|
|||
# fn is now a str object
|
||||
yield fn.encode(fse, "python-escape")
|
||||
|
||||
The encode error handler interface presently requires replacement
|
||||
Unicode to be provide in lieu of the non-encodable Unicode from the
|
||||
source string. It promptly encodes that replacement Unicode. In some
|
||||
error handlers, such as the python-escape proposed here, it is simpler
|
||||
and more efficient for the error handler to provide a pre-encoded
|
||||
replacement byte string, rather than forcing it to calculating Unicode
|
||||
from which the encoder would create the desired bytes. In fact, with
|
||||
python-escape, there are required byte sequences which cannot be
|
||||
generated from replacement Unicode.
|
||||
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
|
|
Loading…
Reference in New Issue