Replace error handler discussion with text from

Stephen Turnbull.
This commit is contained in:
Martin v. Löwis 2009-05-06 07:53:21 +00:00
parent 8f3e8ce98e
commit eeac57f9fa
1 changed files with 11 additions and 9 deletions

View File

@ -126,15 +126,17 @@ and returning bytes, would be written as::
# fn is now a str object # fn is now a str object
yield fn.encode(fse, "utf8b") yield fn.encode(fse, "utf8b")
The encode error handler interface presently requires replacement The extension to the encode error handler interface proposed by this
Unicode to be provide in lieu of the non-encodable Unicode from the PEP is necessary to implement the 'utf8b' error handler, because there
source string. It promptly encodes that replacement Unicode. In some are required byte sequences which cannot be generated from replacement
error handlers, such as the utf8 proposed here, it is simpler Unicode. However, the encode error handler interface presently
and more efficient for the error handler to provide a pre-encoded requires replacement Unicode to be provided in lieu of the
replacement byte string, rather than forcing it to calculating Unicode non-encodable Unicode from the source string. Then it promptly
from which the encoder would create the desired bytes. In fact, with encodes that replacement Unicode. In some error handlers, such as the
utf8b, there are required byte sequences which cannot be 'utf8b' proposed here, it is also simpler and more efficient for the
generated from replacement Unicode. error handler to provide a pre-encoded replacement byte string, rather
than forcing it to calculating Unicode from which the encoder would
create the desired bytes.
A few alternative approaches have been proposed: A few alternative approaches have been proposed: