From eeac57f9fa9671339ebc4d737a8306fd00b4030b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Martin=20v=2E=20L=C3=B6wis?= Date: Wed, 6 May 2009 07:53:21 +0000 Subject: [PATCH] Replace error handler discussion with text from Stephen Turnbull. --- pep-0383.txt | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/pep-0383.txt b/pep-0383.txt index 09d8042c5..6e580b472 100644 --- a/pep-0383.txt +++ b/pep-0383.txt @@ -126,15 +126,17 @@ and returning bytes, would be written as:: # fn is now a str object yield fn.encode(fse, "utf8b") -The encode error handler interface presently requires replacement -Unicode to be provide in lieu of the non-encodable Unicode from the -source string. It promptly encodes that replacement Unicode. In some -error handlers, such as the utf8 proposed here, it is simpler -and more efficient for the error handler to provide a pre-encoded -replacement byte string, rather than forcing it to calculating Unicode -from which the encoder would create the desired bytes. In fact, with -utf8b, there are required byte sequences which cannot be -generated from replacement Unicode. +The extension to the encode error handler interface proposed by this +PEP is necessary to implement the 'utf8b' error handler, because there +are required byte sequences which cannot be generated from replacement +Unicode. However, the encode error handler interface presently +requires replacement Unicode to be provided in lieu of the +non-encodable Unicode from the source string. Then it promptly +encodes that replacement Unicode. In some error handlers, such as the +'utf8b' proposed here, it is also simpler and more efficient for the +error handler to provide a pre-encoded replacement byte string, rather +than forcing it to calculating Unicode from which the encoder would +create the desired bytes. A few alternative approaches have been proposed: