Specify NFKC instead of NFC.

2007-08-14 16:24:05 +00:00 · 2007-08-14 16:24:05 +00:00 · 512559f6a1
parent 10a3df9553
commit 512559f6a1
1 changed files with 3 additions and 7 deletions
--- a/pep-3131.txt
+++ b/pep-3131.txt
@ -82,8 +82,8 @@ nonspacing marks (Mn), spacing combining marks (Mc), decimal number
 (Nd), connector punctuations (Pc), and characters carryig the
 Other_ID_Continue property.

-All identifiers are converted into the normal form NFC while parsing;
-comparison of identifiers is based on NFC.
+All identifiers are converted into the normal form NFKC while parsing;
+comparison of identifiers is based on NFKC.

 A non-normative HTML file listing all valid identifier characters for
 Unicode 4.1 can be found at
@ -117,7 +117,7 @@ The following changes will need to be made to the parser:
   non-identifier character (e.g. a space or punctuation character)

 2. The entire UTF-8 string is passed to a function to normalize the
-   string to NFC, and then verify that it follows the identifier
+   string to NFKC, and then verify that it follows the identifier
   syntax. No such callout is made for pure-ASCII identifiers, which
   continue to be parsed the way they are today. The Unicode database
   must start including the Other_ID_{Start|Continue} property.
@ -154,10 +154,6 @@ C#. It's not clear whether this would improve things (it might
 for RTL languages); if there is a need, these can be added
 later.

-Another open issue is the choice of normalization form: some
-people suggest to use NFKC instead of NFC, others suggest to
-ban compatibility characters.
-
 Some people would like to see an option on selecting support
 for this PEP at run-time; opinions vary on what precisely
 that option should be, and what precisely its default value