Use NFKC, and XID_Start/Continue.

This commit is contained in:
Martin v. Löwis 2007-08-14 16:35:39 +00:00
parent 512559f6a1
commit 0801f23a05
1 changed files with 5 additions and 5 deletions

View File

@ -69,18 +69,18 @@ characters from outside the ASCII range. For other characters, the
classification uses the version of the Unicode Character Database as included in classification uses the version of the Unicode Character Database as included in
the ``unicodedata`` module. the ``unicodedata`` module.
The identifier syntax is ``<ID_Start> <ID_Continue>*``. The identifier syntax is ``<XID_Start> <XID_Continue>*``.
``ID_Start`` is defined as all characters having one of the general ``XID_Start`` is defined as all characters having one of the general
categories uppercase letters (Lu), lowercase letters (Ll), titlecase categories uppercase letters (Lu), lowercase letters (Ll), titlecase
letters (Lt), modifier letters (Lm), other letters (Lo), letter letters (Lt), modifier letters (Lm), other letters (Lo), letter
numbers (Nl), the underscore, and characters carrying the numbers (Nl), the underscore, and characters carrying the
Other_ID_Start property. Other_ID_Start property (XXX adjust for XID_Start).
``ID_Continue`` is defined as all characters in ``ID_Start``, plus ``XID_Continue`` is defined as all characters in ``XID_Start``, plus
nonspacing marks (Mn), spacing combining marks (Mc), decimal number nonspacing marks (Mn), spacing combining marks (Mc), decimal number
(Nd), connector punctuations (Pc), and characters carryig the (Nd), connector punctuations (Pc), and characters carryig the
Other_ID_Continue property. Other_ID_Continue property (XXX adjust for XID_Continue).
All identifiers are converted into the normal form NFKC while parsing; All identifiers are converted into the normal form NFKC while parsing;
comparison of identifiers is based on NFKC. comparison of identifiers is based on NFKC.