Slight grammar fixes.
This commit is contained in:
parent
d46cbd81f8
commit
310b007a32
10
pep-0393.txt
10
pep-0393.txt
|
@ -41,7 +41,7 @@ One problem with the approach is support for existing applications
|
|||
may be computed. Applications are encouraged to phase out reliance on
|
||||
a specific internal representation if possible. As interaction with
|
||||
other libraries will often require some sort of internal
|
||||
representation, the specification choses UTF-8 as the recommended way
|
||||
representation, the specification chooses UTF-8 as the recommended way
|
||||
of exposing strings to C code.
|
||||
|
||||
For many strings (e.g. ASCII), multiple representations may actually
|
||||
|
@ -69,7 +69,7 @@ The Unicode object structure is changed to this definition::
|
|||
These fields have the following interpretations:
|
||||
|
||||
- length: number of code points in the string (result of sq_length)
|
||||
- str: shortest-form representation of the unicode string
|
||||
- str: shortest-form representation of the unicode string.
|
||||
The string is null-terminated (in its respective representation).
|
||||
- hash: same as in Python 3.2
|
||||
- state:
|
||||
|
@ -145,7 +145,7 @@ String Access
|
|||
|
||||
The canonical representation can be accessed using two macros
|
||||
PyUnicode_Kind and PyUnicode_Data. PyUnicode_Kind gives one of the
|
||||
value PyUnicode_1BYTE (1), PyUnicode_2BYTE (2), or PyUnicode_4BYTE
|
||||
values PyUnicode_1BYTE (1), PyUnicode_2BYTE (2), or PyUnicode_4BYTE
|
||||
(3). PyUnicode_Data gives the void pointer to the data, masking out
|
||||
the pointer kind. All these functions call PyUnicode_Ready
|
||||
in case the canonical representation hasn't been computed yet.
|
||||
|
@ -156,7 +156,7 @@ _PyUnicode_AsString, which is removed. The function will compute the
|
|||
utf8 representation when first called. Since this representation will
|
||||
consume memory until the string object is released, applications
|
||||
should use the existing PyUnicode_AsUTF8String where possible
|
||||
(which generates a new string object every time). API that implicitly
|
||||
(which generates a new string object every time). APIs that implicitly
|
||||
converts a string to a char* (such as the ParseTuple functions) will
|
||||
use PyUnicode_AsUTF8 to compute a conversion.
|
||||
|
||||
|
@ -187,7 +187,7 @@ Discussion
|
|||
Several concerns have been raised about the approach presented here:
|
||||
|
||||
It makes the implementation more complex. That's true, but considered
|
||||
worth given the gains.
|
||||
worth it given the benefits.
|
||||
|
||||
The Py_Unicode representation is not instantaneously available,
|
||||
slowing down applications that request it. While this is also true,
|
||||
|
|
Loading…
Reference in New Issue