Add open issues section, add macros, adjust macro names to implementation.
This commit is contained in:
parent
310b007a32
commit
6ec6d73ccd
46
pep-0393.txt
46
pep-0393.txt
|
@ -145,10 +145,17 @@ String Access
|
||||||
|
|
||||||
The canonical representation can be accessed using two macros
|
The canonical representation can be accessed using two macros
|
||||||
PyUnicode_Kind and PyUnicode_Data. PyUnicode_Kind gives one of the
|
PyUnicode_Kind and PyUnicode_Data. PyUnicode_Kind gives one of the
|
||||||
values PyUnicode_1BYTE (1), PyUnicode_2BYTE (2), or PyUnicode_4BYTE
|
values PyUnicode_WCHAR_KIND (0), PyUnicode_1BYTE_KIND (1),
|
||||||
(3). PyUnicode_Data gives the void pointer to the data, masking out
|
PyUnicode_2BYTE_KIND (2), or PyUnicode_4BYTE_KIND (3). PyUnicode_DATA
|
||||||
the pointer kind. All these functions call PyUnicode_Ready
|
gives the void pointer to the data. All these functions call
|
||||||
in case the canonical representation hasn't been computed yet.
|
PyUnicode_Ready in case the canonical representation hasn't been
|
||||||
|
computed yet. Access to individual characters should use
|
||||||
|
PyUnicode_{READ|WRITE}[_CHAR]:
|
||||||
|
|
||||||
|
- PyUnciode_READ(kind, data, index)
|
||||||
|
- PyUnicode_WRITE(kind, data, index, value)
|
||||||
|
- PyUnicode_READ_CHAR(unicode, index)
|
||||||
|
- PyUnicode_WRITE_CHAR(unicode, index, value)
|
||||||
|
|
||||||
A new function PyUnicode_AsUTF8 is provided to access the UTF-8
|
A new function PyUnicode_AsUTF8 is provided to access the UTF-8
|
||||||
representation. It is thus identical to the existing
|
representation. It is thus identical to the existing
|
||||||
|
@ -163,13 +170,6 @@ use PyUnicode_AsUTF8 to compute a conversion.
|
||||||
PyUnicode_AsUnicode is deprecated; it computes the wstr representation
|
PyUnicode_AsUnicode is deprecated; it computes the wstr representation
|
||||||
on first use.
|
on first use.
|
||||||
|
|
||||||
String Operations
|
|
||||||
-----------------
|
|
||||||
|
|
||||||
Various convenience functions will be provided to deal with the
|
|
||||||
canonical representation, in particular with respect to concatenation
|
|
||||||
and slicing.
|
|
||||||
|
|
||||||
Stable ABI
|
Stable ABI
|
||||||
----------
|
----------
|
||||||
|
|
||||||
|
@ -181,6 +181,30 @@ Tools/gdb/libpython.py contains debugging hooks that embed knowledge
|
||||||
about the internals of CPython's data types, include PyUnicodeObject
|
about the internals of CPython's data types, include PyUnicodeObject
|
||||||
instances. It will need to be slightly updated to track the change.
|
instances. It will need to be slightly updated to track the change.
|
||||||
|
|
||||||
|
Open Issues
|
||||||
|
===========
|
||||||
|
|
||||||
|
- When an application uses the legacy API, it may hold onto
|
||||||
|
the Py_UNICODE* representation, and yet start calling Unicode
|
||||||
|
APIs, which would call PyUnicode_Ready, invalidating the
|
||||||
|
Py_UNICODE* representation; this would be an incompatible change.
|
||||||
|
The following solutions can be considered:
|
||||||
|
|
||||||
|
* accept it as an incompatible change. Applications using the
|
||||||
|
legacy API will have to fill out the Py_UNICODE buffer completely
|
||||||
|
before calling any API on the string under construction.
|
||||||
|
* require explicit PyUnicode_Ready calls in such applications;
|
||||||
|
fail with a fatal error if a non-ready string is ever read.
|
||||||
|
This would also be an incompatible change, but one that is
|
||||||
|
more easily detected during testing.
|
||||||
|
* as a compromise between these approaches, implicit PyUnicode_Ready
|
||||||
|
calls (i.e. those not deliberately following the construction of
|
||||||
|
a PyUnicode object) could produce a warning if they convert an
|
||||||
|
object.
|
||||||
|
|
||||||
|
- Which of the APIs created during the development of the PEP should
|
||||||
|
be public?
|
||||||
|
|
||||||
Discussion
|
Discussion
|
||||||
==========
|
==========
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue