Add open issues section, add macros, adjust macro names to implementation.

This commit is contained in:
Martin v. Löwis 2011-08-27 07:22:44 +02:00
parent 310b007a32
commit 6ec6d73ccd
1 changed files with 35 additions and 11 deletions

View File

@ -145,10 +145,17 @@ String Access
The canonical representation can be accessed using two macros The canonical representation can be accessed using two macros
PyUnicode_Kind and PyUnicode_Data. PyUnicode_Kind gives one of the PyUnicode_Kind and PyUnicode_Data. PyUnicode_Kind gives one of the
values PyUnicode_1BYTE (1), PyUnicode_2BYTE (2), or PyUnicode_4BYTE values PyUnicode_WCHAR_KIND (0), PyUnicode_1BYTE_KIND (1),
(3). PyUnicode_Data gives the void pointer to the data, masking out PyUnicode_2BYTE_KIND (2), or PyUnicode_4BYTE_KIND (3). PyUnicode_DATA
the pointer kind. All these functions call PyUnicode_Ready gives the void pointer to the data. All these functions call
in case the canonical representation hasn't been computed yet. PyUnicode_Ready in case the canonical representation hasn't been
computed yet. Access to individual characters should use
PyUnicode_{READ|WRITE}[_CHAR]:
- PyUnciode_READ(kind, data, index)
- PyUnicode_WRITE(kind, data, index, value)
- PyUnicode_READ_CHAR(unicode, index)
- PyUnicode_WRITE_CHAR(unicode, index, value)
A new function PyUnicode_AsUTF8 is provided to access the UTF-8 A new function PyUnicode_AsUTF8 is provided to access the UTF-8
representation. It is thus identical to the existing representation. It is thus identical to the existing
@ -163,13 +170,6 @@ use PyUnicode_AsUTF8 to compute a conversion.
PyUnicode_AsUnicode is deprecated; it computes the wstr representation PyUnicode_AsUnicode is deprecated; it computes the wstr representation
on first use. on first use.
String Operations
-----------------
Various convenience functions will be provided to deal with the
canonical representation, in particular with respect to concatenation
and slicing.
Stable ABI Stable ABI
---------- ----------
@ -181,6 +181,30 @@ Tools/gdb/libpython.py contains debugging hooks that embed knowledge
about the internals of CPython's data types, include PyUnicodeObject about the internals of CPython's data types, include PyUnicodeObject
instances. It will need to be slightly updated to track the change. instances. It will need to be slightly updated to track the change.
Open Issues
===========
- When an application uses the legacy API, it may hold onto
the Py_UNICODE* representation, and yet start calling Unicode
APIs, which would call PyUnicode_Ready, invalidating the
Py_UNICODE* representation; this would be an incompatible change.
The following solutions can be considered:
* accept it as an incompatible change. Applications using the
legacy API will have to fill out the Py_UNICODE buffer completely
before calling any API on the string under construction.
* require explicit PyUnicode_Ready calls in such applications;
fail with a fatal error if a non-ready string is ever read.
This would also be an incompatible change, but one that is
more easily detected during testing.
* as a compromise between these approaches, implicit PyUnicode_Ready
calls (i.e. those not deliberately following the construction of
a PyUnicode object) could produce a warning if they convert an
object.
- Which of the APIs created during the development of the PEP should
be public?
Discussion Discussion
========== ==========