PEP 756: Remove Open Questions (#3968)
This commit is contained in:
parent
80f7aadb73
commit
b6cf6d47f3
|
@ -102,7 +102,12 @@ longer rationale.
|
||||||
PyUnicode_Export()
|
PyUnicode_Export()
|
||||||
------------------
|
------------------
|
||||||
|
|
||||||
API: ``int32_t PyUnicode_Export(PyObject *unicode, int32_t requested_formats, Py_buffer *view)``.
|
API::
|
||||||
|
|
||||||
|
int32_t PyUnicode_Export(
|
||||||
|
PyObject *unicode,
|
||||||
|
int32_t requested_formats,
|
||||||
|
Py_buffer *view)
|
||||||
|
|
||||||
Export the contents of the *unicode* string in one of the *requested_formats*.
|
Export the contents of the *unicode* string in one of the *requested_formats*.
|
||||||
|
|
||||||
|
@ -116,6 +121,10 @@ The contents of the buffer are valid until they are released.
|
||||||
|
|
||||||
The buffer is read-only and must not be modified.
|
The buffer is read-only and must not be modified.
|
||||||
|
|
||||||
|
The ``view->len`` member must be used to get the string length. The
|
||||||
|
buffer should end with a trailing NUL character, but it's not
|
||||||
|
recommended to rely on that because of embedded NUL characters.
|
||||||
|
|
||||||
*unicode* and *view* must not be NULL.
|
*unicode* and *view* must not be NULL.
|
||||||
|
|
||||||
Available formats:
|
Available formats:
|
||||||
|
@ -152,7 +161,7 @@ needed. There are cases when a copy is needed, *O*\ (*n*) complexity:
|
||||||
* If only UTF-8 is requested: the string is encoded to UTF-8 at the
|
* If only UTF-8 is requested: the string is encoded to UTF-8 at the
|
||||||
first call, and then the encoded UTF-8 string is cached.
|
first call, and then the encoded UTF-8 string is cached.
|
||||||
|
|
||||||
To have an *O*\ (1) complexity on CPython and PyPy, it's recommended to
|
To get the best performance on CPython and PyPy, it's recommended to
|
||||||
support these 4 formats::
|
support these 4 formats::
|
||||||
|
|
||||||
(PyUnicode_FORMAT_UCS1 \
|
(PyUnicode_FORMAT_UCS1 \
|
||||||
|
@ -160,6 +169,10 @@ support these 4 formats::
|
||||||
| PyUnicode_FORMAT_UCS4 \
|
| PyUnicode_FORMAT_UCS4 \
|
||||||
| PyUnicode_FORMAT_UTF8)
|
| PyUnicode_FORMAT_UTF8)
|
||||||
|
|
||||||
|
PyPy uses UTF-8 natively and so the ``PyUnicode_FORMAT_UTF8`` format is
|
||||||
|
recommended. It requires a memory copy, since PyPy ``str`` objects can
|
||||||
|
be moved in memory (PyPy uses a moving garbage collector).
|
||||||
|
|
||||||
|
|
||||||
Py_buffer format and item size
|
Py_buffer format and item size
|
||||||
------------------------------
|
------------------------------
|
||||||
|
@ -181,7 +194,12 @@ Export format Buffer format Item size
|
||||||
PyUnicode_Import()
|
PyUnicode_Import()
|
||||||
------------------
|
------------------
|
||||||
|
|
||||||
API: ``PyObject* PyUnicode_Import(const void *data, Py_ssize_t nbytes, int32_t format)``.
|
API::
|
||||||
|
|
||||||
|
PyObject* PyUnicode_Import(
|
||||||
|
const void *data,
|
||||||
|
Py_ssize_t nbytes,
|
||||||
|
int32_t format)
|
||||||
|
|
||||||
Create a Unicode string object from a buffer in a supported format.
|
Create a Unicode string object from a buffer in a supported format.
|
||||||
|
|
||||||
|
@ -224,10 +242,6 @@ example, the UTF-8 format uses the ``surrogatepass`` error handler.
|
||||||
|
|
||||||
Embedded NUL characters are allowed: they can be imported and exported.
|
Embedded NUL characters are allowed: they can be imported and exported.
|
||||||
|
|
||||||
An exported string does not end with a trailing NUL character: the
|
|
||||||
``PyUnicode_Export()`` caller must use ``Py_buffer.len`` to get the
|
|
||||||
string length.
|
|
||||||
|
|
||||||
|
|
||||||
Implementation
|
Implementation
|
||||||
==============
|
==============
|
||||||
|
@ -242,19 +256,6 @@ There is no impact on the backward compatibility, only new C API
|
||||||
functions are added.
|
functions are added.
|
||||||
|
|
||||||
|
|
||||||
Open Questions
|
|
||||||
==============
|
|
||||||
|
|
||||||
* Should we guarantee that the exported buffer always ends with a NUL
|
|
||||||
character? Is it possible to implement it in *O*\ (1) complexity
|
|
||||||
in all Python implementations?
|
|
||||||
* Is it ok to allow surrogate characters?
|
|
||||||
* Should we add a flag to disallow embedded NUL characters? It would
|
|
||||||
have an *O*\ (*n*) complexity.
|
|
||||||
* Should we add a flag to disallow surrogate characters? It would
|
|
||||||
have an *O*\ (*n*) complexity.
|
|
||||||
|
|
||||||
|
|
||||||
Usage of PEP 393 C APIs
|
Usage of PEP 393 C APIs
|
||||||
=======================
|
=======================
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue