PEP 756: Remove Open Questions (#3968)

2024-09-17 15:34:14 +02:00 · 2024-09-17 15:34:14 +02:00 · b6cf6d47f3
parent 80f7aadb73
commit b6cf6d47f3
1 changed files with 21 additions and 20 deletions
--- a/peps/pep-0756.rst
+++ b/peps/pep-0756.rst
@ -102,7 +102,12 @@ longer rationale.
 PyUnicode_Export()
 ------------------

-API: ``int32_t PyUnicode_Export(PyObject *unicode, int32_t requested_formats, Py_buffer *view)``.
+API::
+
+    int32_t PyUnicode_Export(
+        PyObject *unicode,
+        int32_t requested_formats,
+        Py_buffer *view)

 Export the contents of the *unicode* string in one of the *requested_formats*.

@ -116,6 +121,10 @@ The contents of the buffer are valid until they are released.

 The buffer is read-only and must not be modified.

+The ``view->len`` member must be used to get the string length. The
+buffer should end with a trailing NUL character, but it's not
+recommended to rely on that because of embedded NUL characters.
+
 *unicode* and *view* must not be NULL.

 Available formats:
@ -152,7 +161,7 @@ needed. There are cases when a copy is needed, *O*\ (*n*) complexity:
 * If only UTF-8 is requested: the string is encoded to UTF-8 at the
  first call, and then the encoded UTF-8 string is cached.

-To have an *O*\ (1) complexity on CPython and PyPy, it's recommended to
+To get the best performance on CPython and PyPy, it's recommended to
 support these 4 formats::

    (PyUnicode_FORMAT_UCS1 \
@ -160,6 +169,10 @@ support these 4 formats::
     | PyUnicode_FORMAT_UCS4 \
     | PyUnicode_FORMAT_UTF8)

+PyPy uses UTF-8 natively and so the ``PyUnicode_FORMAT_UTF8`` format is
+recommended. It requires a memory copy, since PyPy ``str`` objects can
+be moved in memory (PyPy uses a moving garbage collector).
+

 Py_buffer format and item size
 ------------------------------
@ -181,7 +194,12 @@ Export format               Buffer format       Item size
 PyUnicode_Import()
 ------------------

-API: ``PyObject* PyUnicode_Import(const void *data, Py_ssize_t nbytes, int32_t format)``.
+API::
+
+    PyObject* PyUnicode_Import(
+        const void *data,
+        Py_ssize_t nbytes,
+        int32_t format)

 Create a Unicode string object from a buffer in a supported format.

@ -224,10 +242,6 @@ example, the UTF-8 format uses the ``surrogatepass`` error handler.

 Embedded NUL characters are allowed: they can be imported and exported.

-An exported string does not end with a trailing NUL character: the
-``PyUnicode_Export()`` caller must use ``Py_buffer.len`` to get the
-string length.
-

 Implementation
 ==============
@ -242,19 +256,6 @@ There is no impact on the backward compatibility, only new C API
 functions are added.


-Open Questions
-==============
-
-* Should we guarantee that the exported buffer always ends with a NUL
-  character? Is it possible to implement it in *O*\ (1) complexity
-  in all Python implementations?
-* Is it ok to allow surrogate characters?
-* Should we add a flag to disallow embedded NUL characters? It would
-  have an *O*\ (*n*) complexity.
-* Should we add a flag to disallow surrogate characters? It would
-  have an *O*\ (*n*) complexity.
-
-
 Usage of PEP 393 C APIs
 =======================