PEP 757: Rename endian to endianness (#3973)

* Rephrase "Optimize small integers" section for import.
* Elaborate on PyLong_GetNativeLayout() validity.
* Mention PyLong_FreeExport() in the specification.
* Update benchmarks.
This commit is contained in:
Victor Stinner 2024-09-18 13:59:58 +02:00 committed by GitHub
parent 40bbaaa84a
commit 9475fa0aa4
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
1 changed files with 35 additions and 31 deletions

View File

@ -63,10 +63,10 @@ API::
// * -1 for least significant digit first
int8_t digits_order;
// Endian:
// Endianness:
// * 1 for most significant byte first (big endian)
// * -1 for least significant byte first (little endian)
int8_t endian;
int8_t endianness;
} PyLongLayout;
PyAPI_FUNC(const PyLongLayout*) PyLong_GetNativeLayout(void);
@ -82,6 +82,11 @@ API::
Get the native layout of Python :class:`int` objects.
The function must not be called before Python initialization nor after
Python finalization. The returned layout is valid until Python is
finalized. The layout is the same for all Python sub-interpreters and
so it can be cached.
Export API
----------
@ -106,6 +111,7 @@ Export a Python integer as a digits array::
} PyLongExport;
int PyLong_Export(PyObject *obj, PyLongExport *array);
void PyLong_FreeExport(PyLongExport *array);
On CPython 3.14, no memory copy is needed, it's just a thin wrapper to
expose Python int internal digits array.
@ -215,21 +221,19 @@ API::
Discard the internal object and destroy the writer instance.
Optimize small integers
=======================
Optimize import for small integers
==================================
Proposed API are efficient for large integers. Compared to accessing
directly Python internals, the proposed API can have a significant
performance overhead on small integers.
Proposed import API is efficient for large integers. Compared to
accessing directly Python internals, the proposed import API can have a
significant performance overhead on small integers.
For small integers of a few digits (for example, 1 or 2 digits), existing APIs
can be used
can be used:
* :external+py3.14:c:func:`PyLong_FromUInt64()` / :external+py3.14:c:func:`PyLong_AsUInt64()`;
* :c:func:`PyLong_FromLong()` / :c:func:`PyLong_AsLong()` or :c:func:`PyLong_AsInt()`;
* :external+py3.13:c:func:`PyUnstable_Long_IsCompact()` and
:external+py3.13:c:func:`PyUnstable_Long_CompactValue()`;
* :c:func:`PyLong_FromNativeBytes()` / :c:func:`PyLong_AsNativeBytes()`;
* :external+py3.14:c:func:`PyLong_FromUInt64()`;
* :c:func:`PyLong_FromLong()`;
* :c:func:`PyLong_FromNativeBytes()`.
Implementation
@ -262,7 +266,7 @@ Code::
PyLong_Export(obj, &long_export);
if (long_export.digits) {
mpz_import(z, long_export.ndigits, layout->digits_order,
layout->digit_size, layout->endian,
layout->digit_size, layout->endianness,
layout->digit_size*8 - layout->bits_per_digit,
long_export.digits);
if (long_export.negative) {
@ -307,15 +311,15 @@ mode:
+----------------+---------+-----------------------+
| Benchmark | ref | pep757 |
+================+=========+=======================+
| 1<<7 | 94.3 ns | 96.8 ns: 1.03x slower |
| 1<<7 | 91.3 ns | 89.9 ns: 1.02x faster |
+----------------+---------+-----------------------+
| 1<<38 | 127 ns | 99.7 ns: 1.28x faster |
| 1<<38 | 120 ns | 94.9 ns: 1.27x faster |
+----------------+---------+-----------------------+
| 1<<300 | 209 ns | 222 ns: 1.06x slower |
| 1<<300 | 196 ns | 203 ns: 1.04x slower |
+----------------+---------+-----------------------+
| 1<<3000 | 955 ns | 963 ns: 1.01x slower |
| 1<<3000 | 939 ns | 945 ns: 1.01x slower |
+----------------+---------+-----------------------+
| Geometric mean | (ref) | 1.04x faster |
| Geometric mean | (ref) | 1.05x faster |
+----------------+---------+-----------------------+
@ -341,7 +345,7 @@ Code::
return NULL;
}
mpz_export(digits, NULL, layout->endian,
mpz_export(digits, NULL, layout->endianness,
layout->digit_size, layout->digits_order,
layout->digit_size*8 - layout->bits_per_digit,
obj->z);
@ -365,17 +369,17 @@ Benchmark:
Results on Linux Fedora 40 with CPU isolation, Python built in release
mode:
+----------------+--------+----------------------+
| Benchmark | ref | pep757 |
+================+========+======================+
| 1<<300 | 193 ns | 215 ns: 1.11x slower |
+----------------+--------+----------------------+
| 1<<3000 | 927 ns | 943 ns: 1.02x slower |
+----------------+--------+----------------------+
| Geometric mean | (ref) | 1.03x slower |
+----------------+--------+----------------------+
+----------------+---------+-----------------------+
| Benchmark | ref | pep757 |
+================+=========+=======================+
| 1<<7 | 56.7 ns | 56.2 ns: 1.01x faster |
+----------------+---------+-----------------------+
| 1<<300 | 191 ns | 213 ns: 1.12x slower |
+----------------+---------+-----------------------+
| Geometric mean | (ref) | 1.03x slower |
+----------------+---------+-----------------------+
Benchmark hidden because not significant (2): 1<<7, 1<<38.
Benchmark hidden because not significant (2): 1<<38, 1<<3000.
Backwards Compatibility
@ -388,7 +392,7 @@ added.
Open Questions
==============
* Should we add *digits_order* and *endian* members to :data:`sys.int_info`
* Should we add *digits_order* and *endianness* members to :data:`sys.int_info`
and remove ``PyLong_GetNativeLayout()``? The
``PyLong_GetNativeLayout()`` function returns a C structure
which is more convenient to use in C than :data:`sys.int_info` which uses