python-peps/pep-0670.rst

482 lines
15 KiB
ReStructuredText
Raw Normal View History

PEP: 670
Title: Convert macros to functions in the Python C API
Author: Erlend Egeberg Aasland <erlend.aasland@protonmail.com>,
Victor Stinner <vstinner@python.org>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 19-Oct-2021
Python-Version: 3.11
Abstract
========
Convert macros to static inline functions or regular functions.
Remove the return value of macros having a return value, whereas they
should not, to aid detecting bugs in C extensions when the C API is
misused.
2021-10-19 19:21:46 -04:00
Some function arguments are still cast to ``PyObject*`` to prevent
emitting new compiler warnings.
Rationale
=========
The use of macros may have unintended adverse effects that are hard to
avoid, even for experienced C developers. Some issues have been known
for years, while others have been discovered recently in Python.
Working around macro pitfalls makes the macro coder harder to read and
to maintain.
2021-10-19 18:15:07 -04:00
Converting macros to functions has multiple advantages:
* By design, functions don't have macro pitfalls.
* Arguments type and return type are well defined.
* Debuggers and profilers can retrieve the name of inlined functions.
* Debuggers can put breakpoints on inlined functions.
* Variables have a well defined scope.
2021-10-19 18:15:07 -04:00
* Code is usually easier to read and to maintain than similar macro
code. Functions don't need the following workarounds for macro
pitfalls:
2021-10-19 18:15:07 -04:00
* Add parentheses around arguments.
* Use line continuation characters if the function is written on
multiple lines.
2021-10-19 18:15:07 -04:00
* Add commas to execute multiple expressions.
* Use ``do { ... } while (0)`` to write multiple statements.
Converting macros and static inline functions to regular functions makes
these regular functions accessible to projects which use Python but
cannot use macros and static inline functions.
Macro Pitfalls
==============
The `GCC documentation
<https://gcc.gnu.org/onlinedocs/cpp/Macro-Pitfalls.html>`_ lists several
common macro pitfalls:
2021-11-24 13:23:36 -05:00
- Misnesting
- Operator precedence problems
- Swallowing the semicolon
- Duplication of side effects
- Self-referential macros
- Argument prescan
- Newlines in arguments
Performance and inlining
========================
Static inline functions is a feature added to the C99 standard. Modern C
compilers have efficient heuristics to decide if a function should be
inlined or not.
When a C compiler decides to not inline, there is likely a good reason.
For example, inlining would reuse a register which requires to
save/restore the register value on the stack and so increases the stack
memory usage, or be less efficient.
Debug build
-----------
Benchmarks must not be run on a Python debug build, only on release
build. Moreover, using LTO and PGO optimizations is recommended for best
performances and reliable benchmarks. PGO helps the compiler to decide
if function should be inlined or not.
``./configure --with-pydebug`` uses the ``-Og`` compiler option if it's
2021-11-24 13:23:36 -05:00
supported by the compiler (GCC and LLVM Clang support it): optimize
debugging experience. Otherwise, the ``-O0`` compiler option is used:
disable most optimizations.
With GCC 11, ``gcc -Og`` can inline static inline functions, whereas
``gcc -O0`` does not inline static inline functions. Examples:
* Call ``Py_INCREF()`` in ``PyBool_FromLong()``:
* ``gcc -Og``: inlined
* ``gcc -O0``: not inlined, call ``Py_INCREF()`` function
* Call ``_PyErr_Occurred()`` in ``_Py_CheckFunctionResult()``:
* ``gcc -Og``: inlined
* ``gcc -O0``: not inlined, call ``_PyErr_Occurred()`` function
On Windows, when Python is built in debug mode by Visual Studio, static
inline functions are not inlined.
Force inlining
--------------
The ``Py_ALWAYS_INLINE`` macro can be used to force inlining. This macro
uses ``__attribute__((always_inline))`` with GCC and Clang, and
``__forceinline`` with MSC.
2021-11-24 13:23:36 -05:00
Previous attempts to use ``Py_ALWAYS_INLINE`` didn't show any benefit, and were
abandoned. See for example: `bpo-45094 <https://bugs.python.org/issue45094>`_:
"Consider using ``__forceinline`` and ``__attribute__((always_inline))`` on
static inline functions (``Py_INCREF``, ``Py_TYPE``) for debug build".
When the ``Py_INCREF()`` macro was converted to a static inline
functions in 2018 (`commit
<https://github.com/python/cpython/commit/2aaf0c12041bcaadd7f2cc5a54450eefd7a6ff12>`__),
it was decided not to force inlining. The machine code was analyzed with
multiple C compilers and compiler options: ``Py_INCREF()`` was always
inlined without having to force inlining. The only case where it was not
inlined was the debug build. See discussion in the `bpo-35059
<https://bugs.python.org/issue35059>`_: "Convert ``Py_INCREF()`` and
``PyObject_INIT()`` to inlined functions".
Disable inlining
----------------
On the other side, the ``Py_NO_INLINE`` macro can be used to disable
2021-11-24 13:23:36 -05:00
inlining. It can be used to reduce the stack memory usage, or to prevent
inlining on LTO+PGO builds, which are generally more aggressive to inline
code: see `bpo-33720 <https://bugs.python.org/issue33720>`_. The
``Py_NO_INLINE`` macro uses ``__attribute__ ((noinline))`` with GCC and
Clang, and ``__declspec(noinline)`` with MSC.
Specification
=============
Convert macros to static inline functions
-----------------------------------------
Most macros should be converted to static inline functions to prevent
`macro pitfalls`_.
The following macros should not be converted:
* Empty macros. Example: ``#define Py_HAVE_CONDVAR``.
* Macros only defining a number, even if a constant with a well defined
type can better. Example: ``#define METH_VARARGS 0x0001``.
* Compatibility layer for different C compilers, C language extensions,
or recent C features.
Example: ``#define Py_ALWAYS_INLINE __attribute__((always_inline))``.
2021-11-24 13:23:36 -05:00
* Macros that need the stringification or concatenation feature of the C preprocessor.
Convert static inline functions to regular functions
----------------------------------------------------
The performance impact of converting static inline functions to regular
functions should be measured with benchmarks. If there is a significant
slowdown, there should be a good reason to do the conversion. One reason
can be hiding implementation details.
To avoid any risk of performance slowdown on Python built without LTO,
it is possible to keep a private static inline function in the internal
C API and use it in Python, but expose a regular function in the public
C API.
Using static inline functions in the internal C API is fine: the
2021-11-15 10:36:12 -05:00
internal C API exposes implementation details by design and should not be
used outside Python.
Cast to PyObject*
-----------------
2021-10-19 18:15:07 -04:00
When a macro is converted to a function and the macro casts its
arguments to ``PyObject*``, the new function comes with a new macro
which cast arguments to ``PyObject*`` to prevent emitting new compiler
2021-11-24 13:23:36 -05:00
warnings. This implies that a converted function will accept pointers to
structures inheriting from ``PyObject`` (ex: ``PyTupleObject``).
For example, the ``Py_TYPE(obj)`` macro casts its ``obj`` argument to
2021-10-19 18:15:07 -04:00
``PyObject*``::
2021-10-19 18:15:07 -04:00
#define _PyObject_CAST_CONST(op) ((const PyObject*)(op))
static inline PyTypeObject* _Py_TYPE(const PyObject *ob) {
return ob->ob_type;
}
#define Py_TYPE(ob) _Py_TYPE(_PyObject_CAST_CONST(ob))
The undocumented private ``_Py_TYPE()`` function must not be called
directly. Only the documented public ``Py_TYPE()`` macro must be used.
Later, the cast can be removed on a case by case basis, but that is out
of scope for this PEP.
Remove the return value
-----------------------
2021-10-19 18:15:07 -04:00
When a macro is implemented as an expression, it has an implicit return
2021-11-24 13:23:36 -05:00
value. This macro pitfall can be misused in third party C extensions. See
`bpo-30459 <https://bugs.python.org/issue30459>`_ regarding the misuse of the
``PyList_SET_ITEM()`` and ``PyCell_SET()`` macros. Such pitfalls are hard to
catch while reviewing macro code. Removing the return value aids detecting
bugs in C extensions when the C API is misused.
Backwards Compatibility
=======================
Removing the return value of macros is an incompatible API change made
on purpose: see the `Remove the return value`_ section.
Rejected Ideas
==============
Keep macros, but fix some macro issues
--------------------------------------
2021-10-19 18:15:07 -04:00
Converting macros to functions is not needed to `remove the return
value`_: casting a macro return value to ``void`` also fix the issue.
For example, the ``PyList_SET_ITEM()`` macro was already fixed like
that.
Macros are always "inlined" with any C compiler.
The duplication of side effects can be worked around in the caller of
the macro.
People using macros should be considered "consenting adults". People who
feel unsafe with macros should simply not use them.
2021-11-24 13:23:36 -05:00
These ideas are rejected because macros _are_ error prone, and it is too easy
to miss a macro pitfall when writing and reviewing macro code. Moreover, macros
are harder to read and maintain than functions.
2021-10-19 19:21:46 -04:00
Examples of hard to read macros
===============================
PyObject_INIT()
---------------
Example showing the usage of commas in a macro which has a return value.
Python 3.7 macro::
#define PyObject_INIT(op, typeobj) \
( Py_TYPE(op) = (typeobj), _Py_NewReference((PyObject *)(op)), (op) )
Python 3.8 function (simplified code)::
static inline PyObject*
_PyObject_INIT(PyObject *op, PyTypeObject *typeobj)
{
Py_TYPE(op) = typeobj;
_Py_NewReference(op);
return op;
}
#define PyObject_INIT(op, typeobj) \
_PyObject_INIT(_PyObject_CAST(op), (typeobj))
* The function doesn't need the line continuation character ``"\"``.
* It has an explicit ``"return op;"`` rather than the surprising
``", (op)"`` syntax at the end of the macro.
* It uses short statements on multiple lines, rather than being written
as a single long line.
* Inside the function, the *op* argument has the well defined type
``PyObject*`` and so doesn't need casts like ``(PyObject *)(op)``.
* Arguments don't need to be put inside parenthesis: use ``typeobj``,
rather than ``(typeobj)``.
_Py_NewReference()
------------------
Example showing the usage of an ``#ifdef`` inside a macro.
Python 3.7 macro (simplified code)::
#ifdef COUNT_ALLOCS
# define _Py_INC_TPALLOCS(OP) inc_count(Py_TYPE(OP))
# define _Py_COUNT_ALLOCS_COMMA ,
#else
# define _Py_INC_TPALLOCS(OP)
# define _Py_COUNT_ALLOCS_COMMA
#endif /* COUNT_ALLOCS */
#define _Py_NewReference(op) ( \
_Py_INC_TPALLOCS(op) _Py_COUNT_ALLOCS_COMMA \
Py_REFCNT(op) = 1)
Python 3.8 function (simplified code)::
static inline void _Py_NewReference(PyObject *op)
{
_Py_INC_TPALLOCS(op);
Py_REFCNT(op) = 1;
}
2021-11-24 13:23:36 -05:00
PyUnicode_READ_CHAR()
---------------------
This macro reuses arguments, and possibly calls ``PyUnicode_KIND`` multiple
times::
#define PyUnicode_READ_CHAR(unicode, index) \
(assert(PyUnicode_Check(unicode)), \
assert(PyUnicode_IS_READY(unicode)), \
(Py_UCS4) \
(PyUnicode_KIND((unicode)) == PyUnicode_1BYTE_KIND ? \
((const Py_UCS1 *)(PyUnicode_DATA((unicode))))[(index)] : \
(PyUnicode_KIND((unicode)) == PyUnicode_2BYTE_KIND ? \
((const Py_UCS2 *)(PyUnicode_DATA((unicode))))[(index)] : \
((const Py_UCS4 *)(PyUnicode_DATA((unicode))))[(index)] \
) \
))
Possible implementation as a static inlined function::
static inline Py_UCS4
PyUnicode_READ_CHAR(PyObject *unicode, Py_ssize_t index)
{
assert(PyUnicode_Check(unicode));
assert(PyUnicode_IS_READY(unicode));
switch (PyUnicode_KIND(unicode)) {
case PyUnicode_1BYTE_KIND:
return (Py_UCS4)((const Py_UCS1 *)(PyUnicode_DATA(unicode)))[index];
case PyUnicode_2BYTE_KIND:
return (Py_UCS4)((const Py_UCS2 *)(PyUnicode_DATA(unicode)))[index];
case PyUnicode_4BYTE_KIND:
default:
return (Py_UCS4)((const Py_UCS4 *)(PyUnicode_DATA(unicode)))[index];
}
}
Macros converted to functions since Python 3.8
==============================================
Macros converted to static inline functions
-------------------------------------------
Python 3.8:
* ``Py_DECREF()``
* ``Py_INCREF()``
* ``Py_XDECREF()``
* ``Py_XINCREF()``
* ``PyObject_INIT()``
* ``PyObject_INIT_VAR()``
* ``_PyObject_GC_UNTRACK()``
* ``_Py_Dealloc()``
Python 3.10:
* ``Py_REFCNT()``
Python 3.11:
* ``Py_TYPE()``
* ``Py_SIZE()``
Macros converted to regular functions
-------------------------------------
Python 3.9:
* ``PyIndex_Check()``
* ``PyObject_CheckBuffer()``
* ``PyObject_GET_WEAKREFS_LISTPTR()``
* ``PyObject_IS_GC()``
* ``PyObject_NEW()``: alias to ``PyObject_New()``
* ``PyObject_NEW_VAR()``: alias to ``PyObjectVar_New()``
To avoid any risk of performance slowdown on Python built without LTO,
private static inline functions have been added to the internal C API:
* ``_PyIndex_Check()``
* ``_PyObject_IS_GC()``
* ``_PyType_HasFeature()``
* ``_PyType_IS_GC()``
Static inline functions converted to regular functions
-------------------------------------------------------
Python 3.11:
* ``PyObject_CallOneArg()``
* ``PyObject_Vectorcall()``
* ``PyVectorcall_Function()``
* ``_PyObject_FastCall()``
To avoid any risk of performance slowdown on Python built without LTO, a
private static inline function has been added to the internal C API:
* ``_PyVectorcall_FunctionInline()``
Benchmarks
==========
Benchmarks run on Fedora 35 (Linux) with GCC 11 on a laptop with 8
logical CPUs (4 physical CPU cores).
gcc -O0 versus gcc -Og
----------------------
Benchmark of the ``./python -m test -j10`` command on a Python debug
build:
* ``gcc -Og``: 220 sec ± 3 sec
* ``gcc -O0``: 360 sec ± 6 sec
Python built with ``gcc -O0`` is **1.6x slower** than Python built with
``gcc -Og``.
Replace macros with static inline functions
-------------------------------------------
The `PR 29728 <https://github.com/python/cpython/pull/29728>`_ replaces
existing the following static inline functions with macros:
* ``PyObject_TypeCheck()``
* ``PyType_Check()``, ``PyType_CheckExact()``
* ``PyType_HasFeature()``
* ``PyVectorcall_NARGS()``
* ``Py_DECREF()``, ``Py_XDECREF()``
* ``Py_INCREF()``, ``Py_XINCREF()``
* ``Py_IS_TYPE()``
* ``Py_NewRef()``
* ``Py_REFCNT()``, ``Py_TYPE()``, ``Py_SIZE()``
Benchmark of the ``./python -m test -j10`` command on a Python debug
build:
* Macros (PR 29728), ``gcc -O0``: 345 sec ± 5 sec
* Static inline functions (reference), ``gcc -O0``: 360 sec ± 6 sec
Replacing macros with static inline functions makes Python
**1.04x slower** when the compiler **does not inline** static inline
functions.
References
==========
* `bpo-45490 <https://bugs.python.org/issue45490>`_:
[meta][C API] Avoid C macro pitfalls and usage of static inline
functions (October 2021).
* `What to do with unsafe macros
<https://discuss.python.org/t/what-to-do-with-unsafe-macros/7771>`_
(March 2021).
* `bpo-43502 <https://bugs.python.org/issue43502>`_:
[C-API] Convert obvious unsafe macros to static inline functions
(March 2021).
Copyright
=========
This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.