2020-06-21 21:01:17 -04:00
|
|
|
PEP: 620
|
|
|
|
Title: Hide implementation details from the C API
|
2020-06-21 22:10:12 -04:00
|
|
|
Author: Victor Stinner <vstinner@python.org>
|
2020-06-21 21:01:17 -04:00
|
|
|
Status: Draft
|
|
|
|
Type: Standards Track
|
|
|
|
Content-Type: text/x-rst
|
|
|
|
Created: 19-June-2020
|
|
|
|
Python-Version: 3.10
|
|
|
|
|
|
|
|
Abstract
|
|
|
|
========
|
|
|
|
|
2020-06-21 22:04:07 -04:00
|
|
|
Introduce C API incompatible changes to hide implementation details.
|
2020-06-21 21:01:17 -04:00
|
|
|
|
2020-06-22 04:09:57 -04:00
|
|
|
Once most implementation details will be hidden, evolution of CPython
|
|
|
|
internals would be less limited by C API backward compatibility issues.
|
|
|
|
It will be way easier to add new features.
|
2020-06-21 21:01:17 -04:00
|
|
|
|
2020-06-22 07:06:55 -04:00
|
|
|
It becomes possible to experiment with more advanced optimizations in
|
|
|
|
CPython than just micro-optimizations, like tagged pointers.
|
2020-06-21 21:01:17 -04:00
|
|
|
|
|
|
|
Define a process to reduce the number of broken C extensions.
|
|
|
|
|
2020-06-22 04:09:57 -04:00
|
|
|
The implementation of this PEP is expected to be done carefully over
|
2020-06-21 21:01:17 -04:00
|
|
|
multiple Python versions. It already started in Python 3.7 and most
|
2020-06-22 07:01:12 -04:00
|
|
|
changes are already completed. The `Process to reduce the number of
|
2020-06-22 07:06:55 -04:00
|
|
|
broken C extensions`_ dictates the rhythm.
|
2020-06-21 21:01:17 -04:00
|
|
|
|
|
|
|
|
|
|
|
Motivation
|
|
|
|
==========
|
|
|
|
|
|
|
|
The C API blocks CPython evolutions
|
|
|
|
-----------------------------------
|
|
|
|
|
2020-06-22 04:09:57 -04:00
|
|
|
Adding or removing members of C structures is causing multiple backward
|
|
|
|
compatibility issues.
|
2020-06-21 21:01:17 -04:00
|
|
|
|
|
|
|
Adding a new member breaks the stable ABI (PEP 384), especially for
|
2020-06-21 22:04:07 -04:00
|
|
|
types declared statically (e.g. ``static PyTypeObject MyType =
|
2020-06-22 04:09:57 -04:00
|
|
|
{...};``). In Python 3.4, the PEP 442 "Safe object finalization" added
|
2020-06-22 07:01:12 -04:00
|
|
|
the ``tp_finalize`` member at the end of the ``PyTypeObject`` structure.
|
|
|
|
For ABI backward compatibility, a new ``Py_TPFLAGS_HAVE_FINALIZE`` type
|
|
|
|
flag was required to announce if the type structure contains the
|
2020-06-21 21:01:17 -04:00
|
|
|
``tp_finalize`` member. The flag was removed in Python 3.8 (`bpo-32388
|
|
|
|
<https://bugs.python.org/issue32388>`_).
|
|
|
|
|
|
|
|
The ``PyTypeObject.tp_print`` member, deprecated since Python 3.0
|
|
|
|
released in 2009, has been removed in the Python 3.8 development cycle.
|
|
|
|
But the change broke too many C extensions and had to be reverted before
|
|
|
|
3.8 final release. Finally, the member was removed again in Python 3.9.
|
|
|
|
|
2020-06-22 07:01:12 -04:00
|
|
|
C extensions rely on the ability to access directly structure members,
|
2020-06-22 04:01:09 -04:00
|
|
|
indirectly through the C API, or even directly. Modifying structures
|
|
|
|
like ``PyListObject`` cannot be even considered.
|
2020-06-21 21:01:17 -04:00
|
|
|
|
|
|
|
The ``PyTypeObject`` structure is the one which evolved the most, simply
|
|
|
|
because there was no other way to evolve CPython than modifying it.
|
|
|
|
|
2020-06-22 04:01:09 -04:00
|
|
|
In the C API, all Python objects are passed as ``PyObject*``: a pointer
|
|
|
|
to a ``PyObject`` structure. Experimenting tagged pointers in CPython is
|
|
|
|
blocked by the fact that a C extension can technically dereference a
|
|
|
|
``PyObject*`` pointer and access ``PyObject`` members. Small "objects"
|
|
|
|
can be stored as a tagged pointer with no concrete ``PyObject``
|
|
|
|
structure.
|
|
|
|
|
|
|
|
Replacing Python garbage collector with a tracing garbage collector
|
|
|
|
would also need to remove ``PyObject.ob_refcnt`` reference counter,
|
|
|
|
whereas currently ``Py_INCREF()`` and ``Py_DECREF()`` macros access
|
|
|
|
directly to ``PyObject.ob_refcnt``.
|
|
|
|
|
2020-06-21 21:01:17 -04:00
|
|
|
Same CPython design since 1990: structures and reference counting
|
|
|
|
-----------------------------------------------------------------
|
|
|
|
|
|
|
|
When the CPython project was created, it was written with one principle:
|
|
|
|
keep the implementation simple enough so it can be maintained by a
|
|
|
|
single developer. CPython complexity grew a lot and many
|
|
|
|
micro-optimizations have been implemented, but CPython core design has
|
|
|
|
not changed.
|
|
|
|
|
|
|
|
Members of ``PyObject`` and ``PyTupleObject`` structures have not
|
|
|
|
changed since the "Initial revision" commit (1990)::
|
|
|
|
|
|
|
|
#define OB_HEAD \
|
|
|
|
unsigned int ob_refcnt; \
|
|
|
|
struct _typeobject *ob_type;
|
|
|
|
|
|
|
|
typedef struct _object {
|
|
|
|
OB_HEAD
|
|
|
|
} object;
|
|
|
|
|
|
|
|
typedef struct {
|
2020-06-22 04:01:09 -04:00
|
|
|
OB_VARHEAD
|
|
|
|
object *ob_item[1];
|
2020-06-21 21:01:17 -04:00
|
|
|
} tupleobject;
|
|
|
|
|
|
|
|
Only names changed: ``object`` was renamed to ``PyObject`` and
|
|
|
|
``tupleobject`` was renamed to ``PyTupleObject``.
|
|
|
|
|
|
|
|
CPython still tracks Python objects lifetime using reference counting
|
|
|
|
internally and for third party C extensions (through the Python C API).
|
|
|
|
|
|
|
|
All Python objects must be allocated on the heap and cannot be moved.
|
|
|
|
|
|
|
|
Why is PyPy more efficient than CPython?
|
|
|
|
----------------------------------------
|
|
|
|
|
|
|
|
The PyPy project is a Python implementation which is 4.2x faster than
|
2020-06-22 07:01:12 -04:00
|
|
|
CPython on average. PyPy developers chose to not fork CPython, but start
|
2020-06-21 21:01:17 -04:00
|
|
|
from scratch to have more freedom in terms of optimization choices.
|
|
|
|
|
|
|
|
PyPy does not use reference counting, but a tracing garbage collector
|
2020-06-22 04:01:09 -04:00
|
|
|
which moves objects. Objects can be allocated on the stack (or even not
|
|
|
|
at all), rather than always having to be allocated on the heap.
|
2020-06-21 21:01:17 -04:00
|
|
|
|
|
|
|
Objects layouts are designed with performance in mind. For example, a
|
|
|
|
list strategy stores integers directly as integers, rather than objects.
|
|
|
|
|
|
|
|
Moreover, PyPy also has a JIT compiler which emits fast code thanks to
|
|
|
|
the efficient PyPy design.
|
|
|
|
|
|
|
|
PyPy bottleneck: the Python C API
|
|
|
|
---------------------------------
|
|
|
|
|
|
|
|
While PyPy is way more efficient than CPython to run pure Python code,
|
|
|
|
it is as efficient or slower than CPython to run C extensions.
|
|
|
|
|
|
|
|
Since the C API requires ``PyObject*`` and allows to access directly
|
|
|
|
structure members, PyPy has to associate a CPython object to PyPy
|
|
|
|
objects and maintain both consistent. Converting a PyPy object to a
|
|
|
|
CPython object is inefficient. Moreover, reference counting also has to
|
2020-06-22 07:01:12 -04:00
|
|
|
be implemented on top of PyPy tracing garbage collector.
|
2020-06-21 21:01:17 -04:00
|
|
|
|
|
|
|
These conversions are required because the Python C API is too close to
|
|
|
|
the CPython implementation: there is no high-level abstraction.
|
2020-06-21 22:04:07 -04:00
|
|
|
For example, structures members are part of the public C API and nothing
|
2020-06-21 21:01:17 -04:00
|
|
|
prevents a C extension to get or set directly
|
|
|
|
``PyTupleObject.ob_item[0]`` (the first item of a tuple).
|
|
|
|
|
|
|
|
See `Inside cpyext: Why emulating CPython C API is so Hard
|
|
|
|
<https://morepypy.blogspot.com/2018/09/inside-cpyext-why-emulating-cpython-c.html>`_
|
|
|
|
(Sept 2018) by Antonio Cuni for more details.
|
|
|
|
|
|
|
|
|
|
|
|
Rationale
|
|
|
|
=========
|
|
|
|
|
|
|
|
Hide implementation details
|
|
|
|
---------------------------
|
|
|
|
|
|
|
|
Hiding implementation details from the C API has multiple advantages:
|
|
|
|
|
2020-06-22 07:06:55 -04:00
|
|
|
* It becomes possible to experiment with more advanced optimizations in
|
|
|
|
CPython than just micro-optimizations. For example, tagged pointers,
|
|
|
|
and replace the garbage collector with a tracing garbage collector
|
|
|
|
which can move objects.
|
2020-06-21 21:01:17 -04:00
|
|
|
* Adding new features in CPython becomes easier.
|
|
|
|
* PyPy should be able to avoid conversions to CPython objects in more
|
|
|
|
cases: keep efficient PyPy objects.
|
|
|
|
* It becomes easier to implement the C API for a new Python
|
|
|
|
implementation.
|
|
|
|
* More C extensions will be compatible with Python implementations other
|
|
|
|
than CPython.
|
|
|
|
|
|
|
|
Relationship with the limited C API
|
|
|
|
-----------------------------------
|
|
|
|
|
|
|
|
The PEP 384 "Defining a Stable ABI" is in Python 3.4. It introduces the
|
|
|
|
"limited C API": a subset of the C API. When the limited C API is used,
|
2020-06-22 07:01:12 -04:00
|
|
|
it becomes possible to build a C extension only once and use it on
|
2020-06-21 21:01:17 -04:00
|
|
|
multiple Python versions: that's the stable ABI.
|
|
|
|
|
|
|
|
The main limitation of the PEP 384 is that C extensions have to opt-in
|
|
|
|
for the limited C API. Only very few projects made this choice,
|
|
|
|
usually to ease distribution of binaries, especially on Windows.
|
|
|
|
|
|
|
|
This PEP moves the C API towards the limited C API.
|
|
|
|
|
|
|
|
Ideally, the C API will become the limited C API and all C extensions
|
|
|
|
will use the stable ABI, but this is out of this PEP scope.
|
|
|
|
|
|
|
|
|
|
|
|
Specification
|
|
|
|
=============
|
|
|
|
|
|
|
|
Summary
|
|
|
|
-------
|
|
|
|
|
|
|
|
* (**Completed**) Reorganize the C API header files: create ``Include/cpython/`` and
|
|
|
|
``Include/internal/`` subdirectories.
|
|
|
|
* (**Completed**) Move private functions exposing implementation details to the internal
|
|
|
|
C API.
|
|
|
|
* (**Completed**) Convert macros to static inline functions.
|
|
|
|
* (**Completed**) Add new functions ``Py_SET_TYPE()``, ``Py_SET_REFCNT()`` and
|
|
|
|
``Py_SET_SIZE()``. The ``Py_TYPE()``, ``Py_REFCNT()`` and
|
|
|
|
``Py_SIZE()`` macros become functions which cannot be used as l-value.
|
|
|
|
* (**Completed**) New C API functions must not return borrowed
|
|
|
|
references.
|
|
|
|
* (**In Progress**) Provide ``pythoncapi_compat.h`` header file.
|
|
|
|
* (**In Progress**) Make structures opaque, add getter and setter
|
|
|
|
functions.
|
|
|
|
* (**Not Started**) Deprecate ``PySequence_Fast_ITEMS()``.
|
|
|
|
* (**Not Started**) Convert ``PyTuple_GET_ITEM()`` and
|
|
|
|
``PyList_GET_ITEM()`` macros to static inline functions.
|
|
|
|
|
|
|
|
Reorganize the C API header files
|
|
|
|
---------------------------------
|
|
|
|
|
|
|
|
The first consumer of the C API was Python itself. There is no clear
|
|
|
|
separation between APIs which must not be used outside Python, and API
|
|
|
|
which are public on purpose.
|
|
|
|
|
|
|
|
Header files must be reorganized in 3 API:
|
|
|
|
|
|
|
|
* ``Include/`` directory is the limited C API: no implementation
|
|
|
|
details, structures are opaque. C extensions using it get a stable
|
|
|
|
ABI.
|
|
|
|
* ``Include/cpython/`` directory is the CPython C API: less "portable"
|
|
|
|
API, depends more on the Python version, expose some implementation
|
|
|
|
details, few incompatible changes can happen.
|
2020-06-22 12:17:18 -04:00
|
|
|
* ``Include/internal/`` directory is the internal C API: implementation
|
2020-06-21 21:01:17 -04:00
|
|
|
details, incompatible changes are likely at each Python release.
|
|
|
|
|
|
|
|
The creation of the ``Include/cpython/`` directory is fully backward
|
|
|
|
compatible. ``Include/cpython/`` header files cannot be included
|
|
|
|
directly and are included automatically by ``Include/`` header files
|
|
|
|
when the ``Py_LIMITED_API`` macro is not defined.
|
|
|
|
|
|
|
|
The internal C API is installed and can be used for specific usage like
|
|
|
|
debuggers and profilers which must access structures members without
|
|
|
|
executing code. C extensions using the internal C API are tightly
|
|
|
|
coupled to a Python version and must be recompiled at each Python
|
|
|
|
version.
|
|
|
|
|
|
|
|
**STATUS**: Completed (in Python 3.8)
|
|
|
|
|
|
|
|
The reorganization of header files started in Python 3.7 and was
|
|
|
|
completed in Python 3.8:
|
|
|
|
|
|
|
|
* `bpo-35134 <https://bugs.python.org/issue35134>`_: Add a new
|
|
|
|
Include/cpython/ subdirectory for the "CPython API" with
|
|
|
|
implementation details.
|
|
|
|
* `bpo-35081 <https://bugs.python.org/issue35081>`_: Move internal
|
|
|
|
headers to ``Include/internal/``
|
|
|
|
|
|
|
|
Move private functions to the internal C API
|
|
|
|
--------------------------------------------
|
|
|
|
|
2020-06-22 07:01:12 -04:00
|
|
|
Private functions which expose implementation details must be moved to
|
2020-06-21 21:01:17 -04:00
|
|
|
the internal C API.
|
|
|
|
|
|
|
|
If a C extension relies on a CPython private function which exposes
|
|
|
|
CPython implementation details, other Python implementations have to
|
|
|
|
re-implement this private function to support this C extension.
|
|
|
|
|
|
|
|
**STATUS**: Completed (in Python 3.9)
|
|
|
|
|
2020-06-22 04:09:57 -04:00
|
|
|
Private functions moved to the internal C API in Python 3.8:
|
|
|
|
|
|
|
|
* ``_PyObject_GC_TRACK()``, ``_PyObject_GC_UNTRACK()``
|
2020-06-21 21:01:17 -04:00
|
|
|
|
2020-06-22 04:09:57 -04:00
|
|
|
Macros and functions excluded from the limited C API in Python 3.9:
|
2020-06-21 21:01:17 -04:00
|
|
|
|
2020-06-22 04:09:57 -04:00
|
|
|
* ``_PyObject_SIZE()``, ``_PyObject_VAR_SIZE()``
|
|
|
|
* ``PyThreadState_DeleteCurrent()``
|
|
|
|
* ``PyFPE_START_PROTECT()``, ``PyFPE_END_PROTECT()``
|
|
|
|
* ``_Py_NewReference()``, ``_Py_ForgetReference()``
|
|
|
|
* ``_PyTraceMalloc_NewReference()``
|
|
|
|
* ``_Py_GetRefTotal()``
|
2020-06-21 21:01:17 -04:00
|
|
|
|
2020-06-22 04:09:57 -04:00
|
|
|
Private functions moved to the internal C API in Python 3.9:
|
2020-06-21 21:01:17 -04:00
|
|
|
|
2020-06-22 04:09:57 -04:00
|
|
|
* GC functions like ``_Py_AS_GC()``, ``_PyObject_GC_IS_TRACKED()``
|
|
|
|
and ``_PyGCHead_NEXT()``
|
|
|
|
* ``_Py_AddToAllObjects()`` (not exported)
|
|
|
|
* ``_PyDebug_PrintTotalRefs()``, ``_Py_PrintReferences()``,
|
|
|
|
``_Py_PrintReferenceAddresses()`` (not exported)
|
2020-06-21 21:01:17 -04:00
|
|
|
|
2020-06-22 12:17:18 -04:00
|
|
|
Public "clear free list" functions moved to the internal C API and
|
|
|
|
renamed to private functions in Python 3.9:
|
2020-06-21 21:01:17 -04:00
|
|
|
|
2020-06-22 04:09:57 -04:00
|
|
|
* ``PyAsyncGen_ClearFreeLists()``
|
|
|
|
* ``PyContext_ClearFreeList()``
|
|
|
|
* ``PyDict_ClearFreeList()``
|
|
|
|
* ``PyFloat_ClearFreeList()``
|
|
|
|
* ``PyFrame_ClearFreeList()``
|
|
|
|
* ``PyList_ClearFreeList()``
|
|
|
|
* ``PyTuple_ClearFreeList()``
|
|
|
|
* Functions simply removed:
|
|
|
|
|
|
|
|
* ``PyMethod_ClearFreeList()`` and ``PyCFunction_ClearFreeList()``:
|
|
|
|
bound method free list removed in Python 3.9.
|
|
|
|
* ``PySet_ClearFreeList()``: set free list removed in Python 3.4.
|
|
|
|
* ``PyUnicode_ClearFreeList()``: Unicode free list removed
|
|
|
|
in Python 3.3.
|
2020-06-21 21:01:17 -04:00
|
|
|
|
|
|
|
Convert macros to static inline functions
|
|
|
|
-----------------------------------------
|
|
|
|
|
2020-06-22 12:17:18 -04:00
|
|
|
Converting macros to static inline functions has multiple advantages:
|
2020-06-21 21:01:17 -04:00
|
|
|
|
|
|
|
* Functions have well defined parameter types and return type.
|
|
|
|
* Functions can use variables with a well defined scope (the function).
|
|
|
|
* Debugger can be put breakpoints on functions and profilers can display
|
|
|
|
the function name in the call stacks. In most cases, it works even
|
|
|
|
when a static inline function is inlined.
|
|
|
|
* Functions don't have `macros pitfalls
|
|
|
|
<https://gcc.gnu.org/onlinedocs/cpp/Macro-Pitfalls.html>`_.
|
|
|
|
|
|
|
|
Converting macros to static inline functions should only impact very few
|
2020-06-22 12:17:18 -04:00
|
|
|
C extensions that use macros in unusual ways.
|
2020-06-21 21:01:17 -04:00
|
|
|
|
|
|
|
For backward compatibility, functions must continue to accept any type,
|
|
|
|
not only ``PyObject*``, to avoid compiler warnings, since most macros
|
|
|
|
cast their parameters to ``PyObject*``.
|
|
|
|
|
|
|
|
Python 3.6 requires C compilers to support static inline functions: the
|
|
|
|
PEP 7 requires a subset of C99.
|
|
|
|
|
2020-06-22 04:01:09 -04:00
|
|
|
**STATUS**: Completed (in Python 3.9)
|
2020-06-21 21:01:17 -04:00
|
|
|
|
|
|
|
Macros converted to static inline functions in Python 3.8:
|
|
|
|
|
|
|
|
* ``Py_INCREF()``, ``Py_DECREF()``
|
|
|
|
* ``Py_XINCREF()``, ``Py_XDECREF()``
|
|
|
|
* ``PyObject_INIT()``, ``PyObject_INIT_VAR()``
|
|
|
|
* ``_PyObject_GC_TRACK()``, ``_PyObject_GC_UNTRACK()``, ``_Py_Dealloc()``
|
|
|
|
|
|
|
|
Macros converted to regular functions in Python 3.9:
|
|
|
|
|
|
|
|
* ``Py_EnterRecursiveCall()``, ``Py_LeaveRecursiveCall()``
|
2020-06-22 04:09:57 -04:00
|
|
|
(added to the limited C API)
|
2020-06-21 21:01:17 -04:00
|
|
|
* ``PyObject_INIT()``, ``PyObject_INIT_VAR()``
|
2020-06-22 04:09:57 -04:00
|
|
|
* ``PyObject_GET_WEAKREFS_LISTPTR()``
|
|
|
|
* ``PyObject_CheckBuffer()``
|
|
|
|
* ``PyIndex_Check()``
|
|
|
|
* ``PyObject_IS_GC()``
|
|
|
|
* ``PyObject_NEW()`` (alias to ``PyObject_New()``),
|
|
|
|
``PyObject_NEW_VAR()`` (alias to ``PyObject_NewVar()``)
|
|
|
|
* ``PyType_HasFeature()`` (always call ``PyType_GetFlags()``)
|
|
|
|
* ``Py_TRASHCAN_BEGIN_CONDITION()`` and ``Py_TRASHCAN_END()`` macros
|
|
|
|
now call functions which hide implementation details, rather than
|
|
|
|
accessing directly members of the ``PyThreadState`` structure.
|
2020-06-21 21:01:17 -04:00
|
|
|
|
|
|
|
Make structures opaque
|
|
|
|
----------------------
|
|
|
|
|
|
|
|
All structures of the C API should become opaque: C extensions must
|
|
|
|
use getter or setter functions to get or set structure members. For
|
|
|
|
example, ``tuple->ob_item[0]`` must be replaced with
|
|
|
|
``PyTuple_GET_ITEM(tuple, 0)``.
|
|
|
|
|
|
|
|
To be able to move away from reference counting, ``PyObject`` must
|
|
|
|
become opaque. Currently, the reference counter ``PyObject.ob_refcnt``
|
|
|
|
is exposed in the C API. All structures must become opaque, since they
|
|
|
|
"inherit" from PyObject. For, ``PyFloatObject`` inherits from
|
|
|
|
``PyObject``::
|
|
|
|
|
|
|
|
typedef struct {
|
|
|
|
PyObject ob_base;
|
|
|
|
double ob_fval;
|
|
|
|
} PyFloatObject;
|
|
|
|
|
2020-06-22 07:01:12 -04:00
|
|
|
Making ``PyObject`` fully opaque requires converting ``Py_INCREF()`` and
|
2020-06-21 21:01:17 -04:00
|
|
|
``Py_DECREF()`` macros to function calls. This change has an impact on
|
2020-06-22 07:06:55 -04:00
|
|
|
performance. It is likely to be one of the very last changes when making
|
2020-06-21 21:01:17 -04:00
|
|
|
structures opaque.
|
|
|
|
|
|
|
|
Making ``PyTypeObject`` structure opaque breaks C extensions declaring
|
|
|
|
types statically (e.g. ``static PyTypeObject MyType = {...};``). C
|
|
|
|
extensions must use ``PyType_FromSpec()`` to allocate types on the heap
|
2020-06-22 07:01:12 -04:00
|
|
|
instead. Using heap types has other advantages like being compatible
|
2020-06-21 21:01:17 -04:00
|
|
|
with subinterpreters. Combined with PEP 489 "Multi-phase extension
|
|
|
|
module initialization", it makes a C extension behavior closer to a
|
|
|
|
Python module, like allowing to create more than one module instance.
|
|
|
|
|
2020-06-22 07:01:12 -04:00
|
|
|
Making ``PyThreadState`` structure opaque requires adding getter and
|
2020-06-21 21:01:17 -04:00
|
|
|
setter functions for members used by C extensions.
|
|
|
|
|
|
|
|
**STATUS**: In Progress (started in Python 3.8)
|
|
|
|
|
|
|
|
The ``PyInterpreterState`` structure was made opaque in Python 3.8
|
|
|
|
(`bpo-35886 <https://bugs.python.org/issue35886>`_) and the
|
|
|
|
``PyGC_Head`` structure (`bpo-40241
|
|
|
|
<https://bugs.python.org/issue40241>`_) was made opaque in Python 3.9.
|
|
|
|
|
|
|
|
Issues tracking the work to prepare the C API to make following
|
|
|
|
structures opaque:
|
|
|
|
|
|
|
|
* ``PyObject``: `bpo-39573 <https://bugs.python.org/issue39573>`_
|
|
|
|
* ``PyTypeObject``: `bpo-40170 <https://bugs.python.org/issue40170>`_
|
2020-06-22 04:09:57 -04:00
|
|
|
* ``PyFrameObject``: `bpo-40421 <https://bugs.python.org/issue40421>`_
|
|
|
|
|
|
|
|
* Python 3.9 adds ``PyFrame_GetCode()`` and ``PyFrame_GetBack()``
|
|
|
|
getter functions, and moves ``PyFrame_GetLineNumber`` to the limited
|
|
|
|
C API.
|
|
|
|
|
|
|
|
* ``PyThreadState``: `bpo-39947 <https://bugs.python.org/issue39947>`_
|
|
|
|
|
|
|
|
* Python 3.9 adds 3 getter functions: ``PyThreadState_GetFrame()``,
|
|
|
|
``PyThreadState_GetID()``, ``PyThreadState_GetInterpreter()``.
|
2020-06-21 21:01:17 -04:00
|
|
|
|
|
|
|
Disallow using Py_TYPE() as l-value
|
|
|
|
-----------------------------------
|
|
|
|
|
|
|
|
The ``Py_TYPE()`` function gets an object type, its ``PyObject.ob_type``
|
|
|
|
member. It is implemented as a macro which can be used as an l-value to
|
|
|
|
set the type: ``Py_TYPE(obj) = new_type``. This code relies on the
|
|
|
|
assumption that ``PyObject.ob_type`` can be modified directly. It
|
2020-06-22 07:01:12 -04:00
|
|
|
prevents making the ``PyObject`` structure opaque.
|
2020-06-21 21:01:17 -04:00
|
|
|
|
|
|
|
New setter functions ``Py_SET_TYPE()``, ``Py_SET_REFCNT()`` and
|
|
|
|
``Py_SET_SIZE()`` are added and must be used instead.
|
|
|
|
|
|
|
|
The ``Py_TYPE()``, ``Py_REFCNT()`` and ``Py_SIZE()`` macros must be
|
|
|
|
converted to static inline functions which can not be used as l-value.
|
|
|
|
|
|
|
|
For example, the ``Py_TYPE()`` macro::
|
|
|
|
|
|
|
|
#define Py_TYPE(ob) (((PyObject*)(ob))->ob_type)
|
|
|
|
|
|
|
|
becomes::
|
|
|
|
|
|
|
|
#define _PyObject_CAST_CONST(op) ((const PyObject*)(op))
|
|
|
|
|
|
|
|
static inline PyTypeObject* _Py_TYPE(const PyObject *ob) {
|
|
|
|
return ob->ob_type;
|
|
|
|
}
|
|
|
|
|
|
|
|
#define Py_TYPE(ob) _Py_TYPE(_PyObject_CAST_CONST(ob))
|
|
|
|
|
|
|
|
**STATUS**: Completed (in Python 3.10)
|
|
|
|
|
|
|
|
New functions ``Py_SET_TYPE()``, ``Py_SET_REFCNT()`` and
|
|
|
|
``Py_SET_SIZE()`` were added to Python 3.9.
|
|
|
|
|
|
|
|
In Python 3.10, ``Py_TYPE()``, ``Py_REFCNT()`` and ``Py_SIZE()`` can no
|
|
|
|
longer be used as l-value and the new setter functions must be used
|
|
|
|
instead.
|
|
|
|
|
|
|
|
New C API functions must not return borrowed references
|
|
|
|
-------------------------------------------------------
|
|
|
|
|
|
|
|
When a function returns a borrowed reference, Python cannot track when
|
|
|
|
the caller stops using this reference.
|
|
|
|
|
|
|
|
For example, if the Python ``list`` type is specialized for small
|
|
|
|
integers, store directly "raw" numbers rather than Python objects,
|
|
|
|
``PyList_GetItem()`` has to create a temporary Python object. The
|
|
|
|
problem is to decide when it is safe to delete the temporary object.
|
|
|
|
|
|
|
|
The general guidelines is to avoid returning borrowed references for new
|
|
|
|
C API functions.
|
|
|
|
|
2020-06-22 12:17:18 -04:00
|
|
|
No function returning borrowed references is scheduled for removal by
|
2020-06-21 21:01:17 -04:00
|
|
|
this PEP.
|
|
|
|
|
|
|
|
**STATUS**: Completed (in Python 3.9)
|
|
|
|
|
|
|
|
In Python 3.9, new C API functions returning Python objects only return
|
|
|
|
strong references:
|
|
|
|
|
|
|
|
* ``PyFrame_GetBack()``
|
|
|
|
* ``PyFrame_GetCode()``
|
|
|
|
* ``PyObject_CallNoArgs()``
|
|
|
|
* ``PyObject_CallOneArg()``
|
|
|
|
* ``PyThreadState_GetFrame()``
|
|
|
|
|
|
|
|
Avoid functions returning PyObject**
|
|
|
|
------------------------------------
|
|
|
|
|
|
|
|
The ``PySequence_Fast_ITEMS()`` function gives a direct access to an
|
|
|
|
array of ``PyObject*`` objects. The function is deprecated in favor of
|
|
|
|
``PyTuple_GetItem()`` and ``PyList_GetItem()``.
|
|
|
|
|
|
|
|
``PyTuple_GET_ITEM()`` can be abused to access directly the
|
|
|
|
``PyTupleObject.ob_item`` member::
|
|
|
|
|
|
|
|
PyObject **items = &PyTuple_GET_ITEM(0);
|
|
|
|
|
|
|
|
The ``PyTuple_GET_ITEM()`` and ``PyList_GET_ITEM()`` macros are
|
|
|
|
converted to static inline functions to disallow that.
|
|
|
|
|
|
|
|
**STATUS**: Not Started
|
|
|
|
|
|
|
|
New pythoncapi_compat.h header file
|
|
|
|
-----------------------------------
|
|
|
|
|
2020-06-22 07:01:12 -04:00
|
|
|
Making structures opaque requires modifying C extensions to
|
2020-06-21 22:04:07 -04:00
|
|
|
use getter and setter functions. The practical issue is how to keep
|
|
|
|
support for old Python versions which don't have these functions.
|
2020-06-21 21:01:17 -04:00
|
|
|
|
2020-06-21 22:04:07 -04:00
|
|
|
For example, in Python 3.10, it is no longer possible to use
|
|
|
|
``Py_TYPE()`` as an l-value. The new ``Py_SET_TYPE()`` function must be
|
|
|
|
used instead::
|
2020-06-21 21:01:17 -04:00
|
|
|
|
|
|
|
#if PY_VERSION_HEX >= 0x030900A4
|
|
|
|
Py_SET_TYPE(&MyType, &PyType_Type);
|
|
|
|
#else
|
|
|
|
Py_TYPE(&MyType) = &PyType_Type;
|
|
|
|
#endif
|
|
|
|
|
|
|
|
This code may ring a bell to developers who ported their Python code
|
|
|
|
base from Python 2 to Python 3.
|
|
|
|
|
|
|
|
Python will distribute a new ``pythoncapi_compat.h`` header file which
|
|
|
|
provides new C API functions to old Python versions. Example::
|
|
|
|
|
|
|
|
#if PY_VERSION_HEX < 0x030900A4
|
|
|
|
static inline void
|
|
|
|
_Py_SET_TYPE(PyObject *ob, PyTypeObject *type)
|
|
|
|
{
|
|
|
|
ob->ob_type = type;
|
|
|
|
}
|
|
|
|
#define Py_SET_TYPE(ob, type) _Py_SET_TYPE((PyObject*)(ob), type)
|
|
|
|
#endif // PY_VERSION_HEX < 0x030900A4
|
|
|
|
|
|
|
|
Using this header file, ``Py_SET_TYPE()`` can be used on old Python
|
|
|
|
versions as well.
|
|
|
|
|
2020-06-22 07:01:12 -04:00
|
|
|
Developers can copy this file in their project, or even to only
|
2020-06-21 22:04:07 -04:00
|
|
|
copy/paste the few functions needed by their C extension.
|
2020-06-21 21:01:17 -04:00
|
|
|
|
2020-06-22 07:01:12 -04:00
|
|
|
**STATUS**: In Progress (implemented but not distributed by CPython yet)
|
2020-06-21 21:01:17 -04:00
|
|
|
|
|
|
|
The ``pythoncapi_compat.h`` header file is currently developer at:
|
|
|
|
https://github.com/pythoncapi/pythoncapi_compat
|
|
|
|
|
2020-06-22 07:01:12 -04:00
|
|
|
Process to reduce the number of broken C extensions
|
|
|
|
===================================================
|
|
|
|
|
2020-06-22 07:06:55 -04:00
|
|
|
Process to reduce the number of broken C extensions when introducing C
|
|
|
|
API incompatible changes listed in this PEP:
|
2020-06-21 21:01:17 -04:00
|
|
|
|
|
|
|
* Estimate how many popular C extensions are affected by the
|
|
|
|
incompatible change.
|
|
|
|
* Coordinate with maintainers of broken C extensions to prepare their
|
|
|
|
code for the future incompatible change.
|
|
|
|
* Introduce the incompatible changes in Python. The documentation must
|
|
|
|
explain how to port existing code. It is recommended to merge such
|
2020-06-21 22:04:07 -04:00
|
|
|
changes at the beginning of a development cycle to have more time for
|
|
|
|
tests.
|
2020-06-21 21:01:17 -04:00
|
|
|
* Changes which are the most likely to break a large number of C
|
|
|
|
extensions should be announced on the capi-sig mailing list to notify
|
|
|
|
C extensions maintainers to prepare their project for the next Python.
|
|
|
|
* If the change breaks too many projects, reverting the change should be
|
|
|
|
discussed, taking in account the number of broken packages, their
|
2020-06-22 07:06:55 -04:00
|
|
|
importance in the Python community, and the importance of the change.
|
2020-06-21 21:01:17 -04:00
|
|
|
|
|
|
|
The coordination usually means reporting issues to the projects, or even
|
2020-06-22 07:06:55 -04:00
|
|
|
proposing changes. It does not require waiting for a new release including
|
2020-06-21 21:01:17 -04:00
|
|
|
fixes for every broken project.
|
|
|
|
|
2020-06-21 22:04:07 -04:00
|
|
|
Since more and more C extensions are written using Cython, rather
|
|
|
|
directly using the C API, it is important to ensure that Cython is
|
2020-06-22 07:01:12 -04:00
|
|
|
prepared in advance for incompatible changes. It gives more time for C
|
2020-06-21 22:04:07 -04:00
|
|
|
extension maintainers to release a new version with code generated with
|
2020-06-22 07:01:12 -04:00
|
|
|
the updated Cython (for C extensions distributing the code generated by
|
2020-06-21 22:04:07 -04:00
|
|
|
Cython).
|
|
|
|
|
2020-06-21 21:01:17 -04:00
|
|
|
Future incompatible changes can be announced by deprecating a function
|
|
|
|
in the documentation and by annotating the function with
|
2020-06-21 22:04:07 -04:00
|
|
|
``Py_DEPRECATED()``. But making a structure opaque and preventing the
|
|
|
|
usage of a macro as l-value cannot be deprecated with
|
|
|
|
``Py_DEPRECATED()``.
|
2020-06-21 21:01:17 -04:00
|
|
|
|
2020-06-22 07:01:12 -04:00
|
|
|
The important part is coordination and finding a balance between CPython
|
|
|
|
evolutions and backward compatibility. For example, breaking a random,
|
|
|
|
old, obscure and unmaintained C extension on PyPI is less severe than
|
|
|
|
breaking numpy.
|
2020-06-21 21:01:17 -04:00
|
|
|
|
|
|
|
If a change is reverted, we move back to the coordination step to better
|
|
|
|
prepare the change. Once more C extensions are ready, the incompatible
|
|
|
|
change can be reconsidered.
|
|
|
|
|
|
|
|
|
2020-06-22 06:47:57 -04:00
|
|
|
Version History
|
|
|
|
===============
|
|
|
|
|
2020-06-22 07:01:12 -04:00
|
|
|
* Version 3, June 2020: PEP rewritten from scratch. Python now
|
|
|
|
distributes a new ``pythoncapi_compat.h`` header and a process is
|
2020-06-22 07:06:55 -04:00
|
|
|
defined to reduce the number of broken C extensions when introducing C
|
2020-06-22 07:01:12 -04:00
|
|
|
API incompatible changes listed in this PEP.
|
2020-06-22 06:47:57 -04:00
|
|
|
* Version 2, April 2020:
|
|
|
|
`PEP: Modify the C API to hide implementation details
|
|
|
|
<https://mail.python.org/archives/list/python-dev@python.org/thread/HKM774XKU7DPJNLUTYHUB5U6VR6EQMJF/#TKHNENOXP6H34E73XGFOL2KKXSM4Z6T2>`_.
|
|
|
|
* Version 1, July 2017:
|
|
|
|
`PEP: Hide implementation details in the C API
|
|
|
|
<https://mail.python.org/archives/list/python-ideas@python.org/thread/6XATDGWK4VBUQPRHCRLKQECTJIPBVNJQ/#HFBGCWVLSM47JEP6SO67MRFT7Y3EOC44>`_
|
|
|
|
sent to python-ideas
|
|
|
|
|
|
|
|
|
2020-06-21 21:01:17 -04:00
|
|
|
Copyright
|
|
|
|
=========
|
|
|
|
|
|
|
|
This document has been placed in the public domain.
|