PEP 733: An Evaluation of Python's Public C API (#3491)
This commit is contained in:
parent
150fbe9368
commit
02da28b2bd
|
@ -610,6 +610,7 @@ peps/pep-0729.rst @JelleZijlstra @hauntsaninja
|
|||
peps/pep-0730.rst @ned-deily
|
||||
peps/pep-0731.rst @gvanrossum @encukou @vstinner @zooba @iritkatriel
|
||||
peps/pep-0732.rst @Mariatta
|
||||
peps/pep-0733.rst @encukou @vstinner @zooba @iritkatriel
|
||||
# ...
|
||||
# peps/pep-0754.rst
|
||||
# ...
|
||||
|
|
|
@ -0,0 +1,685 @@
|
|||
PEP: 733
|
||||
Title: An Evaluation of Python's Public C API
|
||||
Author: Erlend Egeberg Aasland <erlend@python.org>,
|
||||
Domenico Andreoli <domenico.andreoli@linux.com>,
|
||||
Stefan Behnel <stefan_ml@behnel.de>,
|
||||
Carl Friedrich Bolz-Tereick <cfbolz@gmx.de>,
|
||||
Simon Cross <hodgestar@gmail.com>,
|
||||
Steve Dower <steve.dower@python.org>,
|
||||
Tim Felgentreff <tim.felgentreff@oracle.com>,
|
||||
David Hewitt <1939362+davidhewitt@users.noreply.github.com>,
|
||||
Shantanu Jain <hauntsaninja at gmail.com>,
|
||||
Wenzel Jakob <wenzel.jakob@epfl.ch>,
|
||||
Irit Katriel <irit@python.org>,
|
||||
Marc-Andre Lemburg <mal@lemburg.com>,
|
||||
Donghee Na <donghee.na@python.org>,
|
||||
Karl Nelson <nelson85@llnl.gov>,
|
||||
Ronald Oussoren <ronaldoussoren@mac.com>,
|
||||
Antoine Pitrou <solipsis@pitrou.net>,
|
||||
Neil Schemenauer <nas@arctrix.com>,
|
||||
Mark Shannon <mark@hotpy.org>,
|
||||
Stepan Sindelar <stepan.sindelar@oracle.com>,
|
||||
Gregory P. Smith <greg@krypto.org>,
|
||||
Eric Snow <ericsnowcurrently@gmail.com>,
|
||||
Victor Stinner <vstinner@python.org>,
|
||||
Guido van Rossum <guido@python.org>,
|
||||
Petr Viktorin <encukou@gmail.com>,
|
||||
Carol Willing <willingc@gmail.com>,
|
||||
William Woodruff <william@yossarian.net>,
|
||||
David Woods <dw-git@d-woods.co.uk>,
|
||||
Jelle Zijlstra <jelle.zijlstra@gmail.com>,
|
||||
Status: Draft
|
||||
Type: Informational
|
||||
Content-Type: text/x-rst
|
||||
Created: 16-Oct-2023
|
||||
|
||||
|
||||
Abstract
|
||||
========
|
||||
|
||||
This **informational** PEP describes our shared view of the public C API. The
|
||||
document defines:
|
||||
|
||||
* purposes of the C API
|
||||
* stakeholders and their particular use cases and requirements
|
||||
* strengths of the C API
|
||||
* problems of the C API categorized into nine areas of weakness
|
||||
|
||||
This document does not propose solutions to any of the identified problems. By
|
||||
creating a shared list of C API issues, this document will help to guide
|
||||
continuing discussion about change proposals and to identify evaluation
|
||||
criteria.
|
||||
|
||||
|
||||
Introduction
|
||||
============
|
||||
|
||||
Python's C API was not designed for the different purposes it currently
|
||||
fulfills. It evolved from what was initially the internal API between
|
||||
the C code of the interpreter and the Python language and libraries.
|
||||
In its first incarnation, it was exposed to make it possible to embed
|
||||
Python into C/C++ applications and to write extension modules in C/C++.
|
||||
These capabilities were instrumental to the growth of Python's ecosystem.
|
||||
Over the decades, the C API grew to provide different tiers of stability,
|
||||
conventions changed, and new usage patterns have emerged, such as bindings
|
||||
to languages other than C/C++. In the next few years, new developments
|
||||
are expected to further test the C API, such as the removal of the GIL
|
||||
and the development of a JIT compiler. However, this growth was not
|
||||
supported by clearly documented guidelines, resulting in inconsitent
|
||||
approaches to API design in different subsystems of CPython. In addition,
|
||||
CPython is no longer the only implementation of Python, and some of the
|
||||
design decisions made when it was, are difficult for alternative
|
||||
implementations to work with
|
||||
[`Issue 64 <https://github.com/capi-workgroup/problems/issues/64>`__].
|
||||
In the meantime, lessons were learned and mistakes in both the design
|
||||
and the implementation of the C API were identified.
|
||||
|
||||
Evolving the C API is hard due to the combination of backwards
|
||||
compatibility constraints and its inherent complexity, both
|
||||
technical and social. Different types of users bring different,
|
||||
sometimes conflicting, requirements. The tradeoff between stability
|
||||
and progress is an ongoing, highly contentious topic of discussion
|
||||
when suggestions are made for incremental improvements.
|
||||
Several proposals have been put forward for improvement, redesign
|
||||
or replacement of the C API, each representing a deep analysis of
|
||||
the problems. At the 2023 Language Summit, three back-to-back
|
||||
sessions were devoted to different aspects of the C API. There is
|
||||
general agreement that a new design can remedy the problems that
|
||||
the C API has accumulated over the last 30 years, while at the
|
||||
same time updating it for use cases that it was not originally
|
||||
designed for.
|
||||
|
||||
However, there was also a sense at the Language Summit that we are
|
||||
trying to discuss solutions without a clear common understanding
|
||||
of the problems that we are trying to solve. We decided that
|
||||
we need to agree on the current problems with the C API, before
|
||||
we are able to evaluate any of the proposed solutions. We
|
||||
therefore created the
|
||||
`capi-workgroup <https://github.com/capi-workgroup/problems/issues/>`__
|
||||
repository on GitHub in order to collect everyone's ideas on that
|
||||
question.
|
||||
|
||||
Over 60 different issues were created on that repository, each
|
||||
describing a problem with the C API. We categorized them and
|
||||
identified a number of recurring themes. The sections below
|
||||
mostly correspond to these themes, and each contains a combined
|
||||
description of the issues raised in that category, along with
|
||||
links to the individual issues. In addition, we included a section
|
||||
that aims to identify the different stakeholders of the C API,
|
||||
and the particular requirements that each of them has.
|
||||
|
||||
|
||||
C API Stakeholders
|
||||
==================
|
||||
|
||||
As mentioned in the introduction, the C API was originally
|
||||
created as the internal interface between CPython's
|
||||
interpreter and the Python layer. It was later exposed as
|
||||
a way for third-party developers to extend and embed Python
|
||||
programs. Over the years, new types of stakeholders emerged,
|
||||
with different requirements and areas of focus. This section
|
||||
describes this complex state of affairs in terms of the
|
||||
actions that different stakeholders need to perform through
|
||||
the C API.
|
||||
|
||||
Common Actions for All Stakeholders
|
||||
-----------------------------------
|
||||
|
||||
There are actions which are generic, and required by
|
||||
all types of API users:
|
||||
|
||||
* Define functions and call them
|
||||
* Define new types
|
||||
* Create instances of builtin and user-defined types
|
||||
* Perform operations on object instances
|
||||
* Introspect objects, including types, instances, and functions
|
||||
* Raise and handle exceptions
|
||||
* Import modules
|
||||
* Access to Python's OS interface
|
||||
|
||||
The following sections look at the unique requirements of various stakeholders.
|
||||
|
||||
Extension Writers
|
||||
-----------------
|
||||
|
||||
Extension writers are the traditional users of the C API. Their requirements
|
||||
are the common actions listed above. They also commonly need to:
|
||||
|
||||
* Create new modules
|
||||
* Efficiently interface between modules at the C level
|
||||
|
||||
|
||||
Authors of Embedded Python Applications
|
||||
---------------------------------------
|
||||
|
||||
Applications with an embedded Python interpreter. Examples are
|
||||
`Blender <https://docs.blender.org/api/current/info_overview.html>`__ and
|
||||
`OBS <https://obsproject.com/wiki/Getting-Started-With-OBS-Scripting>`__.
|
||||
|
||||
They need to be able to:
|
||||
|
||||
* Configure the interpreter (import paths, inittab, ``sys.argv``, memory
|
||||
allocator, etc.).
|
||||
* Interact with the execution model and program lifetime, including
|
||||
clean interpreter shutdown and restart.
|
||||
* Represent complex data models in a way Python can use without
|
||||
having to create deep copies.
|
||||
* Provide and import frozen modules.
|
||||
* Run and manage multiple independent interpreters (in particular, when
|
||||
embedded in a library that wants to avoid global effects).
|
||||
|
||||
Python Implementations
|
||||
----------------------
|
||||
|
||||
Python implementations such as
|
||||
`CPython <https://www.python.org>`__,
|
||||
`PyPy <https://www.pypy.org>`__,
|
||||
`GraalPy <https://www.graalvm.org/python/>`__,
|
||||
`IronPython <https://ironpython.net>`__,
|
||||
`RustPython <https://github.com/RustPython/RustPython>`__,
|
||||
`MicroPython <https://micropython.org>`__,
|
||||
and `Jython <https://www.jython.org>`__), may take
|
||||
very different approaches for the implementation of
|
||||
different subsystems. They need:
|
||||
|
||||
* The API to be abstract and hide implementation details.
|
||||
* A specification of the API, ideally with a test suite
|
||||
that ensures compatibility.
|
||||
* It would be nice to have an ABI that can be shared
|
||||
across Python implementations.
|
||||
|
||||
Alternative APIs and Binding Generators
|
||||
---------------------------------------
|
||||
|
||||
There are several projects that implement alternatives to the
|
||||
C API, which offer extension users advantanges over programming
|
||||
directly with the C API. These APIs are implemented with the
|
||||
C API, and in some cases by using CPython internals.
|
||||
|
||||
There are also libraries that create bindings between Python and
|
||||
other object models, paradigms or languages.
|
||||
|
||||
There is overlap between these categories: binding generators
|
||||
usually provide alternative APIs, and vice versa.
|
||||
|
||||
Examples are
|
||||
`Cython <https://cython.org>`__,
|
||||
`cffi <https://cffi.readthedocs.io/>`__,
|
||||
`pybind11 <https://pybind11.readthedocs.io/en/stable/>`__ and
|
||||
`nanobind <https://github.com/wjakob/nanobind>`__ for C++,
|
||||
`PyO3 <https://github.com/PyO3/pyo3>`__ for Rust,
|
||||
`PySide <https://pypi.org/project/PySide/>`__ for Qt,
|
||||
`PyGObject <https://pygobject.readthedocs.io/en/latest/>`__ for GTK,
|
||||
`Pygolo <https://gitlab.com/pygolo/py>`__ for Go,
|
||||
`JPype <https://github.com/jpype-project/jpype/>`__ for Java,
|
||||
`PyJNIus <https://github.com/kivy/pyjnius/>`__ for Android,
|
||||
`PyObjC <https://pyobjc.readthedocs.io>`__ for Objective-C,
|
||||
`SWIG <https://swig.org/>`__ for C/C++,
|
||||
`Python.NET <https://github.com/pythonnet/pythonnet>`__ for .NET (C#),
|
||||
`HPy <https://hpyproject.org>`__,
|
||||
`Mypyc <https://mypyc.readthedocs.io/en/latest/introduction.html>`__,
|
||||
`Pythran <https://pythran.readthedocs.io>`__ and
|
||||
`pythoncapi-compat <https://pythoncapi-compat.readthedocs.io/en/latest/>`__.
|
||||
CPython's DSL for parsing function arguments, the
|
||||
`Argument Clinic <https://devguide.python.org/development-tools/clinic/>`__,
|
||||
can also be seen as belonging to this category of stakeholders.
|
||||
|
||||
Alternative APIs need minimal building blocks for accessing CPython
|
||||
efficiently. They don't necessarily need an ergonomic API, because
|
||||
they typically generate code that is not intended to be read
|
||||
by humans. But they do need it to be comprehensive enough so that
|
||||
they can avoid accessing internals, without sacrificing performance.
|
||||
|
||||
Binding generators often need to:
|
||||
|
||||
* Create custom objects (e.g. function/module objects
|
||||
and traceback entries) that match the behavior of equivalent
|
||||
Python code as closely as possible.
|
||||
* Dynamically create objects which are static in traditional
|
||||
C extensions (e.g. classes/modules), and need CPython to manage
|
||||
their state and lifetime.
|
||||
* Dynamically adapt foreign objects (strings, GC'd containers), with
|
||||
low overhead.
|
||||
* Adapt external mechanisms, execution models and guarantees to the
|
||||
Python way (stackful coroutines, continuations,
|
||||
one-writer-or-multiple-readers semantics, virtual multiple inheritance,
|
||||
1-based indexing, super-long inheritance chains, goroutines, channels,
|
||||
etc.).
|
||||
|
||||
These tools might also benefit from a choice between a more stable
|
||||
and a faster (possibly lower-level) API. Their users could
|
||||
then decide whether they can afford to regenerate the code often or
|
||||
trade some performance for more stability and less maintenance work.
|
||||
|
||||
|
||||
Strengths of the C API
|
||||
======================
|
||||
|
||||
While the bulk of this document is devoted to problems with the
|
||||
C API that we would like to see fixed in any new design, it is
|
||||
also important to point out the strengths of the C API, and to
|
||||
make sure that they are preserved.
|
||||
|
||||
As mentioned in the introduction, the C API enabled the
|
||||
development and growth of the Python ecosystem over the last
|
||||
three decades, while evolving to support use cases that it was
|
||||
not originally designed for. This track record in itself is
|
||||
indication of how effective and valuable it has been.
|
||||
|
||||
A number of specific strengths were mentioned in the
|
||||
capi-workgroup discussions. Heap types were identified
|
||||
as much safer and easier to use than static types
|
||||
[`Issue 4 <https://github.com/capi-workgroup/problems/issues/4#issuecomment-1542324451>`__].
|
||||
|
||||
API functions that take a C string literal for lookups based
|
||||
on a Python string are very convenient
|
||||
[`Issue 30 <https://github.com/capi-workgroup/problems/issues/30#issuecomment-1550098113>`__].
|
||||
|
||||
The limited API demonstrates that an API which hides implementation
|
||||
details makes it easier to evolve Python
|
||||
[`Issue 30 <https://github.com/capi-workgroup/problems/issues/30#issuecomment-1560083258>`__].
|
||||
|
||||
C API problems
|
||||
==============
|
||||
|
||||
The remainder of this document summarizes and categorizes the problems that were reported on
|
||||
the `capi-workgroup <https://github.com/capi-workgroup/problems/issues/>`__ repository.
|
||||
The issues are grouped into several categories.
|
||||
|
||||
|
||||
API Evolution and Maintenance
|
||||
-----------------------------
|
||||
|
||||
The difficulty of making changes in the C API is central to this report. It is
|
||||
implicit in many of the issues we discuss here, particularly when we need to
|
||||
decide whether an incremental bugfix can resolve the issue, or whether it can
|
||||
only be addressed as part of an API redesign
|
||||
[`Issue 44 <https://github.com/capi-workgroup/problems/issues/44>`__]. The
|
||||
benefit of each incremental change is often viewed as too small to justify the
|
||||
disruption. Over time, this implies that every mistake we make in an API's
|
||||
design or implementation remains with us indefinitely.
|
||||
|
||||
We can take two views on this issue. One is that this is a problem and the
|
||||
solution needs to be baked into any new C API we design, in the form of a
|
||||
process for incremental API evolution, which includes deprecation and
|
||||
removal of API elements. The other possible approach is that this is not
|
||||
a problem to be solved, but rather a feature of any API. In this
|
||||
view, API evolution should not be incremental, but rather through large
|
||||
redesigns, each of which learns from the mistakes of the past and is not
|
||||
shackled by backwards compatibility requirements (in the meantime, new
|
||||
API elements may be added, but nothing can ever be removed). A compromise
|
||||
approach is somewhere between these two extremes, fixing issues which are
|
||||
easy or important enough to tackle incrementally, and leaving others alone.
|
||||
|
||||
The problem we have in CPython is that we don't have an agreed, official
|
||||
approach to API evolution. Different members of the core team are pulling in
|
||||
different directions and this is an ongoing source of disagreements.
|
||||
Any new C API needs to come with a clear decision about the model
|
||||
that its maintenance will follow, as well as the technical and
|
||||
organizational processes by which this will work.
|
||||
|
||||
If the model does include provisions for incremental evolution of the API,
|
||||
it will include processes for managing the impact of the change on users
|
||||
[`Issue 60 <https://github.com/capi-workgroup/problems/issues/60>`__],
|
||||
perhaps through introducing an external backwards compatibility module
|
||||
[`Issue 62 <https://github.com/capi-workgroup/problems/issues/62>`__],
|
||||
or a new API tier of "blessed" functions
|
||||
[`Issue 55 <https://github.com/capi-workgroup/problems/issues/55>`__].
|
||||
|
||||
|
||||
API Specification and Abstraction
|
||||
---------------------------------
|
||||
|
||||
The C API does not have a formal specification, it is currently defined
|
||||
as whatever the reference implementation (CPython) contains in a
|
||||
particular version. The documentation acts as an incomplete description,
|
||||
which is not sufficient for verifying the correctness of either the full
|
||||
API, the limited API, or the stable ABI. As a result, the C API may
|
||||
change significantly between releases without needing a more visible
|
||||
specification update, and this leads to a number of problems.
|
||||
|
||||
Bindings for languages other than C/C++ must parse C code
|
||||
[`Issue 7 <https://github.com/capi-workgroup/problems/issues/7>`__].
|
||||
Some C language features are hard to handle in this way, because
|
||||
they produce compiler-dependent output (such as enums) or require
|
||||
a C preprocessor/compiler rather than just a parser (such as macros)
|
||||
[`Issue 35 <https://github.com/capi-workgroup/problems/issues/35>`__].
|
||||
|
||||
Furthermore, C header files tend to expose more than what is intended
|
||||
to be part of the public API
|
||||
[`Issue 34 <https://github.com/capi-workgroup/problems/issues/34>`__].
|
||||
In particular, implementation details such as the precise memory
|
||||
layouts of internal data structures can be exposed
|
||||
[`Issue 22 <https://github.com/capi-workgroup/problems/issues/22>`__
|
||||
and :pep:`620`].
|
||||
This can make API evolution very difficult, in particular when it
|
||||
occurs in the stable ABI as in the case of ``ob_refcnt`` and ``ob_type``,
|
||||
which are accessed via the reference counting macros
|
||||
[`Issue 45 <https://github.com/capi-workgroup/problems/issues/45>`__].
|
||||
|
||||
We identified a deeper issue in relation to the way that reference
|
||||
counting is exposed. The way that C extensions are required to
|
||||
manage references with calls to ``Py_INCREF`` and ``Py_DECREF`` is
|
||||
specific to CPython's memory model, and is hard for alternative
|
||||
Python implementations to emulate.
|
||||
[`Issue 12 <https://github.com/capi-workgroup/problems/issues/12>`__].
|
||||
|
||||
Another set of problems arises from the fact that a ``PyObject*`` is
|
||||
exposed in the C API as an actual pointer rather than a handle. The
|
||||
address of an object serves as its ID and is used for comparison,
|
||||
and this complicates matters for alternative Python implementations
|
||||
that move objects during GC
|
||||
[`Issue 37 <https://github.com/capi-workgroup/problems/issues/37>`__].
|
||||
|
||||
A separate issue is that object references are opaque to the runtime,
|
||||
discoverable only through calls to ``tp_traverse``/``tp_clear``,
|
||||
which have their own purposes. If there was a way for the runtime to
|
||||
know the structure of the object graph, and keep up with changes in it,
|
||||
this would make it possible for alternative implementations to implement
|
||||
different memory management schemes
|
||||
[`Issue 33 <https://github.com/capi-workgroup/problems/issues/33>`__].
|
||||
|
||||
Object Reference Management
|
||||
---------------------------
|
||||
|
||||
There does not exist a consistent naming convention for functions
|
||||
which makes their reference semantics obvious, and this leads to
|
||||
error prone C API functions, where they do not follow the typical
|
||||
behaviour. When a C API function returns a ``PyObject*``, the
|
||||
caller typically gains ownership of a reference to the object.
|
||||
However, there are exceptions where a function returns a
|
||||
"borrowed" reference, which the caller can access but does not own
|
||||
a reference to. Similarly, functions typically do not change the
|
||||
ownership of references to their arguments, but there are
|
||||
exceptions where a function "steals" a reference, i.e., the
|
||||
ownership of the reference is permanently transferred from the
|
||||
caller to the callee by the call
|
||||
[`Issue 8 <https://github.com/capi-workgroup/problems/issues/8>`__
|
||||
and `Issue 52 <https://github.com/capi-workgroup/problems/issues/52>`__].
|
||||
The terminology used to describe these situations in the documentation
|
||||
can also be improved
|
||||
[`Issue 11 <https://github.com/capi-workgroup/problems/issues/11>`__].
|
||||
|
||||
A more radical change is necessary in the case of functions that
|
||||
return "borrowed" references (such as ``PyList_GetItem``)
|
||||
[`Issue 5 <https://github.com/capi-workgroup/problems/issues/5>`__ and
|
||||
`Issue 21 <https://github.com/capi-workgroup/problems/issues/21>`__]
|
||||
or pointers to parts of the internal structure of an object
|
||||
(such as ``PyBytes_AsString``)
|
||||
[`Issue 57 <https://github.com/capi-workgroup/problems/issues/57>`__].
|
||||
In both cases, the reference/pointer is valid for as long as the
|
||||
owning object holds the reference, but this time is hard to reason about.
|
||||
Such functions should not exist in the API without a mechanism that can
|
||||
make them safe.
|
||||
|
||||
For containers, the API is currently missing bulk operations on the
|
||||
references of contained objects. This is particularly important for
|
||||
a stable ABI where ``INCREF`` and ``DECREF`` cannot be macros, making
|
||||
bulk operations expensive when implemented as a sequence of function
|
||||
calls
|
||||
[`Issue 15 <https://github.com/capi-workgroup/problems/issues/15>`__].
|
||||
|
||||
Type Definition and Object Creation
|
||||
-----------------------------------
|
||||
|
||||
The C API has functions that make it possible to create incomplete
|
||||
or inconsistent Python objects, such as ``PyTuple_New`` and
|
||||
``PyUnicode_New``. This causes problems when the object is tracked
|
||||
by GC or its ``tp_traverse``/``tp_clear`` functions are called.
|
||||
A related issue is with functions such as ``PyTuple_SetItem``
|
||||
which is used to modify a partially initialized tuple (tuples
|
||||
are immutable once fully initialized)
|
||||
[`Issue 56 <https://github.com/capi-workgroup/problems/issues/56>`__].
|
||||
|
||||
We identified a few issues with type definition APIs. For legacy
|
||||
reasons, there is often a significant amount of code duplication
|
||||
between ``tp_new`` and ``tp_vectorcall``
|
||||
[`Issue 24 <https://github.com/capi-workgroup/problems/issues/24>`__].
|
||||
The type slot function should be called indirectly, so that their
|
||||
signatures can change to include context information
|
||||
[`Issue 13 <https://github.com/capi-workgroup/problems/issues/13>`__].
|
||||
Several aspects of the type definition and creation process are not
|
||||
well defined, such as which stage of the process is responsible for
|
||||
initializing and clearing different fields of the type object
|
||||
[`Issue 49 <https://github.com/capi-workgroup/problems/issues/49>`__].
|
||||
|
||||
Error Handling
|
||||
--------------
|
||||
|
||||
Error handling in the C API is based on the error indicator which is stored
|
||||
on the thread state (in global scope). The design intention was that each
|
||||
API function returns a value indicating whether an error has occurred (by
|
||||
convention, ``-1`` or ``NULL``). When the program knows that an error
|
||||
occurred, it can fetch the exception object which is stored in the
|
||||
error indicator. We identified a number of problems which are related
|
||||
to error handling, pointing at APIs which are too easy to use incorrectly.
|
||||
|
||||
There are functions that do not report all errors that occur while they
|
||||
execute. For example, ``PyDict_GetItem`` clears any errors that occur
|
||||
when it calls the key's hash function, or while performing a lookup
|
||||
in the dictionary
|
||||
[`Issue 51 <https://github.com/capi-workgroup/problems/issues/51>`__].
|
||||
|
||||
Python code never executes with an in-flight exception (by definition),
|
||||
and typically code using native functions should also be interrupted by
|
||||
an error being raised. This is not checked in most C API functions, and
|
||||
there are places in the interpreter where error handling code calls a C API
|
||||
function while an exception is set. For example, see the call to
|
||||
``PyUnicode_FromString`` in the error handler of ``_PyErr_WriteUnraisableMsg``
|
||||
[`Issue 2 <https://github.com/capi-workgroup/problems/issues/2>`__].
|
||||
|
||||
|
||||
There are functions that do not return a value, so a caller is forced to
|
||||
query the error indicator in order to identify whether an error has occurred.
|
||||
An example is ``PyBuffer_Release``
|
||||
[`Issue 20 <https://github.com/capi-workgroup/problems/issues/20>`__].
|
||||
There are other functions which do have a return value, but this return value
|
||||
does not unambiguously indicate whether an error has occurred. For example,
|
||||
``PyLong_AsLong`` returns ``-1`` in case of error, or when the value of the
|
||||
argument is indeed ``-1``
|
||||
[`Issue 1 <https://github.com/capi-workgroup/problems/issues/1>`__].
|
||||
In both cases, the API is error prone because it is possible that the
|
||||
error indicator was already set before the function was called, and the
|
||||
error is incorrectly attributed. The fact that the error was not detected
|
||||
before the call is a bug in the calling code, but the behaviour of the
|
||||
program in this case doesn't make it easy to identify and debug the
|
||||
problem.
|
||||
|
||||
There are functions that take a ``PyObject*`` argument, with special meaning
|
||||
when it is ``NULL``. For example, if ``PyObject_SetAttr`` receives ``NULL`` as
|
||||
the value to set, this means that the attribute should be cleared. This is error
|
||||
prone because it could be that ``NULL`` indicates an error in the construction
|
||||
of the value, and the program failed to check for this error. The program will
|
||||
misinterpret the ``NULL`` to mean something different than error
|
||||
[`Issue 47 <https://github.com/capi-workgroup/problems/issues/47>`__].
|
||||
|
||||
|
||||
API Tiers and Stability Guarantees
|
||||
----------------------------------
|
||||
|
||||
The different API tiers provide different tradeoffs of stability vs
|
||||
API evolution, and sometimes performance.
|
||||
|
||||
The stable ABI was identified as an area that needs to be looked into. At
|
||||
the moment it is incomplete and not widely adopted. At the same time, its
|
||||
existence is making it hard to make changes to some implementation
|
||||
details, because it exposes struct fields such as ``ob_refcnt``,
|
||||
``ob_type`` and ``ob_size``. There was some discussion about whether
|
||||
the stable ABI is worth keeping. Arguments on both sides can be
|
||||
found in [`Issue 4 <https://github.com/capi-workgroup/problems/issues/4>`__]
|
||||
and [`Issue 9 <https://github.com/capi-workgroup/problems/issues/9>`__].
|
||||
|
||||
Alternatively, it was suggested that in order to be able to evolve
|
||||
the stable ABI, we need a mechanism to support multiple versions of
|
||||
it in the same Python binary. It was pointed out that versioning
|
||||
individual functions within a single ABI version is not enough
|
||||
because it may be necessary to evolve, together, a group of functions
|
||||
that interoperate with each other
|
||||
[`Issue 39 <https://github.com/capi-workgroup/problems/issues/39>`__].
|
||||
|
||||
The limited API was introduced in 3.2 as a blessed subset of the C API
|
||||
which is recommended for users who would like to restrict themselves
|
||||
to high quality APIs which are not likely to change often. The
|
||||
``Py_LIMITED_API`` flag allows users to restrict their program to older
|
||||
versions of the limited API, but we now need the opposite option, to
|
||||
exclude older versions. This would make it possible to evolve the
|
||||
limited API by replacing flawed elements in it
|
||||
[`Issue 54 <https://github.com/capi-workgroup/problems/issues/54>`__].
|
||||
More generally, in a redesign we should revisit the way that API
|
||||
tiers are specified and consider designing a method that will unify the
|
||||
way we currently select between the different tiers
|
||||
[`Issue 59 <https://github.com/capi-workgroup/problems/issues/59>`__].
|
||||
|
||||
API elements whose names begin with an underscore are considered
|
||||
private, essentially an API tier with no stability guarantees.
|
||||
However, this was only clarified recently, in :pep:`689`. It is
|
||||
not clear what the change policy should be with respect to such
|
||||
API elements that predate PEP 689
|
||||
[`Issue 58 <https://github.com/capi-workgroup/problems/issues/58>`__].
|
||||
|
||||
There are API functions which have an unsafe (but fast) version as well as
|
||||
a safe version which performs error checking (for example,
|
||||
``PyTuple_GET_ITEM`` vs ``PyTuple_GetItem``). It may help to
|
||||
be able to group them into their own tiers - the "unsafe API" tier and
|
||||
the "safe API" tier
|
||||
[`Issue 61 <https://github.com/capi-workgroup/problems/issues/61>`__].
|
||||
|
||||
Use of the C Language
|
||||
---------------------
|
||||
|
||||
A number of issues were raised with respect to the way that CPython
|
||||
uses the C language. First there is the issue of which C dialect
|
||||
we use, and how we test our compatibility with it, as well as API
|
||||
header compatibility with C++ dialects
|
||||
[`Issue 42 <https://github.com/capi-workgroup/problems/issues/42>`__].
|
||||
|
||||
Usage of ``const`` in the API is currently sparse, but it is not
|
||||
clear whether this is something that we should consider changing
|
||||
[`Issue 38 <https://github.com/capi-workgroup/problems/issues/38>`__].
|
||||
|
||||
We currently use the C types ``long`` and ``int``, where fixed-width integers
|
||||
such as ``int32_t`` and ``int64_t`` may now be better choices
|
||||
[`Issue 27 <https://github.com/capi-workgroup/problems/issues/27>`__].
|
||||
|
||||
We are using C language features which are hard for other languages
|
||||
to interact with, such as macros, variadic arguments, enums, bitfields,
|
||||
and non-function symbols
|
||||
[`Issue 35 <https://github.com/capi-workgroup/problems/issues/35>`__].
|
||||
|
||||
There are API functions that take a ``PyObject*`` arg which must be
|
||||
of a more specific type (such as ``PyTuple_Size``, which fails if
|
||||
its arg is not a ``PyTupleObject*``). It is an open question whether this
|
||||
is a good pattern to have, or whether the API should expect the
|
||||
more specific type
|
||||
[`Issue 31 <https://github.com/capi-workgroup/problems/issues/31>`__].
|
||||
|
||||
There are functions in the API that take concrete types, such as
|
||||
``PyDict_GetItemString`` which performs a dictionary lookup for a key
|
||||
specified as a C string rather than ``PyObject*``. At the same time,
|
||||
for ``PyDict_ContainsString`` it is not considered appropriate to
|
||||
add a concrete type alternative. The principle around this should
|
||||
be documented in the guidelines
|
||||
[`Issue 23 <https://github.com/capi-workgroup/problems/issues/23>`__].
|
||||
|
||||
Implementation Flaws
|
||||
--------------------
|
||||
|
||||
Below is a list of localized implementation flaws. Most of these can
|
||||
probably be fixed incrementally, if we choose to do so. They should,
|
||||
in any case, be avoided in any new API design.
|
||||
|
||||
There are functions that don't follow the convention of
|
||||
returning ``0`` for success and ``-1`` for failure. For
|
||||
example, ``PyArg_ParseTuple`` returns 0 for success and
|
||||
non-zero for failure
|
||||
[`Issue 25 <https://github.com/capi-workgroup/problems/issues/25>`__].
|
||||
|
||||
The macros ``Py_CLEAR`` and ``Py_SETREF`` access their arg more than
|
||||
once, so if the arg is an expression with side effects, they are
|
||||
duplicated
|
||||
[`Issue 3 <https://github.com/capi-workgroup/problems/issues/3>`__].
|
||||
|
||||
The meaning of ``Py_SIZE`` depends on the type and is not always
|
||||
reliable
|
||||
[`Issue 10 <https://github.com/capi-workgroup/problems/issues/10>`__].
|
||||
|
||||
Some API function do not have the same behaviour as their Python
|
||||
equivalents. The behaviour of ``PyIter_Next`` is different from
|
||||
``tp_iternext``.
|
||||
[`Issue 29 <https://github.com/capi-workgroup/problems/issues/29>`__].
|
||||
The behaviour of ``PySet_Contains`` is different from ``set.__contains__``
|
||||
[`Issue 6 <https://github.com/capi-workgroup/problems/issues/6>`__].
|
||||
|
||||
The fact that ``PyArg_ParseTupleAndKeywords`` takes a non-const
|
||||
``char*`` array as argument makes it more difficult to use
|
||||
[`Issue 28 <https://github.com/capi-workgroup/problems/issues/28>`__].
|
||||
|
||||
``Python.h`` does not expose the whole API. Some headers (like ``marshal.h``)
|
||||
are not included from ``Python.h``.
|
||||
[`Issue 43 <https://github.com/capi-workgroup/problems/issues/43>`__].
|
||||
|
||||
**Naming**
|
||||
|
||||
``PyLong`` and ``PyUnicode`` use names which no longer match the Python
|
||||
types they represent (``int``/``str``). This could be fixed in a new API
|
||||
[`Issue 14 <https://github.com/capi-workgroup/problems/issues/14>`__].
|
||||
|
||||
There are identifiers in the API which are lacking a ``Py``/``_Py``
|
||||
prefix
|
||||
[`Issue 46 <https://github.com/capi-workgroup/problems/issues/46>`__].
|
||||
|
||||
Missing Functionality
|
||||
---------------------
|
||||
|
||||
This section consists of a list of feature requests, i.e., functionality
|
||||
that was identified as missing in the current C API.
|
||||
|
||||
Debug Mode
|
||||
~~~~~~~~~~
|
||||
|
||||
A debug mode that can be activated without recompilation and which
|
||||
activates various checks that can help detect various types of errors
|
||||
[`Issue 36 <https://github.com/capi-workgroup/problems/issues/36>`__].
|
||||
|
||||
Introspection
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
There aren't currently reliable introspection capabilities for objects
|
||||
defined in C in the same way as there are for Python objects
|
||||
[`Issue 32 <https://github.com/capi-workgroup/problems/issues/32>`__].
|
||||
|
||||
Efficient type checking for heap types
|
||||
[`Issue 17 <https://github.com/capi-workgroup/problems/issues/17>`__].
|
||||
|
||||
Improved Interaction with Other Languages
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Interfacing with other GC based languages, and integrating their
|
||||
GC with Python's GC
|
||||
[`Issue 19 <https://github.com/capi-workgroup/problems/issues/19>`__].
|
||||
|
||||
Inject foreign stack frames to the traceback
|
||||
[`Issue 18 <https://github.com/capi-workgroup/problems/issues/18>`__].
|
||||
|
||||
Concrete strings that can be used in other languages
|
||||
[`Issue 16 <https://github.com/capi-workgroup/problems/issues/16>`__].
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
1. `Python/C API Reference Manual <https://docs.python.org/3/c-api/index.html>`__
|
||||
2. `2023 Language Summit Blog Post: Three Talks on the C API <https://pyfound.blogspot.com/2023/05/the-python-language-summit-2023-three.html>`__
|
||||
3. `capi-workgroup on GitHub <https://github.com/capi-workgroup>`__
|
||||
4. `Irit's Core Sprint 2023 slides about C API workgroup <https://github.com/iritkatriel/talks/blob/main/2023_Sprint_Brno_C_API.pdf>`__
|
||||
5. `Petr's Core Sprint 2023 slides <https://drive.google.com/file/d/148NLRPXGZGI1SXfKLMFvQc_iv67hPJQS/view?usp=sharing>`__
|
||||
6. `HPy team's Core Sprint 2023 slides for Things to Learn from HPy <https://hpyproject.org/talks/2023/10/things_to_learn_from_hpy.pdf>`__
|
||||
7. `Victor's slides of Core Sprint 2023 Python C API talk <https://github.com/vstinner/talks/blob/main/2023-CoreDevSprint-Brno/c-api.pdf>`__
|
||||
8. `The Python's stability promise — Cristián Maureira-Fredes, PySide maintainer <https://www.youtube.com/watch?v=iiBJF0kM-P8>`__
|
||||
9. `Report on the issues PySide had 5 years ago when switching to the stable ABI <https://github.com/pyside/pyside2-setup/blob/5.11/sources/shiboken2/libshiboken/pep384impl_doc.rst>`__
|
||||
|
||||
|
||||
Copyright
|
||||
=========
|
||||
|
||||
This document is placed in the public domain or under the
|
||||
CC0-1.0-Universal license, whichever is more permissive.
|
Loading…
Reference in New Issue