487 lines
18 KiB
ReStructuredText
487 lines
18 KiB
ReStructuredText
PEP: 697
|
||
Title: Limited C API for Extending Opaque Types
|
||
Author: Petr Viktorin <encukou@gmail.com>
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 23-Aug-2022
|
||
Python-Version: 3.12
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
Add `Limited C API <https://docs.python.org/3.11/c-api/stable.html#stable-application-binary-interface>`__
|
||
for extending types with opaque data,
|
||
by allowing code to only deal with data specific to a particular (sub)class.
|
||
|
||
Make the mechanism usable with ``PyHeapTypeObject``.
|
||
|
||
|
||
Motivation
|
||
==========
|
||
|
||
The motivating problem this PEP solves is creating metaclasses (subclasses of
|
||
:py:class:`python:type`) in “wrappers” – projects that expose another type
|
||
system (e.g. C++, Java, Rust) as Python classes.
|
||
These systems typically need to attach information about the “wrapped”
|
||
non-Python class to the Python type object -- that is, extend
|
||
``PyHeapTypeObject``.
|
||
|
||
This should be possible to do in the Limited API, so that these generators
|
||
can be used to create Stable ABI extensions. (See :pep:`652` for the benefits
|
||
of providing a stable ABI.)
|
||
|
||
Extending ``type`` is an instance of a more general problem:
|
||
extending a class while maintaining loose coupling – that is,
|
||
not depending on the memory layout used by the superclass.
|
||
(That's a lot of jargon; see Rationale for a concrete example of extending
|
||
``list``.)
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
Extending opaque types
|
||
----------------------
|
||
|
||
In the Limited API, most ``struct``\ s are opaque: their size and memory layout
|
||
are not exposed, so they can be changed in new versions of CPython (or
|
||
alternate implementations of the C API).
|
||
|
||
This means that the usual subclassing pattern -- making the ``struct``
|
||
used for instances of the *base* type be the first element of the ``struct``
|
||
used for instances of the *derived* type -- does not work.
|
||
To illustrate with code, the `example from the tutorial <https://docs.python.org/3.11/extending/newtypes_tutorial.html#subclassing-other-types>`_
|
||
extends :external+python:c:type:`PyListObject` (:py:class:`python:list`)
|
||
using the following ``struct``:
|
||
|
||
.. code-block:: c
|
||
|
||
typedef struct {
|
||
PyListObject list;
|
||
int state;
|
||
} SubListObject;
|
||
|
||
This won't compile in the Limited API, since ``PyListObject`` is opaque (to
|
||
allow changes as features and optimizations are implemented).
|
||
|
||
Instead, this PEP proposes using a ``struct`` with only the state needed
|
||
in the subclass, that is:
|
||
|
||
.. code-block:: c
|
||
|
||
typedef struct {
|
||
int state;
|
||
} SubListState;
|
||
|
||
// (or just `typedef int SubListState;` in this case)
|
||
|
||
The subclass can now be completely decoupled from the memory layout (and size)
|
||
of the superclass.
|
||
|
||
This is possible today. To use such a struct:
|
||
|
||
* when creating the class, use ``PyListObject->tp_basicsize + sizeof(SubListState)``
|
||
as ``PyType_Spec.basicsize``;
|
||
* when accessing the data, use ``PyListObject->tp_basicsize`` as the offset
|
||
into the instance (``PyObject*``).
|
||
|
||
However, this has disadvantages:
|
||
|
||
* The base's ``basicsize`` may not be properly aligned, causing issues
|
||
on some architectures if not mitigated. (These issues can be particularly
|
||
nasty if alignment changes in a new release.)
|
||
* ``PyTypeObject.tp_basicsize`` is not exposed in the
|
||
Limited API, so extensions that support Limited API need to
|
||
use ``PyObject_GetAttrString(obj, "__basicsize__")``.
|
||
This is cumbersome, and unsafe in edge cases (the Python attribute can
|
||
be overridden).
|
||
* Variable-size types are not handled (see `var-sized`_ below).
|
||
|
||
To make this easy (and even *best practice* for projects that choose loose
|
||
coupling over maximum performance), this PEP proposes an API to:
|
||
|
||
1. During class creation, specify that ``SubListState``
|
||
should be “appended” to ``PyListObject``, without passing any additional
|
||
details about ``list``. (The interpreter itself gets all necessary info,
|
||
like ``tp_basicsize``, from the base).
|
||
|
||
This will be specified by a negative ``PyType_Spec.basicsize``:
|
||
``-sizeof(SubListState)``.
|
||
|
||
2. Given an instance, and the subclass ``PyTypeObject*``,
|
||
get a pointer to the ``SubListState``.
|
||
A new function will be added for this.
|
||
|
||
The base class is not limited to ``PyListObject``, of course: it can be used to
|
||
extend any base class whose instance ``struct`` is opaque, unstable across
|
||
releases, or not exposed at all -- including :py:class:`python:type`
|
||
(``PyHeapTypeObject``) mentioned earlier, but also other extensions
|
||
(for example, NumPy arrays [#f1]_).
|
||
|
||
For cases where no additional state is needed, a zero ``basicsize`` will be
|
||
allowed: in that case, the base's ``tp_basicsize`` will be inherited.
|
||
(With the current API, the base's ``basicsize`` needs to be passed in.)
|
||
|
||
The ``tp_basicsize`` of the new class will be set to the computed total size,
|
||
so code that inspects classes will continue working as before.
|
||
|
||
|
||
.. _var-sized:
|
||
|
||
Extending variable-size objects
|
||
-------------------------------
|
||
|
||
Additional considerations are needed to subclass
|
||
:external+python:c:type:`variable-sized objects <PyVarObject>`
|
||
while maintaining loose coupling as much as possible.
|
||
|
||
Unfortunately, in this case we cannot decouple the subclass from its superclass
|
||
entirely.
|
||
There are two main memory layouts for variable-sized objects, and the
|
||
subclass's author needs to know which one the superclass uses.
|
||
|
||
In types such as ``int`` or ``tuple``, the variable data is stored at a fixed
|
||
offset.
|
||
If subclasses need additional space, it must be added after any variable-sized
|
||
data::
|
||
|
||
PyTupleObject:
|
||
┌───────────────────┬───┬───┬╌╌╌╌┐
|
||
│ PyObject_VAR_HEAD │var. data │
|
||
└───────────────────┴───┴───┴╌╌╌╌┘
|
||
|
||
tuple subclass:
|
||
┌───────────────────┬───┬───┬╌╌╌╌┬─────────────┐
|
||
│ PyObject_VAR_HEAD │var. data │subclass data│
|
||
└───────────────────┴───┴───┴╌╌╌╌┴─────────────┘
|
||
|
||
In other types, like ``PyHeapTypeObject``, variable-sized data always lives at
|
||
the end of the instance's memory area::
|
||
|
||
heap type:
|
||
┌───────────────────┬──────────────┬───┬───┬╌╌╌╌┐
|
||
│ PyObject_VAR_HEAD │Heap type data│var. data │
|
||
└───────────────────┴──────────────┴───┴───┴╌╌╌╌┘
|
||
|
||
type subclass:
|
||
┌───────────────────┬──────────────┬─────────────┬───┬───┬╌╌╌╌┐
|
||
│ PyObject_VAR_HEAD │Heap type data│subclass data│var. data │
|
||
└───────────────────┴──────────────┴─────────────┴───┴───┴╌╌╌╌┘
|
||
|
||
The first layout enables fast access to the items array.
|
||
The second allows subclasses to ignore the variable-sized array (assuming
|
||
they use offsets from the start of the object to access their data).
|
||
|
||
Which layout is used is, unfortunately, an implementation detail that the
|
||
subclass code must take into account.
|
||
Correspondingly, if a variable-sized type is designed to be extended in C,
|
||
its documentation should note the mechanism used.
|
||
Since this PEP focuses on ``PyHeapTypeObject``, it proposes API for the second
|
||
variant.
|
||
|
||
Like with fixed-size types, extending a variable-sized type is already
|
||
possible: when creating the class, ``base->tp_itemsize`` needs to be passed
|
||
as ``PyType_Spec.itemsize``.
|
||
This is cumbersome in the Limited API, where one needs to resort to
|
||
``PyObject_GetAttrString(obj, "__itemsize__")``, with the same caveats as for
|
||
``__basicsize__`` above.
|
||
|
||
This PEP proposes a mechanism to instruct the interpreter to do this on its
|
||
own, without the extension needing to read ``base->tp_itemsize``.
|
||
|
||
Several alternatives for this mechanism were rejected:
|
||
|
||
* The easiest way to do this would be to allow leaving ``itemsize`` as 0 to
|
||
mean “inherit”.
|
||
However, unlike ``basicsize`` zero is a valid value for ``itemsize`` --
|
||
it marks fixed-sized types.
|
||
Also, in C, zero is the default value used when ``itemsize`` is not specified.
|
||
Since extending a variable-sized type requires *some* knowledge of the
|
||
superclass, it would be a good idea to require a more explicit way
|
||
to request it.
|
||
* It would be possible to reserve a special negative value like ``itemsize=-1``
|
||
to mean “inherit”.
|
||
But this would rule out a possible future where negative ``itemsize``
|
||
more closely matches negative ``basicsize`` -- a request for
|
||
additional space.
|
||
* A new flag would also work, but ``tp_flags`` is running out of free bits.
|
||
Reserving one for a flag only used in type creation seems wasteful.
|
||
|
||
So, this PEP proposes a new :external+python:c:type:`PyType_Slot` to mark
|
||
that ``tp_itemsize`` hould be inherited.
|
||
When this flag is used, ``itemsize`` must be set to zero.
|
||
Like with ``tp_basicsize``, ``tp_itemsize`` will be set to the computed value
|
||
as the class is created.
|
||
|
||
|
||
Normalizing the ``PyHeapTypeObject``-like layout
|
||
''''''''''''''''''''''''''''''''''''''''''''''''
|
||
|
||
Additionally, this PEP proposes a helper function to get the variable-sized
|
||
data of a given instance, assuming it uses the ``PyHeapTypeObject``-like layout.
|
||
This is mainly to make it easier to define and document such types.
|
||
|
||
This function will not be exposed in the Limited API.
|
||
|
||
|
||
Relative member offsets
|
||
-----------------------
|
||
|
||
One more piece of the puzzle is ``PyMemberDef.offset``.
|
||
Extensions that use a subclass-specific ``struct`` (``SubListState`` above)
|
||
will get a way to specify “relative” offsets -- offsets based on this ``struct``
|
||
-- rather than to “absolute” ones (based on ``PyObject*``).
|
||
|
||
One way to do it would be to automatically assume “relative” offsets
|
||
if this PEP's API is used to create a class.
|
||
However, this implicit assumption may be too surprising.
|
||
|
||
To be more explicit, this PEP proposes a new flag for “relative” offsets.
|
||
At least initially, this flag will serve only a check against misuse
|
||
(and a hint for reviewers).
|
||
It must be present if used with the new API, and must not be used otherwise.
|
||
|
||
|
||
Specification
|
||
=============
|
||
|
||
In the code blocks below, only function headers are part of the specification.
|
||
Other code (the size/offset calculations) are details of the initial CPython
|
||
implementation, and subject to change.
|
||
|
||
Relative ``basicsize``
|
||
----------------------
|
||
|
||
The ``basicsize`` member of ``PyType_Spec`` will be allowed to be zero or
|
||
negative.
|
||
In that case, it will specify the inverse of *extra* storage space instances of
|
||
the new class require, in addition to the basicsize of the base class.
|
||
That is, the basicsize of the resulting class will be:
|
||
|
||
.. code-block:: c
|
||
|
||
type->tp_basicsize = _align(base->tp_basicsize) + _align(-spec->basicsize);
|
||
|
||
where ``_align`` rounds up to a multiple of ``alignof(max_align_t)``.
|
||
When ``spec->basicsize`` is zero, ``base->tp_basicsize`` will be inherited
|
||
directly instead (i.e. set to ``base->tp_basicsize`` without aligning).
|
||
|
||
On an instance, the memory area specific to a subclass -- that is, the
|
||
“extra space” that subclass reserves in addition its base -- will be available
|
||
through a new function, ``PyObject_GetTypeData``.
|
||
In CPython, this function will be defined as:
|
||
|
||
.. code-block:: c
|
||
|
||
void *
|
||
PyObject_GetTypeData(PyObject *obj, PyTypeObject *cls) {
|
||
return (char *)obj + _align(cls->tp_base->tp_basicsize);
|
||
}
|
||
|
||
Another function will be added to retreive the size of this memory area:
|
||
|
||
.. code-block:: c
|
||
|
||
Py_ssize_t
|
||
PyObject_GetTypeDataSize(PyTypeObject *cls) {
|
||
return cls->tp_basicsize - _align(cls->tp_base->tp_basicsize);
|
||
}
|
||
|
||
The new ``*Get*`` functions come with an important caveat, which will be
|
||
pointed out in documentation: They may only be used for classes created using
|
||
negative ``PyType_Spec.basicsize``. For other classes, their behavior is
|
||
undefined.
|
||
(Note that this allows the above code to assume ``cls->tp_base`` is not
|
||
``NULL``.)
|
||
|
||
|
||
Inheriting ``itemsize``
|
||
-----------------------
|
||
|
||
If a new slot, ``Py_tp_inherit_itemsize``, is present in
|
||
``PyType_Spec.slots``, the new class will inherit
|
||
the base's ``tp_itemsize``.
|
||
|
||
If this is the case, CPython will assert that:
|
||
|
||
* ``PyType_Spec.itemsize`` must be set to zero.
|
||
* The ``Py_tp_inherit_itemsize`` slot's
|
||
``~PyType_Slot.pfunc`` must be set to NULL.
|
||
|
||
A new function, ``PyObject_GetItemData``, will be added to safely access the
|
||
memory reserved for items, taking subclasses that extend ``tp_basicsize``
|
||
into account.
|
||
In CPython it will be defined as:
|
||
|
||
.. code-block:: c
|
||
|
||
void *
|
||
PyObject_GetItemData(PyObject *obj) {
|
||
return (char *)obj + Py_TYPE(obj)->tp_basicsize;
|
||
}
|
||
|
||
This function will *not* be added to the Limited API.
|
||
|
||
Note that it **is not safe** to use **any** of the functions added in this PEP
|
||
unless **all classes in the inheritance hierarchy** only use
|
||
``PyObject_GetItemData`` (or an equivalent) for per-item memory, or don't
|
||
use per-item memory at all.
|
||
(This issue already exists for most current classes that use variable-length
|
||
arrays in the instance struct, but it's much less obvious if the base struct
|
||
layout is unknown.)
|
||
|
||
The documentation for all API added in this PEP will mention
|
||
the caveat.
|
||
|
||
|
||
Relative member offsets
|
||
-----------------------
|
||
|
||
In types defined using negative ``PyType_Spec.basicsize``, the offsets of
|
||
members defined via ``Py_tp_members`` must be relative to the
|
||
extra subclass data, rather than the full ``PyObject`` struct.
|
||
This will be indicated by a new flag, ``PY_RELATIVE_OFFSET``.
|
||
|
||
In the initial implementation, the new flag will be redundant. It only serves
|
||
to make the offset's changed meaning clear, and to help avoid mistakes.
|
||
It will be an error to *not* use ``PY_RELATIVE_OFFSET`` with negative
|
||
``basicsize``, and it will be an error to use it in any other context
|
||
(i.e. direct or indirect calls to ``PyDescr_NewMember``, ``PyMember_GetOne``,
|
||
``PyMember_SetOne``).
|
||
|
||
CPython will adjust the offset and clear the ``PY_RELATIVE_OFFSET`` flag when
|
||
intitializing a type.
|
||
This means that the created type's ``tp_members`` will not match the input
|
||
definition's ``Py_tp_members`` slot, and that any code that reads
|
||
``tp_members`` will not need to handle the flag.
|
||
|
||
|
||
Changes to ``PyTypeObject``
|
||
---------------------------
|
||
|
||
Internally in CPython, access to ``PyTypeObject`` “items”
|
||
(``_PyHeapType_GET_MEMBERS``) will be changed to use ``PyObject_GetItemData``.
|
||
Note that the current implementation is equivalent: it only lacks the
|
||
alignment adjustment.
|
||
The macro is used a few times in type creation, so no measurable
|
||
performance impact is expected.
|
||
Public API for this data, ``tp_members``, will not be affected.
|
||
|
||
|
||
List of new API
|
||
===============
|
||
|
||
The following new functions/values are proposed.
|
||
|
||
These will be added to the Limited API/Stable ABI:
|
||
|
||
* ``void * PyObject_GetTypeData(PyObject *obj, PyTypeObject *cls)``
|
||
* ``Py_ssize_t PyObject_GetTypeDataSize(PyTypeObject *cls)``
|
||
* ``Py_tp_inherit_itemsize`` slot for ``PyType_Spec.slots``
|
||
|
||
These will be added to the public C API only:
|
||
|
||
* ``void *PyObject_GetItemData(PyObject *obj)``
|
||
|
||
|
||
Backwards Compatibility
|
||
=======================
|
||
|
||
No backwards compatibility concerns are known.
|
||
|
||
|
||
Assumptions
|
||
===========
|
||
|
||
The implementation assumes that an instance's memory
|
||
between ``type->tp_base->tp_basicsize`` and ``type->tp_basicsize`` offsets
|
||
“belongs” to ``type`` (except variable-length types).
|
||
This is not documented explicitly, but CPython up to version 3.11 relied on it
|
||
when adding ``__dict__`` to subclasses, so it should be safe.
|
||
|
||
|
||
Security Implications
|
||
=====================
|
||
|
||
None known.
|
||
|
||
|
||
Endorsements
|
||
============
|
||
|
||
XXX: The PEP mentions wrapper libraries, so it should get review/endorsement
|
||
from nanobind, PyO3, JPype, PySide &c.
|
||
|
||
XXX: HPy devs might also want to chime in.
|
||
|
||
|
||
How to Teach This
|
||
=================
|
||
|
||
The initial implementation will include reference documentation
|
||
and a What's New entry, which should be enough for the target audience
|
||
-- authors of C extension libraries.
|
||
|
||
|
||
Reference Implementation
|
||
========================
|
||
|
||
XXX: Not quite ready yet
|
||
|
||
|
||
Possible Future Enhancements
|
||
============================
|
||
|
||
Alignment
|
||
---------
|
||
|
||
The proposed implementation may waste some space if instance structs
|
||
need smaller alignment than ``alignof(max_align_t)``.
|
||
Also, dealing with alignment makes the calculation slower than it could be
|
||
if we could rely on ``base->tp_basicsize`` being properly aligned for the
|
||
subtype.
|
||
|
||
In other words, the proposed implementation focuses on safety and ease of use,
|
||
and trades space and time for it.
|
||
If it turns out that this is a problem, the implementation can be adjusted
|
||
without breaking the API:
|
||
|
||
- The offset to the type-specific buffer can be stored, so
|
||
``PyObject_GetTypeData`` effectively becomes
|
||
``(char *)obj + cls->ht_typedataoffset``, possibly speeding things up at
|
||
the cost of an extra pointer in the class.
|
||
- Then, a new ``PyType_Slot`` can specify the desired alignment, to
|
||
reduce space requirements for instances.
|
||
- Alternatively, it might be possible to align ``tp_basicsize`` up at class
|
||
creation/readying time.
|
||
|
||
|
||
Rejected Ideas
|
||
==============
|
||
|
||
None yet.
|
||
|
||
|
||
Open Issues
|
||
===========
|
||
|
||
Is negative basicsize the way to go? Should this be enabled by a flag instead?
|
||
|
||
|
||
Footnotes
|
||
=========
|
||
|
||
.. [#f1] This PEP does not make it “safe” to subclass NumPy arrays specifically.
|
||
NumPy publishes `an extensive list of caveats <https://numpy.org/doc/1.23/user/basics.subclassing.html>`__
|
||
for subclassing its arrays from Python, and extending in C might need
|
||
a similar list.
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document is placed in the public domain or under the
|
||
CC0-1.0-Universal license, whichever is more permissive.
|