PEP: 697 Title: C API for Extending Opaque Types Author: Petr Viktorin Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 23-Aug-2022 Python-Version: 3.12 Abstract ======== Add limited C API for extending types whose ``struct`` is opaque, by allowing code to only deal with data specific to a particular (sub)class. Make the mechanism usable with ``PyHeapType``. Motivation ========== Extending opaque types ---------------------- In order to allow changing/optimizing CPython, and allow freedom for alternate implementations of the C API, best practice is to not expose memory layout (C structs) in public API, and instead rely on accessor functions. (When this hurts performance, direct struct access can be allowed in a less stable API tier, at the expense of compatibility with diferent versions/implementations of the interpreter.) However, when a particular type's instance struct is hidden, it becomes difficult to subclass it. The usual subclassing pattern, explained `in the tutorial `_, is to put the base class ``struct`` as the first member of the subclass ``struct``. The tutorial shows this on a ``list`` subtype with extra state; adapted to a heap type (``PyType_Spec``) the example reads: .. code-block:: c typedef struct { PyListObject list; int state; } SubListObject; static PyType_Spec Sublist_spec = { .name = "sublist.SubList", .basicsize = sizeof(SubListObject), .itemsize = 0, .flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, .slots = SubList_slots }; Since the superclass struct (``PyListObject``) is part of the subclass struct (``SubListObject``): - ``PyListObject`` size must be known at compile time, and - the size must be the same across all interpreters/versions the compiled extension is ABI-compatible with. But in limited API/stable ABI, we do not expose the size of ``PyListObject``, so that it can vary between CPython versions (and even between possible alternate ABI-compatible C API implementations). With the size not available, limited API users must resort to workarounds such as querying ``__basicsize__`` and plugging it into ``PyType_Spec`` at runtime, and divining the correct offset for their extra data. This requires making assumptions about memory layout, which the limited API is supposed to hide. Extending variable-size objects ------------------------------- Another scenario where the traditional way to extend an object does not work is variable-sized objects, i.e. ones with non-zero ``tp_itemsize``. If the instance struct ends with a variable-length array (such as in ``tuple`` or ``int``), subclasses cannot add their own extra data without detailed knowledge about how the superclass allocates and uses its memory. Some types, such as CPython's ``PyHeapType``, handle this by storing variable-sized data after the fixed-size struct. This means that any subclass can add its own fixed-size data. (Only one class in the inheritance hierarchy can use variable-sized data, though.) This PEP proposes API that makes this practice easier, and ensures the variable-sized data is properly aligned. Note that many variable-size types, like ``int`` or ``tuple``, do not use this mechanism. This PEP does not propose any changes to existing variable-size types (like ``int`` or ``tuple``) except ``PyHeapType``. Extending ``PyHeapType`` specifically ------------------------------------- The motivating problem this PEP solves is creating metaclasses, that is, subclasses of ``type``. The underlying ``PyHeapTypeObject`` struct is both variable-sized and opaque in the limited API. Projects such as language bindings and frameworks that need to attach custom data to metaclasses currently resort to questionable workarounds. The situation is worse in projects that target the Limited API. For an example of the currently necessary workarounds, see: `nb_type_data_static `_ in the not-yet-released limited-API branch of ``nanobind`` (a spiritual successor of the popular C++ binding generator ``pybind11``). Rationale ========= This PEP proposes a different model: instead of the superclass data being part of the subclass data, the extra space a subclass needs is specified and accessed separately. (How base class data is accessed is left to whomever implements the base class: they can for example provide accessor functions, expose a part of its ``struct`` for better performance, or do both.) The proposed mechanism allows using static, read-only ``PyType_Spec`` even if the superclass struct is opaque, like ``PyTypeObject`` in the Limited API. Combined with a way to create class from ``PyType_Spec`` and a custom metaclass, this will allow libraries like nanobind or JPype to create metaclasses without making assumptions about ``PyTypeObject``'s memory layout. The approach generalizes to non-metaclass types as well. Specification ============= In the code blocks below, only function headers are part of the specification. Other code (the size/offset calculations) are details of the initial CPython implementation, and subject to change. Relative ``basicsize`` ---------------------- The ``basicsize`` member of ``PyType_Spec`` will be allowed to be zero or negative. In that case, its absolute value will specify the amount of *extra* storage space instances of the new class require, in addition to the basicsize of the base class. That is, the basicsize of the resulting class will be: .. code-block:: c type->tp_basicsize = _align(base->tp_basicsize) + _align(-spec->basicsize); where ``_align`` rounds up to a multiple of ``alignof(max_align_t)``. When ``spec->basicsize`` is zero, ``base->tp_basicsize`` will be inherited directly instead (i.e. set to ``base->tp_basicsize`` without aligning). On an instance, the memory area specific to a subclass -- that is, the “extra space” that subclass reserves in addition its base -- will be available using a new function, ``PyObject_GetTypeData``. In CPython, this function will be defined as: .. code-block:: c void * PyObject_GetTypeData(PyObject *obj, PyTypeObject *cls) { return (char *)obj + _align(cls->tp_base->tp_basicsize); } Another function will be added to retreive the size of this memory area: .. code-block:: c Py_ssize_t PyObject_GetTypeDataSize(PyTypeObject *cls) { return cls->tp_basicsize - _align(cls->tp_base->tp_basicsize); } The functionality comes with two important caveats, which will be pointed out in documentation: - The new functions may only be used for classes created using negative ``PyType_Spec.basicsize``. For other classes, the behavior is undefined. (Note that this allows the above code to assume ``cls->tp_base`` is not ``NULL``.) - Classes of variable-length objects (those with non-zero ``tp_itemsize``) can only be meaningfully extended using negative ``basicsize`` if all superclasses cooperate (see below). Of types defined by Python, initially only ``PyTypeObject`` will do so, others (including ``int`` or ``tuple``) will not. Inheriting ``itemsize`` ----------------------- If the ``itemsize`` member of ``PyType_Spec`` is set to zero, the itemsize will be inherited from the base class . .. note:: This PEP does not propose specifying “relative” ``itemsize`` (using a negative number). There is a lack of motivating use cases, and there's no obvious best memory layout for sharing item storage across classes in the inheritance hierarchy. A new function, ``PyObject_GetItemData``, will be added to safely access the memory reserved for items, taking subclasses that extend ``tp_basicsize`` into account. In CPython it will be defined as: .. code-block:: c void * PyObject_GetItemData(PyObject *obj) { return (char *)obj + Py_TYPE(obj)->tp_basicsize; } This function will *not* be added to the Limited API. Note that it **is not safe** to use **any** of the functions added in this PEP unless **all classes in the inheritance hierarchy** only use ``PyObject_GetItemData`` (or an equivalent) for per-item memory, or don't use per-item memory at all. (This issue already exists for most current classes that use variable-length arrays in the instance struct, but it's much less obvious if the base struct layout is unknown.) The documentation for all API added in this PEP will mention the caveat. Relative member offsets ----------------------- In types defined using negative ``PyType_Spec.basicsize``, the offsets of members defined via ``Py_tp_members`` must be “relative” -- to the extra subclass data, rather than the full ``PyObject`` struct. This will be indicated by a new flag, ``PY_RELATIVE_OFFSET``. In the initial implementation, the new flag will be redundant -- it only serves to make the offset's changed meaning clear. It is an error to *not* use ``PY_RELATIVE_OFFSET`` with negative ``basicsize``, and it is an error to use it in any other context (i.e. direct or indirect calls to ``PyDescr_NewMember``, ``PyMember_GetOne``, ``PyMember_SetOne``). CPython will adjust the offset and clear the ``PY_RELATIVE_OFFSET`` flag when intitializing a type. This means that the created type's ``tp_members`` will not match the input definition's ``Py_tp_members`` slot, and that any code that reads ``tp_members`` does not need to handle the flag. Changes to ``PyTypeObject`` --------------------------- Internally in CPython, access to ``PyTypeObject`` “items” (``_PyHeapType_GET_MEMBERS``) will be changed to use ``PyObject_GetItemData``. Note that the current implementation is equivalent except it lacks the alignment adjustment. The macro is used a few times in type creation, so no measurable performance impact is expected. Public API for this data, ``tp_members``, will not be affected. List of new API =============== The following new functions are proposed. These will be added to the Limited API/Stable ABI: * ``void * PyObject_GetTypeData(PyObject *obj, PyTypeObject *cls)`` * ``Py_ssize_t PyObject_GetTypeDataSize(PyTypeObject *cls)`` These will be added to the public C API only: * ``void *PyObject_GetItemData(PyObject *obj)`` Backwards Compatibility ======================= There are no known backwards compatibility concerns. Security Implications ===================== None known. Endorsements ============ XXX: The PEP mentions nanobind -- make sure they agree! XXX: HPy, JPype, PySide might also want to chime in. How to Teach This ================= The initial implementation will include reference documentation and a What's New entry, which should be enough for the target audience -- authors of C extension libraries. Reference Implementation ======================== XXX: Not quite ready yet Possible Future Enhancements ============================ Alignment --------- The proposed implementation may waste some space if instance structs need smaller alignment than ``alignof(max_align_t)``. Also, dealing with alignment makes the calculation slower than it could be if we could rely on ``base->tp_basicsize`` being properly aligned for the subtype. In other words, the proposed implementation focuses on safety and ease of use, and trades space and time for it. If it turns out that this is a problem, the implementation can be adjusted without breaking the API: - The offset to the type-specific buffer can be stored, so ``PyObject_GetTypeData`` effectively becomes ``(char *)obj + cls->ht_typedataoffset``, possibly speeding things up at the cost of an extra pointer in the class. - Then, a new ``PyType_Slot`` can specify the desired alignment, to reduce space requirements for instances. - Alternatively, it might be possible to align ``tp_basicsize`` up at class creation/readying time. Rejected Ideas ============== None yet. Open Issues =========== Is negative basicsize the way to go? Should this be enabled by a flag instead? Copyright ========= This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.