PEP: 697 Title: Limited C API for Extending Opaque Types Author: Petr Viktorin Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 23-Aug-2022 Python-Version: 3.12 Abstract ======== Add `Limited C API `__ for extending types with opaque data, by allowing code to only deal with data specific to a particular (sub)class. Make the mechanism usable with ``PyHeapTypeObject``. Motivation ========== The motivating problem this PEP solves is creating metaclasses (subclasses of :py:class:`python:type`) in “wrappers” – projects that expose another type system (e.g. C++, Java, Rust) as Python classes. These systems typically need to attach information about the “wrapped” non-Python class to the Python type object -- that is, extend ``PyHeapTypeObject``. This should be possible to do in the Limited API, so that these generators can be used to create Stable ABI extensions. (See :pep:`652` for the benefits of providing a stable ABI.) Extending ``type`` is an instance of a more general problem: extending a class while maintaining loose coupling – that is, not depending on the memory layout used by the superclass. (That's a lot of jargon; see Rationale for a concrete example of extending ``list``.) Rationale ========= Extending opaque types ---------------------- In the Limited API, most ``struct``\ s are opaque: their size and memory layout are not exposed, so they can be changed in new versions of CPython (or alternate implementations of the C API). This means that the usual subclassing pattern -- making the ``struct`` used for instances of the *base* type be the first element of the ``struct`` used for instances of the *derived* type -- does not work. To illustrate with code, the `example from the tutorial `_ extends :external+python:c:type:`PyListObject` (:py:class:`python:list`) using the following ``struct``: .. code-block:: c typedef struct { PyListObject list; int state; } SubListObject; This won't compile in the Limited API, since ``PyListObject`` is opaque (to allow changes as features and optimizations are implemented). Instead, this PEP proposes using a ``struct`` with only the state needed in the subclass, that is: .. code-block:: c typedef struct { int state; } SubListState; // (or just `typedef int SubListState;` in this case) The subclass can now be completely decoupled from the memory layout (and size) of the superclass. This is possible today. To use such a struct: * when creating the class, use ``PyListObject->tp_basicsize + sizeof(SubListState)`` as ``PyType_Spec.basicsize``; * when accessing the data, use ``PyListObject->tp_basicsize`` as the offset into the instance (``PyObject*``). However, this has disadvantages: * The base's ``basicsize`` may not be properly aligned, causing issues on some architectures if not mitigated. (These issues can be particularly nasty if alignment changes in a new release.) * ``PyTypeObject.tp_basicsize`` is not exposed in the Limited API, so extensions that support Limited API need to use ``PyObject_GetAttrString(obj, "__basicsize__")``. This is cumbersome, and unsafe in edge cases (the Python attribute can be overridden). * Variable-size types are not handled (see `var-sized`_ below). To make this easy (and even *best practice* for projects that choose loose coupling over maximum performance), this PEP proposes an API to: 1. During class creation, specify that ``SubListState`` should be “appended” to ``PyListObject``, without passing any additional details about ``list``. (The interpreter itself gets all necessary info, like ``tp_basicsize``, from the base). This will be specified by a negative ``PyType_Spec.basicsize``: ``-sizeof(SubListState)``. 2. Given an instance, and the subclass ``PyTypeObject*``, get a pointer to the ``SubListState``. A new function will be added for this. The base class is not limited to ``PyListObject``, of course: it can be used to extend any base class whose instance ``struct`` is opaque, unstable across releases, or not exposed at all -- including :py:class:`python:type` (``PyHeapTypeObject``) mentioned earlier, but also other extensions (for example, NumPy arrays [#f1]_). For cases where no additional state is needed, a zero ``basicsize`` will be allowed: in that case, the base's ``tp_basicsize`` will be inherited. (With the current API, the base's ``basicsize`` needs to be passed in.) The ``tp_basicsize`` of the new class will be set to the computed total size, so code that inspects classes will continue working as before. .. _var-sized: Extending variable-size objects ------------------------------- Additional considerations are needed to subclass :external+python:c:type:`variable-sized objects ` while maintaining loose coupling as much as possible. Unfortunately, in this case we cannot decouple the subclass from its superclass entirely. There are two main memory layouts for variable-sized objects, and the subclass's author needs to know which one the superclass uses. In types such as ``int`` or ``tuple``, the variable data is stored at a fixed offset. If subclasses need additional space, it must be added after any variable-sized data:: PyTupleObject: ┌───────────────────┬───┬───┬╌╌╌╌┐ │ PyObject_VAR_HEAD │var. data │ └───────────────────┴───┴───┴╌╌╌╌┘ tuple subclass: ┌───────────────────┬───┬───┬╌╌╌╌┬─────────────┐ │ PyObject_VAR_HEAD │var. data │subclass data│ └───────────────────┴───┴───┴╌╌╌╌┴─────────────┘ In other types, like ``PyHeapTypeObject``, variable-sized data always lives at the end of the instance's memory area:: heap type: ┌───────────────────┬──────────────┬───┬───┬╌╌╌╌┐ │ PyObject_VAR_HEAD │Heap type data│var. data │ └───────────────────┴──────────────┴───┴───┴╌╌╌╌┘ type subclass: ┌───────────────────┬──────────────┬─────────────┬───┬───┬╌╌╌╌┐ │ PyObject_VAR_HEAD │Heap type data│subclass data│var. data │ └───────────────────┴──────────────┴─────────────┴───┴───┴╌╌╌╌┘ The first layout enables fast access to the items array. The second allows subclasses to ignore the variable-sized array (assuming they use offsets from the start of the object to access their data). Which layout is used is, unfortunately, an implementation detail that the subclass code must take into account. Correspondingly, if a variable-sized type is designed to be extended in C, its documentation should note the mechanism used. Since this PEP focuses on ``PyHeapTypeObject``, it proposes API for the second variant. Like with fixed-size types, extending a variable-sized type is already possible: when creating the class, ``base->tp_itemsize`` needs to be passed as ``PyType_Spec.itemsize``. This is cumbersome in the Limited API, where one needs to resort to ``PyObject_GetAttrString(obj, "__itemsize__")``, with the same caveats as for ``__basicsize__`` above. This PEP proposes a mechanism to instruct the interpreter to do this on its own, without the extension needing to read ``base->tp_itemsize``. Several alternatives for this mechanism were rejected: * The easiest way to do this would be to allow leaving ``itemsize`` as 0 to mean “inherit”. However, unlike ``basicsize`` zero is a valid value for ``itemsize`` -- it marks fixed-sized types. Also, in C, zero is the default value used when ``itemsize`` is not specified. Since extending a variable-sized type requires *some* knowledge of the superclass, it would be a good idea to require a more explicit way to request it. * It would be possible to reserve a special negative value like ``itemsize=-1`` to mean “inherit”. But this would rule out a possible future where negative ``itemsize`` more closely matches negative ``basicsize`` -- a request for additional space. * A new flag would also work, but ``tp_flags`` is running out of free bits. Reserving one for a flag only used in type creation seems wasteful. So, this PEP proposes a new :external+python:c:type:`PyType_Slot` to mark that ``tp_itemsize`` hould be inherited. When this flag is used, ``itemsize`` must be set to zero. Like with ``tp_basicsize``, ``tp_itemsize`` will be set to the computed value as the class is created. Normalizing the ``PyHeapTypeObject``-like layout '''''''''''''''''''''''''''''''''''''''''''''''' Additionally, this PEP proposes a helper function to get the variable-sized data of a given instance, assuming it uses the ``PyHeapTypeObject``-like layout. This is mainly to make it easier to define and document such types. This function will not be exposed in the Limited API. Relative member offsets ----------------------- One more piece of the puzzle is ``PyMemberDef.offset``. Extensions that use a subclass-specific ``struct`` (``SubListState`` above) will get a way to specify “relative” offsets -- offsets based on this ``struct`` -- rather than to “absolute” ones (based on ``PyObject*``). One way to do it would be to automatically assume “relative” offsets if this PEP's API is used to create a class. However, this implicit assumption may be too surprising. To be more explicit, this PEP proposes a new flag for “relative” offsets. At least initially, this flag will serve only a check against misuse (and a hint for reviewers). It must be present if used with the new API, and must not be used otherwise. Specification ============= In the code blocks below, only function headers are part of the specification. Other code (the size/offset calculations) are details of the initial CPython implementation, and subject to change. Relative ``basicsize`` ---------------------- The ``basicsize`` member of ``PyType_Spec`` will be allowed to be zero or negative. In that case, it will specify the inverse of *extra* storage space instances of the new class require, in addition to the basicsize of the base class. That is, the basicsize of the resulting class will be: .. code-block:: c type->tp_basicsize = _align(base->tp_basicsize) + _align(-spec->basicsize); where ``_align`` rounds up to a multiple of ``alignof(max_align_t)``. When ``spec->basicsize`` is zero, ``base->tp_basicsize`` will be inherited directly instead (i.e. set to ``base->tp_basicsize`` without aligning). On an instance, the memory area specific to a subclass -- that is, the “extra space” that subclass reserves in addition its base -- will be available through a new function, ``PyObject_GetTypeData``. In CPython, this function will be defined as: .. code-block:: c void * PyObject_GetTypeData(PyObject *obj, PyTypeObject *cls) { return (char *)obj + _align(cls->tp_base->tp_basicsize); } Another function will be added to retreive the size of this memory area: .. code-block:: c Py_ssize_t PyObject_GetTypeDataSize(PyTypeObject *cls) { return cls->tp_basicsize - _align(cls->tp_base->tp_basicsize); } The new ``*Get*`` functions come with an important caveat, which will be pointed out in documentation: They may only be used for classes created using negative ``PyType_Spec.basicsize``. For other classes, their behavior is undefined. (Note that this allows the above code to assume ``cls->tp_base`` is not ``NULL``.) Inheriting ``itemsize`` ----------------------- If a new slot, ``Py_tp_inherit_itemsize``, is present in ``PyType_Spec.slots``, the new class will inherit the base's ``tp_itemsize``. If this is the case, CPython will assert that: * ``PyType_Spec.itemsize`` must be set to zero. * The ``Py_tp_inherit_itemsize`` slot's ``~PyType_Slot.pfunc`` must be set to NULL. A new function, ``PyObject_GetItemData``, will be added to safely access the memory reserved for items, taking subclasses that extend ``tp_basicsize`` into account. In CPython it will be defined as: .. code-block:: c void * PyObject_GetItemData(PyObject *obj) { return (char *)obj + Py_TYPE(obj)->tp_basicsize; } This function will *not* be added to the Limited API. Note that it **is not safe** to use **any** of the functions added in this PEP unless **all classes in the inheritance hierarchy** only use ``PyObject_GetItemData`` (or an equivalent) for per-item memory, or don't use per-item memory at all. (This issue already exists for most current classes that use variable-length arrays in the instance struct, but it's much less obvious if the base struct layout is unknown.) The documentation for all API added in this PEP will mention the caveat. Relative member offsets ----------------------- In types defined using negative ``PyType_Spec.basicsize``, the offsets of members defined via ``Py_tp_members`` must be relative to the extra subclass data, rather than the full ``PyObject`` struct. This will be indicated by a new flag, ``PY_RELATIVE_OFFSET``. In the initial implementation, the new flag will be redundant. It only serves to make the offset's changed meaning clear, and to help avoid mistakes. It will be an error to *not* use ``PY_RELATIVE_OFFSET`` with negative ``basicsize``, and it will be an error to use it in any other context (i.e. direct or indirect calls to ``PyDescr_NewMember``, ``PyMember_GetOne``, ``PyMember_SetOne``). CPython will adjust the offset and clear the ``PY_RELATIVE_OFFSET`` flag when intitializing a type. This means that the created type's ``tp_members`` will not match the input definition's ``Py_tp_members`` slot, and that any code that reads ``tp_members`` will not need to handle the flag. Changes to ``PyTypeObject`` --------------------------- Internally in CPython, access to ``PyTypeObject`` “items” (``_PyHeapType_GET_MEMBERS``) will be changed to use ``PyObject_GetItemData``. Note that the current implementation is equivalent: it only lacks the alignment adjustment. The macro is used a few times in type creation, so no measurable performance impact is expected. Public API for this data, ``tp_members``, will not be affected. List of new API =============== The following new functions/values are proposed. These will be added to the Limited API/Stable ABI: * ``void * PyObject_GetTypeData(PyObject *obj, PyTypeObject *cls)`` * ``Py_ssize_t PyObject_GetTypeDataSize(PyTypeObject *cls)`` * ``Py_tp_inherit_itemsize`` slot for ``PyType_Spec.slots`` These will be added to the public C API only: * ``void *PyObject_GetItemData(PyObject *obj)`` Backwards Compatibility ======================= No backwards compatibility concerns are known. Assumptions =========== The implementation assumes that an instance's memory between ``type->tp_base->tp_basicsize`` and ``type->tp_basicsize`` offsets “belongs” to ``type`` (except variable-length types). This is not documented explicitly, but CPython up to version 3.11 relied on it when adding ``__dict__`` to subclasses, so it should be safe. Security Implications ===================== None known. Endorsements ============ XXX: The PEP mentions wrapper libraries, so it should get review/endorsement from nanobind, PyO3, JPype, PySide &c. XXX: HPy devs might also want to chime in. How to Teach This ================= The initial implementation will include reference documentation and a What's New entry, which should be enough for the target audience -- authors of C extension libraries. Reference Implementation ======================== XXX: Not quite ready yet Possible Future Enhancements ============================ Alignment --------- The proposed implementation may waste some space if instance structs need smaller alignment than ``alignof(max_align_t)``. Also, dealing with alignment makes the calculation slower than it could be if we could rely on ``base->tp_basicsize`` being properly aligned for the subtype. In other words, the proposed implementation focuses on safety and ease of use, and trades space and time for it. If it turns out that this is a problem, the implementation can be adjusted without breaking the API: - The offset to the type-specific buffer can be stored, so ``PyObject_GetTypeData`` effectively becomes ``(char *)obj + cls->ht_typedataoffset``, possibly speeding things up at the cost of an extra pointer in the class. - Then, a new ``PyType_Slot`` can specify the desired alignment, to reduce space requirements for instances. - Alternatively, it might be possible to align ``tp_basicsize`` up at class creation/readying time. Rejected Ideas ============== None yet. Open Issues =========== Is negative basicsize the way to go? Should this be enabled by a flag instead? Footnotes ========= .. [#f1] This PEP does not make it “safe” to subclass NumPy arrays specifically. NumPy publishes `an extensive list of caveats `__ for subclassing its arrays from Python, and extending in C might need a similar list. Copyright ========= This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.