python-peps/pep-0697.rst

562 lines
21 KiB
ReStructuredText
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

PEP: 697
Title: Limited C API for Extending Opaque Types
Author: Petr Viktorin <encukou@gmail.com>
Discussions-To: https://discuss.python.org/t/19743
Status: Final
Type: Standards Track
Content-Type: text/x-rst
Created: 23-Aug-2022
Python-Version: 3.12
Post-History: `24-May-2022 <https://mail.python.org/archives/list/capi-sig@python.org/thread/SIP3VP7JU4OBWP62KBOYGOYCVIOTXEFH/>`__,
`06-Oct-2022 <https://discuss.python.org/t/19743>`__,
Resolution: https://discuss.python.org/t/19743/30
.. canonical-doc::
:external+py3.12:c:member:`PyType_Spec.basicsize`,
:external+py3.12:c:func:`PyObject_GetTypeData`,
:external+py3.12:data:`Py_TPFLAGS_ITEMS_AT_END`,
:external+py3.12:c:macro:`Py_RELATIVE_OFFSET`,
:external+py3.12:c:func:`PyObject_GetItemData`
Abstract
========
Add `Limited C API <https://docs.python.org/3.11/c-api/stable.html#stable-application-binary-interface>`__
support for extending some types with opaque data
by allowing code to only deal with data specific to a particular (sub)class.
This mechanism is required to be usable with ``PyHeapTypeObject``.
This PEP does not propose allowing to extend non-dynamically sized variable
sized objects such as ``tuple`` or ``int`` due to their different memory layout
and perceived lack of demand for doing so. This PEP leaves room to do so in
the future via the same mechanism if ever desired.
Motivation
==========
The motivating problem this PEP solves is attaching C-level state
to custom types --- i.e. metaclasses (subclasses of
:py:class:`python:type`).
This is often needed in “wrappers” that expose another type
system (e.g. C++, Java, Rust) as Python classes.
These typically need to attach information about the “wrapped” non-Python
class to the Python type object.
This should be possible to do in the Limited API, so that the language wrappers
or code generators can be used to create Stable ABI extensions.
(See :pep:`652` for the benefits of providing a stable ABI.)
Extending ``type`` is an instance of a more general problem:
extending a class while maintaining loose coupling that is,
not depending on the memory layout used by the superclass.
(That's a lot of jargon; see Rationale for a concrete example of extending
``list``.)
Rationale
=========
Extending opaque types
----------------------
In the Limited API, most ``struct``\ s are opaque: their size and memory layout
are not exposed, so they can be changed in new versions of CPython (or
alternate implementations of the C API).
This means that the usual subclassing pattern -- making the ``struct``
used for instances of the *base* type be the first element of the ``struct``
used for instances of the *derived* type -- does not work.
To illustrate with code, the `example from the tutorial <https://docs.python.org/3.11/extending/newtypes_tutorial.html#subclassing-other-types>`_
extends :external+python:c:type:`PyListObject` (:py:class:`python:list`)
using the following ``struct``:
.. code-block:: c
typedef struct {
PyListObject list;
int state;
} SubListObject;
This won't compile in the Limited API, since ``PyListObject`` is opaque (to
allow changes as features and optimizations are implemented).
Instead, this PEP proposes using a ``struct`` with only the state needed
in the subclass, that is:
.. code-block:: c
typedef struct {
int state;
} SubListState;
// (or just `typedef int SubListState;` in this case)
The subclass can now be completely decoupled from the memory layout (and size)
of the superclass.
This is possible today. To use such a struct:
* when creating the class, use ``PyListObject->tp_basicsize + sizeof(SubListState)``
as ``PyType_Spec.basicsize``;
* when accessing the data, use ``PyListObject->tp_basicsize`` as the offset
into the instance (``PyObject*``).
However, this has disadvantages:
* The base's ``basicsize`` may not be properly aligned, causing issues
on some architectures if not mitigated. (These issues can be particularly
nasty if alignment changes in a new release.)
* ``PyTypeObject.tp_basicsize`` is not exposed in the
Limited API, so extensions that support Limited API need to
use ``PyObject_GetAttrString(obj, "__basicsize__")``.
This is cumbersome, and unsafe in edge cases (the Python attribute can
be overridden).
* Variable-size objects are not handled (see :ref:`697-var-sized` below).
To make this easy (and even *best practice* for projects that choose loose
coupling over maximum performance), this PEP proposes an API to:
1. During class creation, specify that ``SubListState``
should be “appended” to ``PyListObject``, without passing any additional
details about ``list``. (The interpreter itself gets all necessary info,
like ``tp_basicsize``, from the base).
This will be specified by a negative ``PyType_Spec.basicsize``:
``-sizeof(SubListState)``.
2. Given an instance, and the subclass ``PyTypeObject*``,
get a pointer to the ``SubListState``.
A new function, ``PyObject_GetTypeData``, will be added for this.
The base class is not limited to ``PyListObject``, of course: it can be used to
extend any base class whose instance ``struct`` is opaque, unstable across
releases, or not exposed at all -- including :py:class:`python:type`
(``PyHeapTypeObject``) or third-party extensions
(for example, NumPy arrays [#f1]_).
For cases where no additional state is needed, a zero ``basicsize`` will be
allowed: in that case, the base's ``tp_basicsize`` will be inherited.
(This currently works, but lacks explicit documentation and tests.)
The ``tp_basicsize`` of the new class will be set to the computed total size,
so code that inspects classes will continue working as before.
.. _697-var-sized:
Extending variable-size objects
-------------------------------
Additional considerations are needed to subclass
:external+python:c:type:`variable-sized objects <PyVarObject>`
while maintaining loose coupling:
the variable-sized data can collide with subclass data (``SubListState`` in
the example above).
Currently, CPython doesn't provide a way to prevent such collisions.
So, the proposed mechanism of extending opaque classes (negative
``base->tp_itemsize``) will *fail* by default.
We could stop there, but since the motivating type --- ``PyHeapTypeObject`` ---
is variable sized, we need a safe way to allow subclassing it.
A bit of background first:
Variable-size layouts
.....................
There are two main memory layouts for variable-sized objects.
In types such as ``int`` or ``tuple``, the variable data is stored at a fixed
offset.
If subclasses need additional space, it must be added after any variable-sized
data::
PyTupleObject:
┌───────────────────┬───┬───┬╌╌╌╌┐
│ PyObject_VAR_HEAD │var. data │
└───────────────────┴───┴───┴╌╌╌╌┘
tuple subclass:
┌───────────────────┬───┬───┬╌╌╌╌┬─────────────┐
│ PyObject_VAR_HEAD │var. data │subclass data│
└───────────────────┴───┴───┴╌╌╌╌┴─────────────┘
In other types, like ``PyHeapTypeObject``, variable-sized data always lives at
the end of the instance's memory area::
heap type:
┌───────────────────┬──────────────┬───┬───┬╌╌╌╌┐
│ PyObject_VAR_HEAD │Heap type data│var. data │
└───────────────────┴──────────────┴───┴───┴╌╌╌╌┘
type subclass:
┌───────────────────┬──────────────┬─────────────┬───┬───┬╌╌╌╌┐
│ PyObject_VAR_HEAD │Heap type data│subclass data│var. data │
└───────────────────┴──────────────┴─────────────┴───┴───┴╌╌╌╌┘
The first layout enables fast access to the items array.
The second allows subclasses to ignore the variable-sized array (assuming
they use offsets from the start of the object to access their data).
Since this PEP focuses on ``PyHeapTypeObject``, it proposes an API to allow
subclassing for the second variant.
Support for the first can be added later *as an API-compatible change*
(though your PEP author doubts it'd be worth the effort).
Extending classes with the ``PyHeapTypeObject``-like layout
...........................................................
This PEP proposes a type flag, ``Py_TPFLAGS_ITEMS_AT_END``, which will indicate
the ``PyHeapTypeObject``-like layout.
This can be set in two ways:
* the superclass can set the flag, allowing subclass authors to not care about
the fact that ``itemsize`` is involved, or
* the new subclass sets the flag, asserting that the author knows the
superclass is suitable (but perhaps hasn't been updated to use the flag yet).
This flag will be necessary to extend a variable-sized type using negative
``basicsize``.
An alternative to a flag would be to require subclass authors to know that the
base uses a compatible layout (e.g. from documentation).
A past version of this PEP proposed a new
``PyType_Slot`` for it.
This turned out to be hard to explain, and goes against the idea of decoupling
the subclass from the base layout.
The new flag will be used to allow safely extending variable-sized types:
creating a type with ``spec->basicsize < 0`` and ``base->tp_itemsize > 0``
will require the flag.
Additionally, this PEP proposes a helper function to get the variable-sized
data of a given instance, if it uses the new ``Py_TPFLAGS_ITEMS_AT_END`` flag.
This hides the necessary pointer arithmetic behind an API
that can potentially be adapted to other layouts in the future (including,
potentially, a VM-managed layout).
Big picture
...........
To make it easier to verify that all cases are covered, here's a scary-looking
big-picture decision tree.
.. note::
The individual cases are easier to explain in isolation (see the
:ref:`reference implementation <697-ref-impl>` for draft docs).
* ``spec->basicsize > 0``: No change to the status quo. (The base
class layout is known.)
* ``spec->basicsize == 0``: (Inheriting the basicsize)
* ``base->tp_itemsize == 0``: The item size is set to ``spec->tp_itemsize``.
(No change to status quo.)
* ``base->tp_itemsize > 0``: (Extending a variable-size class)
* ``spec->itemsize == 0``: The item size is inherited.
(No change to status quo.)
* ``spec->itemsize > 0``: The item size is set. (This is hard to use safely,
but it's CPython's current behavior.)
* ``spec->basicsize < 0``: (Extending the basicsize)
* ``base->tp_itemsize == 0``: (Extending a fixed-size class)
* ``spec->itemsize == 0``: The item size is set to 0.
* ``spec->itemsize > 0``: Fail. (We'd need to add an ``ob_size``, which is
only possible for trivial types -- and the trivial layout must be known.)
* ``base->tp_itemsize > 0``: (Extending a variable-size class)
* ``spec->itemsize == 0``: (Inheriting the itemsize)
* ``Py_TPFLAGS_ITEMS_AT_END`` used: itemsize is inherited.
* ``Py_TPFLAGS_ITEMS_AT_END`` not used: Fail. (Possible conflict.)
* ``spec->itemsize > 0``: Fail. (Changing/extending the item size can't be
done safely.)
Setting ``spec->itemsize < 0`` is always an error.
This PEP does not propose any mechanism to *extend* ``tp->itemsize``
rather than just inherit it.
Relative member offsets
-----------------------
One more piece of the puzzle is ``PyMemberDef.offset``.
Extensions that use a subclass-specific ``struct`` (``SubListState`` above)
will get a way to specify “relative” offsets (offsets based from this
``struct``) rather than “absolute” ones (based off the ``PyObject`` struct).
One way to do it would be to automatically assume “relative” offsets
when creating a class using the new API.
However, this implicit assumption would be too surprising.
To be more explicit, this PEP proposes a new flag for “relative” offsets.
At least initially, this flag will serve only as a check against misuse
(and a hint for reviewers).
It must be present if used with the new API, and must not be used otherwise.
Specification
=============
In the code blocks below, only function headers are part of the specification.
Other code (the size/offset calculations) are details of the initial CPython
implementation, and subject to change.
Relative ``basicsize``
----------------------
The ``basicsize`` member of ``PyType_Spec`` will be allowed to be zero or
negative.
In that case, its absolute value will specify how much *extra* storage space
instances of the new class require, in addition to the basicsize of the
base class.
That is, the basicsize of the resulting class will be:
.. code-block:: c
type->tp_basicsize = _align(base->tp_basicsize) + _align(-spec->basicsize);
where ``_align`` rounds up to a multiple of ``alignof(max_align_t)``.
When ``spec->basicsize`` is zero, basicsize will be inherited
directly instead, i.e. set to ``base->tp_basicsize`` without aligning.
(This already works; explicit tests and documentation will be added.)
On an instance, the memory area specific to a subclass -- that is, the
“extra space” that subclass reserves in addition its base -- will be available
through a new function, ``PyObject_GetTypeData``.
In CPython, this function will be defined as:
.. code-block:: c
void *
PyObject_GetTypeData(PyObject *obj, PyTypeObject *cls) {
return (char *)obj + _align(cls->tp_base->tp_basicsize);
}
Another function will be added to retrieve the size of this memory area:
.. code-block:: c
Py_ssize_t
PyType_GetTypeDataSize(PyTypeObject *cls) {
return cls->tp_basicsize - _align(cls->tp_base->tp_basicsize);
}
The result may be higher than requested by ``-basicsize``. It is safe to
use all of it (e.g. with ``memset``).
The new ``*Get*`` functions come with an important caveat, which will be
pointed out in documentation: They may only be used for classes created using
negative ``PyType_Spec.basicsize``. For other classes, their behavior is
undefined.
(Note that this allows the above code to assume ``cls->tp_base`` is not
``NULL``.)
Inheriting ``itemsize``
-----------------------
When ``spec->itemsize`` is zero, ``tp_itemsize`` will be inherited
from the base.
(This already works; explicit tests and documentation will be added.)
A new type flag, ``Py_TPFLAGS_ITEMS_AT_END``, will be added.
This flag can only be set on types with non-zero ``tp_itemsize``.
It indicates that the variable-sized portion of an instance
is stored at the end of the instance's memory.
The default metatype (``PyType_Type``) will set this flag.
A new function, ``PyObject_GetItemData``, will be added to access the
memory reserved for variable-sized content of types with the new flag.
In CPython it will be defined as:
.. code-block:: c
void *
PyObject_GetItemData(PyObject *obj) {
if (!PyType_HasFeature(Py_TYPE(obj), Py_TPFLAGS_ITEMS_AT_END) {
<fail with TypeError>
}
return (char *)obj + Py_TYPE(obj)->tp_basicsize;
}
This function will initially *not* be added to the Limited API.
Extending a class with positive ``base->itemsize`` using
negative ``spec->basicsize`` will fail unless ``Py_TPFLAGS_ITEMS_AT_END``
is set, either on the base or in ``spec->flags``.
(See :ref:`697-var-sized` for a full explanation.)
Extending a class with positive ``spec->itemsize`` using negative
``spec->basicsize`` will fail.
Relative member offsets
-----------------------
In types defined using negative ``PyType_Spec.basicsize``, the offsets of
members defined via ``Py_tp_members`` must be relative to the
extra subclass data, rather than the full ``PyObject`` struct.
This will be indicated by a new flag in ``PyMemberDef.flags``:
``Py_RELATIVE_OFFSET``.
In the initial implementation, the new flag will be redundant. It only serves
to make the offset's changed meaning clear, and to help avoid mistakes.
It will be an error to *not* use ``Py_RELATIVE_OFFSET`` with negative
``basicsize``, and it will be an error to use it in any other context
(i.e. direct or indirect calls to ``PyDescr_NewMember``, ``PyMember_GetOne``,
``PyMember_SetOne``).
CPython will adjust the offset and clear the ``Py_RELATIVE_OFFSET`` flag when
intitializing a type.
This means that:
* the created type's ``tp_members`` will not match the input
definition's ``Py_tp_members`` slot, and
* any code that reads ``tp_members`` will not need to handle the flag.
List of new API
===============
The following new functions/values are proposed.
These will be added to the Limited API/Stable ABI:
* ``void * PyObject_GetTypeData(PyObject *obj, PyTypeObject *cls)``
* ``Py_ssize_t PyType_GetTypeDataSize(PyTypeObject *cls)``
* ``Py_TPFLAGS_ITEMS_AT_END`` flag for ``PyTypeObject.tp_flags``
* ``Py_RELATIVE_OFFSET`` flag for ``PyMemberDef.flags``
These will be added to the public C API only:
* ``void *PyObject_GetItemData(PyObject *obj)``
Backwards Compatibility
=======================
No backwards compatibility concerns are known.
Assumptions
===========
The implementation assumes that an instance's memory
between ``type->tp_base->tp_basicsize`` and ``type->tp_basicsize`` offsets
“belongs” to ``type`` (except variable-length types).
This is not documented explicitly, but CPython up to version 3.11 relied on it
when adding ``__dict__`` to subclasses, so it should be safe.
Security Implications
=====================
None known.
Endorsements
============
The author of ``pybind11`` originally requested solving the issue
(see point 2 in `this list <https://discuss.python.org/t/15993>`__),
and `has been verifying the implementation <https://discuss.python.org/t/19743/14>`__.
Florian from the HPy project `said <https://discuss.python.org/t/19743/3>`__
that the API looks good in general.
(See :ref:`below <697-alignment-performance>` for a possible solution to
performance concerns.)
How to Teach This
=================
The initial implementation will include reference documentation
and a What's New entry, which should be enough for the target audience
-- authors of C extension libraries.
.. _697-ref-impl:
Reference Implementation
========================
A reference implementation is in the `extend-opaque branch <https://github.com/python/cpython/compare/main...encukou:cpython:extend-opaque>`__
in the ``encukou/cpython`` GitHub repo.
Possible Future Enhancements
============================
.. _697-alignment-performance:
Alignment & Performance
-----------------------
The proposed implementation may waste some space if instance structs
need smaller alignment than ``alignof(max_align_t)``.
Also, dealing with alignment makes the calculation slower than it could be
if we could rely on ``base->tp_basicsize`` being properly aligned for the
subtype.
In other words, the proposed implementation focuses on safety and ease of use,
and trades space and time for it.
If it turns out that this is a problem, the implementation can be adjusted
without breaking the API:
- The offset to the type-specific buffer can be stored, so
``PyObject_GetTypeData`` effectively becomes
``(char *)obj + cls->ht_typedataoffset``, possibly speeding things up at
the cost of an extra pointer in the class.
- Then, a new ``PyType_Slot`` can specify the desired alignment, to
reduce space requirements for instances.
Other layouts for variable-size types
-------------------------------------
A flag like ``Py_TPFLAGS_ITEMS_AT_END`` could be added to signal the
“tuple-like” layout described in :ref:`697-var-sized`, and all mechanisms
this PEP proposes could be adapted to support it.
Other layouts could be added as well.
However, it seems there'd be very little practical benefit,
so it's just a theoretical possibility.
Rejected Ideas
==============
Instead of a negative ``spec->basicsize``, a new ``PyType_Spec`` flag could've
been added. The effect would be the same to any existing code accessing these
internals without up to date knowledge of the change as the meaning of the
field value is changing in this situation.
Footnotes
=========
.. [#f1] This PEP does not make it “safe” to subclass NumPy arrays specifically.
NumPy publishes `an extensive list of caveats <https://numpy.org/doc/1.23/user/basics.subclassing.html>`__
for subclassing its arrays from Python, and extending in C might need
a similar list.
Copyright
=========
This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.