PEP 630: add disclaimers re. heap types & conversion, add PyType_GetModuleByDef (GH-2319)

* PEP 630: add disclaimers re. heap types & conversion, and PyType_GetModuleByDef

Co-authored-by: Erlend Egeberg Aasland <erlend.aasland@innova.no>
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
This commit is contained in:
Petr Viktorin 2022-02-14 16:30:46 +01:00 committed by GitHub
parent 88c65146e7
commit bdc0e44c67
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 93 additions and 24 deletions

View File

@ -21,9 +21,9 @@ describes problems of such per-process state and efforts to make
per-module state, a better default, possible and easy to use.
The document also describes how to switch to per-module state where
possible. The switch involves allocating space for that state, switching
from static types to heap types, and—perhaps most importantly—accessing
per-module state from code.
possible. The switch involves allocating space for that state, potentially
switching from static types to heap types, and—perhaps most
importantly—accessing per-module state from code.
About this document
-------------------
@ -40,9 +40,15 @@ development, gaps identified in this text can show where to focus
the effort, and the text can be updated as new features are implemented
Whenever this PEP mentions *extension modules*, the advice also
applies to *built-in* modules, such as the C parts of the standard
library. The standard library is expected to switch to per-module state
early.
applies to *built-in* modules.
.. note::
This PEP contains generic advice. When following it, always take into
account the specifics of your project.
For example, while much of this advice applies to the C parts of
Python's standard library, the PEP does not factor in stdlib specifics
(unusual backward compatibility issues, access to private API, etc.).
PEPs related to this effort are:
@ -101,7 +107,7 @@ testing the isolation, multiple module objects corresponding to a single
extension can even be loaded in a single interpreter.
Per-module state provides an easy way to think about lifetime and
resource ownership: the extension module author will set up when a
resource ownership: the extension module will initialize when a
module object is created, and clean up when it's freed. In this regard,
a module is just like any other ``PyObject *``; there are no “on
interpreter shutdown” hooks to think about—or forget about.
@ -235,7 +241,8 @@ Set ``PyModuleDef.m_size`` to a positive number to request that many
bytes of storage local to the module. Usually, this will be set to the
size of some module-specific ``struct``, which can store all of the
module's C-level state. In particular, it is where you should put
pointers to classes (including exceptions) and settings (e.g. ``csv``'s
pointers to classes (including exceptions, but excluding static types)
and settings (e.g. ``csv``'s
`field_size_limit <https://docs.python.org/3.8/library/csv.html#csv.field_size_limit>`__)
which the C code needs to function.
@ -302,9 +309,9 @@ zero. In your own module, you're in control of ``m_size``, so this is
easy to prevent.)
Heap types
~~~~~~~~~~
----------
Traditionally, types defined in C code were *static*, that is,
Traditionally, types defined in C code are *static*, that is,
``static PyTypeObject`` structures defined directly in code and
initialized using ``PyType_Ready()``.
@ -321,10 +328,27 @@ the Python level: for example, you can't set ``str.myattribute = 123``.
that shares any Python objects across interpreters implicitly depends
on CPython's current, process-wide GIL.
An alternative to static types is *heap-allocated types*, or heap types
Because they are immutable and process-global, static types cannot access
“their” module state.
If any method of such a type requires access to module state,
the type must be converted to a *heap-allocated type*, or *heap type*
for short. These correspond more closely to classes created by Pythons
``class`` statement.
For new modules, using heap types by default is a good rule of thumb.
Static types can be converted to heap types, but note that
the heap type API was not designed for “lossless” conversion
from static types -- that is, creating a type that works exactly like a given
static type. Unlike static types, heap type objects are mutable by default.
Also, when rewriting the class definition in a new API,
you are likely to unintentionally change a few details (e.g. pickle-ability
or inherited slots). Always test the details that are important to you.
Defining Heap Types
~~~~~~~~~~~~~~~~~~~
Heap types can be created by filling a ``PyType_Spec`` structure, a
description or “blueprint” of a class, and calling
``PyType_FromModuleAndSpec()`` to construct a new class object.
@ -370,7 +394,7 @@ is called on a *subclass* of your type, ``Py_TYPE(self)`` will refer to
that subclass, which may be defined in different module than yours.
.. note::
The following Python code. can illustrate the concept.
The following Python code can illustrate the concept.
``Base.get_defining_class`` returns ``Base`` even
if ``type(self) == Sub``::
@ -424,6 +448,52 @@ For example::
{NULL},
}
Module State Access from Slot Methods, Getters and Setters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. note::
This is new in Python 3.11.
.. After adding to limited API:
If you use the `limited API <https://docs.python.org/3/c-api/stable.html>__,
you must update ``Py_LIMITED_API`` to ``0x030b0000``, losing ABI
compatibility with earlier versions.
Slot methods -- the fast C equivalents for special methods, such as
`nb_add <https://docs.python.org/3/c-api/typeobj.html#c.PyNumberMethods.nb_add>`__
for ``__add__`` or `tp_new <https://docs.python.org/3/c-api/typeobj.html#c.PyTypeObject.tp_new>`__
for initialization -- have a very simple API that doesn't allow
passing in the defining class as in ``PyCMethod``.
The same goes for getters and setters defined with
`PyGetSetDef <https://docs.python.org/3/c-api/structures.html#c.PyGetSetDef>`__.
To access the module state in these cases, use the
`PyType_GetModuleByDef <https://docs.python.org/typeobj.html#c.PyType_GetModuleByDef>`__
function, and pass in the module definition.
Once you have the module, call `PyModule_GetState <https://docs.python.org/3/c-api/module.html?highlight=pymodule_getstate#c.PyModule_GetState>`__
to get the state::
PyObject *module = PyType_GetModuleByDef(Py_TYPE(self), &module_def);
my_struct *state = (my_struct*)PyModule_GetState(module);
if (state === NULL) {
return NULL;
}
``PyType_GetModuleByDef`` works by searching the `MRO <https://docs.python.org/3/glossary.html#term-method-resolution-order>`__
(i.e. all superclasses) for the first superclass that has a corresponding
module.
.. note::
In very exotic cases (inheritance chains spanning multiple modules
created from the same definition), ``PyType_GetModuleByDef`` might not
return the module of the true defining class. However, it will always
return a module with the same definition, ensuring a compatible
C memory layout.
Open Issues
-----------
@ -432,20 +502,10 @@ Several issues around per-module state and heap types are still open.
Discussions about improving the situation are best held on the `capi-sig
mailing list <https://mail.python.org/mailman3/lists/capi-sig.python.org/>`__.
Module State Access from Slot Methods, Getters and Setters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Currently (as of Python 3.9), there is no API to access the module state
from:
- slot methods (meaning type slots, such as ``tp_new``, ``nb_add`` or
``tp_iternext``)
- getters and setters defined with ``tp_getset``
Type Checking
~~~~~~~~~~~~~
Currently (as of Python 3.9), heap types have no good API to write
Currently (as of Python 3.10), heap types have no good API to write
``Py*_Check`` functions (like ``PyUnicode_Check`` exists for ``str``, a
static type), and so it is not easy to ensure whether instances have a
particular C layout.
@ -453,7 +513,7 @@ particular C layout.
Metaclasses
~~~~~~~~~~~
Currently (as of Python 3.9), there is no good API to specify the
Currently (as of Python 3.10), there is no good API to specify the
*metaclass* of a heap type, that is, the ``ob_type`` field of the type
object.
@ -466,6 +526,15 @@ its variable-size storage is currently consumed by slots. Fixing this
is complicated by the fact that several classes in an inheritance
hierarchy may need to reserve some state.
Lossless conversion to heap types
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The heap type API was not designed for “lossless” conversion from static types,
that is, creating a type that works exactly like a given static type.
The best way to address it would probably be to write a guide that covers
known “gotchas”.
Copyright
---------