This PEP proposes to add a way for CPython extension methods to access context such as
the state of the modules they are defined in.
This will allow extension methods to use direct pointer dereferences
rather than PyState_FindModule for looking up module state, reducing or eliminating the
performance cost of using module-scoped state over process global state.
This fixes one of the remaining roadblocks for adoption of PEP 3121 (Extension
module initialization and finalization) and PEP 489
(Multi-phase extension module initialization).
Additionaly, support for easier creation of immutable exception classes is added.
This removes the need for keeping per-module state if it would only be used
for exception classes.
While this PEP takes an additional step towards fully solving the problems that PEP 3121 and PEP 489 started
tackling, it does not attempt to resolve *all* remaining concerns. In particular, accessing the module state from slot methods (``nb_add``, etc) remains slower than accessing that state from other extension methods.
Terminology
===========
Process-Global State
--------------------
C-level static variables. Since this is very low-level
memory storage, it must be managed carefully.
Per-module State
----------------
State local to a module object, allocated dynamically as part of a
module object's initialization. This isolates the state from other
instances of the module (including those in other subinterpreters).
Accessed by ``PyModule_GetState()``.
Static Type
-----------
A type object defined as a C-level static variable, i.e. a compiled-in type object.
A static type needs to be shared between module instances and has no
information of what module it belongs to.
Static types do not have ``__dict__`` (although their instances might).
Heap Type
---------
A type object created at run time.
Rationale
=========
PEP 489 introduced a new way to initialize extension modules, which brings
several advantages to extensions that implement it:
* The extension modules behave more like their Python counterparts.
* The extension modules can easily support loading into pre-existing
module objects, which paves the way for extension module support for
``runpy`` or for systems that enable extension module reloading.
* Loading multiple modules from the same extension is possible, which
makes testing module isolation (a key feature for proper sub-interpreter
support) possible from a single interpreter.
The biggest hurdle for adoption of PEP 489 is allowing access to module state
from methods of extension types.
Currently, the way to access this state from extension methods is by looking up the module via
``PyState_FindModule`` (in contrast to module level functions in extension modules, which
receive a module reference as an argument).
However, ``PyState_FindModule`` queries the thread-local state, making it relatively
costly compared to C level process global access and consequently deterring module authors from using it.
Also, ``PyState_FindModule`` relies on the assumption that in each
subinterpreter, there is at most one module corresponding to
a given ``PyModuleDef``. This does not align well with Python's import
machinery. Since PEP 489 aimed to fix that, the assumption does
not hold for modules that use multi-phase initialization, so
``PyState_FindModule`` is unavailable for these modules.
A faster, safer way of accessing module-level state from extension methods
is needed.
Immutable Exception Types
-------------------------
For isolated modules to work, any class whose methods touch module state
must be a heap type, so that each instance of a module can have its own
type object. With the changes proposed in this PEP, heap type instances will
have access to module state without global registration. But, to create
instances of heap types, one will need the module state in order to
get the type object corresponding to the appropriate module.
In short, heap types are "viral" – anything that “touches” them must itself be
a heap type.
Curently, most exception types, apart from the ones in ``builtins``, are
heap types. This is likely simply because there is a convenient way
to create them: ``PyErr_NewException``.
Heap types generally have a mutable ``__dict__``.
In most cases, this mutability is harmful. For example, exception types
from the ``sqlite`` module are mutable and shared across subinterpreters.
This allows "smuggling" values to other subinterpreters via attributes of
``sqlite3.Error``.
Moreover, since raising exceptions is a common operation, and heap types
will be "viral", ``PyErr_NewException`` will tend to "infect" the module
By contrast, extension methods are typically implemented as normal C functions.
This means that they only have access to their arguments and C level thread-local
and process-global states. Traditionally, many extension modules have stored
their shared state in C-level process globals, causing problems when:
* running multiple initialize/finalize cycles in the same process
* reloading modules (e.g. to test conditional imports)
* loading extension modules in subinterpreters
PEP 3121 attempted to resolve this by offering the ``PyState_FindModule`` API, but this still has significant problems when it comes to extension methods (rather than module level functions):
* it is markedly slower than directly accessing C-level process-global state
* there is still some inherent reliance on process global state that means it still doesn't reliably handle module reloading
It's also the case that when looking up a C-level struct such as module state, supplying
an unexpected object layout can crash the interpreter, so it's significantly more important to ensure that extension
methods receive the kind of object they expect.
Proposal
========
Currently, a bound extension method (``PyCFunction`` or ``PyCFunctionWithKeywords``) receives only
``self``, and (if applicable) the supplied positional and keyword arguments.
While module-level extension functions already receive access to the defining module object via their
``self`` argument, methods of extension types don't have that luxury: they receive the bound instance
via ``self``, and hence have no direct access to the defining class or the module level state.
The additional module level context described above can be made available with two changes.
Both additions are optional; extension authors need to opt in to start
using them:
* Add a pointer to the module to heap type objects.
* Pass the defining class to the underlying C function.
The defining class is readily available at the time built-in
method object (``PyCFunctionObject``) is created, so it can be stored
in a new struct that extends ``PyCFunctionObject``.
The module state can then be retrieved from the module object via
``PyModule_GetState``.
Note that this proposal implies that any type whose method needs to access
per-module state must be a heap type, rather than a static type.
This is necessary to support loading multiple module objects from a single
extension: a static type, as a C-level global, has no information about
which module it belongs to.
Slot methods
------------
The above changes don't cover slot methods, such as ``tp_iter`` or ``nb_add``.
The problem with slot methods is that their C API is fixed, so we can't
simply add a new argument to pass in the defining class.
Two possible solutions have been proposed to this problem:
* Look up the class through walking the MRO.
This is potentially expensive, but will be useful if performance is not
a problem (such as when raising a module-level exception).
* Storing a pointer to the defining class of each slot in a separate table,
``__typeslots__``[#typeslots-mail]_. This is technically feasible and fast,