PEP 630: Format and copyedit prior to conversion into a docs HOWTO guide (GH-2459)
This commit is contained in:
parent
53d6d5987b
commit
dfdd865c05
274
pep-0630.rst
274
pep-0630.rst
|
@ -9,35 +9,36 @@ Created: 25-Aug-2020
|
||||||
Post-History: 16-Jul-2020
|
Post-History: 16-Jul-2020
|
||||||
|
|
||||||
|
|
||||||
Isolating Extension Modules
|
.. highlight:: c
|
||||||
===========================
|
|
||||||
|
|
||||||
Abstract
|
Abstract
|
||||||
--------
|
========
|
||||||
|
|
||||||
Traditionally, state of Python extension modules was kept in C
|
Traditionally, state belonging to Python extension modules was kept in C
|
||||||
``static`` variables, which have process-wide scope. This document
|
``static`` variables, which have process-wide scope. This document
|
||||||
describes problems of such per-process state and efforts to make
|
describes problems of such per-process state and efforts to make
|
||||||
per-module state, a better default, possible and easy to use.
|
per-module state—a better default—possible and easy to use.
|
||||||
|
|
||||||
The document also describes how to switch to per-module state where
|
The document also describes how to switch to per-module state where
|
||||||
possible. The switch involves allocating space for that state, potentially
|
possible. This transition involves allocating space for that state, potentially
|
||||||
switching from static types to heap types, and—perhaps most
|
switching from static types to heap types, and—perhaps most
|
||||||
importantly—accessing per-module state from code.
|
importantly—accessing per-module state from code.
|
||||||
|
|
||||||
About this document
|
|
||||||
-------------------
|
About This Document
|
||||||
|
===================
|
||||||
|
|
||||||
As an :pep:`informational PEP <1#pep-types>`,
|
As an :pep:`informational PEP <1#pep-types>`,
|
||||||
this document does not introduce any changes: those should be done in
|
this document does not introduce any changes; those should be done in
|
||||||
their own PEPs (or issues, if small enough). Rather, it covers the
|
their own PEPs (or issues, if small enough). Rather, it covers the
|
||||||
motivation behind an effort that spans multiple releases, and instructs
|
motivation behind an effort that spans multiple releases, and instructs
|
||||||
early adopters on how to use the finished features.
|
early adopters on how to use the finished features.
|
||||||
|
|
||||||
Once support is reasonably complete, the text can be moved to Python's
|
Once support is reasonably complete, this content can be moved to Python's
|
||||||
documentation as a HOWTO. Meanwhile, in the spirit of documentation-driven
|
documentation as a `HOWTO <https://docs.python.org/3/howto/index.html>`__.
|
||||||
development, gaps identified in this text can show where to focus
|
Meanwhile, in the spirit of documentation-driven development,
|
||||||
the effort, and the text can be updated as new features are implemented
|
gaps identified in this PEP can show where to focus the effort,
|
||||||
|
and it can be updated as new features are implemented.
|
||||||
|
|
||||||
Whenever this PEP mentions *extension modules*, the advice also
|
Whenever this PEP mentions *extension modules*, the advice also
|
||||||
applies to *built-in* modules.
|
applies to *built-in* modules.
|
||||||
|
@ -52,7 +53,7 @@ applies to *built-in* modules.
|
||||||
|
|
||||||
PEPs related to this effort are:
|
PEPs related to this effort are:
|
||||||
|
|
||||||
- :pep:`384` -- *Defining a Stable ABI*, which added C API for creating
|
- :pep:`384` -- *Defining a Stable ABI*, which added a C API for creating
|
||||||
heap types
|
heap types
|
||||||
- :pep:`489` -- *Multi-phase extension module initialization*
|
- :pep:`489` -- *Multi-phase extension module initialization*
|
||||||
- :pep:`573` -- *Module State Access from C Extension Methods*
|
- :pep:`573` -- *Module State Access from C Extension Methods*
|
||||||
|
@ -64,8 +65,9 @@ specific to CPython.
|
||||||
As with any Informational PEP, this text does not necessarily represent
|
As with any Informational PEP, this text does not necessarily represent
|
||||||
a Python community consensus or recommendation.
|
a Python community consensus or recommendation.
|
||||||
|
|
||||||
|
|
||||||
Motivation
|
Motivation
|
||||||
----------
|
==========
|
||||||
|
|
||||||
An *interpreter* is the context in which Python code runs. It contains
|
An *interpreter* is the context in which Python code runs. It contains
|
||||||
configuration (e.g. the import path) and runtime state (e.g. the set of
|
configuration (e.g. the import path) and runtime state (e.g. the set of
|
||||||
|
@ -76,13 +78,13 @@ two cases to think about—users may run interpreters:
|
||||||
|
|
||||||
- in sequence, with several ``Py_InitializeEx``/``Py_FinalizeEx``
|
- in sequence, with several ``Py_InitializeEx``/``Py_FinalizeEx``
|
||||||
cycles, and
|
cycles, and
|
||||||
- in parallel, managing “sub-interpreters” using
|
- in parallel, managing "sub-interpreters" using
|
||||||
``Py_NewInterpreter``/``Py_EndInterpreter``.
|
``Py_NewInterpreter``/``Py_EndInterpreter``.
|
||||||
|
|
||||||
Both cases (and combinations of them) would be most useful when
|
Both cases (and combinations of them) would be most useful when
|
||||||
embedding Python within a library. Libraries generally shouldn't make
|
embedding Python within a library. Libraries generally shouldn't make
|
||||||
assumptions about the application that uses them, which includes
|
assumptions about the application that uses them, which includes
|
||||||
assumptions about a process-wide “main Python interpreter”.
|
assuming a process-wide "main Python interpreter".
|
||||||
|
|
||||||
Currently, CPython doesn't handle this use case well. Many extension
|
Currently, CPython doesn't handle this use case well. Many extension
|
||||||
modules (and even some stdlib modules) use *per-process* global state,
|
modules (and even some stdlib modules) use *per-process* global state,
|
||||||
|
@ -90,34 +92,36 @@ because C ``static`` variables are extremely easy to use. Thus, data
|
||||||
that should be specific to an interpreter ends up being shared between
|
that should be specific to an interpreter ends up being shared between
|
||||||
interpreters. Unless the extension developer is careful, it is very easy
|
interpreters. Unless the extension developer is careful, it is very easy
|
||||||
to introduce edge cases that lead to crashes when a module is loaded in
|
to introduce edge cases that lead to crashes when a module is loaded in
|
||||||
more than one interpreter.
|
more than one interpreter in the same process.
|
||||||
|
|
||||||
Unfortunately, *per-interpreter* state is not easy to achieve: extension
|
Unfortunately, *per-interpreter* state is not easy to achieve—extension
|
||||||
authors tend to not keep multiple interpreters in mind when developing,
|
authors tend to not keep multiple interpreters in mind when developing,
|
||||||
and it is currently cumbersome to test the behavior.
|
and it is currently cumbersome to test the behavior.
|
||||||
|
|
||||||
|
|
||||||
Rationale for Per-module State
|
Rationale for Per-module State
|
||||||
------------------------------
|
==============================
|
||||||
|
|
||||||
Instead of focusing on per-interpreter state, Python's C API is evolving
|
Instead of focusing on per-interpreter state, Python's C API is evolving
|
||||||
to better support the more granular *per-module* state. By default,
|
to better support the more granular *per-module* state. By default,
|
||||||
C-level data will be attached to a *module object*. Each interpreter
|
C-level data will be attached to a *module object*. Each interpreter
|
||||||
will then create its own module object, keeping data separate. For
|
will then create its own module object, keeping the data separate. For
|
||||||
testing the isolation, multiple module objects corresponding to a single
|
testing the isolation, multiple module objects corresponding to a single
|
||||||
extension can even be loaded in a single interpreter.
|
extension can even be loaded in a single interpreter.
|
||||||
|
|
||||||
Per-module state provides an easy way to think about lifetime and
|
Per-module state provides an easy way to think about lifetime and
|
||||||
resource ownership: the extension module will initialize when a
|
resource ownership: the extension module will initialize when a
|
||||||
module object is created, and clean up when it's freed. In this regard,
|
module object is created, and clean up when it's freed. In this regard,
|
||||||
a module is just like any other ``PyObject *``; there are no “on
|
a module is just like any other ``PyObject *``; there are no "on
|
||||||
interpreter shutdown” hooks to think about—or forget about.
|
interpreter shutdown" hooks to think—or forget—about.
|
||||||
|
|
||||||
Goal: Easy-to-use Module State
|
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
Goal: Easy-to-Use Module State
|
||||||
|
------------------------------
|
||||||
|
|
||||||
It is currently cumbersome or impossible to do everything the C API
|
It is currently cumbersome or impossible to do everything the C API
|
||||||
offers while keeping modules isolated. Enabled by :pep:`384`, changes in
|
offers while keeping modules isolated. Enabled by :pep:`384`, changes in
|
||||||
PEPs 489 and 573 (and future planned ones) aim to first make it
|
:pep:`489` and :pep:`573` (and future planned ones) aim to first make it
|
||||||
*possible* to build modules this way, and then to make it *easy* to
|
*possible* to build modules this way, and then to make it *easy* to
|
||||||
write new modules this way and to convert old ones, so that it can
|
write new modules this way and to convert old ones, so that it can
|
||||||
become a natural default.
|
become a natural default.
|
||||||
|
@ -128,20 +132,22 @@ per-thread or per-task state. The goal is to treat these as exceptional
|
||||||
cases: they should be possible, but extension authors will need to
|
cases: they should be possible, but extension authors will need to
|
||||||
think more carefully about them.
|
think more carefully about them.
|
||||||
|
|
||||||
|
|
||||||
Non-goals: Speedups and the GIL
|
Non-goals: Speedups and the GIL
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
-------------------------------
|
||||||
|
|
||||||
There is some effort to speed up CPython on multi-core CPUs by making the GIL
|
There is some effort to speed up CPython on multi-core CPUs by making the GIL
|
||||||
per-interpreter. While isolating interpreters helps that effort,
|
per-interpreter. While isolating interpreters helps that effort,
|
||||||
defaulting to per-module state will be beneficial even if no speedup is
|
defaulting to per-module state will be beneficial even if no speedup is
|
||||||
achieved, as it makes supporting multiple interpreters safer by default.
|
achieved, as it makes supporting multiple interpreters safer by default.
|
||||||
|
|
||||||
How to make modules safe with multiple interpreters
|
|
||||||
---------------------------------------------------
|
Making Modules Safe with Multiple Interpreters
|
||||||
|
==============================================
|
||||||
|
|
||||||
There are many ways to correctly support multiple interpreters in
|
There are many ways to correctly support multiple interpreters in
|
||||||
extension modules. The rest of this text describes the preferred way to
|
extension modules. The rest of this text describes the preferred way to
|
||||||
write such a module, or to convert an existing module.
|
write such a module, or to convert an existing one.
|
||||||
|
|
||||||
Note that support is a work in progress; the API for some features your
|
Note that support is a work in progress; the API for some features your
|
||||||
module needs might not yet be ready.
|
module needs might not yet be ready.
|
||||||
|
@ -149,15 +155,17 @@ module needs might not yet be ready.
|
||||||
A full example module is available as
|
A full example module is available as
|
||||||
`xxlimited <https://github.com/python/cpython/blob/master/Modules/xxlimited.c>`__.
|
`xxlimited <https://github.com/python/cpython/blob/master/Modules/xxlimited.c>`__.
|
||||||
|
|
||||||
This section assumes that “*you*” are an extension module author.
|
This section assumes that "*you*" are an extension module author.
|
||||||
|
|
||||||
|
|
||||||
Isolated Module Objects
|
Isolated Module Objects
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~
|
-----------------------
|
||||||
|
|
||||||
The key point to keep in mind when developing an extension module is
|
The key point to keep in mind when developing an extension module is
|
||||||
that several module objects can be created from a single shared library.
|
that several module objects can be created from a single shared library.
|
||||||
For example::
|
For example:
|
||||||
|
|
||||||
|
.. code-block:: pycon
|
||||||
|
|
||||||
>>> import sys
|
>>> import sys
|
||||||
>>> import binascii
|
>>> import binascii
|
||||||
|
@ -171,7 +179,7 @@ As a rule of thumb, the two modules should be completely independent.
|
||||||
All objects and state specific to the module should be encapsulated
|
All objects and state specific to the module should be encapsulated
|
||||||
within the module object, not shared with other module objects, and
|
within the module object, not shared with other module objects, and
|
||||||
cleaned up when the module object is deallocated. Exceptions are
|
cleaned up when the module object is deallocated. Exceptions are
|
||||||
possible (see “Managing global state” below), but they will need more
|
possible (see `Managing Global State`_), but they will need more
|
||||||
thought and attention to edge cases than code that follows this rule of
|
thought and attention to edge cases than code that follows this rule of
|
||||||
thumb.
|
thumb.
|
||||||
|
|
||||||
|
@ -179,14 +187,18 @@ While some modules could do with less stringent restrictions, isolated
|
||||||
modules make it easier to set clear expectations (and guidelines) that
|
modules make it easier to set clear expectations (and guidelines) that
|
||||||
work across a variety of use cases.
|
work across a variety of use cases.
|
||||||
|
|
||||||
|
|
||||||
Surprising Edge Cases
|
Surprising Edge Cases
|
||||||
~~~~~~~~~~~~~~~~~~~~~
|
---------------------
|
||||||
|
|
||||||
Note that isolated modules do create some surprising edge cases. Most
|
Note that isolated modules do create some surprising edge cases. Most
|
||||||
notably, each module object will typically not share its classes and
|
notably, each module object will typically not share its classes and
|
||||||
exceptions with other similar modules. Continuing from the example
|
exceptions with other similar modules. Continuing from the
|
||||||
above, note that ``old_binascii.Error`` and ``binascii.Error`` are
|
`example above <Isolated Module Objects_>`__,
|
||||||
separate objects. In the following code, the exception is *not* caught::
|
note that ``old_binascii.Error`` and ``binascii.Error`` are
|
||||||
|
separate objects. In the following code, the exception is *not* caught:
|
||||||
|
|
||||||
|
.. code-block:: pycon
|
||||||
|
|
||||||
>>> old_binascii.Error == binascii.Error
|
>>> old_binascii.Error == binascii.Error
|
||||||
False
|
False
|
||||||
|
@ -194,7 +206,7 @@ separate objects. In the following code, the exception is *not* caught::
|
||||||
... old_binascii.unhexlify(b'qwertyuiop')
|
... old_binascii.unhexlify(b'qwertyuiop')
|
||||||
... except binascii.Error:
|
... except binascii.Error:
|
||||||
... print('boo')
|
... print('boo')
|
||||||
...
|
...
|
||||||
Traceback (most recent call last):
|
Traceback (most recent call last):
|
||||||
File "<stdin>", line 2, in <module>
|
File "<stdin>", line 2, in <module>
|
||||||
binascii.Error: Non-hexadecimal digit found
|
binascii.Error: Non-hexadecimal digit found
|
||||||
|
@ -203,14 +215,15 @@ This is expected. Notice that pure-Python modules behave the same way:
|
||||||
it is a part of how Python works.
|
it is a part of how Python works.
|
||||||
|
|
||||||
The goal is to make extension modules safe at the C level, not to make
|
The goal is to make extension modules safe at the C level, not to make
|
||||||
hacks behave intuitively. Mutating ``sys.modules`` “manually” counts
|
hacks behave intuitively. Mutating ``sys.modules`` "manually" counts
|
||||||
as a hack.
|
as a hack.
|
||||||
|
|
||||||
|
|
||||||
Managing Global State
|
Managing Global State
|
||||||
~~~~~~~~~~~~~~~~~~~~~
|
---------------------
|
||||||
|
|
||||||
Sometimes, state of a Python module is not specific to that module, but
|
Sometimes, state of a Python module is not specific to that module, but
|
||||||
to the entire process (or something else “more global” than a module).
|
to the entire process (or something else "more global" than a module).
|
||||||
For example:
|
For example:
|
||||||
|
|
||||||
- The ``readline`` module manages *the* terminal.
|
- The ``readline`` module manages *the* terminal.
|
||||||
|
@ -226,14 +239,15 @@ If that is not possible, consider explicit locking.
|
||||||
|
|
||||||
If it is necessary to use process-global state, the simplest way to
|
If it is necessary to use process-global state, the simplest way to
|
||||||
avoid issues with multiple interpreters is to explicitly prevent a
|
avoid issues with multiple interpreters is to explicitly prevent a
|
||||||
module from being loaded more than once per process—see “Opt-Out:
|
module from being loaded more than once per process—see
|
||||||
Limiting to One Module Object per Process” below.
|
`Opt-Out: Limiting to One Module Object per Process`_.
|
||||||
|
|
||||||
|
|
||||||
Managing Per-Module State
|
Managing Per-Module State
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~
|
-------------------------
|
||||||
|
|
||||||
To use per-module state, use `multi-phase extension module
|
To use per-module state, use `multi-phase extension module initialization
|
||||||
initialization <https://docs.python.org/3/c-api/module.html#multi-phase-initialization>`__
|
<https://docs.python.org/3/c-api/module.html#multi-phase-initialization>`__
|
||||||
introduced in :pep:`489`. This signals that your module supports multiple
|
introduced in :pep:`489`. This signals that your module supports multiple
|
||||||
interpreters correctly.
|
interpreters correctly.
|
||||||
|
|
||||||
|
@ -242,8 +256,8 @@ bytes of storage local to the module. Usually, this will be set to the
|
||||||
size of some module-specific ``struct``, which can store all of the
|
size of some module-specific ``struct``, which can store all of the
|
||||||
module's C-level state. In particular, it is where you should put
|
module's C-level state. In particular, it is where you should put
|
||||||
pointers to classes (including exceptions, but excluding static types)
|
pointers to classes (including exceptions, but excluding static types)
|
||||||
and settings (e.g. ``csv``'s
|
and settings (e.g. ``csv``'s `field_size_limit
|
||||||
`field_size_limit <https://docs.python.org/3.8/library/csv.html#csv.field_size_limit>`__)
|
<https://docs.python.org/3/library/csv.html#csv.field_size_limit>`__)
|
||||||
which the C code needs to function.
|
which the C code needs to function.
|
||||||
|
|
||||||
.. note::
|
.. note::
|
||||||
|
@ -253,9 +267,9 @@ which the C code needs to function.
|
||||||
which is easy to get wrong and hard to test sufficiently.
|
which is easy to get wrong and hard to test sufficiently.
|
||||||
|
|
||||||
If the module state includes ``PyObject`` pointers, the module object
|
If the module state includes ``PyObject`` pointers, the module object
|
||||||
must hold references to those objects and implement module-level hooks
|
must hold references to those objects and implement the module-level hooks
|
||||||
``m_traverse``, ``m_clear``, ``m_free``. These work like
|
``m_traverse``, ``m_clear`` and ``m_free``. These work like
|
||||||
``tp_traverse``, ``tp_clear``, ``tp_free`` of a class. Adding them will
|
``tp_traverse``, ``tp_clear`` and ``tp_free`` of a class. Adding them will
|
||||||
require some work and make the code longer; this is the price for
|
require some work and make the code longer; this is the price for
|
||||||
modules which can be unloaded cleanly.
|
modules which can be unloaded cleanly.
|
||||||
|
|
||||||
|
@ -265,7 +279,7 @@ example module initialization shown at the bottom of the file.
|
||||||
|
|
||||||
|
|
||||||
Opt-Out: Limiting to One Module Object per Process
|
Opt-Out: Limiting to One Module Object per Process
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
--------------------------------------------------
|
||||||
|
|
||||||
A non-negative ``PyModuleDef.m_size`` signals that a module supports
|
A non-negative ``PyModuleDef.m_size`` signals that a module supports
|
||||||
multiple interpreters correctly. If this is not yet the case for your
|
multiple interpreters correctly. If this is not yet the case for your
|
||||||
|
@ -286,12 +300,13 @@ process. For example::
|
||||||
// ... rest of initialization
|
// ... rest of initialization
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
Module State Access from Functions
|
Module State Access from Functions
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
----------------------------------
|
||||||
|
|
||||||
Accessing the state from module-level functions is straightforward.
|
Accessing the state from module-level functions is straightforward.
|
||||||
Functions get the module object as their first argument; for extracting
|
Functions get the module object as their first argument; for extracting
|
||||||
the state there is ``PyModule_GetState``::
|
the state, you can use ``PyModule_GetState``::
|
||||||
|
|
||||||
static PyObject *
|
static PyObject *
|
||||||
func(PyObject *module, PyObject *args)
|
func(PyObject *module, PyObject *args)
|
||||||
|
@ -303,15 +318,17 @@ the state there is ``PyModule_GetState``::
|
||||||
// ... rest of logic
|
// ... rest of logic
|
||||||
}
|
}
|
||||||
|
|
||||||
(Note that ``PyModule_GetState`` may return NULL without setting an
|
.. note::
|
||||||
exception if there is no module state, i.e. ``PyModuleDef.m_size`` was
|
``PyModule_GetState`` may return NULL without setting an
|
||||||
zero. In your own module, you're in control of ``m_size``, so this is
|
exception if there is no module state, i.e. ``PyModuleDef.m_size`` was
|
||||||
easy to prevent.)
|
zero. In your own module, you're in control of ``m_size``, so this is
|
||||||
|
easy to prevent.
|
||||||
|
|
||||||
Heap types
|
|
||||||
----------
|
|
||||||
|
|
||||||
Traditionally, types defined in C code are *static*, that is,
|
Heap Types
|
||||||
|
==========
|
||||||
|
|
||||||
|
Traditionally, types defined in C code are *static*; that is,
|
||||||
``static PyTypeObject`` structures defined directly in code and
|
``static PyTypeObject`` structures defined directly in code and
|
||||||
initialized using ``PyType_Ready()``.
|
initialized using ``PyType_Ready()``.
|
||||||
|
|
||||||
|
@ -322,23 +339,23 @@ the Python level: for example, you can't set ``str.myattribute = 123``.
|
||||||
|
|
||||||
.. note::
|
.. note::
|
||||||
Sharing truly immutable objects between interpreters is fine,
|
Sharing truly immutable objects between interpreters is fine,
|
||||||
as long as they don't provide access to mutable objects. But, every
|
as long as they don't provide access to mutable objects.
|
||||||
Python object has a mutable implementation detail: the reference
|
However, in CPython, every Python object has a mutable implementation
|
||||||
count. Changes to the refcount are guarded by the GIL. Thus, code
|
detail: the reference count. Changes to the refcount are guarded by the GIL.
|
||||||
that shares any Python objects across interpreters implicitly depends
|
Thus, code that shares any Python objects across interpreters implicitly
|
||||||
on CPython's current, process-wide GIL.
|
depends on CPython's current, process-wide GIL.
|
||||||
|
|
||||||
Because they are immutable and process-global, static types cannot access
|
Because they are immutable and process-global, static types cannot access
|
||||||
“their” module state.
|
"their" module state.
|
||||||
If any method of such a type requires access to module state,
|
If any method of such a type requires access to module state,
|
||||||
the type must be converted to a *heap-allocated type*, or *heap type*
|
the type must be converted to a *heap-allocated type*, or *heap type*
|
||||||
for short. These correspond more closely to classes created by Python’s
|
for short. These correspond more closely to classes created by Python's
|
||||||
``class`` statement.
|
``class`` statement.
|
||||||
|
|
||||||
For new modules, using heap types by default is a good rule of thumb.
|
For new modules, using heap types by default is a good rule of thumb.
|
||||||
|
|
||||||
Static types can be converted to heap types, but note that
|
Static types can be converted to heap types, but note that
|
||||||
the heap type API was not designed for “lossless” conversion
|
the heap type API was not designed for "lossless" conversion
|
||||||
from static types -- that is, creating a type that works exactly like a given
|
from static types -- that is, creating a type that works exactly like a given
|
||||||
static type. Unlike static types, heap type objects are mutable by default.
|
static type. Unlike static types, heap type objects are mutable by default.
|
||||||
Also, when rewriting the class definition in a new API,
|
Also, when rewriting the class definition in a new API,
|
||||||
|
@ -347,10 +364,10 @@ or inherited slots). Always test the details that are important to you.
|
||||||
|
|
||||||
|
|
||||||
Defining Heap Types
|
Defining Heap Types
|
||||||
~~~~~~~~~~~~~~~~~~~
|
-------------------
|
||||||
|
|
||||||
Heap types can be created by filling a ``PyType_Spec`` structure, a
|
Heap types can be created by filling a ``PyType_Spec`` structure, a
|
||||||
description or “blueprint” of a class, and calling
|
description or "blueprint" of a class, and calling
|
||||||
``PyType_FromModuleAndSpec()`` to construct a new class object.
|
``PyType_FromModuleAndSpec()`` to construct a new class object.
|
||||||
|
|
||||||
.. note::
|
.. note::
|
||||||
|
@ -364,10 +381,10 @@ Python code).
|
||||||
|
|
||||||
|
|
||||||
Garbage Collection Protocol
|
Garbage Collection Protocol
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
---------------------------
|
||||||
|
|
||||||
Instances of heap types hold a reference to their type.
|
Instances of heap types hold a reference to their type.
|
||||||
This ensures that the type isn't destroyed before its instance,
|
This ensures that the type isn't destroyed before all its instances are,
|
||||||
but may result in reference cycles that need to be broken by the
|
but may result in reference cycles that need to be broken by the
|
||||||
garbage collector.
|
garbage collector.
|
||||||
|
|
||||||
|
@ -375,14 +392,18 @@ To avoid memory leaks, instances of heap types must implement the
|
||||||
garbage collection protocol.
|
garbage collection protocol.
|
||||||
That is, heap types should:
|
That is, heap types should:
|
||||||
|
|
||||||
- Have the ``Py_TPFLAGS_HAVE_GC`` flag,
|
- Have the ``Py_TPFLAGS_HAVE_GC`` flag.
|
||||||
- Define a traverse function using ``Py_tp_traverse``, which
|
- Define a traverse function using ``Py_tp_traverse``, which
|
||||||
visits the type (e.g. using ``Py_VISIT(Py_TYPE(self));``).
|
visits the type (e.g. using ``Py_VISIT(Py_TYPE(self));``).
|
||||||
|
|
||||||
Please refer to the documentation of ``Py_TPFLAGS_HAVE_GC`` and
|
Please refer to the `documentation
|
||||||
``tp_traverse`` for additional considerations.
|
<https://docs.python.org/3/c-api/typeobj.html>`__ of `Py_TPFLAGS_HAVE_GC
|
||||||
|
<https://docs.python.org/3/c-api/typeobj.html#Py_TPFLAGS_HAVE_GC>`__ and
|
||||||
|
`tp_traverse
|
||||||
|
<https://docs.python.org/3/c-api/typeobj.html#c.PyTypeObject.tp_traverse>`
|
||||||
|
for additional considerations.
|
||||||
|
|
||||||
If your traverse function delegates to ``tp_traverse`` of its base class
|
If your traverse function delegates to the ``tp_traverse`` of its base class
|
||||||
(or another type), ensure that ``Py_TYPE(self)`` is visited only once.
|
(or another type), ensure that ``Py_TYPE(self)`` is visited only once.
|
||||||
Note that only heap type are expected to visit the type in ``tp_traverse``.
|
Note that only heap type are expected to visit the type in ``tp_traverse``.
|
||||||
|
|
||||||
|
@ -403,30 +424,31 @@ and ``tp_clear``.
|
||||||
|
|
||||||
|
|
||||||
Module State Access from Classes
|
Module State Access from Classes
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
--------------------------------
|
||||||
|
|
||||||
If you have a type object defined with ``PyType_FromModuleAndSpec()``,
|
If you have a type object defined with ``PyType_FromModuleAndSpec()``,
|
||||||
you can call ``PyType_GetModule`` to get the associated module, then
|
you can call ``PyType_GetModule`` to get the associated module, and then
|
||||||
``PyModule_GetState`` to get the module's state.
|
``PyModule_GetState`` to get the module's state.
|
||||||
|
|
||||||
To save a some tedious error-handling boilerplate code, you can combine
|
To save a some tedious error-handling boilerplate code, you can combine
|
||||||
these two steps with ``PyType_GetModuleState``, resulting in::
|
these two steps with ``PyType_GetModuleState``, resulting in::
|
||||||
|
|
||||||
my_struct *state = (my_struct*)PyType_GetModuleState(type);
|
my_struct *state = (my_struct*)PyType_GetModuleState(type);
|
||||||
if (state === NULL) {
|
if (state === NULL) {
|
||||||
return NULL;
|
return NULL;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
Module State Access from Regular Methods
|
Module State Access from Regular Methods
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
----------------------------------------
|
||||||
|
|
||||||
Accessing the module-level state from methods of a class is somewhat
|
Accessing the module-level state from methods of a class is somewhat more
|
||||||
more complicated, but possible thanks to changes introduced in :pep:`573`.
|
complicated, but is possible thanks to the changes introduced in :pep:`573`.
|
||||||
To get the state, you need to first get the *defining class*, and then
|
To get the state, you need to first get the *defining class*, and then
|
||||||
get the module state from it.
|
get the module state from it.
|
||||||
|
|
||||||
The largest roadblock is getting *the class a method was defined in*, or
|
The largest roadblock is getting *the class a method was defined in*, or
|
||||||
that method's “defining class” for short. The defining class can have a
|
that method's "defining class" for short. The defining class can have a
|
||||||
reference to the module it is part of.
|
reference to the module it is part of.
|
||||||
|
|
||||||
Do not confuse the defining class with ``Py_TYPE(self)``. If the method
|
Do not confuse the defining class with ``Py_TYPE(self)``. If the method
|
||||||
|
@ -436,7 +458,9 @@ that subclass, which may be defined in different module than yours.
|
||||||
.. note::
|
.. note::
|
||||||
The following Python code can illustrate the concept.
|
The following Python code can illustrate the concept.
|
||||||
``Base.get_defining_class`` returns ``Base`` even
|
``Base.get_defining_class`` returns ``Base`` even
|
||||||
if ``type(self) == Sub``::
|
if ``type(self) == Sub``:
|
||||||
|
|
||||||
|
.. code-block:: python
|
||||||
|
|
||||||
class Base:
|
class Base:
|
||||||
def get_defining_class(self):
|
def get_defining_class(self):
|
||||||
|
@ -445,12 +469,11 @@ that subclass, which may be defined in different module than yours.
|
||||||
class Sub(Base):
|
class Sub(Base):
|
||||||
pass
|
pass
|
||||||
|
|
||||||
|
For a method to get its "defining class", it must use the
|
||||||
For a method to get its “defining class”, it must use the
|
``METH_METHOD | METH_FASTCALL | METH_KEYWORDS`` `calling convention
|
||||||
``METH_METHOD | METH_FASTCALL | METH_KEYWORDS`` `calling
|
<https://docs.python.org/3/c-api/structures.html#c.PyMethodDef>`__
|
||||||
convention <https://docs.python.org/3.9/c-api/structures.html?highlight=meth_o#c.PyMethodDef>`__
|
and the corresponding `PyCMethod signature
|
||||||
and the corresponding `PyCMethod
|
<https://docs.python.org/3/c-api/structures.html#c.PyCMethod>`__::
|
||||||
signature <https://docs.python.org/3.9/c-api/structures.html#c.PyCMethod>`__::
|
|
||||||
|
|
||||||
PyObject *PyCMethod(
|
PyObject *PyCMethod(
|
||||||
PyObject *self, // object the method was called on
|
PyObject *self, // object the method was called on
|
||||||
|
@ -488,8 +511,9 @@ For example::
|
||||||
{NULL},
|
{NULL},
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
Module State Access from Slot Methods, Getters and Setters
|
Module State Access from Slot Methods, Getters and Setters
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
----------------------------------------------------------
|
||||||
|
|
||||||
.. note::
|
.. note::
|
||||||
|
|
||||||
|
@ -501,18 +525,20 @@ Module State Access from Slot Methods, Getters and Setters
|
||||||
you must update ``Py_LIMITED_API`` to ``0x030b0000``, losing ABI
|
you must update ``Py_LIMITED_API`` to ``0x030b0000``, losing ABI
|
||||||
compatibility with earlier versions.
|
compatibility with earlier versions.
|
||||||
|
|
||||||
Slot methods -- the fast C equivalents for special methods, such as
|
Slot methods -- the fast C equivalents for special methods, such as `nb_add
|
||||||
`nb_add <https://docs.python.org/3/c-api/typeobj.html#c.PyNumberMethods.nb_add>`__
|
<https://docs.python.org/3/c-api/typeobj.html#c.PyNumberMethods.nb_add>`__
|
||||||
for ``__add__`` or `tp_new <https://docs.python.org/3/c-api/typeobj.html#c.PyTypeObject.tp_new>`__
|
for ``__add__`` or `tp_new
|
||||||
|
<https://docs.python.org/3/c-api/typeobj.html#c.PyTypeObject.tp_new>`__
|
||||||
for initialization -- have a very simple API that doesn't allow
|
for initialization -- have a very simple API that doesn't allow
|
||||||
passing in the defining class as in ``PyCMethod``.
|
passing in the defining class, unlike with ``PyCMethod``.
|
||||||
The same goes for getters and setters defined with
|
The same goes for getters and setters defined with
|
||||||
`PyGetSetDef <https://docs.python.org/3/c-api/structures.html#c.PyGetSetDef>`__.
|
`PyGetSetDef <https://docs.python.org/3/c-api/structures.html#c.PyGetSetDef>`__.
|
||||||
|
|
||||||
To access the module state in these cases, use the
|
To access the module state in these cases, use the `PyType_GetModuleByDef
|
||||||
`PyType_GetModuleByDef <https://docs.python.org/typeobj.html#c.PyType_GetModuleByDef>`__
|
<https://docs.python.org/3/c-api/typeobj.html#c.PyType_GetModuleByDef>`__
|
||||||
function, and pass in the module definition.
|
function, and pass in the module definition.
|
||||||
Once you have the module, call `PyModule_GetState <https://docs.python.org/3/c-api/module.html?highlight=pymodule_getstate#c.PyModule_GetState>`__
|
Once you have the module, call `PyModule_GetState
|
||||||
|
<https://docs.python.org/3/c-api/module.html#c.PyModule_GetState>`__
|
||||||
to get the state::
|
to get the state::
|
||||||
|
|
||||||
PyObject *module = PyType_GetModuleByDef(Py_TYPE(self), &module_def);
|
PyObject *module = PyType_GetModuleByDef(Py_TYPE(self), &module_def);
|
||||||
|
@ -521,7 +547,8 @@ to get the state::
|
||||||
return NULL;
|
return NULL;
|
||||||
}
|
}
|
||||||
|
|
||||||
``PyType_GetModuleByDef`` works by searching the `MRO <https://docs.python.org/3/glossary.html#term-method-resolution-order>`__
|
``PyType_GetModuleByDef`` works by searching the `MRO
|
||||||
|
<https://docs.python.org/3/glossary.html#term-method-resolution-order>`__
|
||||||
(i.e. all superclasses) for the first superclass that has a corresponding
|
(i.e. all superclasses) for the first superclass that has a corresponding
|
||||||
module.
|
module.
|
||||||
|
|
||||||
|
@ -535,7 +562,7 @@ module.
|
||||||
|
|
||||||
|
|
||||||
Lifetime of the Module State
|
Lifetime of the Module State
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
----------------------------
|
||||||
|
|
||||||
When a module object is garbage-collected, its module state is freed.
|
When a module object is garbage-collected, its module state is freed.
|
||||||
For each pointer to (a part of) the module state, you must hold a reference
|
For each pointer to (a part of) the module state, you must hold a reference
|
||||||
|
@ -550,30 +577,33 @@ libraries.
|
||||||
|
|
||||||
|
|
||||||
Open Issues
|
Open Issues
|
||||||
-----------
|
===========
|
||||||
|
|
||||||
Several issues around per-module state and heap types are still open.
|
Several issues around per-module state and heap types are still open.
|
||||||
|
|
||||||
Discussions about improving the situation are best held on the `capi-sig
|
Discussions about improving the situation are best held on the `capi-sig
|
||||||
mailing list <https://mail.python.org/mailman3/lists/capi-sig.python.org/>`__.
|
mailing list <https://mail.python.org/mailman3/lists/capi-sig.python.org/>`__.
|
||||||
|
|
||||||
|
|
||||||
Type Checking
|
Type Checking
|
||||||
~~~~~~~~~~~~~
|
-------------
|
||||||
|
|
||||||
Currently (as of Python 3.10), heap types have no good API to write
|
Currently (as of Python 3.10), heap types have no good API to write
|
||||||
``Py*_Check`` functions (like ``PyUnicode_Check`` exists for ``str``, a
|
``Py*_Check`` functions (like ``PyUnicode_Check`` exists for ``str``, a
|
||||||
static type), and so it is not easy to ensure whether instances have a
|
static type), and so it is not easy to ensure that instances have a
|
||||||
particular C layout.
|
particular C layout.
|
||||||
|
|
||||||
|
|
||||||
Metaclasses
|
Metaclasses
|
||||||
~~~~~~~~~~~
|
-----------
|
||||||
|
|
||||||
Currently (as of Python 3.10), there is no good API to specify the
|
Currently (as of Python 3.10), there is no good API to specify the
|
||||||
*metaclass* of a heap type, that is, the ``ob_type`` field of the type
|
*metaclass* of a heap type; that is, the ``ob_type`` field of the type
|
||||||
object.
|
object.
|
||||||
|
|
||||||
Per-Class scope
|
|
||||||
~~~~~~~~~~~~~~~
|
Per-Class Scope
|
||||||
|
---------------
|
||||||
|
|
||||||
It is also not possible to attach state to *types*. While
|
It is also not possible to attach state to *types*. While
|
||||||
``PyHeapTypeObject`` is a variable-size object (``PyVarObject``),
|
``PyHeapTypeObject`` is a variable-size object (``PyVarObject``),
|
||||||
|
@ -581,26 +611,18 @@ its variable-size storage is currently consumed by slots. Fixing this
|
||||||
is complicated by the fact that several classes in an inheritance
|
is complicated by the fact that several classes in an inheritance
|
||||||
hierarchy may need to reserve some state.
|
hierarchy may need to reserve some state.
|
||||||
|
|
||||||
Lossless conversion to heap types
|
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
||||||
|
|
||||||
The heap type API was not designed for “lossless” conversion from static types,
|
Lossless Conversion to Heap Types
|
||||||
|
---------------------------------
|
||||||
|
|
||||||
|
The heap type API was not designed for "lossless" conversion from static types;
|
||||||
that is, creating a type that works exactly like a given static type.
|
that is, creating a type that works exactly like a given static type.
|
||||||
The best way to address it would probably be to write a guide that covers
|
The best way to address it would probably be to write a guide that covers
|
||||||
known “gotchas”.
|
known "gotchas".
|
||||||
|
|
||||||
|
|
||||||
Copyright
|
Copyright
|
||||||
---------
|
=========
|
||||||
|
|
||||||
This document is placed in the public domain or under the
|
This document is placed in the public domain or under the
|
||||||
CC0-1.0-Universal license, whichever is more permissive.
|
CC0-1.0-Universal license, whichever is more permissive.
|
||||||
|
|
||||||
..
|
|
||||||
Local Variables:
|
|
||||||
mode: indented-text
|
|
||||||
indent-tabs-mode: nil
|
|
||||||
sentence-end-double-space: t
|
|
||||||
fill-column: 70
|
|
||||||
coding: utf-8
|
|
||||||
End:
|
|
||||||
|
|
Loading…
Reference in New Issue