224 lines
5.9 KiB
Plaintext
224 lines
5.9 KiB
Plaintext
PEP: 3121
|
||
Title: Extension Module Initialization and Finalization
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Martin von Löwis <martin@v.loewis.de>
|
||
Status: Accepted
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 27-Apr-2007
|
||
Python-Version: 3.0
|
||
Post-History:
|
||
|
||
Abstract
|
||
========
|
||
|
||
Extension module initialization currently has a few deficiencies.
|
||
There is no cleanup for modules, the entry point name might give
|
||
naming conflicts, the entry functions don't follow the usual calling
|
||
convention, and multiple interpreters are not supported well. This PEP
|
||
addresses these issues.
|
||
|
||
Problems
|
||
========
|
||
|
||
Module Finalization
|
||
-------------------
|
||
|
||
Currently, extension modules are initialized usually once and then
|
||
"live" forever. The only exception is when Py_Finalize() is called:
|
||
then the initialization routine is invoked a second time. This is bad
|
||
from a resource management point of view: memory and other resources
|
||
might get allocated each time initialization is called, but there is
|
||
no way to reclaim them. As a result, there is currently no way to
|
||
completely release all resources Python has allocated.
|
||
|
||
Entry point name conflicts
|
||
--------------------------
|
||
|
||
The entry point is currently called init<module>. This might conflict
|
||
with other symbols also called init<something>. In particular,
|
||
initsocket is known to have conflicted in the past (this specific
|
||
problem got resolved as a side effect of renaming the module to
|
||
_socket).
|
||
|
||
Entry point signature
|
||
---------------------
|
||
|
||
The entry point is currently a procedure (returning void). This
|
||
deviates from the usual calling conventions; callers can find out
|
||
whether there was an error during initialization only by checking
|
||
PyErr_Occurred. The entry point should return a PyObject*, which will
|
||
be the module created, or NULL in case of an exception.
|
||
|
||
Multiple Interpreters
|
||
---------------------
|
||
|
||
Currently, extension modules share their state across all
|
||
interpreters. This allows for undesirable information leakage across
|
||
interpreters: one script could permanently corrupt objects in an
|
||
extension module, possibly breaking all scripts in other interpreters.
|
||
|
||
Specification
|
||
=============
|
||
|
||
The module initialization routines change their signature
|
||
to::
|
||
|
||
PyObject *PyInit_<modulename>()
|
||
|
||
The initialization routine will be invoked once per
|
||
interpreter, when the module is imported. It should
|
||
return a new module object each time.
|
||
|
||
In order to store per-module state in C variables,
|
||
each module object will contain a block of memory
|
||
that is interpreted only by the module. The amount
|
||
of memory used for the module is specified at
|
||
the point of creation of the module.
|
||
|
||
In addition to the initialization function, a module
|
||
may implement a number of additional callback
|
||
functions, which are invoked when the module's
|
||
tp_traverse, tp_clear, and tp_free functions are
|
||
invoked, and when the module is reloaded.
|
||
|
||
The entire module definition is combined in a struct
|
||
PyModuleDef::
|
||
|
||
struct PyModuleDef{
|
||
PyModuleDef_Base m_base; /* To be filled out by the interpreter */
|
||
Py_ssize_t m_size; /* Size of per-module data */
|
||
PyMethodDef *m_methods;
|
||
inquiry m_reload;
|
||
traverseproc m_traverse;
|
||
inquiry m_clear;
|
||
freefunc m_free;
|
||
};
|
||
|
||
Creation of a module is changed to expect an optional
|
||
PyModuleDef*. The module state will be
|
||
null-initialized.
|
||
|
||
Each module method will be passed the module object
|
||
as the first parameter. To access the module data,
|
||
a function::
|
||
|
||
void* PyModule_GetData(PyObject*);
|
||
|
||
will be provided. In addition, to lookup a module
|
||
more efficiently than going through sys.modules,
|
||
a function::
|
||
|
||
PyObject* PyState_FindModule(struct PyModuleDef*);
|
||
|
||
will be provided. This lookup function will use an
|
||
index located in the m_base field, to find the
|
||
module by index, not by name.
|
||
|
||
As all Python objects should be controlled through
|
||
the Python memory management, usage of "static"
|
||
type objects is discouraged, unless the type object
|
||
itself has no memory-managed state. To simplify
|
||
definition of heap types, a new method::
|
||
|
||
PyTypeObject* PyType_Copy(PyTypeObject*);
|
||
|
||
is added.
|
||
|
||
Example
|
||
=======
|
||
|
||
xxmodule.c would be changed to remove the initxx
|
||
function, and add the following code instead::
|
||
|
||
struct xxstate{
|
||
PyObject *ErrorObject;
|
||
PyObject *Xxo_Type;
|
||
};
|
||
|
||
#define xxstate(o) ((struct xxstate*)PyModule_GetState(o))
|
||
|
||
static int xx_traverse(PyObject *m, visitproc v,
|
||
void *arg)
|
||
{
|
||
Py_VISIT(xxstate(m)->ErrorObject);
|
||
Py_VISIT(xxstate(m)->Xxo_Type);
|
||
return 0;
|
||
}
|
||
|
||
static int xx_clear(PyObject *m)
|
||
{
|
||
Py_CLEAR(xxstate(m)->ErrorObject);
|
||
Py_CLEAR(xxstate(m)->Xxo_Type);
|
||
return 0;
|
||
}
|
||
|
||
static struct PyModuleDef xxmodule = {
|
||
{}, /* m_base */
|
||
sizeof(struct xxstate),
|
||
&xx_methods,
|
||
0, /* m_reload */
|
||
xx_traverse,
|
||
xx_clear,
|
||
0, /* m_free - not needed, since all is done in m_clear */
|
||
}
|
||
|
||
PyObject*
|
||
PyInit_xx()
|
||
{
|
||
PyObject *res = PyModule_New("xx", &xxmodule);
|
||
if (!res) return NULL;
|
||
xxstate(res)->ErrorObject = PyErr_NewException("xx.error, NULL, NULL);
|
||
if (!xxstate(res)->ErrorObject) {
|
||
Py_DECREF(res);
|
||
return NULL;
|
||
}
|
||
xxstate(res)->XxoType = PyType_Copy(&Xxo_Type);
|
||
if (!xxstate(res)->Xxo_Type) {
|
||
Py_DECREF(res);
|
||
return NULL;
|
||
}
|
||
return res;
|
||
}
|
||
|
||
|
||
Discussion
|
||
==========
|
||
|
||
Tim Peters reports in [1]_ that PythonLabs considered such a feature
|
||
at one point, and lists the following additional hooks which aren't
|
||
currently supported in this PEP:
|
||
|
||
* when the module object is deleted from sys.modules
|
||
|
||
* when Py_Finalize is called
|
||
|
||
* when Python exits
|
||
|
||
* when the Python DLL is unloaded (Windows only)
|
||
|
||
|
||
References
|
||
==========
|
||
|
||
.. [1] Tim Peters, reporting earlier conversation about such a feature
|
||
http://mail.python.org/pipermail/python-3000/2006-April/000726.html
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|