PEP 445
This commit is contained in:
parent
4cd865a624
commit
69f972bb2b
168
pep-0445.txt
168
pep-0445.txt
|
@ -34,12 +34,6 @@ Use cases:
|
|||
allocator APIs (builtin Python debug hooks)
|
||||
- force allocation to fail to test handling of ``MemoryError`` exception
|
||||
|
||||
API:
|
||||
|
||||
* Setup a custom memory allocator for all memory allocated by Python
|
||||
* Hook memory allocator functions to call extra code before and/or after
|
||||
the underlying allocator function
|
||||
|
||||
|
||||
Proposal
|
||||
========
|
||||
|
@ -47,15 +41,29 @@ Proposal
|
|||
API changes
|
||||
-----------
|
||||
|
||||
* Add a new ``PyMemAllocators`` structure
|
||||
|
||||
* Add new GIL-free memory allocator functions:
|
||||
|
||||
- ``void* PyMem_RawMalloc(size_t size)``
|
||||
- ``void* PyMem_RawRealloc(void *ptr, size_t new_size)``
|
||||
- ``void PyMem_RawFree(void *ptr)``
|
||||
|
||||
* Add new functions to get and set memory allocators:
|
||||
* Add a new ``PyMemAllocators`` structure::
|
||||
|
||||
typedef struct {
|
||||
/* user context passed as the first argument to the 3 functions */
|
||||
void *ctx;
|
||||
|
||||
/* allocate memory */
|
||||
void* (*malloc) (void *ctx, size_t size);
|
||||
|
||||
/* allocate memory or resize a memory buffer */
|
||||
void* (*realloc) (void *ctx, void *ptr, size_t new_size);
|
||||
|
||||
/* release memory */
|
||||
void (*free) (void *ctx, void *ptr);
|
||||
} PyMemAllocators;
|
||||
|
||||
* Add new functions to get and set memory block allocators:
|
||||
|
||||
- ``void PyMem_GetRawAllocators(PyMemAllocators *allocators)``
|
||||
- ``void PyMem_SetRawAllocators(PyMemAllocators *allocators)``
|
||||
|
@ -63,17 +71,32 @@ API changes
|
|||
- ``void PyMem_SetAllocators(PyMemAllocators *allocators)``
|
||||
- ``void PyObject_GetAllocators(PyMemAllocators *allocators)``
|
||||
- ``void PyObject_SetAllocators(PyMemAllocators *allocators)``
|
||||
|
||||
* Add new functions to get and set memory mapping allocators:
|
||||
|
||||
- ``void _PyObject_GetArenaAllocators(void **ctx_p, void* (**malloc_p) (void *ctx, size_t size), void (**free_p) (void *ctx, void *ptr, size_t size))``
|
||||
- ``void _PyObject_SetArenaAllocators(void *ctx, void* (*malloc) (void *ctx, size_t size), void (*free) (void *ctx, void *ptr, size_t size))``
|
||||
|
||||
* Add a new function to setup Python builtin debug hooks when memory
|
||||
* Add a new function to setup the builtin Python debug hooks when memory
|
||||
allocators are replaced:
|
||||
|
||||
- ``void PyMem_SetupDebugHooks(void)``
|
||||
|
||||
.. note::
|
||||
|
||||
Use these new APIs
|
||||
------------------
|
||||
The builtin Python debug hooks were introduced in Python 2.3 and implement the
|
||||
following checks:
|
||||
|
||||
* Newly allocated memory is filled with the byte 0xCB, freed memory is filled
|
||||
with the byte 0xDB.
|
||||
* Detect API violations, ex: ``PyObject_Free()`` called on a memory block
|
||||
allocated by ``PyMem_Malloc()``
|
||||
* Detect write before the start of the buffer (buffer underflow)
|
||||
* Detect write after the end of the buffer (buffer overflow)
|
||||
|
||||
|
||||
Make usage of these new APIs
|
||||
----------------------------
|
||||
|
||||
* ``PyMem_Malloc()`` and ``PyMem_Realloc()`` always call ``malloc()`` and
|
||||
``realloc()``, instead of calling ``PyObject_Malloc()`` and
|
||||
|
@ -156,12 +179,12 @@ bytes per allocation, and 10 bytes per arena::
|
|||
are not thread-safe.
|
||||
|
||||
|
||||
Use case 2: Replace Memory Allocators, overriding pymalloc
|
||||
----------------------------------------------------------
|
||||
Use case 2: Replace Memory Allocators, override pymalloc
|
||||
--------------------------------------------------------
|
||||
|
||||
If your allocator is optimized for allocation of small objects (less than 512
|
||||
bytes) with a short liftime, you can replace override pymalloc (replace
|
||||
``PyObject_Malloc()``).
|
||||
bytes) with a short lifetime, pymalloc can be overriden: replace
|
||||
``PyObject_Malloc()``.
|
||||
|
||||
Dummy Example wasting 2 bytes per allocation::
|
||||
|
||||
|
@ -210,7 +233,7 @@ Dummy Example wasting 2 bytes per allocation::
|
|||
Use case 3: Setup Allocator Hooks
|
||||
---------------------------------
|
||||
|
||||
Example to setup hooks on memory allocators::
|
||||
Example to setup hooks on all memory allocators::
|
||||
|
||||
struct {
|
||||
PyMemAllocators pymem;
|
||||
|
@ -249,11 +272,11 @@ Example to setup hooks on memory allocators::
|
|||
void setup_hooks(void)
|
||||
{
|
||||
PyMemAllocators alloc;
|
||||
static int registered = 0;
|
||||
static int installed = 0;
|
||||
|
||||
if (registered)
|
||||
if (installed)
|
||||
return;
|
||||
registered = 1;
|
||||
installed = 1;
|
||||
|
||||
alloc.malloc = hook_malloc;
|
||||
alloc.realloc = hook_realloc;
|
||||
|
@ -284,30 +307,27 @@ Example to setup hooks on memory allocators::
|
|||
Performances
|
||||
============
|
||||
|
||||
The `Python benchmarks suite <http://hg.python.org/benchmarks>`_ (-b 2n3): some
|
||||
tests are 1.04x faster, some tests are 1.04 slower, significant is between 115
|
||||
and -191. I don't understand these output, but I guess that the overhead cannot
|
||||
be seen with such test.
|
||||
Results of the `Python benchmarks suite <http://hg.python.org/benchmarks>`_ (-b
|
||||
2n3): some tests are 1.04x faster, some tests are 1.04 slower, significant is
|
||||
between 115 and -191. I don't understand these output, but I guess that the
|
||||
overhead cannot be seen with such test.
|
||||
|
||||
pybench: "+0.1%" (diff between -4.9% and +5.6%).
|
||||
Results of pybench benchmark: "+0.1%" slower globally (diff between -4.9% and
|
||||
+5.6%).
|
||||
|
||||
The full output is attached to the issue #3329.
|
||||
The full reports are attached to the issue #3329.
|
||||
|
||||
|
||||
Alternatives
|
||||
============
|
||||
|
||||
Only one get and one set function
|
||||
---------------------------------
|
||||
Only have one generic get/set function
|
||||
--------------------------------------
|
||||
|
||||
Replace the 6 functions:
|
||||
|
||||
* ``PyMem_GetRawAllocators()``
|
||||
* ``PyMem_GetAllocators()``
|
||||
* ``PyObject_GetAllocators()``
|
||||
* ``PyMem_SetRawAllocators(allocators)``
|
||||
* ``PyMem_SetAllocators(allocators)``
|
||||
* ``PyObject_SetAllocators(allocators)``
|
||||
* ``PyMem_GetRawAllocators()``, ``PyMem_GetAllocators()``, ``PyObject_GetAllocators()``
|
||||
* ``PyMem_SetRawAllocators(allocators)``, ``PyMem_SetAllocators(allocators)``, ``PyObject_SetAllocators(allocators)``
|
||||
|
||||
with 2 functions with an additional *domain* argument:
|
||||
|
||||
|
@ -321,16 +341,21 @@ where domain is one of these values:
|
|||
* ``PYALLOC_PYOBJECT``
|
||||
|
||||
|
||||
``_PyObject_GetArenaAllocators()`` and ``_PyObject_SetArenaAllocators()`` are
|
||||
not merged and kept private because their prototypes are different and they are
|
||||
specific to pymalloc.
|
||||
|
||||
|
||||
Add a new PYDEBUGMALLOC environment variable
|
||||
--------------------------------------------
|
||||
|
||||
To be able to use Python builtin debug hooks even when a custom memory
|
||||
allocator is set, an environment variable ``PYDEBUGMALLOC`` can be added to
|
||||
setup these debug function hooks, instead of adding the new function
|
||||
``PyMem_SetupDebugHooks()``. If the environment variable is present,
|
||||
``PyMem_SetRawAllocators()``, ``PyMem_SetAllocators()`` and
|
||||
``PyObject_SetAllocators()`` will reinstall automatically the hook on top of
|
||||
the new allocator.
|
||||
To be able to use the Python builtin debug hooks even when a custom memory
|
||||
allocator replaces the default Python allocator, an environment variable
|
||||
``PYDEBUGMALLOC`` can be added to setup these debug function hooks, instead of
|
||||
adding the new function ``PyMem_SetupDebugHooks()``. If the environment
|
||||
variable is present, ``PyMem_SetRawAllocators()``, ``PyMem_SetAllocators()``
|
||||
and ``PyObject_SetAllocators()`` will reinstall automatically the hook on top
|
||||
of the new allocator.
|
||||
|
||||
An new environment variable would make the Python initialization even more
|
||||
complex. The `PEP 432 <http://www.python.org/dev/peps/pep-0432/>`_ tries to
|
||||
|
@ -343,8 +368,8 @@ Use macros to get customizable allocators
|
|||
To have no overhead in the default configuration, customizable allocators would
|
||||
be an optional feature enabled by a configuration option or by macros.
|
||||
|
||||
Not having to recompile Python makes debug hooks easy to use in practice.
|
||||
Extensions modules don't have to be compiled with or without macros.
|
||||
Not having to recompile Python makes debug hooks easier to use in practice.
|
||||
Extensions modules don't have to be recompiled with macros.
|
||||
|
||||
|
||||
Pass the C filename and line number
|
||||
|
@ -354,10 +379,11 @@ Use C macros using ``__FILE__`` and ``__LINE__`` to get the C filename
|
|||
and line number of a memory allocation.
|
||||
|
||||
Passing a filename and a line number to each allocator makes the API more
|
||||
complex: pass 3 new arguments instead of just a context argument, to each
|
||||
allocator function. GC allocator functions should also be patched,
|
||||
``_PyObject_GC_Malloc()`` is used in many C functions for example. Such changes
|
||||
add too much complexity, for a little gain.
|
||||
complex: pass 3 new arguments, instead of just a context argument, to each
|
||||
allocator function. The GC allocator functions should also be patched.
|
||||
``_PyObject_GC_Malloc()`` is used in many C functions for example and so
|
||||
objects of differenet types would have the same allocation location. Such
|
||||
changes add too much complexity for a little gain.
|
||||
|
||||
|
||||
No context argument
|
||||
|
@ -369,41 +395,67 @@ Simplify the signature of allocator functions, remove the context argument:
|
|||
* ``void* realloc(void *ptr, size_t new_size)``
|
||||
* ``void free(void *ptr)``
|
||||
|
||||
The context is a convenient way to reuse the same allocator for different APIs
|
||||
(ex: PyMem and PyObject).
|
||||
It is likely for an allocator hook to be reused for ``PyMem_SetAllocators()``
|
||||
and ``PyObject_SetAllocators()``, but the hook must call a different function
|
||||
depending on the allocator. The context is a convenient way to reuse the same
|
||||
allocator or hook for different APIs.
|
||||
|
||||
|
||||
PyMem_Malloc() GIL-free
|
||||
-----------------------
|
||||
|
||||
There is no real reason to require the GIL when calling ``PyMem_Malloc()``.
|
||||
``PyMem_Malloc()`` must be called with the GIL held because in debug mode, it
|
||||
calls indirectly ``PyObject_Malloc()`` which requires the GIL to be held. This
|
||||
PEP proposes to "fix" ``PyMem_Malloc()`` to make it always call ``malloc()``.
|
||||
So the "GIL must be held" restriction may be removed no ``PyMem_Malloc()``.
|
||||
|
||||
Allowing to call ``PyMem_Malloc()`` without holding the GIL might break
|
||||
applications which setup their own allocator or their allocator hooks. Holding
|
||||
the GIL is very convinient to develop a custom allocator or a hook (no need to
|
||||
care of other threads, no need to handle mutexes, etc.).
|
||||
the GIL is very convinient to develop a custom allocator: no need to care of
|
||||
other threads nor mutexes. It is also convinient for an allocator hook: Python
|
||||
internals can be safetly inspected.
|
||||
|
||||
Calling ``PyGILState_Ensure()`` in a memory allocator may have unexpected
|
||||
behaviour, especially at Python startup and at creation of a new Python thread
|
||||
state.
|
||||
|
||||
|
||||
Don't add PyMem_RawMalloc()
|
||||
---------------------------
|
||||
|
||||
Replace ``malloc()`` with ``PyMem_Malloc()``, but if the GIL is not held: keep
|
||||
``malloc()`` unchanged.
|
||||
Replace ``malloc()`` with ``PyMem_Malloc()``, but only if the GIL is held.
|
||||
Otherwise, keep ``malloc()`` unchanged.
|
||||
|
||||
The ``PyMem_Malloc()`` is sometimes already misused. For example, the
|
||||
``main()`` and ``Py_Main()`` functions of Python call ``PyMem_Malloc()``
|
||||
whereas the GIL do not exist yet. In this case, ``PyMem_Malloc()`` should
|
||||
be replaced with ``malloc()``.
|
||||
be replaced with ``malloc()`` (or ``PyMem_RawMalloc()``).
|
||||
|
||||
If an hook is used to the track memory usage, the ``malloc()`` memory will not
|
||||
be seen. Remaining ``malloc()`` may allocate a lot of memory and so would be
|
||||
missed in reports.
|
||||
|
||||
|
||||
CCP API
|
||||
-------
|
||||
|
||||
XXX To be done (Kristján Valur Jónsson) XXX
|
||||
Use existing debug tools to analyze the memory
|
||||
----------------------------------------------
|
||||
|
||||
There are many existing debug tools to analyze the memory. Some examples:
|
||||
`Valgrind <http://valgrind.org/>`_,
|
||||
`Purify <http://ibm.com/software/awdtools/purify/>`_,
|
||||
`Clang AddressSanitizer <http://code.google.com/p/address-sanitizer/>`_,
|
||||
`failmalloc <http://www.nongnu.org/failmalloc/>`_,
|
||||
etc.
|
||||
|
||||
The problem is retrieve the Python object related to a memory pointer to read
|
||||
its type and/or content. Another issue is to retrieve the location of the
|
||||
memory allocation: the C backtrace is usually useless (same reasoning than
|
||||
macros using ``__FILE__`` and ``__LINE__``), the Python filename and line
|
||||
number (or even the Python traceback) is more useful.
|
||||
|
||||
Classic tools are unable to introspect the Python internal to collect such
|
||||
information. Being able to setup a hook on allocators called with the GIL held
|
||||
allow to read a lot of useful data from Python internals.
|
||||
|
||||
|
||||
External libraries
|
||||
|
|
Loading…
Reference in New Issue