This commit is contained in:
Victor Stinner 2013-06-18 14:14:17 +02:00
parent 4cd865a624
commit 69f972bb2b
1 changed files with 110 additions and 58 deletions

View File

@ -34,12 +34,6 @@ Use cases:
allocator APIs (builtin Python debug hooks)
- force allocation to fail to test handling of ``MemoryError`` exception
API:
* Setup a custom memory allocator for all memory allocated by Python
* Hook memory allocator functions to call extra code before and/or after
the underlying allocator function
Proposal
========
@ -47,15 +41,29 @@ Proposal
API changes
-----------
* Add a new ``PyMemAllocators`` structure
* Add new GIL-free memory allocator functions:
- ``void* PyMem_RawMalloc(size_t size)``
- ``void* PyMem_RawRealloc(void *ptr, size_t new_size)``
- ``void PyMem_RawFree(void *ptr)``
* Add new functions to get and set memory allocators:
* Add a new ``PyMemAllocators`` structure::
typedef struct {
/* user context passed as the first argument to the 3 functions */
void *ctx;
/* allocate memory */
void* (*malloc) (void *ctx, size_t size);
/* allocate memory or resize a memory buffer */
void* (*realloc) (void *ctx, void *ptr, size_t new_size);
/* release memory */
void (*free) (void *ctx, void *ptr);
} PyMemAllocators;
* Add new functions to get and set memory block allocators:
- ``void PyMem_GetRawAllocators(PyMemAllocators *allocators)``
- ``void PyMem_SetRawAllocators(PyMemAllocators *allocators)``
@ -63,17 +71,32 @@ API changes
- ``void PyMem_SetAllocators(PyMemAllocators *allocators)``
- ``void PyObject_GetAllocators(PyMemAllocators *allocators)``
- ``void PyObject_SetAllocators(PyMemAllocators *allocators)``
* Add new functions to get and set memory mapping allocators:
- ``void _PyObject_GetArenaAllocators(void **ctx_p, void* (**malloc_p) (void *ctx, size_t size), void (**free_p) (void *ctx, void *ptr, size_t size))``
- ``void _PyObject_SetArenaAllocators(void *ctx, void* (*malloc) (void *ctx, size_t size), void (*free) (void *ctx, void *ptr, size_t size))``
* Add a new function to setup Python builtin debug hooks when memory
* Add a new function to setup the builtin Python debug hooks when memory
allocators are replaced:
- ``void PyMem_SetupDebugHooks(void)``
.. note::
Use these new APIs
------------------
The builtin Python debug hooks were introduced in Python 2.3 and implement the
following checks:
* Newly allocated memory is filled with the byte 0xCB, freed memory is filled
with the byte 0xDB.
* Detect API violations, ex: ``PyObject_Free()`` called on a memory block
allocated by ``PyMem_Malloc()``
* Detect write before the start of the buffer (buffer underflow)
* Detect write after the end of the buffer (buffer overflow)
Make usage of these new APIs
----------------------------
* ``PyMem_Malloc()`` and ``PyMem_Realloc()`` always call ``malloc()`` and
``realloc()``, instead of calling ``PyObject_Malloc()`` and
@ -156,12 +179,12 @@ bytes per allocation, and 10 bytes per arena::
are not thread-safe.
Use case 2: Replace Memory Allocators, overriding pymalloc
----------------------------------------------------------
Use case 2: Replace Memory Allocators, override pymalloc
--------------------------------------------------------
If your allocator is optimized for allocation of small objects (less than 512
bytes) with a short liftime, you can replace override pymalloc (replace
``PyObject_Malloc()``).
bytes) with a short lifetime, pymalloc can be overriden: replace
``PyObject_Malloc()``.
Dummy Example wasting 2 bytes per allocation::
@ -210,7 +233,7 @@ Dummy Example wasting 2 bytes per allocation::
Use case 3: Setup Allocator Hooks
---------------------------------
Example to setup hooks on memory allocators::
Example to setup hooks on all memory allocators::
struct {
PyMemAllocators pymem;
@ -249,11 +272,11 @@ Example to setup hooks on memory allocators::
void setup_hooks(void)
{
PyMemAllocators alloc;
static int registered = 0;
static int installed = 0;
if (registered)
if (installed)
return;
registered = 1;
installed = 1;
alloc.malloc = hook_malloc;
alloc.realloc = hook_realloc;
@ -284,30 +307,27 @@ Example to setup hooks on memory allocators::
Performances
============
The `Python benchmarks suite <http://hg.python.org/benchmarks>`_ (-b 2n3): some
tests are 1.04x faster, some tests are 1.04 slower, significant is between 115
and -191. I don't understand these output, but I guess that the overhead cannot
be seen with such test.
Results of the `Python benchmarks suite <http://hg.python.org/benchmarks>`_ (-b
2n3): some tests are 1.04x faster, some tests are 1.04 slower, significant is
between 115 and -191. I don't understand these output, but I guess that the
overhead cannot be seen with such test.
pybench: "+0.1%" (diff between -4.9% and +5.6%).
Results of pybench benchmark: "+0.1%" slower globally (diff between -4.9% and
+5.6%).
The full output is attached to the issue #3329.
The full reports are attached to the issue #3329.
Alternatives
============
Only one get and one set function
---------------------------------
Only have one generic get/set function
--------------------------------------
Replace the 6 functions:
* ``PyMem_GetRawAllocators()``
* ``PyMem_GetAllocators()``
* ``PyObject_GetAllocators()``
* ``PyMem_SetRawAllocators(allocators)``
* ``PyMem_SetAllocators(allocators)``
* ``PyObject_SetAllocators(allocators)``
* ``PyMem_GetRawAllocators()``, ``PyMem_GetAllocators()``, ``PyObject_GetAllocators()``
* ``PyMem_SetRawAllocators(allocators)``, ``PyMem_SetAllocators(allocators)``, ``PyObject_SetAllocators(allocators)``
with 2 functions with an additional *domain* argument:
@ -321,16 +341,21 @@ where domain is one of these values:
* ``PYALLOC_PYOBJECT``
``_PyObject_GetArenaAllocators()`` and ``_PyObject_SetArenaAllocators()`` are
not merged and kept private because their prototypes are different and they are
specific to pymalloc.
Add a new PYDEBUGMALLOC environment variable
--------------------------------------------
To be able to use Python builtin debug hooks even when a custom memory
allocator is set, an environment variable ``PYDEBUGMALLOC`` can be added to
setup these debug function hooks, instead of adding the new function
``PyMem_SetupDebugHooks()``. If the environment variable is present,
``PyMem_SetRawAllocators()``, ``PyMem_SetAllocators()`` and
``PyObject_SetAllocators()`` will reinstall automatically the hook on top of
the new allocator.
To be able to use the Python builtin debug hooks even when a custom memory
allocator replaces the default Python allocator, an environment variable
``PYDEBUGMALLOC`` can be added to setup these debug function hooks, instead of
adding the new function ``PyMem_SetupDebugHooks()``. If the environment
variable is present, ``PyMem_SetRawAllocators()``, ``PyMem_SetAllocators()``
and ``PyObject_SetAllocators()`` will reinstall automatically the hook on top
of the new allocator.
An new environment variable would make the Python initialization even more
complex. The `PEP 432 <http://www.python.org/dev/peps/pep-0432/>`_ tries to
@ -343,8 +368,8 @@ Use macros to get customizable allocators
To have no overhead in the default configuration, customizable allocators would
be an optional feature enabled by a configuration option or by macros.
Not having to recompile Python makes debug hooks easy to use in practice.
Extensions modules don't have to be compiled with or without macros.
Not having to recompile Python makes debug hooks easier to use in practice.
Extensions modules don't have to be recompiled with macros.
Pass the C filename and line number
@ -354,10 +379,11 @@ Use C macros using ``__FILE__`` and ``__LINE__`` to get the C filename
and line number of a memory allocation.
Passing a filename and a line number to each allocator makes the API more
complex: pass 3 new arguments instead of just a context argument, to each
allocator function. GC allocator functions should also be patched,
``_PyObject_GC_Malloc()`` is used in many C functions for example. Such changes
add too much complexity, for a little gain.
complex: pass 3 new arguments, instead of just a context argument, to each
allocator function. The GC allocator functions should also be patched.
``_PyObject_GC_Malloc()`` is used in many C functions for example and so
objects of differenet types would have the same allocation location. Such
changes add too much complexity for a little gain.
No context argument
@ -369,41 +395,67 @@ Simplify the signature of allocator functions, remove the context argument:
* ``void* realloc(void *ptr, size_t new_size)``
* ``void free(void *ptr)``
The context is a convenient way to reuse the same allocator for different APIs
(ex: PyMem and PyObject).
It is likely for an allocator hook to be reused for ``PyMem_SetAllocators()``
and ``PyObject_SetAllocators()``, but the hook must call a different function
depending on the allocator. The context is a convenient way to reuse the same
allocator or hook for different APIs.
PyMem_Malloc() GIL-free
-----------------------
There is no real reason to require the GIL when calling ``PyMem_Malloc()``.
``PyMem_Malloc()`` must be called with the GIL held because in debug mode, it
calls indirectly ``PyObject_Malloc()`` which requires the GIL to be held. This
PEP proposes to "fix" ``PyMem_Malloc()`` to make it always call ``malloc()``.
So the "GIL must be held" restriction may be removed no ``PyMem_Malloc()``.
Allowing to call ``PyMem_Malloc()`` without holding the GIL might break
applications which setup their own allocator or their allocator hooks. Holding
the GIL is very convinient to develop a custom allocator or a hook (no need to
care of other threads, no need to handle mutexes, etc.).
the GIL is very convinient to develop a custom allocator: no need to care of
other threads nor mutexes. It is also convinient for an allocator hook: Python
internals can be safetly inspected.
Calling ``PyGILState_Ensure()`` in a memory allocator may have unexpected
behaviour, especially at Python startup and at creation of a new Python thread
state.
Don't add PyMem_RawMalloc()
---------------------------
Replace ``malloc()`` with ``PyMem_Malloc()``, but if the GIL is not held: keep
``malloc()`` unchanged.
Replace ``malloc()`` with ``PyMem_Malloc()``, but only if the GIL is held.
Otherwise, keep ``malloc()`` unchanged.
The ``PyMem_Malloc()`` is sometimes already misused. For example, the
``main()`` and ``Py_Main()`` functions of Python call ``PyMem_Malloc()``
whereas the GIL do not exist yet. In this case, ``PyMem_Malloc()`` should
be replaced with ``malloc()``.
be replaced with ``malloc()`` (or ``PyMem_RawMalloc()``).
If an hook is used to the track memory usage, the ``malloc()`` memory will not
be seen. Remaining ``malloc()`` may allocate a lot of memory and so would be
missed in reports.
CCP API
-------
XXX To be done (Kristján Valur Jónsson) XXX
Use existing debug tools to analyze the memory
----------------------------------------------
There are many existing debug tools to analyze the memory. Some examples:
`Valgrind <http://valgrind.org/>`_,
`Purify <http://ibm.com/software/awdtools/purify/>`_,
`Clang AddressSanitizer <http://code.google.com/p/address-sanitizer/>`_,
`failmalloc <http://www.nongnu.org/failmalloc/>`_,
etc.
The problem is retrieve the Python object related to a memory pointer to read
its type and/or content. Another issue is to retrieve the location of the
memory allocation: the C backtrace is usually useless (same reasoning than
macros using ``__FILE__`` and ``__LINE__``), the Python filename and line
number (or even the Python traceback) is more useful.
Classic tools are unable to introspect the Python internal to collect such
information. Being able to setup a hook on allocators called with the GIL held
allow to read a lot of useful data from Python internals.
External libraries