This commit is contained in:
Victor Stinner 2013-06-18 21:59:48 +02:00
parent 5cfe3ed35f
commit 2d81cffb61
1 changed files with 132 additions and 86 deletions

View File

@ -25,10 +25,10 @@ Use cases:
optimized for its Python usage optimized for its Python usage
* Python running on embedded devices with low memory and slow CPU. * Python running on embedded devices with low memory and slow CPU.
A custom memory allocator may be required to use efficiently the memory A custom memory allocator may be required to use efficiently the memory
and/or to be able to use all memory of the device. and/or to be able to use all the memory of the device.
* Debug tool to: * Debug tool to:
- track memory leaks - track memory usage (memory leaks)
- get the Python filename and line number where an object was allocated - get the Python filename and line number where an object was allocated
- detect buffer underflow, buffer overflow and detect misuse of Python - detect buffer underflow, buffer overflow and detect misuse of Python
allocator APIs (builtin Python debug hooks) allocator APIs (builtin Python debug hooks)
@ -41,7 +41,7 @@ Proposal
API changes API changes
----------- -----------
* Add new GIL-free memory allocator functions: * Add new GIL-free (no need to hold the GIL) memory allocator functions:
- ``void* PyMem_RawMalloc(size_t size)`` - ``void* PyMem_RawMalloc(size_t size)``
- ``void* PyMem_RawRealloc(void *ptr, size_t new_size)`` - ``void* PyMem_RawRealloc(void *ptr, size_t new_size)``
@ -65,27 +65,27 @@ API changes
void (*free) (void *ctx, void *ptr); void (*free) (void *ctx, void *ptr);
} PyMemBlockAllocator; } PyMemBlockAllocator;
* Add new functions to get and set memory block allocators: * Add new functions to get and set internal functions of ``PyMem_RawMalloc()``,
``PyMem_RawRealloc()`` and ``PyMem_RawFree()``:
- Get/Set internal functions of ``PyMem_RawMalloc()``, - ``void PyMem_GetRawAllocator(PyMemBlockAllocator *allocator)``
``PyMem_RawRealloc()`` and ``PyMem_RawFree()``: - ``void PyMem_SetRawAllocator(PyMemBlockAllocator *allocator)``
* ``void PyMem_GetRawAllocator(PyMemBlockAllocator *allocator)`` * Add new functions to get and set internal functions of ``PyMem_Malloc()``,
* ``void PyMem_SetRawAllocator(PyMemBlockAllocator *allocator)`` ``PyMem_Realloc()`` and ``PyMem_Free()``:
- Get/Set internal functions of ``PyMem_Malloc()``, - ``void PyMem_GetAllocator(PyMemBlockAllocator *allocator)``
``PyMem_Realloc()`` and ``PyMem_Free()``: - ``void PyMem_SetAllocator(PyMemBlockAllocator *allocator)``
- ``malloc(ctx, 0)`` and ``realloc(ctx, ptr, 0)`` must not return *NULL*:
it would be treated as an error.
* ``void PyMem_GetAllocator(PyMemBlockAllocator *allocator)`` * Add new functions to get and set internal functions of
* ``void PyMem_SetAllocator(PyMemBlockAllocator *allocator)`` ``PyObject_Malloc()``,, ``PyObject_Realloc()`` and ``PyObject_Free()``:
* ``malloc(ctx, 0)`` and ``realloc(ctx, ptr, 0)`` must not return *NULL*:
it would be treated as an error.
- Get/Set internal functions of ``PyObject_Malloc()``,, - ``void PyObject_GetAllocator(PyMemBlockAllocator *allocator)``
``PyObject_Realloc()`` and ``PyObject_Free()``: - ``void PyObject_SetAllocator(PyMemBlockAllocator *allocator)``
- ``malloc(ctx, 0)`` and ``realloc(ctx, ptr, 0)`` must not return *NULL*:
* ``void PyObject_GetAllocator(PyMemBlockAllocator *allocator)`` it would be treated as an error.
* ``void PyObject_SetAllocator(PyMemBlockAllocator *allocator)``
* Add a new ``PyMemMappingAllocator`` structure:: * Add a new ``PyMemMappingAllocator`` structure::
@ -94,7 +94,7 @@ API changes
void *ctx; void *ctx;
/* allocate a memory mapping */ /* allocate a memory mapping */
void* (*malloc) (void *ctx, size_t size); void* (*alloc) (void *ctx, size_t size);
/* release a memory mapping */ /* release a memory mapping */
void (*free) (void *ctx, void *ptr, size_t size); void (*free) (void *ctx, void *ptr, size_t size);
@ -112,6 +112,11 @@ API changes
- ``void PyMem_SetupDebugHooks(void)`` - ``void PyMem_SetupDebugHooks(void)``
* The following memory allocators always returns *NULL* if size is greater
than ``PY_SSIZE_T_MAX``: ``PyMem_RawMalloc()``, ``PyMem_RawRealloc()``,
``PyMem_Malloc()``, ``PyMem_Realloc()``, ``PyObject_Malloc()``,
``PyObject_Realloc()``.
The builtin Python debug hooks were introduced in Python 2.3 and implement the The builtin Python debug hooks were introduced in Python 2.3 and implement the
following checks: following checks:
@ -151,8 +156,7 @@ Examples
Use case 1: Replace Memory Allocator, keep pymalloc Use case 1: Replace Memory Allocator, keep pymalloc
---------------------------------------------------- ----------------------------------------------------
Setup your custom memory allocator, keeping pymalloc. Dummy example wasting 2 Dummy example wasting 2 bytes per allocation, and 10 bytes per arena::
bytes per allocation, and 10 bytes per arena::
#include <stdlib.h> #include <stdlib.h>
@ -189,18 +193,21 @@ bytes per allocation, and 10 bytes per arena::
void setup_custom_allocator(void) void setup_custom_allocator(void)
{ {
PyMemBlockAllocator alloc; PyMemBlockAllocator block;
PyMemMappingAllocator mapping;
alloc.ctx = &alloc_padding; block.ctx = &alloc_padding;
alloc.malloc = my_malloc; block.malloc = my_malloc;
alloc.realloc = my_realloc; block.realloc = my_realloc;
alloc.free = my_free; block.free = my_free;
PyMem_SetRawAllocator(&alloc); PyMem_SetRawAllocator(&block);
PyMem_SetAllocator(&alloc); PyMem_SetAllocator(&block);
_PyObject_SetArenaAllocator(&arena_padding, mapping.ctx = &arena_padding;
my_alloc_arena, my_free_arena); mapping.alloc = my_alloc_arena;
mapping.free = my_free_arena;
PyMem_SetMappingAllocator(mapping);
PyMem_SetupDebugHooks(); PyMem_SetupDebugHooks();
} }
@ -267,9 +274,9 @@ Use case 3: Setup Allocator Hooks
Example to setup hooks on all memory allocators:: Example to setup hooks on all memory allocators::
struct { struct {
PyMemBlockAllocator pymem; PyMemBlockAllocator raw;
PyMemBlockAllocator pymem_raw; PyMemBlockAllocator mem;
PyMemBlockAllocator pyobj; PyMemBlockAllocator obj;
/* ... */ /* ... */
} hook; } hook;
@ -313,16 +320,16 @@ Example to setup hooks on all memory allocators::
alloc.realloc = hook_realloc; alloc.realloc = hook_realloc;
alloc.free = hook_free; alloc.free = hook_free;
PyMem_GetRawAllocator(&hook.pymem_raw); PyMem_GetRawAllocator(&hook.raw);
alloc.ctx = &hook.pymem_raw; alloc.ctx = &hook.raw;
PyMem_SetRawAllocator(&alloc); PyMem_SetRawAllocator(&alloc);
PyMem_GetAllocator(&hook.pymem); PyMem_GetAllocator(&hook.mem);
alloc.ctx = &hook.pymem; alloc.ctx = &hook.mem;
PyMem_SetAllocator(&alloc); PyMem_SetAllocator(&alloc);
PyObject_GetAllocator(&hook.pyobj); PyObject_GetAllocator(&hook.obj);
alloc.ctx = &hook.pyobj; alloc.ctx = &hook.obj;
PyObject_SetAllocator(&alloc); PyObject_SetAllocator(&alloc);
} }
@ -366,8 +373,8 @@ Replace the 6 functions:
with 2 functions with an additional *domain* argument: with 2 functions with an additional *domain* argument:
* ``int Py_GetAllocator(int domain, PyMemBlockAllocator *allocator)`` * ``int PyMem_GetBlockAllocator(int domain, PyMemBlockAllocator *allocator)``
* ``int Py_SetAllocator(int domain, PyMemBlockAllocator *allocator)`` * ``int PyMem_SetBlockAllocator(int domain, PyMemBlockAllocator *allocator)``
These functions return 0 on success, or -1 if the domain is unknown. These functions return 0 on success, or -1 if the domain is unknown.
@ -377,6 +384,8 @@ where domain is one of these values:
* ``PYALLOC_PYMEM_RAW`` * ``PYALLOC_PYMEM_RAW``
* ``PYALLOC_PYOBJECT`` * ``PYALLOC_PYOBJECT``
Drawback: the caller has to check if the result is 0, or handle the error.
PyMem_Malloc() reuses PyMem_RawMalloc() by default PyMem_Malloc() reuses PyMem_RawMalloc() by default
-------------------------------------------------- --------------------------------------------------
@ -385,12 +394,11 @@ PyMem_Malloc() reuses PyMem_RawMalloc() by default
``PyMem_SetRawAllocator()`` would also also patch ``PyMem_Malloc()`` ``PyMem_SetRawAllocator()`` would also also patch ``PyMem_Malloc()``
indirectly. indirectly.
Such change is less optimal, it adds another level of indirection. .. note::
In the proposed implementation of this PEP (issue #3329), ``PyMem_RawMalloc()`` In the implementation of this PEP (issue #3329),
calls directly ``malloc()``, whereas ``PyMem_Malloc()`` returns ``NULL`` if ``PyMem_RawMalloc(0)`` calls ``malloc(0)``,
size is larger than ``PY_SSIZE_T_MAX``, and the default allocator of whereas ``PyMem_Malloc(0)`` calls ``malloc(1)``.
``PyMem_Malloc()`` calls ``malloc(1)`` if the size is zero.
Add a new PYDEBUGMALLOC environment variable Add a new PYDEBUGMALLOC environment variable
@ -422,30 +430,58 @@ Extensions modules don't have to be recompiled with macros.
Pass the C filename and line number Pass the C filename and line number
----------------------------------- -----------------------------------
Use C macros using ``__FILE__`` and ``__LINE__`` to get the C filename Define allocator functions using macros and use ``__FILE__`` and ``__LINE__``
and line number of a memory allocation. to get the C filename and line number of a memory allocation.
Example::
typedef struct {
/* user context passed as the first argument to the 3 functions */
void *ctx;
/* allocate a memory block */
void* (*malloc) (void *ctx, const char *filename, int lineno,
size_t size);
/* allocate or resize a memory block */
void* (*realloc) (void *ctx, const char *filename, int lineno,
void *ptr, size_t new_size);
/* release a memory block */
void (*free) (void *ctx, const char *filename, int lineno,
void *ptr);
} PyMemBlockAllocator;
void* _PyMem_MallocTrace(const char *filename, int lineno, size_t size);
/* need also a function for the Python stable ABI */
void* PyMem_Malloc(size_t size);
#define PyMem_Malloc(size) _PyMem_MallocTrace(__FILE__, __LINE__, size)
Passing a filename and a line number to each allocator makes the API more Passing a filename and a line number to each allocator makes the API more
complex: pass 3 new arguments, instead of just a context argument, to each complex: pass 3 new arguments, instead of just a context argument, to each
allocator function. The GC allocator functions should also be patched. allocator function. The GC allocator functions should also be patched.
``_PyObject_GC_Malloc()`` is used in many C functions for example and so For example, ``_PyObject_GC_Malloc()`` is used in many C functions and so
objects of differenet types would have the same allocation location. Such objects of differenet types would have the same allocation location. Such
changes add too much complexity for a little gain. changes add too much complexity for a little gain.
PyMem_Malloc() GIL-free GIL-free PyMem_Malloc()
----------------------- -----------------------
``PyMem_Malloc()`` must be called with the GIL held because in debug mode, it When Python is compiled in debug mode, ``PyMem_Malloc()`` calls indirectly ``PyObject_Malloc()`` which requires the GIL to be held.
calls indirectly ``PyObject_Malloc()`` which requires the GIL to be held. This That's why ``PyMem_Malloc()`` must be called with the GIL held.
PEP proposes to "fix" ``PyMem_Malloc()`` to make it always call ``malloc()``.
So the "GIL must be held" restriction may be removed no ``PyMem_Malloc()``. This PEP proposes to "fix" ``PyMem_Malloc()`` to make it always call
``malloc()``. So the "GIL must be held" restriction may be removed from
``PyMem_Malloc()``.
Allowing to call ``PyMem_Malloc()`` without holding the GIL might break Allowing to call ``PyMem_Malloc()`` without holding the GIL might break
applications which setup their own allocator or their allocator hooks. Holding applications which setup their own allocators or allocator hooks. Holding the
the GIL is very convinient to develop a custom allocator: no need to care of GIL is convinient to develop a custom allocator: no need to care of other
other threads nor mutexes. It is also convinient for an allocator hook: Python threads. It is also convinient for a debug allocator hook: Python internal
internals can be safetly inspected. objects can be safetly inspected.
Calling ``PyGILState_Ensure()`` in a memory allocator may have unexpected Calling ``PyGILState_Ensure()`` in a memory allocator may have unexpected
behaviour, especially at Python startup and at creation of a new Python thread behaviour, especially at Python startup and at creation of a new Python thread
@ -458,10 +494,11 @@ Don't add PyMem_RawMalloc()
Replace ``malloc()`` with ``PyMem_Malloc()``, but only if the GIL is held. Replace ``malloc()`` with ``PyMem_Malloc()``, but only if the GIL is held.
Otherwise, keep ``malloc()`` unchanged. Otherwise, keep ``malloc()`` unchanged.
The ``PyMem_Malloc()`` is sometimes already misused. For example, the The ``PyMem_Malloc()`` is used without the GIL held in some Python functions.
``main()`` and ``Py_Main()`` functions of Python call ``PyMem_Malloc()`` For example, the ``main()`` and ``Py_Main()`` functions of Python call
whereas the GIL do not exist yet. In this case, ``PyMem_Malloc()`` should ``PyMem_Malloc()`` whereas the GIL do not exist yet. In this case,
be replaced with ``malloc()`` (or ``PyMem_RawMalloc()``). ``PyMem_Malloc()`` should be replaced with ``malloc()`` (or
``PyMem_RawMalloc()``).
If an hook is used to the track memory usage, the ``malloc()`` memory will not If an hook is used to the track memory usage, the ``malloc()`` memory will not
be seen. Remaining ``malloc()`` may allocate a lot of memory and so would be be seen. Remaining ``malloc()`` may allocate a lot of memory and so would be
@ -478,15 +515,15 @@ There are many existing debug tools to analyze the memory. Some examples:
`failmalloc <http://www.nongnu.org/failmalloc/>`_, `failmalloc <http://www.nongnu.org/failmalloc/>`_,
etc. etc.
The problem is retrieve the Python object related to a memory pointer to read The problem is to retrieve the Python object related to a memory pointer to read
its type and/or content. Another issue is to retrieve the location of the its type and/or content. Another issue is to retrieve the location of the
memory allocation: the C backtrace is usually useless (same reasoning than memory allocation: the C backtrace is usually useless (same reasoning than
macros using ``__FILE__`` and ``__LINE__``), the Python filename and line macros using ``__FILE__`` and ``__LINE__``), the Python filename and line
number (or even the Python traceback) is more useful. number (or even the Python traceback) is more useful.
Classic tools are unable to introspect the Python internal to collect such Classic tools are unable to introspect Python internals to collect such
information. Being able to setup a hook on allocators called with the GIL held information. Being able to setup a hook on allocators called with the GIL held
allow to read a lot of useful data from Python internals. allow to collect a lot of useful data from Python internals.
Add msize() Add msize()
@ -514,22 +551,31 @@ Simplify the signature of allocator functions, remove the context argument:
* ``void free(void *ptr)`` * ``void free(void *ptr)``
It is likely for an allocator hook to be reused for ``PyMem_SetAllocator()`` It is likely for an allocator hook to be reused for ``PyMem_SetAllocator()``
and ``PyObject_SetAllocator()``, but the hook must call a different function and ``PyObject_SetAllocator()``, or even ``PyMem_SetRawAllocator()``, but the
depending on the allocator. The context is a convenient way to reuse the same hook must call a different function depending on the allocator. The context is
allocator or hook for different APIs. a convenient way to reuse the same custom allocator or hook for different
Python allocators.
External libraries External libraries
================== ==================
* glib: `g_mem_set_vtable() Python should try to reuse the same prototypes for allocator functions than
<http://developer.gnome.org/glib/unstable/glib-Memory-Allocation.html#g-mem-set-vtable>`_ other libraries.
Libraries used by Python:
* OpenSSL: `CRYPTO_set_mem_functions() * OpenSSL: `CRYPTO_set_mem_functions()
<http://git.openssl.org/gitweb/?p=openssl.git;a=blob;f=crypto/mem.c;h=f7984fa958eb1edd6c61f6667f3f2b29753be662;hb=HEAD#l124>`_ <http://git.openssl.org/gitweb/?p=openssl.git;a=blob;f=crypto/mem.c;h=f7984fa958eb1edd6c61f6667f3f2b29753be662;hb=HEAD#l124>`_
to set memory management functions globally to set memory management functions globally
* expat: `parserCreate() * expat: `parserCreate()
<http://hg.python.org/cpython/file/cc27d50bd91a/Modules/expat/xmlparse.c#l724>`_ <http://hg.python.org/cpython/file/cc27d50bd91a/Modules/expat/xmlparse.c#l724>`_
has a per-instance memory handler has a per-instance memory handler
Other libraries:
* glib: `g_mem_set_vtable()
<http://developer.gnome.org/glib/unstable/glib-Memory-Allocation.html#g-mem-set-vtable>`_
* libxml2: `xmlGcMemSetup() <http://xmlsoft.org/html/libxml-xmlmemory.html>`_, * libxml2: `xmlGcMemSetup() <http://xmlsoft.org/html/libxml-xmlmemory.html>`_,
global global
@ -555,8 +601,8 @@ and it is contiguous. On Windows, the heap is handled by ``HeapAlloc()`` and
may be discontiguous. Memory mappings are handled by ``mmap()`` on UNIX and may be discontiguous. Memory mappings are handled by ``mmap()`` on UNIX and
``VirtualAlloc()`` on Windows, they may be discontiguous. ``VirtualAlloc()`` on Windows, they may be discontiguous.
Releasing a memory mapping gives back immediatly the memory to the system. For Releasing a memory mapping gives back immediatly the memory to the system. On
the heap, memory is only given back to the system if it is at the end of the UNIX, heap memory is only given back to the system if it is at the end of the
heap. Otherwise, the memory will only be given back to the system when all the heap. Otherwise, the memory will only be given back to the system when all the
memory located after the released memory are also released. memory located after the released memory are also released.
@ -564,24 +610,24 @@ To allocate memory in the heap, the allocator tries to reuse free space. If
there is no contiguous space big enough, the heap must be increased, even if we there is no contiguous space big enough, the heap must be increased, even if we
have more free space than required size. This issue is called the "memory have more free space than required size. This issue is called the "memory
fragmentation": the memory usage seen by the system may be much higher than fragmentation": the memory usage seen by the system may be much higher than
real usage. real usage. On Windows, ``HeapAlloc()`` creates a new memory mapping with
On Windows, ``HeapAlloc()`` creates a new memory mapping with
``VirtualAlloc()`` if there is not enough free contiguous memory. ``VirtualAlloc()`` if there is not enough free contiguous memory.
CPython has a pymalloc allocator using arenas of 256 KB for allocations smaller CPython has a *pymalloc* allocator for allocations smaller than 512 bytes. This
than 512 bytes. This allocator is optimized for small objects with a short allocator is optimized for small objects with a short lifetime. It uses memory
lifetime. mappings called "arenas" with a fixed size of 256 KB.
Windows provides a `Low-fragmentation Heap Other allocators:
<http://msdn.microsoft.com/en-us/library/windows/desktop/aa366750%28v=vs.85%29.aspx>`_.
The Linux kernel uses `slab allocation * Windows provides a `Low-fragmentation Heap
<http://en.wikipedia.org/wiki/Slab_allocation>`_. <http://msdn.microsoft.com/en-us/library/windows/desktop/aa366750%28v=vs.85%29.aspx>`_.
The glib library has a `Memory Slice API * The Linux kernel uses `slab allocation
<https://developer.gnome.org/glib/unstable/glib-Memory-Slices.html>`_: <http://en.wikipedia.org/wiki/Slab_allocation>`_.
efficient way to allocate groups of equal-sized chunks of memory
* The glib library has a `Memory Slice API
<https://developer.gnome.org/glib/unstable/glib-Memory-Slices.html>`_:
efficient way to allocate groups of equal-sized chunks of memory
Links Links