* add PyMemAllocatorDomain enum: PYALLOC_PYMEM_RAW, PYALLOC_PYMEM or
   PYALLOC_PYOBJECT
 * rename:

   - PyMemBlockAllocator structure => PyMemAllocator
   - PyMem_GetMappingAllocator() => PyObject_GetArenaAllocator()
   - PyMemMappingAllocator structure => PyObjectArenaAllocator
   - PyMem_SetMappingAllocator() => PyObject_SetArenaAllocator()

 * group get/set functions to only keep 2 functions:
   PyMem_GetAllocator() and PyMem_SetAllocator()
 * PyMem_RawMalloc(0) now calls malloc(1) to have a well defined behaviour
 * PYALLOC_PYMEM_RAW and PYALLOC_PYMEM are now using exactly the same allocator
 * Add more references for external libraries
This commit is contained in:
Victor Stinner 2013-06-20 13:20:58 +02:00
parent f4a21c42e6
commit 682a7fe994
1 changed files with 157 additions and 144 deletions

View File

@ -40,18 +40,20 @@ Use cases:
Proposal
========
API changes
-----------
New functions and new structure
-------------------------------
* Add new GIL-free (no need to hold the GIL) memory allocator functions:
* Add a new GIL-free (no need to hold the GIL) memory allocator:
- ``void* PyMem_RawMalloc(size_t size)``
- ``void* PyMem_RawRealloc(void *ptr, size_t new_size)``
- ``void PyMem_RawFree(void *ptr)``
- the behaviour of requesting zero bytes is not defined: return *NULL*
or a distinct non-*NULL* pointer depending on the platform.
- The newly allocated memory will not have been initialized in any
way.
- Requesting zero bytes returns a distinct non-*NULL* pointer if
possible, as if ``PyMem_Malloc(1)`` had been called instead.
* Add a new ``PyMemBlockAllocator`` structure::
* Add a new ``PyMemAllocator`` structure::
typedef struct {
/* user context passed as the first argument
@ -66,69 +68,70 @@ API changes
/* release a memory block */
void (*free) (void *ctx, void *ptr);
} PyMemBlockAllocator;
} PyMemAllocator;
* Add new functions to get and set internal functions of
``PyMem_RawMalloc()``, ``PyMem_RawRealloc()`` and ``PyMem_RawFree()``:
* Add a new ``PyMemAllocatorDomain`` enum to choose the Python
allocator domain. Domains:
- ``void PyMem_GetRawAllocator(PyMemBlockAllocator *allocator)``
- ``void PyMem_SetRawAllocator(PyMemBlockAllocator *allocator)``
- default allocator: ``malloc()``, ``realloc()``, ``free()``
- ``PYALLOC_PYMEM_RAW``: ``PyMem_RawMalloc()``, ``PyMem_RawRealloc()``
and ``PyMem_RawRealloc()``
* Add new functions to get and set internal functions of
``PyMem_Malloc()``, ``PyMem_Realloc()`` and ``PyMem_Free()``:
- ``PYALLOC_PYMEM``: ``PyMem_Malloc()``, ``PyMem_Realloc()`` and
``PyMem_Realloc()``
- ``void PyMem_GetAllocator(PyMemBlockAllocator *allocator)``
- ``void PyMem_SetAllocator(PyMemBlockAllocator *allocator)``
- ``malloc(ctx, 0)`` and ``realloc(ctx, ptr, 0)`` must not return
*NULL*: it would be treated as an error.
- default allocator: ``malloc()``, ``realloc()``, ``free()``;
``PyMem_Malloc(0)`` calls ``malloc(1)``
and ``PyMem_Realloc(NULL, 0)`` calls ``realloc(NULL, 1)``
- ``PYALLOC_PYOBJECT``: ``PyObject_Malloc()``, ``PyObject_Realloc()``
and ``PyObject_Realloc()``
* Add new functions to get and set internal functions of
``PyObject_Malloc()``, ``PyObject_Realloc()`` and
``PyObject_Free()``:
* Add new functions to get and set memory allocators:
- ``void PyObject_GetAllocator(PyMemBlockAllocator *allocator)``
- ``void PyObject_SetAllocator(PyMemBlockAllocator *allocator)``
- ``malloc(ctx, 0)`` and ``realloc(ctx, ptr, 0)`` must not return
*NULL*: it would be treated as an error.
- default allocator: the *pymalloc* allocator
- ``void PyMem_GetAllocator(PyMemAllocatorDomain domain, PyMemAllocator *allocator)``
- ``void PyMem_SetAllocator(PyMemAllocatorDomain domain, PyMemAllocator *allocator)``
- The new allocator must return a distinct non-*NULL* pointer when
requesting zero bytes
* Add a new ``PyMemMappingAllocator`` structure::
* Add a new ``PyObjectArenaAllocator`` structure::
typedef struct {
/* user context passed as the first argument
to the 2 functions */
void *ctx;
/* allocate a memory mapping */
/* allocate an arena */
void* (*alloc) (void *ctx, size_t size);
/* release a memory mapping */
/* release an arena */
void (*free) (void *ctx, void *ptr, size_t size);
} PyMemMappingAllocator;
} PyObjectArenaAllocator;
* Add a new function to get and set the memory mapping allocator:
* Add new functions to get and set the arena allocator used by
*pymalloc*:
- ``void PyMem_GetMappingAllocator(PyMemMappingAllocator *allocator)``
- ``void PyMem_SetMappingAllocator(PyMemMappingAllocator *allocator)``
- Currently, this allocator is only used internally by *pymalloc* to
allocate arenas.
- ``void PyObject_GetArenaAllocator(PyObjectArenaAllocator *allocator)``
- ``void PyObject_SetArenaAllocator(PyObjectArenaAllocator *allocator)``
* Add a new function to setup the builtin Python debug hooks when memory
allocators are replaced:
* Add a new function to setup the builtin Python debug hooks when a
memory allocator is replaced:
- ``void PyMem_SetupDebugHooks(void)``
- the function does nothing is Python is compiled not compiled in
debug mode
- the function does nothing is Python is not compiled in debug mode
* The following memory allocators always returns *NULL* if size is
greater than ``PY_SSIZE_T_MAX`` (check before calling the internal
function): ``PyMem_RawMalloc()``, ``PyMem_RawRealloc()``,
``PyMem_Malloc()``, ``PyMem_Realloc()``, ``PyObject_Malloc()``,
``PyObject_Realloc()``.
* Memory allocators always returns *NULL* if size is greater than
``PY_SSIZE_T_MAX``. The check is done before calling the
inner function.
The *pymalloc* allocator is optimized for objects smaller than 512 bytes
with a short lifetime. It uses memory mappings with a fixed size of 256
KB called "arenas".
Default allocators:
* ``PYALLOC_PYMEM_RAW``, ``PYALLOC_PYMEM``: ``malloc()``,
``realloc()``, ``free()`` (and *ctx* is NULL); call ``malloc(1)`` when
requesting zero bytes
* ``PYALLOC_PYOBJECT``: *pymalloc* allocator which fall backs on
``PyMem_Malloc()`` for allocations larger than 512 bytes
* *pymalloc* arena allocator: ``mmap()``, ``munmap()`` (and *ctx* is
NULL), or ``malloc()`` and ``free()`` if ``mmap()`` is not available
The builtin Python debug hooks were introduced in Python 2.3 and
implement the following checks:
@ -141,23 +144,34 @@ implement the following checks:
* Detect write after the end of the buffer (buffer overflow)
Other changes
-------------
Don't call malloc() directly anymore
------------------------------------
* ``PyMem_Malloc()`` and ``PyMem_Realloc()`` always call ``malloc()``
and ``realloc()``, instead of calling ``PyObject_Malloc()`` and
``PyObject_Realloc()`` in debug mode
``PyMem_Malloc()`` and ``PyMem_Realloc()`` always call ``malloc()`` and
``realloc()``, instead of calling ``PyObject_Malloc()`` and
``PyObject_Realloc()`` in debug mode.
* ``PyObject_Malloc()`` falls back on ``PyMem_Malloc()`` instead of
``malloc()`` if size is greater or equal than
``SMALL_REQUEST_THRESHOLD`` (512 bytes), and ``PyObject_Realloc()``
falls back on ``PyMem_Realloc()`` instead of ``realloc()``
``PyObject_Malloc()`` falls back on ``PyMem_Malloc()`` instead of
``malloc()`` if size is greater or equal than 512 bytes, and
``PyObject_Realloc()`` falls back on ``PyMem_Realloc()`` instead of
``realloc()``
* Replace direct calls to ``malloc()`` with ``PyMem_Malloc()``, or
``PyMem_RawMalloc()`` if the GIL is not held
Replace direct calls to ``malloc()`` with ``PyMem_Malloc()``, or
``PyMem_RawMalloc()`` if the GIL is not held.
* Configure external libraries like zlib or OpenSSL to allocate memory
using ``PyMem_RawMalloc()``
Configure external libraries like zlib or OpenSSL to allocate memory
using ``PyMem_Malloc()`` or ``PyMem_RawMalloc()``. If the allocator of a
library can only be replaced globally, the allocator is not replaced if
Python is embedded in an application.
For the "track memory usage" use case, it is important to track memory
allocated in external libraries to have accurate reports, because these
allocations may be large.
If an hook is used to the track memory usage, the memory allocated by
``malloc()`` will not be tracked. Remaining ``malloc()`` in external
libraries like OpenSSL or bz2 may allocate large memory blocks and so
would be missed in memory usage reports.
Examples
@ -171,8 +185,8 @@ and 10 bytes per memory mapping::
#include <stdlib.h>
int block_padding = 2;
int mapping_padding = 10;
int alloc_padding = 2;
int arena_padding = 10;
void* my_malloc(void *ctx, size_t size)
{
@ -191,49 +205,49 @@ and 10 bytes per memory mapping::
free(ptr);
}
void* my_alloc_mapping(void *ctx, size_t size)
void* my_alloc_arena(void *ctx, size_t size)
{
int padding = *(int *)ctx;
return malloc(size + padding);
}
void my_free_mapping(void *ctx, void *ptr, size_t size)
void my_free_arena(void *ctx, void *ptr, size_t size)
{
free(ptr);
}
void setup_custom_allocator(void)
{
PyMemBlockAllocator block;
PyMemMappingAllocator mapping;
PyMemAllocator alloc;
PyObjectArenaAllocator arena;
block.ctx = &block_padding;
block.malloc = my_malloc;
block.realloc = my_realloc;
block.free = my_free;
alloc.ctx = &alloc_padding;
alloc.malloc = my_malloc;
alloc.realloc = my_realloc;
alloc.free = my_free;
PyMem_SetRawAllocator(&block);
PyMem_SetAllocator(&block);
PyMem_SetAllocator(PYALLOC_PYMEM_RAW, &alloc);
PyMem_SetAllocator(PYALLOC_PYMEM, &alloc);
mapping.ctx = &mapping_padding;
mapping.alloc = my_alloc_mapping;
mapping.free = my_free_mapping;
PyMem_SetMappingAllocator(mapping);
arena.ctx = &arena_padding;
arena.alloc = my_alloc_arena;
arena.free = my_free_arena;
PyObject_SetArenaAllocator(&arena);
PyMem_SetupDebugHooks();
}
.. warning::
Remove the call ``PyMem_SetRawAllocator(&alloc)`` if the new
allocator are not thread-safe.
Remove the call ``PyMem_SetAllocator(PYALLOC_PYMEM_RAW, &alloc)`` if
the new allocator is not thread-safe.
Use case 2: Replace Memory Allocator, override pymalloc
--------------------------------------------------------
If your allocator is optimized for allocation of small objects (less
than 512 bytes) with a short lifetime, pymalloc can be overriden
(replace ``PyObject_Malloc()``).
If your allocator is optimized for allocations of objects smaller than
512 bytes with a short lifetime, pymalloc can be overriden (replace
``PyObject_Malloc()``).
Dummy example wasting 2 bytes per memory block::
@ -260,22 +274,22 @@ Dummy example wasting 2 bytes per memory block::
void setup_custom_allocator(void)
{
PyMemBlockAllocator alloc;
PyMemAllocator alloc;
alloc.ctx = &padding;
alloc.malloc = my_malloc;
alloc.realloc = my_realloc;
alloc.free = my_free;
PyMem_SetRawAllocator(&alloc);
PyMem_SetAllocator(&alloc);
PyObject_SetAllocator(&alloc);
PyMem_SetAllocator(PYALLOC_PYMEM_RAW, &alloc);
PyMem_SetAllocator(PYALLOC_PYMEM, &alloc);
PyMem_SetAllocator(PYALLOC_PYOBJECT, &alloc);
PyMem_SetupDebugHooks();
}
.. warning::
Remove the call ``PyMem_SetRawAllocator(&alloc)`` if the new
allocator are not thread-safe.
Remove the call ``PyMem_SetAllocator(PYALLOC_PYMEM_RAW, &alloc)`` if
the new allocator is not thread-safe.
@ -285,15 +299,15 @@ Use case 3: Setup Allocator Hooks
Example to setup hooks on all memory allocators::
struct {
PyMemBlockAllocator raw;
PyMemBlockAllocator mem;
PyMemBlockAllocator obj;
PyMemAllocator raw;
PyMemAllocator mem;
PyMemAllocator obj;
/* ... */
} hook;
static void* hook_malloc(void *ctx, size_t size)
{
PyMemBlockAllocator *alloc = (PyMemBlockAllocator *)ctx;
PyMemAllocator *alloc = (PyMemAllocator *)ctx;
/* ... */
ptr = alloc->malloc(alloc->ctx, size);
/* ... */
@ -302,7 +316,7 @@ Example to setup hooks on all memory allocators::
static void* hook_realloc(void *ctx, void *ptr, size_t new_size)
{
PyMemBlockAllocator *alloc = (PyMemBlockAllocator *)ctx;
PyMemAllocator *alloc = (PyMemAllocator *)ctx;
void *ptr2;
/* ... */
ptr2 = alloc->realloc(alloc->ctx, ptr, new_size);
@ -312,7 +326,7 @@ Example to setup hooks on all memory allocators::
static void hook_free(void *ctx, void *ptr)
{
PyMemBlockAllocator *alloc = (PyMemBlockAllocator *)ctx;
PyMemAllocator *alloc = (PyMemAllocator *)ctx;
/* ... */
alloc->free(alloc->ctx, ptr);
/* ... */
@ -320,7 +334,7 @@ Example to setup hooks on all memory allocators::
void setup_hooks(void)
{
PyMemBlockAllocator alloc;
PyMemAllocator alloc;
static int installed = 0;
if (installed)
@ -330,27 +344,28 @@ Example to setup hooks on all memory allocators::
alloc.malloc = hook_malloc;
alloc.realloc = hook_realloc;
alloc.free = hook_free;
PyMem_GetAllocator(PYALLOC_PYMEM_RAW, &hook.raw);
PyMem_GetAllocator(PYALLOC_PYMEM, &hook.mem);
PyMem_GetAllocator(PYALLOC_PYOBJECT, &hook.obj);
PyMem_GetRawAllocator(&hook.raw);
alloc.ctx = &hook.raw;
PyMem_SetRawAllocator(&alloc);
PyMem_SetAllocator(PYALLOC_PYMEM_RAW, &alloc);
PyMem_GetAllocator(&hook.mem);
alloc.ctx = &hook.mem;
PyMem_SetAllocator(&alloc);
PyMem_SetAllocator(PYALLOC_PYMEM, &alloc);
PyObject_GetAllocator(&hook.obj);
alloc.ctx = &hook.obj;
PyObject_SetAllocator(&alloc);
PyMem_SetAllocator(PYALLOC_PYOBJECT, &alloc);
}
.. warning::
Remove the call ``PyMem_SetRawAllocator(&alloc)`` if hooks are not
thread-safe.
Remove the call ``PyMem_SetAllocator(PYALLOC_PYMEM_RAW, &alloc)`` if
hooks are not thread-safe.
.. note::
``PyMem_SetupDebugHooks()`` does not need to be called: Python debug
hooks are installed automatically at startup.
``PyMem_SetupDebugHooks()`` does not need to be called because the
allocator is not replaced: Python debug hooks are installed
automatically at startup.
Performances
@ -369,32 +384,22 @@ The full reports are attached to the issue #3329.
Alternatives
============
Only one get/set function for block allocators
----------------------------------------------
More specific functions to get/set memory allocators
----------------------------------------------------
Replace the 6 functions:
Replace the 2 functions:
* ``void PyMem_GetRawAllocator(PyMemBlockAllocator *allocator)``
* ``void PyMem_GetAllocator(PyMemBlockAllocator *allocator)``
* ``void PyObject_GetAllocator(PyMemBlockAllocator *allocator)``
* ``void PyMem_SetRawAllocator(PyMemBlockAllocator *allocator)``
* ``void PyMem_SetAllocator(PyMemBlockAllocator *allocator)``
* ``void PyObject_SetAllocator(PyMemBlockAllocator *allocator)``
* ``void PyMem_GetAllocator(PyMemAllocatorDomain domain, PyMemAllocator *allocator)``
* ``void PyMem_SetAllocator(PyMemAllocatorDomain domain, PyMemAllocator *allocator)``
with 2 functions with an additional *domain* argument:
with:
* ``int PyMem_GetBlockAllocator(int domain, PyMemBlockAllocator *allocator)``
* ``int PyMem_SetBlockAllocator(int domain, PyMemBlockAllocator *allocator)``
These functions return 0 on success, or -1 if the domain is unknown.
where domain is one of these values:
* ``PYALLOC_PYMEM``
* ``PYALLOC_PYMEM_RAW``
* ``PYALLOC_PYOBJECT``
Drawback: the caller has to check if the result is 0, or handle the error.
* ``void PyMem_GetRawAllocator(PyMemAllocator *allocator)``
* ``void PyMem_GetAllocator(PyMemAllocator *allocator)``
* ``void PyObject_GetAllocator(PyMemAllocator *allocator)``
* ``void PyMem_SetRawAllocator(PyMemAllocator *allocator)``
* ``void PyMem_SetAllocator(PyMemAllocator *allocator)``
* ``void PyObject_SetAllocator(PyMemAllocator *allocator)``
Make PyMem_Malloc() reuse PyMem_RawMalloc() by default
@ -404,12 +409,6 @@ Make PyMem_Malloc() reuse PyMem_RawMalloc() by default
calling ``PyMem_SetRawAllocator()`` would also also patch
``PyMem_Malloc()`` indirectly.
.. note::
In the implementation of this PEP (issue #3329),
``PyMem_RawMalloc(0)`` calls ``malloc(0)``,
whereas ``PyMem_Malloc(0)`` calls ``malloc(1)``.
Add a new PYDEBUGMALLOC environment variable
--------------------------------------------
@ -445,7 +444,7 @@ Define allocator functions as macros using ``__FILE__`` and ``__LINE__``
to get the C filename and line number of a memory allocation.
Example of ``PyMem_Malloc`` macro with the modified
``PyMemBlockAllocator`` structure::
``PyMemAllocator`` structure::
typedef struct {
/* user context passed as the first argument
@ -463,7 +462,7 @@ Example of ``PyMem_Malloc`` macro with the modified
/* release a memory block */
void (*free) (void *ctx, const char *filename, int lineno,
void *ptr);
} PyMemBlockAllocator;
} PyMemAllocator;
void* _PyMem_MallocTrace(const char *filename, int lineno,
size_t size);
@ -485,12 +484,12 @@ changes add too much complexity for a little gain.
GIL-free PyMem_Malloc()
-----------------------
When Python is compiled in debug mode, ``PyMem_Malloc()`` calls
indirectly ``PyObject_Malloc()`` which requires the GIL to be held.
That's why ``PyMem_Malloc()`` must be called with the GIL held.
In Python 3.3, when Python is compiled in debug mode, ``PyMem_Malloc()``
calls indirectly ``PyObject_Malloc()`` which requires the GIL to be
held. That's why ``PyMem_Malloc()`` must be called with the GIL held.
This PEP proposes to "fix" ``PyMem_Malloc()`` to make it always call
``malloc()``. So the "GIL must be held" restriction may be removed from
This PEP proposes changes ``PyMem_Malloc()``: it now always call
``malloc()``. The "GIL must be held" restriction can be removed from
``PyMem_Malloc()``.
Allowing to call ``PyMem_Malloc()`` without holding the GIL might break
@ -516,9 +515,10 @@ Python call ``PyMem_Malloc()`` whereas the GIL do not exist yet. In this
case, ``PyMem_Malloc()`` should be replaced with ``malloc()`` (or
``PyMem_RawMalloc()``).
If an hook is used to the track memory usage, the ``malloc()`` memory
will not be seen. Remaining ``malloc()`` may allocate a lot of memory
and so would be missed in reports.
If an hook is used to the track memory usage, the memory allocated by
direct calls to ``malloc()`` will not be tracked. External libraries
like OpenSSL or bz2 should not call ``malloc()`` directly, so large
allocated will be included in memory usage reports.
Use existing debug tools to analyze the memory
@ -545,8 +545,7 @@ GIL held allow to collect a lot of useful data from Python internals.
Add msize()
-----------
Add another field to ``PyMemBlockAllocator`` and
``PyMemMappingAllocator``::
Add another field to ``PyMemAllocator`` and ``PyObjectArenaAllocator``::
size_t msize(void *ptr);
@ -574,6 +573,8 @@ It is likely for an allocator hook to be reused for
depending on the allocator. The context is a convenient way to reuse the
same custom allocator or hook for different Python allocators.
In C++, the context can be used to pass *this*.
External libraries
==================
@ -589,6 +590,15 @@ Libraries used by Python:
* expat: `parserCreate()
<http://hg.python.org/cpython/file/cc27d50bd91a/Modules/expat/xmlparse.c#l724>`_
has a per-instance memory handler
* zlib: `zlib 1.2.8 Manual <http://www.zlib.net/manual.html#Usage>`_,
pass an opaque pointer
* bz2: `bzip2 and libbzip2, version 1.0.5
<http://www.bzip.org/1.0.5/bzip2-manual-1.0.5.html>`_,
pass an opaque pointer
* lzma: `LZMA SDK - How to Use
<http://www.asawicki.info/news_1368_lzma_sdk_-_how_to_use.html>`_,
pass an opaque pointer
* lipmpdec doesn't have this extra *ctx* parameter
Other libraries:
@ -596,6 +606,9 @@ Other libraries:
<http://developer.gnome.org/glib/unstable/glib-Memory-Allocation.html#g-mem-set-vtable>`_
* libxml2: `xmlGcMemSetup() <http://xmlsoft.org/html/libxml-xmlmemory.html>`_,
global
* Oracle's OCI: `Oracle Call Interface Programmer's Guide,
Release 2 (9.2)
<http://docs.oracle.com/cd/B10501_01/appdev.920/a96584/oci15re4.htm>`_
See also the `GNU libc: Memory Allocation Hooks
<http://www.gnu.org/software/libc/manual/html_node/Hooks-for-Malloc.html>`_.