This commit is contained in:
Victor Stinner 2013-06-18 22:33:41 +02:00
parent d197160016
commit fdb230409c
1 changed files with 34 additions and 34 deletions

View File

@ -1,5 +1,5 @@
PEP: 445
Title: Add new APIs to customize memory allocators
Title: Add new APIs to customize Python memory allocators
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner <victor.stinner@gmail.com>
@ -12,7 +12,7 @@ Python-Version: 3.4
Abstract
========
Add new APIs to customize memory allocators.
Add new APIs to customize Python memory allocators.
Rationale
@ -110,7 +110,7 @@ API changes
void (*free) (void *ctx, void *ptr, size_t size);
} PyMemMappingAllocator;
* Add a new function to get and set memory mapping allocator:
* Add a new function to get and set the memory mapping allocator:
- ``void PyMem_GetMappingAllocator(PyMemMappingAllocator *allocator)``
- ``void PyMem_SetMappingAllocator(PyMemMappingAllocator *allocator)``
@ -141,8 +141,8 @@ implement the following checks:
* Detect write after the end of the buffer (buffer overflow)
Make usage of these new APIs
----------------------------
Other changes
-------------
* ``PyMem_Malloc()`` and ``PyMem_Realloc()`` always call ``malloc()``
and ``realloc()``, instead of calling ``PyObject_Malloc()`` and
@ -166,12 +166,13 @@ Examples
Use case 1: Replace Memory Allocator, keep pymalloc
----------------------------------------------------
Dummy example wasting 2 bytes per allocation, and 10 bytes per arena::
Dummy example wasting 2 bytes per memory block,
and 10 bytes per memory mapping::
#include <stdlib.h>
int alloc_padding = 2;
int arena_padding = 10;
int block_padding = 2;
int mapping_padding = 10;
void* my_malloc(void *ctx, size_t size)
{
@ -190,13 +191,13 @@ Dummy example wasting 2 bytes per allocation, and 10 bytes per arena::
free(ptr);
}
void* my_alloc_arena(void *ctx, size_t size)
void* my_alloc_mapping(void *ctx, size_t size)
{
int padding = *(int *)ctx;
return malloc(size + padding);
}
void my_free_arena(void *ctx, void *ptr, size_t size)
void my_free_mapping(void *ctx, void *ptr, size_t size)
{
free(ptr);
}
@ -206,7 +207,7 @@ Dummy example wasting 2 bytes per allocation, and 10 bytes per arena::
PyMemBlockAllocator block;
PyMemMappingAllocator mapping;
block.ctx = &alloc_padding;
block.ctx = &block_padding;
block.malloc = my_malloc;
block.realloc = my_realloc;
block.free = my_free;
@ -214,9 +215,9 @@ Dummy example wasting 2 bytes per allocation, and 10 bytes per arena::
PyMem_SetRawAllocator(&block);
PyMem_SetAllocator(&block);
mapping.ctx = &arena_padding;
mapping.alloc = my_alloc_arena;
mapping.free = my_free_arena;
mapping.ctx = &mapping_padding;
mapping.alloc = my_alloc_mapping;
mapping.free = my_free_mapping;
PyMem_SetMappingAllocator(mapping);
PyMem_SetupDebugHooks();
@ -231,10 +232,10 @@ Use case 2: Replace Memory Allocator, override pymalloc
--------------------------------------------------------
If your allocator is optimized for allocation of small objects (less
than 512 bytes) with a short lifetime, pymalloc can be overriden:
replace ``PyObject_Malloc()``.
than 512 bytes) with a short lifetime, pymalloc can be overriden
(replace ``PyObject_Malloc()``).
Dummy Example wasting 2 bytes per allocation::
Dummy example wasting 2 bytes per memory block::
#include <stdlib.h>
@ -358,8 +359,6 @@ Performances
Results of the `Python benchmarks suite
<http://hg.python.org/benchmarks>`_ (-b 2n3): some tests are 1.04x
faster, some tests are 1.04 slower, significant is between 115 and -191.
I don't understand these output, but I guess that the overhead cannot be
seen with such test.
Results of pybench benchmark: "+0.1%" slower globally (diff between
-4.9% and +5.6%).
@ -370,17 +369,17 @@ The full reports are attached to the issue #3329.
Alternatives
============
Only have one generic get/set function
--------------------------------------
Only one get/set function for block allocators
----------------------------------------------
Replace the 6 functions:
* ``PyMem_GetRawAllocator(PyMemBlockAllocator *allocator)``
* ``PyMem_GetAllocator(PyMemBlockAllocator *allocator)``
* ``PyObject_GetAllocator(PyMemBlockAllocator *allocator)``
* ``PyMem_SetRawAllocator(allocator)``
* ``PyMem_SetAllocator(PyMemBlockAllocator *allocator)``
* ``PyObject_SetAllocator(PyMemBlockAllocator *allocator)``
* ``void PyMem_GetRawAllocator(PyMemBlockAllocator *allocator)``
* ``void PyMem_GetAllocator(PyMemBlockAllocator *allocator)``
* ``void PyObject_GetAllocator(PyMemBlockAllocator *allocator)``
* ``void PyMem_SetRawAllocator(PyMemBlockAllocator *allocator)``
* ``void PyMem_SetAllocator(PyMemBlockAllocator *allocator)``
* ``void PyObject_SetAllocator(PyMemBlockAllocator *allocator)``
with 2 functions with an additional *domain* argument:
@ -398,8 +397,8 @@ where domain is one of these values:
Drawback: the caller has to check if the result is 0, or handle the error.
PyMem_Malloc() reuses PyMem_RawMalloc() by default
--------------------------------------------------
Make PyMem_Malloc() reuse PyMem_RawMalloc() by default
------------------------------------------------------
``PyMem_Malloc()`` should call ``PyMem_RawMalloc()`` by default. So
calling ``PyMem_SetRawAllocator()`` would also also patch
@ -442,11 +441,11 @@ practice. Extensions modules don't have to be recompiled with macros.
Pass the C filename and line number
-----------------------------------
Define allocator functions using macros and use ``__FILE__`` and
``__LINE__`` to get the C filename and line number of a memory
allocation.
Define allocator functions as macros using ``__FILE__`` and ``__LINE__``
to get the C filename and line number of a memory allocation.
Example::
Example of ``PyMem_Malloc`` macro with the modified
``PyMemBlockAllocator`` structure::
typedef struct {
/* user context passed as the first argument
@ -472,7 +471,8 @@ Example::
/* need also a function for the Python stable ABI */
void* PyMem_Malloc(size_t size);
#define PyMem_Malloc(size) _PyMem_MallocTrace(__FILE__, __LINE__, size)
#define PyMem_Malloc(size) \
_PyMem_MallocTrace(__FILE__, __LINE__, size)
Passing a filename and a line number to each allocator makes the API more
complex: pass 3 new arguments, instead of just a context argument, to each