From 69f972bb2bd91cd16f51a4a41630a0626efbdede Mon Sep 17 00:00:00 2001 From: Victor Stinner Date: Tue, 18 Jun 2013 14:14:17 +0200 Subject: [PATCH] PEP 445 --- pep-0445.txt | 168 +++++++++++++++++++++++++++++++++------------------ 1 file changed, 110 insertions(+), 58 deletions(-) diff --git a/pep-0445.txt b/pep-0445.txt index a40b4ad4a..8e4f11919 100644 --- a/pep-0445.txt +++ b/pep-0445.txt @@ -34,12 +34,6 @@ Use cases: allocator APIs (builtin Python debug hooks) - force allocation to fail to test handling of ``MemoryError`` exception -API: - -* Setup a custom memory allocator for all memory allocated by Python -* Hook memory allocator functions to call extra code before and/or after - the underlying allocator function - Proposal ======== @@ -47,15 +41,29 @@ Proposal API changes ----------- -* Add a new ``PyMemAllocators`` structure - * Add new GIL-free memory allocator functions: - ``void* PyMem_RawMalloc(size_t size)`` - ``void* PyMem_RawRealloc(void *ptr, size_t new_size)`` - ``void PyMem_RawFree(void *ptr)`` -* Add new functions to get and set memory allocators: +* Add a new ``PyMemAllocators`` structure:: + + typedef struct { + /* user context passed as the first argument to the 3 functions */ + void *ctx; + + /* allocate memory */ + void* (*malloc) (void *ctx, size_t size); + + /* allocate memory or resize a memory buffer */ + void* (*realloc) (void *ctx, void *ptr, size_t new_size); + + /* release memory */ + void (*free) (void *ctx, void *ptr); + } PyMemAllocators; + +* Add new functions to get and set memory block allocators: - ``void PyMem_GetRawAllocators(PyMemAllocators *allocators)`` - ``void PyMem_SetRawAllocators(PyMemAllocators *allocators)`` @@ -63,17 +71,32 @@ API changes - ``void PyMem_SetAllocators(PyMemAllocators *allocators)`` - ``void PyObject_GetAllocators(PyMemAllocators *allocators)`` - ``void PyObject_SetAllocators(PyMemAllocators *allocators)`` + +* Add new functions to get and set memory mapping allocators: + - ``void _PyObject_GetArenaAllocators(void **ctx_p, void* (**malloc_p) (void *ctx, size_t size), void (**free_p) (void *ctx, void *ptr, size_t size))`` - ``void _PyObject_SetArenaAllocators(void *ctx, void* (*malloc) (void *ctx, size_t size), void (*free) (void *ctx, void *ptr, size_t size))`` -* Add a new function to setup Python builtin debug hooks when memory +* Add a new function to setup the builtin Python debug hooks when memory allocators are replaced: - ``void PyMem_SetupDebugHooks(void)`` +.. note:: -Use these new APIs ------------------- + The builtin Python debug hooks were introduced in Python 2.3 and implement the + following checks: + + * Newly allocated memory is filled with the byte 0xCB, freed memory is filled + with the byte 0xDB. + * Detect API violations, ex: ``PyObject_Free()`` called on a memory block + allocated by ``PyMem_Malloc()`` + * Detect write before the start of the buffer (buffer underflow) + * Detect write after the end of the buffer (buffer overflow) + + +Make usage of these new APIs +---------------------------- * ``PyMem_Malloc()`` and ``PyMem_Realloc()`` always call ``malloc()`` and ``realloc()``, instead of calling ``PyObject_Malloc()`` and @@ -156,12 +179,12 @@ bytes per allocation, and 10 bytes per arena:: are not thread-safe. -Use case 2: Replace Memory Allocators, overriding pymalloc ----------------------------------------------------------- +Use case 2: Replace Memory Allocators, override pymalloc +-------------------------------------------------------- If your allocator is optimized for allocation of small objects (less than 512 -bytes) with a short liftime, you can replace override pymalloc (replace -``PyObject_Malloc()``). +bytes) with a short lifetime, pymalloc can be overriden: replace +``PyObject_Malloc()``. Dummy Example wasting 2 bytes per allocation:: @@ -210,7 +233,7 @@ Dummy Example wasting 2 bytes per allocation:: Use case 3: Setup Allocator Hooks --------------------------------- -Example to setup hooks on memory allocators:: +Example to setup hooks on all memory allocators:: struct { PyMemAllocators pymem; @@ -249,11 +272,11 @@ Example to setup hooks on memory allocators:: void setup_hooks(void) { PyMemAllocators alloc; - static int registered = 0; + static int installed = 0; - if (registered) + if (installed) return; - registered = 1; + installed = 1; alloc.malloc = hook_malloc; alloc.realloc = hook_realloc; @@ -284,30 +307,27 @@ Example to setup hooks on memory allocators:: Performances ============ -The `Python benchmarks suite `_ (-b 2n3): some -tests are 1.04x faster, some tests are 1.04 slower, significant is between 115 -and -191. I don't understand these output, but I guess that the overhead cannot -be seen with such test. +Results of the `Python benchmarks suite `_ (-b +2n3): some tests are 1.04x faster, some tests are 1.04 slower, significant is +between 115 and -191. I don't understand these output, but I guess that the +overhead cannot be seen with such test. -pybench: "+0.1%" (diff between -4.9% and +5.6%). +Results of pybench benchmark: "+0.1%" slower globally (diff between -4.9% and ++5.6%). -The full output is attached to the issue #3329. +The full reports are attached to the issue #3329. Alternatives ============ -Only one get and one set function ---------------------------------- +Only have one generic get/set function +-------------------------------------- Replace the 6 functions: -* ``PyMem_GetRawAllocators()`` -* ``PyMem_GetAllocators()`` -* ``PyObject_GetAllocators()`` -* ``PyMem_SetRawAllocators(allocators)`` -* ``PyMem_SetAllocators(allocators)`` -* ``PyObject_SetAllocators(allocators)`` +* ``PyMem_GetRawAllocators()``, ``PyMem_GetAllocators()``, ``PyObject_GetAllocators()`` +* ``PyMem_SetRawAllocators(allocators)``, ``PyMem_SetAllocators(allocators)``, ``PyObject_SetAllocators(allocators)`` with 2 functions with an additional *domain* argument: @@ -321,16 +341,21 @@ where domain is one of these values: * ``PYALLOC_PYOBJECT`` +``_PyObject_GetArenaAllocators()`` and ``_PyObject_SetArenaAllocators()`` are +not merged and kept private because their prototypes are different and they are +specific to pymalloc. + + Add a new PYDEBUGMALLOC environment variable -------------------------------------------- -To be able to use Python builtin debug hooks even when a custom memory -allocator is set, an environment variable ``PYDEBUGMALLOC`` can be added to -setup these debug function hooks, instead of adding the new function -``PyMem_SetupDebugHooks()``. If the environment variable is present, -``PyMem_SetRawAllocators()``, ``PyMem_SetAllocators()`` and -``PyObject_SetAllocators()`` will reinstall automatically the hook on top of -the new allocator. +To be able to use the Python builtin debug hooks even when a custom memory +allocator replaces the default Python allocator, an environment variable +``PYDEBUGMALLOC`` can be added to setup these debug function hooks, instead of +adding the new function ``PyMem_SetupDebugHooks()``. If the environment +variable is present, ``PyMem_SetRawAllocators()``, ``PyMem_SetAllocators()`` +and ``PyObject_SetAllocators()`` will reinstall automatically the hook on top +of the new allocator. An new environment variable would make the Python initialization even more complex. The `PEP 432 `_ tries to @@ -343,8 +368,8 @@ Use macros to get customizable allocators To have no overhead in the default configuration, customizable allocators would be an optional feature enabled by a configuration option or by macros. -Not having to recompile Python makes debug hooks easy to use in practice. -Extensions modules don't have to be compiled with or without macros. +Not having to recompile Python makes debug hooks easier to use in practice. +Extensions modules don't have to be recompiled with macros. Pass the C filename and line number @@ -354,10 +379,11 @@ Use C macros using ``__FILE__`` and ``__LINE__`` to get the C filename and line number of a memory allocation. Passing a filename and a line number to each allocator makes the API more -complex: pass 3 new arguments instead of just a context argument, to each -allocator function. GC allocator functions should also be patched, -``_PyObject_GC_Malloc()`` is used in many C functions for example. Such changes -add too much complexity, for a little gain. +complex: pass 3 new arguments, instead of just a context argument, to each +allocator function. The GC allocator functions should also be patched. +``_PyObject_GC_Malloc()`` is used in many C functions for example and so +objects of differenet types would have the same allocation location. Such +changes add too much complexity for a little gain. No context argument @@ -369,41 +395,67 @@ Simplify the signature of allocator functions, remove the context argument: * ``void* realloc(void *ptr, size_t new_size)`` * ``void free(void *ptr)`` -The context is a convenient way to reuse the same allocator for different APIs -(ex: PyMem and PyObject). +It is likely for an allocator hook to be reused for ``PyMem_SetAllocators()`` +and ``PyObject_SetAllocators()``, but the hook must call a different function +depending on the allocator. The context is a convenient way to reuse the same +allocator or hook for different APIs. PyMem_Malloc() GIL-free ----------------------- -There is no real reason to require the GIL when calling ``PyMem_Malloc()``. +``PyMem_Malloc()`` must be called with the GIL held because in debug mode, it +calls indirectly ``PyObject_Malloc()`` which requires the GIL to be held. This +PEP proposes to "fix" ``PyMem_Malloc()`` to make it always call ``malloc()``. +So the "GIL must be held" restriction may be removed no ``PyMem_Malloc()``. Allowing to call ``PyMem_Malloc()`` without holding the GIL might break applications which setup their own allocator or their allocator hooks. Holding -the GIL is very convinient to develop a custom allocator or a hook (no need to -care of other threads, no need to handle mutexes, etc.). +the GIL is very convinient to develop a custom allocator: no need to care of +other threads nor mutexes. It is also convinient for an allocator hook: Python +internals can be safetly inspected. + +Calling ``PyGILState_Ensure()`` in a memory allocator may have unexpected +behaviour, especially at Python startup and at creation of a new Python thread +state. Don't add PyMem_RawMalloc() --------------------------- -Replace ``malloc()`` with ``PyMem_Malloc()``, but if the GIL is not held: keep -``malloc()`` unchanged. +Replace ``malloc()`` with ``PyMem_Malloc()``, but only if the GIL is held. +Otherwise, keep ``malloc()`` unchanged. The ``PyMem_Malloc()`` is sometimes already misused. For example, the ``main()`` and ``Py_Main()`` functions of Python call ``PyMem_Malloc()`` whereas the GIL do not exist yet. In this case, ``PyMem_Malloc()`` should -be replaced with ``malloc()``. +be replaced with ``malloc()`` (or ``PyMem_RawMalloc()``). If an hook is used to the track memory usage, the ``malloc()`` memory will not be seen. Remaining ``malloc()`` may allocate a lot of memory and so would be missed in reports. -CCP API -------- -XXX To be done (Kristján Valur Jónsson) XXX +Use existing debug tools to analyze the memory +---------------------------------------------- + +There are many existing debug tools to analyze the memory. Some examples: +`Valgrind `_, +`Purify `_, +`Clang AddressSanitizer `_, +`failmalloc `_, +etc. + +The problem is retrieve the Python object related to a memory pointer to read +its type and/or content. Another issue is to retrieve the location of the +memory allocation: the C backtrace is usually useless (same reasoning than +macros using ``__FILE__`` and ``__LINE__``), the Python filename and line +number (or even the Python traceback) is more useful. + +Classic tools are unable to introspect the Python internal to collect such +information. Being able to setup a hook on allocators called with the GIL held +allow to read a lot of useful data from Python internals. External libraries