From d1c9cb312b31f020d498aad97588f478571154c7 Mon Sep 17 00:00:00 2001 From: Victor Stinner Date: Tue, 18 Jun 2013 22:05:17 +0200 Subject: [PATCH] PEP 445: textwidth=72 (for email) --- pep-0445.txt | 308 +++++++++++++++++++++++++++------------------------ 1 file changed, 163 insertions(+), 145 deletions(-) diff --git a/pep-0445.txt b/pep-0445.txt index a9ec2fa27..c27d2a863 100644 --- a/pep-0445.txt +++ b/pep-0445.txt @@ -20,19 +20,21 @@ Rationale Use cases: -* Application embedding Python may want to isolate Python memory from the - memory of the application, or may want to different memory allocator - optimized for its Python usage +* Application embedding Python may want to isolate Python memory from + the memory of the application, or may want to different memory + allocator optimized for its Python usage * Python running on embedded devices with low memory and slow CPU. - A custom memory allocator may be required to use efficiently the memory - and/or to be able to use all the memory of the device. + A custom memory allocator may be required to use efficiently the + memory and/or to be able to use all the memory of the device. * Debug tool to: - track memory usage (memory leaks) - - get the Python filename and line number where an object was allocated + - get the Python filename and line number where an object was + allocated - detect buffer underflow, buffer overflow and detect misuse of Python allocator APIs (builtin Python debug hooks) - - force allocation to fail to test handling of ``MemoryError`` exception + - force allocation to fail to test handling of ``MemoryError`` + exception Proposal @@ -46,13 +48,14 @@ API changes - ``void* PyMem_RawMalloc(size_t size)`` - ``void* PyMem_RawRealloc(void *ptr, size_t new_size)`` - ``void PyMem_RawFree(void *ptr)`` - - the behaviour of requesting zero bytes is not defined: return *NULL* or a - distinct non-*NULL* pointer depending on the platform. + - the behaviour of requesting zero bytes is not defined: return *NULL* + or a distinct non-*NULL* pointer depending on the platform. * Add a new ``PyMemBlockAllocator`` structure:: typedef struct { - /* user context passed as the first argument to the 3 functions */ + /* user context passed as the first argument + to the 3 functions */ void *ctx; /* allocate a memory block */ @@ -65,32 +68,34 @@ API changes void (*free) (void *ctx, void *ptr); } PyMemBlockAllocator; -* Add new functions to get and set internal functions of ``PyMem_RawMalloc()``, - ``PyMem_RawRealloc()`` and ``PyMem_RawFree()``: +* Add new functions to get and set internal functions of + ``PyMem_RawMalloc()``, ``PyMem_RawRealloc()`` and ``PyMem_RawFree()``: - ``void PyMem_GetRawAllocator(PyMemBlockAllocator *allocator)`` - ``void PyMem_SetRawAllocator(PyMemBlockAllocator *allocator)`` -* Add new functions to get and set internal functions of ``PyMem_Malloc()``, - ``PyMem_Realloc()`` and ``PyMem_Free()``: +* Add new functions to get and set internal functions of + ``PyMem_Malloc()``, ``PyMem_Realloc()`` and ``PyMem_Free()``: - ``void PyMem_GetAllocator(PyMemBlockAllocator *allocator)`` - ``void PyMem_SetAllocator(PyMemBlockAllocator *allocator)`` - - ``malloc(ctx, 0)`` and ``realloc(ctx, ptr, 0)`` must not return *NULL*: - it would be treated as an error. + - ``malloc(ctx, 0)`` and ``realloc(ctx, ptr, 0)`` must not return + *NULL*: it would be treated as an error. * Add new functions to get and set internal functions of - ``PyObject_Malloc()``,, ``PyObject_Realloc()`` and ``PyObject_Free()``: + ``PyObject_Malloc()``,, ``PyObject_Realloc()`` and + ``PyObject_Free()``: - ``void PyObject_GetAllocator(PyMemBlockAllocator *allocator)`` - ``void PyObject_SetAllocator(PyMemBlockAllocator *allocator)`` - - ``malloc(ctx, 0)`` and ``realloc(ctx, ptr, 0)`` must not return *NULL*: - it would be treated as an error. + - ``malloc(ctx, 0)`` and ``realloc(ctx, ptr, 0)`` must not return + *NULL*: it would be treated as an error. * Add a new ``PyMemMappingAllocator`` structure:: typedef struct { - /* user context passed as the first argument to the 2 functions */ + /* user context passed as the first argument + to the 2 functions */ void *ctx; /* allocate a memory mapping */ @@ -104,26 +109,26 @@ API changes - ``void PyMem_GetMappingAllocator(PyMemMappingAllocator *allocator)`` - ``void PyMem_SetMappingAllocator(PyMemMappingAllocator *allocator)`` - - Currently, this allocator is only used internally by *pymalloc* to allocate - arenas. + - Currently, this allocator is only used internally by *pymalloc* to + allocate arenas. * Add a new function to setup the builtin Python debug hooks when memory allocators are replaced: - ``void PyMem_SetupDebugHooks(void)`` -* The following memory allocators always returns *NULL* if size is greater - than ``PY_SSIZE_T_MAX``: ``PyMem_RawMalloc()``, ``PyMem_RawRealloc()``, - ``PyMem_Malloc()``, ``PyMem_Realloc()``, ``PyObject_Malloc()``, - ``PyObject_Realloc()``. +* The following memory allocators always returns *NULL* if size is + greater than ``PY_SSIZE_T_MAX``: ``PyMem_RawMalloc()``, + ``PyMem_RawRealloc()``, ``PyMem_Malloc()``, ``PyMem_Realloc()``, + ``PyObject_Malloc()``, ``PyObject_Realloc()``. -The builtin Python debug hooks were introduced in Python 2.3 and implement the -following checks: +The builtin Python debug hooks were introduced in Python 2.3 and +implement the following checks: -* Newly allocated memory is filled with the byte ``0xCB``, freed memory is - filled with the byte ``0xDB``. -* Detect API violations, ex: ``PyObject_Free()`` called on a memory block - allocated by ``PyMem_Malloc()`` +* Newly allocated memory is filled with the byte ``0xCB``, freed memory + is filled with the byte ``0xDB``. +* Detect API violations, ex: ``PyObject_Free()`` called on a memory + block allocated by ``PyMem_Malloc()`` * Detect write before the start of the buffer (buffer underflow) * Detect write after the end of the buffer (buffer overflow) @@ -134,20 +139,20 @@ The *pymalloc* allocator is used by default for: Make usage of these new APIs ---------------------------- -* ``PyMem_Malloc()`` and ``PyMem_Realloc()`` always call ``malloc()`` and - ``realloc()``, instead of calling ``PyObject_Malloc()`` and +* ``PyMem_Malloc()`` and ``PyMem_Realloc()`` always call ``malloc()`` + and ``realloc()``, instead of calling ``PyObject_Malloc()`` and ``PyObject_Realloc()`` in debug mode * ``PyObject_Malloc()`` falls back on ``PyMem_Malloc()`` instead of - ``malloc()`` if size is greater or equal than ``SMALL_REQUEST_THRESHOLD`` - (512 bytes), and ``PyObject_Realloc()`` falls back on ``PyMem_Realloc()`` - instead of ``realloc()`` + ``malloc()`` if size is greater or equal than + ``SMALL_REQUEST_THRESHOLD`` (512 bytes), and ``PyObject_Realloc()`` + falls back on ``PyMem_Realloc()`` instead of ``realloc()`` * Replace direct calls to ``malloc()`` with ``PyMem_Malloc()``, or ``PyMem_RawMalloc()`` if the GIL is not held -* Configure external libraries like zlib or OpenSSL to allocate memory using - ``PyMem_RawMalloc()`` +* Configure external libraries like zlib or OpenSSL to allocate memory + using ``PyMem_RawMalloc()`` Examples @@ -213,16 +218,16 @@ Dummy example wasting 2 bytes per allocation, and 10 bytes per arena:: } .. warning:: - Remove the call ``PyMem_SetRawAllocator(&alloc)`` if the new allocator - are not thread-safe. + Remove the call ``PyMem_SetRawAllocator(&alloc)`` if the new + allocator are not thread-safe. Use case 2: Replace Memory Allocator, override pymalloc -------------------------------------------------------- -If your allocator is optimized for allocation of small objects (less than 512 -bytes) with a short lifetime, pymalloc can be overriden: replace -``PyObject_Malloc()``. +If your allocator is optimized for allocation of small objects (less +than 512 bytes) with a short lifetime, pymalloc can be overriden: +replace ``PyObject_Malloc()``. Dummy Example wasting 2 bytes per allocation:: @@ -263,8 +268,8 @@ Dummy Example wasting 2 bytes per allocation:: } .. warning:: - Remove the call ``PyMem_SetRawAllocator(&alloc)`` if the new allocator - are not thread-safe. + Remove the call ``PyMem_SetRawAllocator(&alloc)`` if the new + allocator are not thread-safe. @@ -338,20 +343,21 @@ Example to setup hooks on all memory allocators:: thread-safe. .. note:: - ``PyMem_SetupDebugHooks()`` does not need to be called: Python debug hooks - are installed automatically at startup. + ``PyMem_SetupDebugHooks()`` does not need to be called: Python debug + hooks are installed automatically at startup. Performances ============ -Results of the `Python benchmarks suite `_ (-b -2n3): some tests are 1.04x faster, some tests are 1.04 slower, significant is -between 115 and -191. I don't understand these output, but I guess that the -overhead cannot be seen with such test. +Results of the `Python benchmarks suite +`_ (-b 2n3): some tests are 1.04x +faster, some tests are 1.04 slower, significant is between 115 and -191. +I don't understand these output, but I guess that the overhead cannot be +seen with such test. -Results of pybench benchmark: "+0.1%" slower globally (diff between -4.9% and -+5.6%). +Results of pybench benchmark: "+0.1%" slower globally (diff between +-4.9% and +5.6%). The full reports are attached to the issue #3329. @@ -390,9 +396,9 @@ Drawback: the caller has to check if the result is 0, or handle the error. PyMem_Malloc() reuses PyMem_RawMalloc() by default -------------------------------------------------- -``PyMem_Malloc()`` should call ``PyMem_RawMalloc()`` by default. So calling -``PyMem_SetRawAllocator()`` would also also patch ``PyMem_Malloc()`` -indirectly. +``PyMem_Malloc()`` should call ``PyMem_RawMalloc()`` by default. So +calling ``PyMem_SetRawAllocator()`` would also also patch +``PyMem_Malloc()`` indirectly. .. note:: @@ -404,39 +410,42 @@ indirectly. Add a new PYDEBUGMALLOC environment variable -------------------------------------------- -To be able to use the Python builtin debug hooks even when a custom memory -allocator replaces the default Python allocator, an environment variable -``PYDEBUGMALLOC`` can be added to setup these debug function hooks, instead of -adding the new function ``PyMem_SetupDebugHooks()``. If the environment -variable is present, ``PyMem_SetRawAllocator()``, ``PyMem_SetAllocator()`` -and ``PyObject_SetAllocator()`` will reinstall automatically the hook on top -of the new allocator. +To be able to use the Python builtin debug hooks even when a custom +memory allocator replaces the default Python allocator, an environment +variable ``PYDEBUGMALLOC`` can be added to setup these debug function +hooks, instead of adding the new function ``PyMem_SetupDebugHooks()``. +If the environment variable is present, ``PyMem_SetRawAllocator()``, +``PyMem_SetAllocator()`` and ``PyObject_SetAllocator()`` will reinstall +automatically the hook on top of the new allocator. -An new environment variable would make the Python initialization even more -complex. The `PEP 432 `_ tries to -simply the CPython startup sequence. +An new environment variable would make the Python initialization even +more complex. The `PEP 432 `_ +tries to simply the CPython startup sequence. Use macros to get customizable allocators ----------------------------------------- -To have no overhead in the default configuration, customizable allocators would -be an optional feature enabled by a configuration option or by macros. +To have no overhead in the default configuration, customizable +allocators would be an optional feature enabled by a configuration +option or by macros. -Not having to recompile Python makes debug hooks easier to use in practice. -Extensions modules don't have to be recompiled with macros. +Not having to recompile Python makes debug hooks easier to use in +practice. Extensions modules don't have to be recompiled with macros. Pass the C filename and line number ----------------------------------- -Define allocator functions using macros and use ``__FILE__`` and ``__LINE__`` -to get the C filename and line number of a memory allocation. +Define allocator functions using macros and use ``__FILE__`` and +``__LINE__`` to get the C filename and line number of a memory +allocation. Example:: typedef struct { - /* user context passed as the first argument to the 3 functions */ + /* user context passed as the first argument + to the 3 functions */ void *ctx; /* allocate a memory block */ @@ -452,7 +461,8 @@ Example:: void *ptr); } PyMemBlockAllocator; - void* _PyMem_MallocTrace(const char *filename, int lineno, size_t size); + void* _PyMem_MallocTrace(const char *filename, int lineno, + size_t size); /* need also a function for the Python stable ABI */ void* PyMem_Malloc(size_t size); @@ -470,7 +480,8 @@ changes add too much complexity for a little gain. GIL-free PyMem_Malloc() ----------------------- -When Python is compiled in debug mode, ``PyMem_Malloc()`` calls indirectly ``PyObject_Malloc()`` which requires the GIL to be held. +When Python is compiled in debug mode, ``PyMem_Malloc()`` calls +indirectly ``PyObject_Malloc()`` which requires the GIL to be held. That's why ``PyMem_Malloc()`` must be called with the GIL held. This PEP proposes to "fix" ``PyMem_Malloc()`` to make it always call @@ -478,64 +489,65 @@ This PEP proposes to "fix" ``PyMem_Malloc()`` to make it always call ``PyMem_Malloc()``. Allowing to call ``PyMem_Malloc()`` without holding the GIL might break -applications which setup their own allocators or allocator hooks. Holding the -GIL is convinient to develop a custom allocator: no need to care of other -threads. It is also convinient for a debug allocator hook: Python internal -objects can be safetly inspected. +applications which setup their own allocators or allocator hooks. +Holding the GIL is convinient to develop a custom allocator: no need to +care of other threads. It is also convinient for a debug allocator hook: +Python internal objects can be safetly inspected. -Calling ``PyGILState_Ensure()`` in a memory allocator may have unexpected -behaviour, especially at Python startup and at creation of a new Python thread -state. +Calling ``PyGILState_Ensure()`` in a memory allocator may have +unexpected behaviour, especially at Python startup and at creation of a +new Python thread state. Don't add PyMem_RawMalloc() --------------------------- -Replace ``malloc()`` with ``PyMem_Malloc()``, but only if the GIL is held. -Otherwise, keep ``malloc()`` unchanged. +Replace ``malloc()`` with ``PyMem_Malloc()``, but only if the GIL is +held. Otherwise, keep ``malloc()`` unchanged. -The ``PyMem_Malloc()`` is used without the GIL held in some Python functions. -For example, the ``main()`` and ``Py_Main()`` functions of Python call -``PyMem_Malloc()`` whereas the GIL do not exist yet. In this case, -``PyMem_Malloc()`` should be replaced with ``malloc()`` (or +The ``PyMem_Malloc()`` is used without the GIL held in some Python +functions. For example, the ``main()`` and ``Py_Main()`` functions of +Python call ``PyMem_Malloc()`` whereas the GIL do not exist yet. In this +case, ``PyMem_Malloc()`` should be replaced with ``malloc()`` (or ``PyMem_RawMalloc()``). -If an hook is used to the track memory usage, the ``malloc()`` memory will not -be seen. Remaining ``malloc()`` may allocate a lot of memory and so would be -missed in reports. +If an hook is used to the track memory usage, the ``malloc()`` memory +will not be seen. Remaining ``malloc()`` may allocate a lot of memory +and so would be missed in reports. Use existing debug tools to analyze the memory ---------------------------------------------- -There are many existing debug tools to analyze the memory. Some examples: -`Valgrind `_, -`Purify `_, -`Clang AddressSanitizer `_, -`failmalloc `_, -etc. +There are many existing debug tools to analyze the memory. Some +examples: `Valgrind `_, `Purify +`_, `Clang AddressSanitizer +`_, `failmalloc +`_, etc. -The problem is to retrieve the Python object related to a memory pointer to read -its type and/or content. Another issue is to retrieve the location of the -memory allocation: the C backtrace is usually useless (same reasoning than -macros using ``__FILE__`` and ``__LINE__``), the Python filename and line -number (or even the Python traceback) is more useful. +The problem is to retrieve the Python object related to a memory pointer +to read its type and/or content. Another issue is to retrieve the +location of the memory allocation: the C backtrace is usually useless +(same reasoning than macros using ``__FILE__`` and ``__LINE__``), the +Python filename and line number (or even the Python traceback) is more +useful. Classic tools are unable to introspect Python internals to collect such -information. Being able to setup a hook on allocators called with the GIL held -allow to collect a lot of useful data from Python internals. +information. Being able to setup a hook on allocators called with the +GIL held allow to collect a lot of useful data from Python internals. Add msize() ----------- -Add another field to ``PyMemBlockAllocator`` and ``PyMemMappingAllocator``:: +Add another field to ``PyMemBlockAllocator`` and +``PyMemMappingAllocator``:: size_t msize(void *ptr); -This function returns the size of a memory block or a memory mapping. Return -(size_t)-1 if the function is not implemented or if the pointer is unknown -(ex: NULL pointer). +This function returns the size of a memory block or a memory mapping. +Return (size_t)-1 if the function is not implemented or if the pointer +is unknown (ex: NULL pointer). On Windows, this function can be implemented using ``_msize()`` and ``VirtualQuery()``. @@ -544,24 +556,25 @@ On Windows, this function can be implemented using ``_msize()`` and No context argument ------------------- -Simplify the signature of allocator functions, remove the context argument: +Simplify the signature of allocator functions, remove the context +argument: * ``void* malloc(size_t size)`` * ``void* realloc(void *ptr, size_t new_size)`` * ``void free(void *ptr)`` -It is likely for an allocator hook to be reused for ``PyMem_SetAllocator()`` -and ``PyObject_SetAllocator()``, or even ``PyMem_SetRawAllocator()``, but the -hook must call a different function depending on the allocator. The context is -a convenient way to reuse the same custom allocator or hook for different -Python allocators. +It is likely for an allocator hook to be reused for +``PyMem_SetAllocator()`` and ``PyObject_SetAllocator()``, or even +``PyMem_SetRawAllocator()``, but the hook must call a different function +depending on the allocator. The context is a convenient way to reuse the +same custom allocator or hook for different Python allocators. External libraries ================== -Python should try to reuse the same prototypes for allocator functions than -other libraries. +Python should try to reuse the same prototypes for allocator functions +than other libraries. Libraries used by Python: @@ -586,36 +599,41 @@ See also the `GNU libc: Memory Allocation Hooks Memory allocators ================= -The C standard library provides the well known ``malloc()`` function. Its -implementation depends on the platform and of the C library. The GNU C library -uses a modified ptmalloc2, based on "Doug Lea's Malloc" (dlmalloc). FreeBSD -uses `jemalloc `_. Google provides -tcmalloc which is part of `gperftools `_. +The C standard library provides the well known ``malloc()`` function. +Its implementation depends on the platform and of the C library. The GNU +C library uses a modified ptmalloc2, based on "Doug Lea's Malloc" +(dlmalloc). FreeBSD uses `jemalloc +`_. Google provides tcmalloc which +is part of `gperftools `_. ``malloc()`` uses two kinds of memory: heap and memory mappings. Memory -mappings are usually used for large allocations (ex: larger than 256 KB), -whereas the heap is used for small allocations. +mappings are usually used for large allocations (ex: larger than 256 +KB), whereas the heap is used for small allocations. -On UNIX, the heap is handled by ``brk()`` and ``sbrk()`` system calls on Linux, -and it is contiguous. On Windows, the heap is handled by ``HeapAlloc()`` and -may be discontiguous. Memory mappings are handled by ``mmap()`` on UNIX and -``VirtualAlloc()`` on Windows, they may be discontiguous. +On UNIX, the heap is handled by ``brk()`` and ``sbrk()`` system calls on +Linux, and it is contiguous. On Windows, the heap is handled by +``HeapAlloc()`` and may be discontiguous. Memory mappings are handled by +``mmap()`` on UNIX and ``VirtualAlloc()`` on Windows, they may be +discontiguous. -Releasing a memory mapping gives back immediatly the memory to the system. On -UNIX, heap memory is only given back to the system if it is at the end of the -heap. Otherwise, the memory will only be given back to the system when all the -memory located after the released memory are also released. +Releasing a memory mapping gives back immediatly the memory to the +system. On UNIX, heap memory is only given back to the system if it is +at the end of the heap. Otherwise, the memory will only be given back to +the system when all the memory located after the released memory are +also released. -To allocate memory in the heap, the allocator tries to reuse free space. If -there is no contiguous space big enough, the heap must be increased, even if we -have more free space than required size. This issue is called the "memory -fragmentation": the memory usage seen by the system may be much higher than -real usage. On Windows, ``HeapAlloc()`` creates a new memory mapping with -``VirtualAlloc()`` if there is not enough free contiguous memory. +To allocate memory in the heap, the allocator tries to reuse free space. +If there is no contiguous space big enough, the heap must be increased, +even if we have more free space than required size. This issue is +called the "memory fragmentation": the memory usage seen by the system +may be much higher than real usage. On Windows, ``HeapAlloc()`` creates +a new memory mapping with ``VirtualAlloc()`` if there is not enough free +contiguous memory. -CPython has a *pymalloc* allocator for allocations smaller than 512 bytes. This -allocator is optimized for small objects with a short lifetime. It uses memory -mappings called "arenas" with a fixed size of 256 KB. +CPython has a *pymalloc* allocator for allocations smaller than 512 +bytes. This allocator is optimized for small objects with a short +lifetime. It uses memory mappings called "arenas" with a fixed size of +256 KB. Other allocators: @@ -641,10 +659,10 @@ CPython issues related to memory allocation: `_ * `Issue #16742: PyOS_Readline drops GIL and calls PyOS_StdioReadline, which isn't thread safe `_ -* `Issue #18203: Replace calls to malloc() with PyMem_Malloc() or PyMem_RawMalloc() - `_ -* `Issue #18227: Use Python memory allocators in external libraries like zlib - or OpenSSL `_ +* `Issue #18203: Replace calls to malloc() with PyMem_Malloc() or + PyMem_RawMalloc() `_ +* `Issue #18227: Use Python memory allocators in external libraries like + zlib or OpenSSL `_ Projects analyzing the memory usage of Python applications: