python-peps/pep-0445.txt

363 lines
11 KiB
Plaintext
Raw Normal View History

PEP: 445
Title: Add new APIs to customize memory allocators
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner <victor.stinner@gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 15-june-2013
Python-Version: 3.4
Abstract
========
Add new APIs to customize memory allocators
Rationale
=========
Use cases:
* Application embedding Python wanting to use a custom memory allocator
to allocate all Python memory somewhere else or with a different algorithm
* Python running on embedded devices with low memory and slow CPU.
A custom memory allocator may be required to use efficiently the memory
and/or to be able to use all memory of the device.
* Debug tool to track memory leaks
* Debug tool to detect buffer underflow, buffer overflow and misuse
of Python allocator APIs
* Debug tool to inject bugs, simulate out of memory for example
API:
* Setup a custom memory allocator for all memory allocated by Python
* Hook memory allocator functions to call extra code before and/or after
the underlying allocator function
Proposal
========
2013-06-17 19:02:16 -04:00
API changes
-----------
* Add a new ``PyMemAllocators`` structure
* Add new GIL-free memory allocator functions:
2013-06-17 19:02:16 -04:00
- ``void* PyMem_RawMalloc(size_t size)``
- ``void* PyMem_RawRealloc(void *ptr, size_t new_size)``
- ``void PyMem_RawFree(void *ptr)``
* Add new functions to get and set memory allocators:
2013-06-17 19:02:16 -04:00
- ``void PyMem_GetRawAllocators(PyMemAllocators *allocators)``
- ``void PyMem_SetRawAllocators(PyMemAllocators *allocators)``
- ``void PyMem_GetAllocators(PyMemAllocators *allocators)``
- ``void PyMem_SetAllocators(PyMemAllocators *allocators)``
- ``void PyObject_GetAllocators(PyMemAllocators *allocators)``
- ``void PyObject_SetAllocators(PyMemAllocators *allocators)``
- ``void _PyObject_GetArenaAllocators(void **ctx_p, void* (**malloc_p) (void *ctx, size_t size), void (**free_p) (void *ctx, void *ptr, size_t size))``
- ``void _PyObject_SetArenaAllocators(void *ctx, void* (*malloc) (void *ctx, size_t size), void (*free) (void *ctx, void *ptr, size_t size))``
* Add a new function to setup debug hooks after memory allocators were
replaced:
2013-06-17 19:02:16 -04:00
- ``void PyMem_SetupDebugHooks(void)``
2013-06-15 22:03:15 -04:00
* ``PyMem_Malloc()`` and ``PyMem_Realloc()`` now always call ``malloc()`` and
``realloc()``, instead of calling ``PyObject_Malloc()`` and
``PyObject_Realloc()`` in debug mode
* ``PyObject_Malloc()`` now falls back on ``PyMem_Malloc()`` instead of
``malloc()`` if size is bigger than ``SMALL_REQUEST_THRESHOLD``, and
``PyObject_Realloc()`` falls back on ``PyMem_Realloc()`` instead of
``realloc()``
2013-06-17 19:02:16 -04:00
Examples
========
Use case 1: replace memory allocators, keeping pymalloc
-------------------------------------------------------
Setup your custom memory allocators, keeping pymalloc::
/* global variable, don't use a variable allocated on the stack! */
int magic = 42;
int my_malloc(void *ctx, size_t size);
int my_realloc(void *ctx, void *ptr, size_t new_size);
void my_free(void *ctx, void *ptr);
int my_alloc_arena(void *ctx, size_t size);
int my_free_arena(void *ctx, void *ptr, size_t size);
void setup_custom_allocators(void)
{
PyMemAllocators alloc;
alloc.ctx = &magic;
alloc.malloc = my_malloc;
alloc.realloc = my_realloc;
alloc.free = my_free;
PyMem_SetRawAllocators(&alloc);
PyMem_SetAllocators(&alloc);
_PyObject_SetArenaAllocators(&magic, my_alloc_arena, my_free_arena);
PyMem_SetupDebugHooks();
}
.. warning::
Remove call ``PyMem_SetRawAllocators(&alloc);`` if the new allocators are
not thread-safe.
Full example:
`replace_allocs.c <http://hg.python.org/peps/file/tip/pep-0445/replace_allocs.c>`_.
Use case 2: replace memory allocators, overriding pymalloc
----------------------------------------------------------
If your allocator is optimized for allocation of small objects (less than 512
bytes) with a short liftime, you can replace override pymalloc (replace
``PyObject_Malloc()``). Example::
/* global variable, don't use a variable allocated on the stack! */
int magic = 42;
int my_malloc(void *ctx, size_t size);
int my_realloc(void *ctx, void *ptr, size_t new_size);
void my_free(void *ctx, void *ptr);
void setup_custom_allocators(void)
{
PyMemAllocators alloc;
alloc.ctx = &magic;
alloc.malloc = my_malloc;
alloc.realloc = my_realloc;
alloc.free = my_free;
PyMem_SetRawAllocators(&alloc);
PyMem_SetAllocators(&alloc);
PyObject_SetAllocators(&areana);
PyMem_SetupDebugHooks();
}
Full example:
`replace_pymalloc.c <http://hg.python.org/peps/file/tip/pep-0445/replace_pymalloc.c>`_.
Use case 3: hook allocators
---------------------------
Setup hooks on memory allocators::
/* global variable, don't use a variable allocated on the stack! */
struct {
PyMemAllocators pymem;
PyMemAllocators pymem_raw;
PyMemAllocators pyobj;
int magic;
} hook;
int hook_malloc(void *ctx, size_t size);
int hook_realloc(void *ctx, void *ptr, size_t new_size);
void hook_free(void *ctx, void *ptr);
/* Must be called before the first allocation, or hook_realloc() and
hook_free() will crash */
void setup_custom_allocators(void)
{
PyMemAllocators alloc;
alloc.ctx = &magic;
alloc.malloc = hook_malloc;
alloc.realloc = hook_realloc;
alloc.free = hook_free;
PyMem_GetRawAllocators(&alloc.pymem_raw);
alloc.ctx = &alloc.pymem_raw;
PyMem_SetRawAllocators(&alloc);
PyMem_GetAllocators(&alloc.pymem);
alloc.ctx = &alloc.pymem;
PyMem_SetAllocators(&alloc);
PyObject_GetAllocators(&alloc.pyobj);
alloc.ctx = &alloc.pyobj;
PyObject_SetAllocators(&alloc);
}
.. note::
No need to call ``PyMem_SetupDebugHooks()``: it is already installed by
default.
Full example tracking memory usage:
`alloc_hooks.c <http://hg.python.org/peps/file/tip/pep-0445/alloc_hooks.c>`_.
Performances
============
2013-06-17 19:02:16 -04:00
The `Python benchmarks suite <http://hg.python.org/benchmarks>`_ (-b 2n3): some
tests are 1.04x faster, some tests are 1.04 slower, significant is between 115
and -191. I don't understand these output, but I guess that the overhead cannot
be seen with such test.
pybench: "+0.1%" (diff between -4.9% and +5.6%).
2013-06-17 19:02:16 -04:00
The full output is attached to the issue #3329.
Alternatives
============
Only one get and one set function
---------------------------------
Replace the 6 functions:
* ``PyMem_GetRawAllocators()``
* ``PyMem_GetAllocators()``
* ``PyObject_GetAllocators()``
* ``PyMem_SetRawAllocators(allocators)``
* ``PyMem_SetAllocators(allocators)``
2013-06-14 22:25:42 -04:00
* ``PyObject_SetAllocators(allocators)``
with 2 functions with an additional *domain* argument:
2013-06-15 22:01:00 -04:00
* ``Py_GetAllocators(domain)``
* ``Py_SetAllocators(domain, allocators)``
where domain is one of these values:
* ``PYALLOC_PYMEM``
* ``PYALLOC_PYMEM_RAW``
* ``PYALLOC_PYOBJECT``
Setup Builtin Debug Hooks
-------------------------
To be able to use Python debug functions (like ``_PyMem_DebugMalloc()``) even
when a custom memory allocator is set, an environment variable
``PYDEBUGMALLOC`` can be added to set these debug function hooks, instead of
the new function ``PyMem_SetupDebugHooks()``.
Use macros to get customizable allocators
-----------------------------------------
To have no overhead in the default configuration, customizable allocators would
be an optional feature enabled by a configuration option or by macros.
Pass the C filename and line number
-----------------------------------
Use C macros using ``__FILE__`` and ``__LINE__`` to get the C filename
and line number of a memory allocation.
No context argument
-------------------
Simplify the signature of allocator functions, remove the context argument:
* ``void* malloc(size_t size)``
* ``void* realloc(void *ptr, size_t new_size)``
* ``void free(void *ptr)``
2013-06-15 12:28:29 -04:00
The context is a convenient way to reuse the same allocator for different APIs
(ex: PyMem and PyObject).
PyMem_Malloc() GIL-free
-----------------------
There is no real reason to require the GIL when calling PyMem_Malloc().
CCP API
-------
XXX To be done (Kristján Valur Jónsson) XXX
2013-06-15 21:49:29 -04:00
External libraries
==================
* glib: `g_mem_set_vtable()
<http://developer.gnome.org/glib/unstable/glib-Memory-Allocation.html#g-mem-set-vtable>`_
2013-06-17 19:02:16 -04:00
See also the `GNU libc: Memory Allocation Hooks
<http://www.gnu.org/software/libc/manual/html_node/Hooks-for-Malloc.html>`_.
2013-06-15 21:49:29 -04:00
Memory allocators
=================
The C standard library provides the well known ``malloc()`` function. Its
implementation depends on the platform and of the C library. The GNU C library
uses a modified ptmalloc2, based on "Doug Lea's Malloc" (dlmalloc). FreeBSD
uses `jemalloc <http://www.canonware.com/jemalloc/>`_. Google provides
tcmalloc which is part of `gperftools <http://code.google.com/p/gperftools/>`_.
``malloc()`` uses two kinds of memory: heap and memory mappings. Memory
mappings are usually used for large allocations (ex: larger than 256 KB),
whereas the heap is used for small allocations.
The heap is handled by ``brk()`` and ``sbrk()`` system calls on Linux, and is
contiguous. Memory mappings are handled by ``mmap()`` on UNIX and
``VirtualAlloc()`` on Windows, they are discontiguous. Releasing a memory
mapping gives back the memory immediatly to the system. For the heap, memory is
only gave back to the system if it is at the end of the heap. Otherwise, the
memory will only gave back to the system when all the memory located after the
released memory are also released. This limitation causes an issue called the
"memory fragmentation": the memory usage seen by the system may be much higher
than real usage.
Windows provides a `Low-fragmentation Heap
<http://msdn.microsoft.com/en-us/library/windows/desktop/aa366750%28v=vs.85%29.aspx>`_.
The Linux kernel uses `slab allocation
<http://en.wikipedia.org/wiki/Slab_allocation>`_.
The glib library has a `Memory Slice API
<https://developer.gnome.org/glib/unstable/glib-Memory-Slices.html>`_:
efficient way to allocate groups of equal-sized chunks of memory
Links
=====
2013-06-15 21:49:29 -04:00
CPython issues related to memory allocation:
* `Issue #3329: Add new APIs to customize memory allocators
<http://bugs.python.org/issue3329>`_
2013-06-15 21:49:29 -04:00
* `Issue #13483: Use VirtualAlloc to allocate memory arenas
<http://bugs.python.org/issue13483>`_
* `Issue #16742: PyOS_Readline drops GIL and calls PyOS_StdioReadline, which
isn't thread safe <http://bugs.python.org/issue16742>`_
* `Issue #18203: Replace calls to malloc() with PyMem_Malloc()
<http://bugs.python.org/issue18203>`_
* `Issue #18227: Use Python memory allocators in external libraries like zlib
or OpenSSL <http://bugs.python.org/issue18227>`_
Projects analyzing the memory usage of Python applications:
* `pytracemalloc
<https://pypi.python.org/pypi/pytracemalloc>`_
* `Meliae: Python Memory Usage Analyzer
<https://pypi.python.org/pypi/meliae>`_
* `Guppy-PE: umbrella package combining Heapy and GSL
<http://guppy-pe.sourceforge.net/>`_
* `PySizer (developed for Python 2.4)
<http://pysizer.8325.org/>`_