PEP 445: Add new APIs to customize memory allocators

This commit is contained in:
Victor Stinner 2013-06-15 04:24:25 +02:00
parent a8b84be851
commit f22e1a0a70
1 changed files with 174 additions and 0 deletions

174
pep-0445.txt Normal file
View File

@ -0,0 +1,174 @@
PEP: 445
Title: Add new APIs to customize memory allocators
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner <victor.stinner@gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 15-june-2013
Python-Version: 3.4
Abstract
========
Add new APIs to customize memory allocators
Rationale
=========
Use cases:
* Application embedding Python wanting to use a custom memory allocator
to allocate all Python memory somewhere else or with a different algorithm
* Python running on embedded devices with low memory and slow CPU.
A custom memory allocator may be required to use efficiently the memory
and/or to be able to use all memory of the device.
* Debug tool to track memory leaks
* Debug tool to detect buffer underflow, buffer overflow and misuse
of Python allocator APIs
* Debug tool to inject bugs, simulate out of memory for example
API:
* Setup a custom memory allocator for all memory allocated by Python
* Hook memory allocator functions to call extra code before and/or after
the underlying allocator function
Proposal
========
* Add a new ``PyMemAllocators`` structure
* Add new GIL-free memory allocator functions:
- ``PyMem_RawMalloc()``
- ``PyMem_RawRealloc()``
- ``PyMem_RawFree()``
* Add new functions to get and set memory allocators:
- ``PyMem_GetRawAllocators()``, ``PyMem_SetRawAllocators()``
- ``PyMem_GetAllocators()``, ``PyMem_SetAllocators()``
- ``PyObject_GetAllocators()``, ``PyObject_SetAllocators()``
- ``_PyObject_GetArenaAllocators()``, ``_PyObject_SetArenaAllocators()``
* Add a new function to setup debug hooks after memory allocators were
replaced:
- ``PyMem_SetupDebugHooks()``
* ``PyObject_Malloc()`` now falls back on ``PyMem_Malloc()`` instead of
``malloc()`` if size is bigger than ``SMALL_REQUEST_THRESHOLD``, and
``PyObject_Realloc()`` falls back on ``PyMem_Realloc()`` instead of
``realloc()``
* ``PyMem_Malloc()`` and ``PyMem_Realloc()`` now always call ``malloc()`` and
``realloc()``, instead of calling ``PyObject_Malloc()`` and
``PyObject_Realloc()`` in debug mode
Performances
============
The Python benchmarks suite (-b 2n3): some tests are 1.04x faster, some tests
are 1.04 slower, significant is between 115 and -191. I don't understand these
output, but I guess that the overhead cannot be seen with such test.
pybench: "+0.1%" (diff between -4.9% and +5.6%).
Full output attached to the issue #3329.
Alternatives
============
Only one get and one set function
---------------------------------
Replace the 6 functions:
* ``PyMem_GetRawAllocators()``
* ``PyMem_GetAllocators()``
* ``PyObject_GetAllocators()``
* ``PyMem_SetRawAllocators(allocators)``
* ``PyMem_SetAllocators(allocators)``
, ``PyObject_SetAllocators(allocators)``
with 2 functions with an additional *domain* argument:
* ``PyMem_GetAllocators(domain)``
* ``PyMem_SetAllocators(domain, allocators)``
Setup Builtin Debug Hooks
-------------------------
To be able to use Python debug functions (like ``_PyMem_DebugMalloc()``) even
when a custom memory allocator is set, an environment variable
``PYDEBUGMALLOC`` can be added to set these debug function hooks, instead of
the new function ``PyMem_SetupDebugHooks()``.
Use macros to get customizable allocators
-----------------------------------------
To have no overhead in the default configuration, customizable allocators would
be an optional feature enabled by a configuration option or by macros.
Pass the C filename and line number
-----------------------------------
Use C macros using ``__FILE__`` and ``__LINE__`` to get the C filename
and line number of a memory allocation.
No context argument
-------------------
Simplify the signature of allocator functions, remove the context argument:
* ``void* malloc(size_t size)``
* ``void* realloc(void *ptr, size_t new_size)``
* ``void free(void *ptr)``
The context is a convinient way to reuse the same allocator for different APIs
(ex: PyMem and PyObject).
PyMem_Malloc() GIL-free
-----------------------
There is no real reason to require the GIL when calling PyMem_Malloc().
CCP API
-------
XXX To be done (Kristján Valur Jónsson) XXX
Links
=====
Memory allocators:
* `Issue #3329: Add new APIs to customize memory allocators
<http://bugs.python.org/issue3329>`_
* `pytracemalloc
<https://pypi.python.org/pypi/pytracemalloc>`_
* `Meliae: Python Memory Usage Analyzer
<https://pypi.python.org/pypi/meliae>`_
* `Guppy-PE: umbrella package combining Heapy and GSL
<http://guppy-pe.sourceforge.net/>`_
* `PySizer (developed for Python 2.4)
<http://pysizer.8325.org/>`_
Other:
* `Python benchmark suite
<http://hg.python.org/benchmarks>`_