PEP 445: take into account Antoine Pitrou's remarks

This commit is contained in:
Victor Stinner 2013-06-28 22:39:29 +02:00
parent 92027e2fe7
commit 16b2b8f19e
1 changed files with 92 additions and 59 deletions

View File

@ -88,6 +88,8 @@ New Functions and Structures
- ``void PyMem_SetAllocator(PyMemAllocatorDomain domain, PyMemAllocator *allocator)``
- The new allocator must return a distinct non-*NULL* pointer when
requesting zero bytes
- For the ``PYMEM_DOMAIN_RAW`` domain, the allocator must be
thread-safe: the GIL is not held when the allocator is called.
* Add a new ``PyObjectArenaAllocator`` structure::
@ -113,7 +115,9 @@ New Functions and Structures
a memory allocator is replaced:
- ``void PyMem_SetupDebugHooks(void)``
- Install the debug hook on all memory block allocators.
- Install the debug hook on all memory block allocators. The function
can be called more than once, hooks are not reinstalled if they
were already installed.
- The function does nothing is Python is not compiled in debug mode
* Memory allocators always returns *NULL* if size is greater than
@ -127,12 +131,13 @@ KB called "arenas".
Default allocators:
* ``PYMEM_DOMAIN_RAW``, ``PYMEM_DOMAIN_MEM``: ``malloc()``,
``realloc()``, ``free()`` (and *ctx* is NULL); call ``malloc(1)`` when
requesting zero bytes
* ``PYMEM_DOMAIN_OBJ``: *pymalloc* allocator which fall backs on
``realloc()`` and ``free()``; call ``malloc(1)`` when requesting zero
bytes
* ``PYMEM_DOMAIN_OBJ``: *pymalloc* allocator which falls back on
``PyMem_Malloc()`` for allocations larger than 512 bytes
* *pymalloc* arena allocator: ``mmap()``, ``munmap()`` (and *ctx* is
NULL), or ``malloc()`` and ``free()`` if ``mmap()`` is not available
* *pymalloc* arena allocator: ``VirualAlloc()`` and ``VirtualFree()`` on
Windows, ``mmap()`` and ``munmap()`` when available, or ``malloc()``
and ``free()``
Redesign Debug Checks on Memory Allocators as Hooks
@ -148,23 +153,30 @@ allocators in debug mode:
* Detect write before the start of the buffer (buffer underflow)
* Detect write after the end of the buffer (buffer overflow)
In Python 3.3, the checks are installed by replacing
``PYMEM_DOMAIN_MEM`` and ``PYMEM_DOMAIN_OBJ`` allocators, the previous
allocator is no more called. The new allocator is the same for both
domains: ``PyMem_Malloc()`` and ``PyMem_Realloc()`` call indirectly
``PyObject_Malloc()`` and ``PyObject_Realloc()``.
In Python 3.3, the checks are installed by replacing ``PyMem_Malloc()``,
``PyMem_Realloc()``, ``PyMem_Free()``, ``PyObject_Malloc()``,
``PyObject_Realloc()`` and ``PyObject_Free()`` using macros. The new
allocator allocates a larger buffer and write a pattern to detect buffer
underflow and overflow. It uses the original ``PyObject_Malloc()``
function to allocate memory. So ``PyMem_Malloc()`` and
``PyMem_Realloc()`` call indirectly ``PyObject_Malloc()`` and
``PyObject_Realloc()``.
This PEP redesigns the debug checks as hooks on the existing allocators
in debug mode. Examples of call traces without the hooks:
* ``PyMem_Malloc()`` => ``_PyMem_RawMalloc()`` => ``malloc()``
* ``PyMem_RawMalloc()`` => ``_PyMem_RawMalloc()`` => ``malloc()``
* ``PyMem_Realloc()`` => ``_PyMem_RawRealloc()`` => ``realloc()``
* ``PyObject_Free()`` => ``_PyObject_Free()``
Call traces when the hooks are installed (debug mode):
* ``PyMem_Malloc()`` => ``_PyMem_DebugMalloc()`` =>
``_PyMem_RawMalloc()`` => ``malloc()``
* ``PyObject_Free()`` => ``_PyMem_DebugFree()`` => ``_PyObject_Free()``
* ``PyMem_RawMalloc()`` => ``_PyMem_DebugMalloc()``
=> ``_PyMem_RawMalloc()`` => ``malloc()``
* ``PyMem_Realloc()`` => ``_PyMem_DebugRealloc()``
=> ``_PyMem_RawRealloc()`` => ``realloc()``
* ``PyObject_Free()`` => ``_PyMem_DebugFree()``
=> ``_PyObject_Free()``
As a result, ``PyMem_Malloc()`` and ``PyMem_Realloc()`` now always call
``malloc()`` and ``realloc()``, instead of calling ``PyObject_Malloc()``
@ -212,8 +224,8 @@ and 10 bytes per memory mapping::
#include <stdlib.h>
int alloc_padding = 2;
int arena_padding = 10;
size_t alloc_padding = 2;
size_t arena_padding = 10;
void* my_malloc(void *ctx, size_t size)
{
@ -264,10 +276,6 @@ and 10 bytes per memory mapping::
PyMem_SetupDebugHooks();
}
.. warning::
Remove the call ``PyMem_SetAllocator(PYMEM_DOMAIN_RAW, &alloc)`` if
the new allocator is not thread-safe.
Use case 2: Replace Memory Allocator, override pymalloc
--------------------------------------------------------
@ -280,7 +288,7 @@ Dummy example wasting 2 bytes per memory block::
#include <stdlib.h>
int padding = 2;
size_t padding = 2;
void* my_malloc(void *ctx, size_t size)
{
@ -314,11 +322,6 @@ Dummy example wasting 2 bytes per memory block::
PyMem_SetupDebugHooks();
}
.. warning::
Remove the call ``PyMem_SetAllocator(PYMEM_DOMAIN_RAW, &alloc)`` if
the new allocator is not thread-safe.
Use case 3: Setup Allocator Hooks
---------------------------------
@ -386,10 +389,6 @@ Example to setup hooks on all memory allocators::
PyMem_SetAllocator(PYMEM_DOMAIN_OBJ, &alloc);
}
.. warning::
Remove the call ``PyMem_SetAllocator(PYMEM_DOMAIN_RAW, &alloc)`` if
hooks are not thread-safe.
.. note::
``PyMem_SetupDebugHooks()`` does not need to be called because the
allocator is not replaced: Python debug hooks are installed
@ -409,8 +408,8 @@ Results of pybench benchmark: "+0.1%" slower globally (diff between
The full reports are attached to the issue #3329.
Alternatives
============
Rejected Alternatives
=====================
More specific functions to get/set memory allocators
----------------------------------------------------
@ -429,13 +428,20 @@ with:
* ``void PyMem_SetAllocator(PyMemAllocator *allocator)``
* ``void PyObject_SetAllocator(PyMemAllocator *allocator)``
With more specific functions, it becomes more difficult to write generic
code, like reusing the same code for different allocator domains.
Make PyMem_Malloc() reuse PyMem_RawMalloc() by default
------------------------------------------------------
``PyMem_Malloc()`` should call ``PyMem_RawMalloc()`` by default. So
calling ``PyMem_SetRawAllocator()`` would also also patch
``PyMem_Malloc()`` indirectly.
If ``PyMem_Malloc()`` would call ``PyMem_RawMalloc()`` by default,
calling ``PyMem_SetAllocator(PYMEM_DOMAIN_RAW, alloc)`` would also also
patch ``PyMem_Malloc()`` indirectly.
This option was rejected because ``PyMem_SetAllocator()`` would have a
different behaviour depending on the domain. Always having the same
behaviour is less error-prone.
Add a new PYDEBUGMALLOC environment variable
@ -449,7 +455,7 @@ If the environment variable is present, ``PyMem_SetRawAllocator()``,
``PyMem_SetAllocator()`` and ``PyObject_SetAllocator()`` will reinstall
automatically the hook on top of the new allocator.
An new environment variable would make the Python initialization even
A new environment variable would make the Python initialization even
more complex. The `PEP 432 <http://www.python.org/dev/peps/pep-0432/>`_
tries to simply the CPython startup sequence.
@ -461,8 +467,10 @@ To have no overhead in the default configuration, customizable
allocators would be an optional feature enabled by a configuration
option or by macros.
Not having to recompile Python makes debug hooks easier to use in
practice. Extensions modules don't have to be recompiled with macros.
This alternative was rejected because the usage of macros implies having
to recompile extensions modules to use the new allocator and allocator
hooks. Not having to recompile Python nor extension modules makes debug
hooks easier to use in practice.
Pass the C filename and line number
@ -501,12 +509,15 @@ Example of ``PyMem_Malloc`` macro with the modified
#define PyMem_Malloc(size) \
_PyMem_MallocTrace(__FILE__, __LINE__, size)
Passing a filename and a line number to each allocator makes the API more
complex: pass 3 new arguments, instead of just a context argument, to each
allocator function. The GC allocator functions should also be patched.
For example, ``_PyObject_GC_Malloc()`` is used in many C functions and so
objects of differenet types would have the same allocation location. Such
changes add too much complexity for a little gain.
The GC allocator functions would also have to be patched. For example,
``_PyObject_GC_Malloc()`` is used in many C functions and so objects of
differenet types would have the same allocation location.
This alternative was rejected because passing a filename and a line
number to each allocator makes the API more complex: pass 3 new
arguments (ctx, filename, lineno) to each allocator function, instead of
just a context argument (ctx). Having to modify also GC allocator
functions adds too much complexity for a little gain.
GIL-free PyMem_Malloc()
@ -520,15 +531,16 @@ This PEP proposes changes ``PyMem_Malloc()``: it now always call
``malloc()``. The "GIL must be held" restriction can be removed from
``PyMem_Malloc()``.
Allowing to call ``PyMem_Malloc()`` without holding the GIL might break
applications which setup their own allocators or allocator hooks.
Holding the GIL is convinient to develop a custom allocator: no need to
care of other threads. It is also convinient for a debug allocator hook:
Python internal objects can be safetly inspected.
This alternative was rejected because allowing to call
``PyMem_Malloc()`` without holding the GIL might break applications
which setup their own allocators or allocator hooks. Holding the GIL is
convinient to develop a custom allocator: no need to care of other
threads. It is also convinient for a debug allocator hook: Python
internal objects can be safetly inspected.
Calling ``PyGILState_Ensure()`` in a memory allocator may have
unexpected behaviour, especially at Python startup and at creation of a
new Python thread state.
Calling ``PyGILState_Ensure()`` in
a memory allocator may have unexpected behaviour, especially at Python
startup and at creation of a new Python thread state.
Don't add PyMem_RawMalloc()
@ -565,15 +577,17 @@ location of the memory allocation: the C backtrace is usually useless
Python filename and line number (or even the Python traceback) is more
useful.
Classic tools are unable to introspect Python internals to collect such
information. Being able to setup a hook on allocators called with the
GIL held allow to collect a lot of useful data from Python internals.
This alternative was rejected because classic tools are unable to
introspect Python internals to collect such information. Being able to
setup a hook on allocators called with the GIL held allow to collect a
lot of useful data from Python internals.
Add msize()
-----------
Add a msize() function
----------------------
Add another field to ``PyMemAllocator`` and ``PyObjectArenaAllocator``::
Add another field to ``PyMemAllocator`` and ``PyObjectArenaAllocator``
structures::
size_t msize(void *ptr);
@ -584,6 +598,19 @@ is unknown (ex: NULL pointer).
On Windows, this function can be implemented using ``_msize()`` and
``VirtualQuery()``.
The function can be used to implement an hook tracking the memory usage.
The ``free()`` method of an allocator only gets the address of a memory
block, whereas the size of the memory block is required to update the
memory usage.
The additional ``msize()`` function was rejected because only few
platforms implement it. For example, Linux with the GNU libc does not
provide a function to get the size of a memory block. ``msize()`` is not
currently used in the Python source code. The function is only used to
track the memory usage, but makes the API more complex. A debug hook can
implemente the function internally, there is no need to add it to
``PyMemAllocator`` and ``PyObjectArenaAllocator`` structures.
No context argument
-------------------
@ -721,3 +748,9 @@ Projects analyzing the memory usage of Python applications:
* `PySizer (developed for Python 2.4)
<http://pysizer.8325.org/>`_
Copyright
=========
This document has been placed into the public domain.