Some tweaks
This commit is contained in:
parent
41d43d2d53
commit
722bcf3c16
128
pep-0445.txt
128
pep-0445.txt
|
@ -13,7 +13,9 @@ Abstract
|
|||
========
|
||||
|
||||
This PEP proposes new Application Programming Interfaces (API) to customize
|
||||
Python memory allocators.
|
||||
Python memory allocators. The only implementation required to conform to
|
||||
this PEP is CPython, but other implementations may choose to be compatible,
|
||||
or to re-use a similar scheme.
|
||||
|
||||
|
||||
Rationale
|
||||
|
@ -123,11 +125,12 @@ New Functions and Structures
|
|||
``PY_SSIZE_T_MAX``. The check is done before calling the inner
|
||||
function.
|
||||
|
||||
The *pymalloc* allocator is optimized for objects smaller than 512 bytes
|
||||
with a short lifetime. It uses memory mappings with a fixed size of 256
|
||||
KB called "arenas".
|
||||
.. note::
|
||||
The *pymalloc* allocator is optimized for objects smaller than 512 bytes
|
||||
with a short lifetime. It uses memory mappings with a fixed size of 256
|
||||
KB called "arenas".
|
||||
|
||||
Default allocators:
|
||||
Here is how the allocators are set up by default:
|
||||
|
||||
* ``PYMEM_DOMAIN_RAW``, ``PYMEM_DOMAIN_MEM``: ``malloc()``,
|
||||
``realloc()`` and ``free()``; call ``malloc(1)`` when requesting zero
|
||||
|
@ -155,11 +158,11 @@ allocators in debug mode:
|
|||
In Python 3.3, the checks are installed by replacing ``PyMem_Malloc()``,
|
||||
``PyMem_Realloc()``, ``PyMem_Free()``, ``PyObject_Malloc()``,
|
||||
``PyObject_Realloc()`` and ``PyObject_Free()`` using macros. The new
|
||||
allocator allocates a larger buffer and write a pattern to detect buffer
|
||||
underflow, buffer overflow and use after free (fill the buffer with the
|
||||
pattern ``0xDB``). It uses the original ``PyObject_Malloc()``
|
||||
allocator allocates a larger buffer and writes a pattern to detect buffer
|
||||
underflow, buffer overflow and use after free (by filling the buffer with
|
||||
the byte ``0xDB``). It uses the original ``PyObject_Malloc()``
|
||||
function to allocate memory. So ``PyMem_Malloc()`` and
|
||||
``PyMem_Realloc()`` call indirectly ``PyObject_Malloc()`` and
|
||||
``PyMem_Realloc()`` indirectly call``PyObject_Malloc()`` and
|
||||
``PyObject_Realloc()``.
|
||||
|
||||
This PEP redesigns the debug checks as hooks on the existing allocators
|
||||
|
@ -179,7 +182,7 @@ Call traces when the hooks are installed (debug mode):
|
|||
=> ``_PyObject_Free()``
|
||||
|
||||
As a result, ``PyMem_Malloc()`` and ``PyMem_Realloc()`` now call
|
||||
``malloc()`` and ``realloc()`` in release mode and in debug mode,
|
||||
``malloc()`` and ``realloc()`` in both release mode and debug mode,
|
||||
instead of calling ``PyObject_Malloc()`` and ``PyObject_Realloc()`` in
|
||||
debug mode.
|
||||
|
||||
|
@ -199,19 +202,15 @@ Don't call malloc() directly anymore
|
|||
Direct calls to ``malloc()`` are replaced with ``PyMem_Malloc()``, or
|
||||
``PyMem_RawMalloc()`` if the GIL is not held.
|
||||
|
||||
Configure external libraries like zlib or OpenSSL to allocate memory
|
||||
External libraries like zlib or OpenSSL can be configured to allocate memory
|
||||
using ``PyMem_Malloc()`` or ``PyMem_RawMalloc()``. If the allocator of a
|
||||
library can only be replaced globally, the allocator is not replaced if
|
||||
Python is embedded in an application.
|
||||
library can only be replaced globally (rather than on an object-by-object
|
||||
basis), it shouldn't be replaced when Python is embedded in an application.
|
||||
|
||||
For the "track memory usage" use case, it is important to track memory
|
||||
allocated in external libraries to have accurate reports, because these
|
||||
allocations can be large (can raise a ``MemoryError`` exception).
|
||||
|
||||
If an hook is used to the track memory usage, the memory allocated by
|
||||
direct calls to ``malloc()`` will not be tracked. Remaining ``malloc()``
|
||||
in external libraries like OpenSSL or bz2 can allocate large memory
|
||||
blocks and so would be missed in memory usage reports.
|
||||
allocations can be large (e.g. they can raise a ``MemoryError`` exception)
|
||||
and would otherwise be missed in memory usage reports.
|
||||
|
||||
|
||||
Examples
|
||||
|
@ -282,9 +281,9 @@ and 10 bytes per *pymalloc* arena::
|
|||
Use case 2: Replace Memory Allocators, override pymalloc
|
||||
--------------------------------------------------------
|
||||
|
||||
If your allocator is optimized for allocations of objects smaller than
|
||||
512 bytes with a short lifetime, pymalloc can be overriden (replace
|
||||
``PyObject_Malloc()``).
|
||||
If you have a dedicated allocator optimized for allocations of objects
|
||||
smaller than 512 bytes with a short lifetime, pymalloc can be overriden
|
||||
(replace ``PyObject_Malloc()``).
|
||||
|
||||
Dummy example wasting 2 bytes per memory block::
|
||||
|
||||
|
@ -420,12 +419,8 @@ Rejected Alternatives
|
|||
More specific functions to get/set memory allocators
|
||||
----------------------------------------------------
|
||||
|
||||
Replace the 2 functions:
|
||||
|
||||
* ``void PyMem_GetAllocator(PyMemAllocatorDomain domain, PyMemAllocator *allocator)``
|
||||
* ``void PyMem_SetAllocator(PyMemAllocatorDomain domain, PyMemAllocator *allocator)``
|
||||
|
||||
with:
|
||||
It was originally proposed a larger set of C API functions, with one pair
|
||||
of functions for each allocator domain:
|
||||
|
||||
* ``void PyMem_GetRawAllocator(PyMemAllocator *allocator)``
|
||||
* ``void PyMem_GetAllocator(PyMemAllocator *allocator)``
|
||||
|
@ -442,8 +437,8 @@ each memory allocator domain.
|
|||
Make PyMem_Malloc() reuse PyMem_RawMalloc() by default
|
||||
------------------------------------------------------
|
||||
|
||||
If ``PyMem_Malloc()`` would call ``PyMem_RawMalloc()`` by default,
|
||||
calling ``PyMem_SetAllocator(PYMEM_DOMAIN_RAW, alloc)`` would also also
|
||||
If ``PyMem_Malloc()`` called ``PyMem_RawMalloc()`` by default,
|
||||
calling ``PyMem_SetAllocator(PYMEM_DOMAIN_RAW, alloc)`` would also
|
||||
patch ``PyMem_Malloc()`` indirectly.
|
||||
|
||||
This alternative was rejected because ``PyMem_SetAllocator()`` would
|
||||
|
@ -454,17 +449,17 @@ same behaviour is less error-prone.
|
|||
Add a new PYDEBUGMALLOC environment variable
|
||||
--------------------------------------------
|
||||
|
||||
Add a new ``PYDEBUGMALLOC`` environment variable to enable debug checks
|
||||
on memory block allocators. The environment variable replaces the new
|
||||
function ``PyMem_SetupDebugHooks()`` which is not needed anymore.
|
||||
Another advantage is to allow to enable debug checks even in release
|
||||
mode: debug checks are always compiled, but only enabled when the
|
||||
environment variable is present and non-empty.
|
||||
It was proposed to add a new ``PYDEBUGMALLOC`` environment variable to
|
||||
enable debug checks on memory block allocators. It would have had the same
|
||||
effect as calling the ``PyMem_SetupDebugHooks()``, without the need
|
||||
to write any C code. Another advantage is to allow to enable debug checks
|
||||
even in release mode: debug checks would always be compiled in, but only
|
||||
enabled when the environment variable is present and non-empty.
|
||||
|
||||
This alternative was rejected because a new environment variable would
|
||||
make the Python initialization even more complex. The `PEP 432
|
||||
<http://www.python.org/dev/peps/pep-0432/>`_ tries to simply the CPython
|
||||
startup sequence.
|
||||
make Python initialization even more complex. `PEP 432
|
||||
<http://www.python.org/dev/peps/pep-0432/>`_ tries to simplify the
|
||||
CPython startup sequence.
|
||||
|
||||
|
||||
Use macros to get customizable allocators
|
||||
|
@ -474,7 +469,7 @@ To have no overhead in the default configuration, customizable
|
|||
allocators would be an optional feature enabled by a configuration
|
||||
option or by macros.
|
||||
|
||||
This alternative was rejected because the usage of macros implies having
|
||||
This alternative was rejected because the use of macros implies having
|
||||
to recompile extensions modules to use the new allocator and allocator
|
||||
hooks. Not having to recompile Python nor extension modules makes debug
|
||||
hooks easier to use in practice.
|
||||
|
@ -518,12 +513,12 @@ Example of ``PyMem_Malloc`` macro with the modified
|
|||
|
||||
The GC allocator functions would also have to be patched. For example,
|
||||
``_PyObject_GC_Malloc()`` is used in many C functions and so objects of
|
||||
differenet types would have the same allocation location.
|
||||
different types would have the same allocation location.
|
||||
|
||||
This alternative was rejected because passing a filename and a line
|
||||
number to each allocator makes the API more complex: pass 3 new
|
||||
arguments (ctx, filename, lineno) to each allocator function, instead of
|
||||
just a context argument (ctx). Having to modify also GC allocator
|
||||
just a context argument (ctx). Having to also modify GC allocator
|
||||
functions adds too much complexity for a little gain.
|
||||
|
||||
|
||||
|
@ -531,23 +526,25 @@ GIL-free PyMem_Malloc()
|
|||
-----------------------
|
||||
|
||||
In Python 3.3, when Python is compiled in debug mode, ``PyMem_Malloc()``
|
||||
calls indirectly ``PyObject_Malloc()`` which requires the GIL to be
|
||||
held. That's why ``PyMem_Malloc()`` must be called with the GIL held.
|
||||
indirectly calls ``PyObject_Malloc()`` which requires the GIL to be
|
||||
held (it isn't thread-safe). That's why ``PyMem_Malloc()`` must be called
|
||||
with the GIL held.
|
||||
|
||||
This PEP changes ``PyMem_Malloc()``: it now always call ``malloc()``.
|
||||
The "GIL must be held" restriction could be removed from
|
||||
``PyMem_Malloc()``.
|
||||
This PEP changes ``PyMem_Malloc()``: it now always calls ``malloc()``
|
||||
rather than ``PyObject_Malloc()``. The "GIL must be held" restriction
|
||||
could therefore be removed from ``PyMem_Malloc()``.
|
||||
|
||||
This alternative was rejected because allowing to call
|
||||
``PyMem_Malloc()`` without holding the GIL can break applications
|
||||
which setup their own allocators or allocator hooks. Holding the GIL is
|
||||
convinient to develop a custom allocator: no need to care of other
|
||||
threads. It is also convinient for a debug allocator hook: Python
|
||||
internal objects can be safetly inspected.
|
||||
convenient to develop a custom allocator: no need to care about other
|
||||
threads. It is also convenient for a debug allocator hook: Python
|
||||
objects can be safely inspected, and the C API may be used for reporting.
|
||||
|
||||
Calling ``PyGILState_Ensure()`` in
|
||||
a memory allocator has unexpected behaviour, especially at Python
|
||||
startup and at creation of a new Python thread state.
|
||||
Moreover, calling ``PyGILState_Ensure()`` in a memory allocator has
|
||||
unexpected behaviour, especially at Python startup and when creating of a
|
||||
new Python thread state. It is better to free custom allocators of
|
||||
the responsibility of acquiring the GIL.
|
||||
|
||||
|
||||
Don't add PyMem_RawMalloc()
|
||||
|
@ -566,13 +563,14 @@ This alternative was rejected because ``PyMem_RawMalloc()`` is required
|
|||
for accurate reports of the memory usage. When a debug hook is used to
|
||||
track the memory usage, the memory allocated by direct calls to
|
||||
``malloc()`` cannot be tracked. ``PyMem_RawMalloc()`` can be hooked and
|
||||
so all the memory allocated by Python can be tracked.
|
||||
so all the memory allocated by Python can be tracked, including
|
||||
memory allocated without holding the GIL.
|
||||
|
||||
|
||||
Use existing debug tools to analyze the memory
|
||||
Use existing debug tools to analyze memory use
|
||||
----------------------------------------------
|
||||
|
||||
There are many existing debug tools to analyze the memory. Some
|
||||
There are many existing debug tools to analyze memory use. Some
|
||||
examples: `Valgrind <http://valgrind.org/>`_, `Purify
|
||||
<http://ibm.com/software/awdtools/purify/>`_, `Clang AddressSanitizer
|
||||
<http://code.google.com/p/address-sanitizer/>`_, `failmalloc
|
||||
|
@ -580,14 +578,14 @@ examples: `Valgrind <http://valgrind.org/>`_, `Purify
|
|||
|
||||
The problem is to retrieve the Python object related to a memory pointer
|
||||
to read its type and/or its content. Another issue is to retrieve the
|
||||
location of the memory allocation: the C backtrace is usually useless
|
||||
source of the memory allocation: the C backtrace is usually useless
|
||||
(same reasoning than macros using ``__FILE__`` and ``__LINE__``, see
|
||||
`Pass the C filename and line number`_), the Python filename and line
|
||||
number (or even the Python traceback) is more useful.
|
||||
|
||||
This alternative was rejected because classic tools are unable to
|
||||
introspect Python internals to collect such information. Being able to
|
||||
setup a hook on allocators called with the GIL held allow to collect a
|
||||
setup a hook on allocators called with the GIL held allows to collect a
|
||||
lot of useful data from Python internals.
|
||||
|
||||
|
||||
|
@ -606,7 +604,7 @@ is unknown (ex: NULL pointer).
|
|||
On Windows, this function can be implemented using ``_msize()`` and
|
||||
``VirtualQuery()``.
|
||||
|
||||
The function can be used to implement an hook tracking the memory usage.
|
||||
The function can be used to implement a hook tracking the memory usage.
|
||||
The ``free()`` method of an allocator only gets the address of a memory
|
||||
block, whereas the size of the memory block is required to update the
|
||||
memory usage.
|
||||
|
@ -614,9 +612,9 @@ memory usage.
|
|||
The additional ``msize()`` function was rejected because only few
|
||||
platforms implement it. For example, Linux with the GNU libc does not
|
||||
provide a function to get the size of a memory block. ``msize()`` is not
|
||||
currently used in the Python source code. The function is only used to
|
||||
track the memory usage, and makes the API more complex. A debug hook can
|
||||
implement the function internally, there is no need to add it to
|
||||
currently used in the Python source code. The function would only be
|
||||
used to track memory use, and make the API more complex. A debug hook
|
||||
can implement the function internally, there is no need to add it to
|
||||
``PyMemAllocator`` and ``PyObjectArenaAllocator`` structures.
|
||||
|
||||
|
||||
|
@ -733,9 +731,9 @@ Other allocators:
|
|||
<https://developer.gnome.org/glib/unstable/glib-Memory-Slices.html>`_:
|
||||
efficient way to allocate groups of equal-sized chunks of memory
|
||||
|
||||
This PEP permits to choose exactly which memory allocator is used for your
|
||||
application depending on its usage of the memory (number of allocation, size of
|
||||
allocations, lifetime of objects, etc.).
|
||||
This PEP allows to choose exactly which memory allocator is used for your
|
||||
application depending on its usage of the memory (number of allocations,
|
||||
size of allocations, lifetime of objects, etc.).
|
||||
|
||||
|
||||
Links
|
||||
|
|
Loading…
Reference in New Issue