Some tweaks
This commit is contained in:
parent
41d43d2d53
commit
722bcf3c16
128
pep-0445.txt
128
pep-0445.txt
|
@ -13,7 +13,9 @@ Abstract
|
||||||
========
|
========
|
||||||
|
|
||||||
This PEP proposes new Application Programming Interfaces (API) to customize
|
This PEP proposes new Application Programming Interfaces (API) to customize
|
||||||
Python memory allocators.
|
Python memory allocators. The only implementation required to conform to
|
||||||
|
this PEP is CPython, but other implementations may choose to be compatible,
|
||||||
|
or to re-use a similar scheme.
|
||||||
|
|
||||||
|
|
||||||
Rationale
|
Rationale
|
||||||
|
@ -123,11 +125,12 @@ New Functions and Structures
|
||||||
``PY_SSIZE_T_MAX``. The check is done before calling the inner
|
``PY_SSIZE_T_MAX``. The check is done before calling the inner
|
||||||
function.
|
function.
|
||||||
|
|
||||||
The *pymalloc* allocator is optimized for objects smaller than 512 bytes
|
.. note::
|
||||||
with a short lifetime. It uses memory mappings with a fixed size of 256
|
The *pymalloc* allocator is optimized for objects smaller than 512 bytes
|
||||||
KB called "arenas".
|
with a short lifetime. It uses memory mappings with a fixed size of 256
|
||||||
|
KB called "arenas".
|
||||||
|
|
||||||
Default allocators:
|
Here is how the allocators are set up by default:
|
||||||
|
|
||||||
* ``PYMEM_DOMAIN_RAW``, ``PYMEM_DOMAIN_MEM``: ``malloc()``,
|
* ``PYMEM_DOMAIN_RAW``, ``PYMEM_DOMAIN_MEM``: ``malloc()``,
|
||||||
``realloc()`` and ``free()``; call ``malloc(1)`` when requesting zero
|
``realloc()`` and ``free()``; call ``malloc(1)`` when requesting zero
|
||||||
|
@ -155,11 +158,11 @@ allocators in debug mode:
|
||||||
In Python 3.3, the checks are installed by replacing ``PyMem_Malloc()``,
|
In Python 3.3, the checks are installed by replacing ``PyMem_Malloc()``,
|
||||||
``PyMem_Realloc()``, ``PyMem_Free()``, ``PyObject_Malloc()``,
|
``PyMem_Realloc()``, ``PyMem_Free()``, ``PyObject_Malloc()``,
|
||||||
``PyObject_Realloc()`` and ``PyObject_Free()`` using macros. The new
|
``PyObject_Realloc()`` and ``PyObject_Free()`` using macros. The new
|
||||||
allocator allocates a larger buffer and write a pattern to detect buffer
|
allocator allocates a larger buffer and writes a pattern to detect buffer
|
||||||
underflow, buffer overflow and use after free (fill the buffer with the
|
underflow, buffer overflow and use after free (by filling the buffer with
|
||||||
pattern ``0xDB``). It uses the original ``PyObject_Malloc()``
|
the byte ``0xDB``). It uses the original ``PyObject_Malloc()``
|
||||||
function to allocate memory. So ``PyMem_Malloc()`` and
|
function to allocate memory. So ``PyMem_Malloc()`` and
|
||||||
``PyMem_Realloc()`` call indirectly ``PyObject_Malloc()`` and
|
``PyMem_Realloc()`` indirectly call``PyObject_Malloc()`` and
|
||||||
``PyObject_Realloc()``.
|
``PyObject_Realloc()``.
|
||||||
|
|
||||||
This PEP redesigns the debug checks as hooks on the existing allocators
|
This PEP redesigns the debug checks as hooks on the existing allocators
|
||||||
|
@ -179,7 +182,7 @@ Call traces when the hooks are installed (debug mode):
|
||||||
=> ``_PyObject_Free()``
|
=> ``_PyObject_Free()``
|
||||||
|
|
||||||
As a result, ``PyMem_Malloc()`` and ``PyMem_Realloc()`` now call
|
As a result, ``PyMem_Malloc()`` and ``PyMem_Realloc()`` now call
|
||||||
``malloc()`` and ``realloc()`` in release mode and in debug mode,
|
``malloc()`` and ``realloc()`` in both release mode and debug mode,
|
||||||
instead of calling ``PyObject_Malloc()`` and ``PyObject_Realloc()`` in
|
instead of calling ``PyObject_Malloc()`` and ``PyObject_Realloc()`` in
|
||||||
debug mode.
|
debug mode.
|
||||||
|
|
||||||
|
@ -199,19 +202,15 @@ Don't call malloc() directly anymore
|
||||||
Direct calls to ``malloc()`` are replaced with ``PyMem_Malloc()``, or
|
Direct calls to ``malloc()`` are replaced with ``PyMem_Malloc()``, or
|
||||||
``PyMem_RawMalloc()`` if the GIL is not held.
|
``PyMem_RawMalloc()`` if the GIL is not held.
|
||||||
|
|
||||||
Configure external libraries like zlib or OpenSSL to allocate memory
|
External libraries like zlib or OpenSSL can be configured to allocate memory
|
||||||
using ``PyMem_Malloc()`` or ``PyMem_RawMalloc()``. If the allocator of a
|
using ``PyMem_Malloc()`` or ``PyMem_RawMalloc()``. If the allocator of a
|
||||||
library can only be replaced globally, the allocator is not replaced if
|
library can only be replaced globally (rather than on an object-by-object
|
||||||
Python is embedded in an application.
|
basis), it shouldn't be replaced when Python is embedded in an application.
|
||||||
|
|
||||||
For the "track memory usage" use case, it is important to track memory
|
For the "track memory usage" use case, it is important to track memory
|
||||||
allocated in external libraries to have accurate reports, because these
|
allocated in external libraries to have accurate reports, because these
|
||||||
allocations can be large (can raise a ``MemoryError`` exception).
|
allocations can be large (e.g. they can raise a ``MemoryError`` exception)
|
||||||
|
and would otherwise be missed in memory usage reports.
|
||||||
If an hook is used to the track memory usage, the memory allocated by
|
|
||||||
direct calls to ``malloc()`` will not be tracked. Remaining ``malloc()``
|
|
||||||
in external libraries like OpenSSL or bz2 can allocate large memory
|
|
||||||
blocks and so would be missed in memory usage reports.
|
|
||||||
|
|
||||||
|
|
||||||
Examples
|
Examples
|
||||||
|
@ -282,9 +281,9 @@ and 10 bytes per *pymalloc* arena::
|
||||||
Use case 2: Replace Memory Allocators, override pymalloc
|
Use case 2: Replace Memory Allocators, override pymalloc
|
||||||
--------------------------------------------------------
|
--------------------------------------------------------
|
||||||
|
|
||||||
If your allocator is optimized for allocations of objects smaller than
|
If you have a dedicated allocator optimized for allocations of objects
|
||||||
512 bytes with a short lifetime, pymalloc can be overriden (replace
|
smaller than 512 bytes with a short lifetime, pymalloc can be overriden
|
||||||
``PyObject_Malloc()``).
|
(replace ``PyObject_Malloc()``).
|
||||||
|
|
||||||
Dummy example wasting 2 bytes per memory block::
|
Dummy example wasting 2 bytes per memory block::
|
||||||
|
|
||||||
|
@ -420,12 +419,8 @@ Rejected Alternatives
|
||||||
More specific functions to get/set memory allocators
|
More specific functions to get/set memory allocators
|
||||||
----------------------------------------------------
|
----------------------------------------------------
|
||||||
|
|
||||||
Replace the 2 functions:
|
It was originally proposed a larger set of C API functions, with one pair
|
||||||
|
of functions for each allocator domain:
|
||||||
* ``void PyMem_GetAllocator(PyMemAllocatorDomain domain, PyMemAllocator *allocator)``
|
|
||||||
* ``void PyMem_SetAllocator(PyMemAllocatorDomain domain, PyMemAllocator *allocator)``
|
|
||||||
|
|
||||||
with:
|
|
||||||
|
|
||||||
* ``void PyMem_GetRawAllocator(PyMemAllocator *allocator)``
|
* ``void PyMem_GetRawAllocator(PyMemAllocator *allocator)``
|
||||||
* ``void PyMem_GetAllocator(PyMemAllocator *allocator)``
|
* ``void PyMem_GetAllocator(PyMemAllocator *allocator)``
|
||||||
|
@ -442,8 +437,8 @@ each memory allocator domain.
|
||||||
Make PyMem_Malloc() reuse PyMem_RawMalloc() by default
|
Make PyMem_Malloc() reuse PyMem_RawMalloc() by default
|
||||||
------------------------------------------------------
|
------------------------------------------------------
|
||||||
|
|
||||||
If ``PyMem_Malloc()`` would call ``PyMem_RawMalloc()`` by default,
|
If ``PyMem_Malloc()`` called ``PyMem_RawMalloc()`` by default,
|
||||||
calling ``PyMem_SetAllocator(PYMEM_DOMAIN_RAW, alloc)`` would also also
|
calling ``PyMem_SetAllocator(PYMEM_DOMAIN_RAW, alloc)`` would also
|
||||||
patch ``PyMem_Malloc()`` indirectly.
|
patch ``PyMem_Malloc()`` indirectly.
|
||||||
|
|
||||||
This alternative was rejected because ``PyMem_SetAllocator()`` would
|
This alternative was rejected because ``PyMem_SetAllocator()`` would
|
||||||
|
@ -454,17 +449,17 @@ same behaviour is less error-prone.
|
||||||
Add a new PYDEBUGMALLOC environment variable
|
Add a new PYDEBUGMALLOC environment variable
|
||||||
--------------------------------------------
|
--------------------------------------------
|
||||||
|
|
||||||
Add a new ``PYDEBUGMALLOC`` environment variable to enable debug checks
|
It was proposed to add a new ``PYDEBUGMALLOC`` environment variable to
|
||||||
on memory block allocators. The environment variable replaces the new
|
enable debug checks on memory block allocators. It would have had the same
|
||||||
function ``PyMem_SetupDebugHooks()`` which is not needed anymore.
|
effect as calling the ``PyMem_SetupDebugHooks()``, without the need
|
||||||
Another advantage is to allow to enable debug checks even in release
|
to write any C code. Another advantage is to allow to enable debug checks
|
||||||
mode: debug checks are always compiled, but only enabled when the
|
even in release mode: debug checks would always be compiled in, but only
|
||||||
environment variable is present and non-empty.
|
enabled when the environment variable is present and non-empty.
|
||||||
|
|
||||||
This alternative was rejected because a new environment variable would
|
This alternative was rejected because a new environment variable would
|
||||||
make the Python initialization even more complex. The `PEP 432
|
make Python initialization even more complex. `PEP 432
|
||||||
<http://www.python.org/dev/peps/pep-0432/>`_ tries to simply the CPython
|
<http://www.python.org/dev/peps/pep-0432/>`_ tries to simplify the
|
||||||
startup sequence.
|
CPython startup sequence.
|
||||||
|
|
||||||
|
|
||||||
Use macros to get customizable allocators
|
Use macros to get customizable allocators
|
||||||
|
@ -474,7 +469,7 @@ To have no overhead in the default configuration, customizable
|
||||||
allocators would be an optional feature enabled by a configuration
|
allocators would be an optional feature enabled by a configuration
|
||||||
option or by macros.
|
option or by macros.
|
||||||
|
|
||||||
This alternative was rejected because the usage of macros implies having
|
This alternative was rejected because the use of macros implies having
|
||||||
to recompile extensions modules to use the new allocator and allocator
|
to recompile extensions modules to use the new allocator and allocator
|
||||||
hooks. Not having to recompile Python nor extension modules makes debug
|
hooks. Not having to recompile Python nor extension modules makes debug
|
||||||
hooks easier to use in practice.
|
hooks easier to use in practice.
|
||||||
|
@ -518,12 +513,12 @@ Example of ``PyMem_Malloc`` macro with the modified
|
||||||
|
|
||||||
The GC allocator functions would also have to be patched. For example,
|
The GC allocator functions would also have to be patched. For example,
|
||||||
``_PyObject_GC_Malloc()`` is used in many C functions and so objects of
|
``_PyObject_GC_Malloc()`` is used in many C functions and so objects of
|
||||||
differenet types would have the same allocation location.
|
different types would have the same allocation location.
|
||||||
|
|
||||||
This alternative was rejected because passing a filename and a line
|
This alternative was rejected because passing a filename and a line
|
||||||
number to each allocator makes the API more complex: pass 3 new
|
number to each allocator makes the API more complex: pass 3 new
|
||||||
arguments (ctx, filename, lineno) to each allocator function, instead of
|
arguments (ctx, filename, lineno) to each allocator function, instead of
|
||||||
just a context argument (ctx). Having to modify also GC allocator
|
just a context argument (ctx). Having to also modify GC allocator
|
||||||
functions adds too much complexity for a little gain.
|
functions adds too much complexity for a little gain.
|
||||||
|
|
||||||
|
|
||||||
|
@ -531,23 +526,25 @@ GIL-free PyMem_Malloc()
|
||||||
-----------------------
|
-----------------------
|
||||||
|
|
||||||
In Python 3.3, when Python is compiled in debug mode, ``PyMem_Malloc()``
|
In Python 3.3, when Python is compiled in debug mode, ``PyMem_Malloc()``
|
||||||
calls indirectly ``PyObject_Malloc()`` which requires the GIL to be
|
indirectly calls ``PyObject_Malloc()`` which requires the GIL to be
|
||||||
held. That's why ``PyMem_Malloc()`` must be called with the GIL held.
|
held (it isn't thread-safe). That's why ``PyMem_Malloc()`` must be called
|
||||||
|
with the GIL held.
|
||||||
|
|
||||||
This PEP changes ``PyMem_Malloc()``: it now always call ``malloc()``.
|
This PEP changes ``PyMem_Malloc()``: it now always calls ``malloc()``
|
||||||
The "GIL must be held" restriction could be removed from
|
rather than ``PyObject_Malloc()``. The "GIL must be held" restriction
|
||||||
``PyMem_Malloc()``.
|
could therefore be removed from ``PyMem_Malloc()``.
|
||||||
|
|
||||||
This alternative was rejected because allowing to call
|
This alternative was rejected because allowing to call
|
||||||
``PyMem_Malloc()`` without holding the GIL can break applications
|
``PyMem_Malloc()`` without holding the GIL can break applications
|
||||||
which setup their own allocators or allocator hooks. Holding the GIL is
|
which setup their own allocators or allocator hooks. Holding the GIL is
|
||||||
convinient to develop a custom allocator: no need to care of other
|
convenient to develop a custom allocator: no need to care about other
|
||||||
threads. It is also convinient for a debug allocator hook: Python
|
threads. It is also convenient for a debug allocator hook: Python
|
||||||
internal objects can be safetly inspected.
|
objects can be safely inspected, and the C API may be used for reporting.
|
||||||
|
|
||||||
Calling ``PyGILState_Ensure()`` in
|
Moreover, calling ``PyGILState_Ensure()`` in a memory allocator has
|
||||||
a memory allocator has unexpected behaviour, especially at Python
|
unexpected behaviour, especially at Python startup and when creating of a
|
||||||
startup and at creation of a new Python thread state.
|
new Python thread state. It is better to free custom allocators of
|
||||||
|
the responsibility of acquiring the GIL.
|
||||||
|
|
||||||
|
|
||||||
Don't add PyMem_RawMalloc()
|
Don't add PyMem_RawMalloc()
|
||||||
|
@ -566,13 +563,14 @@ This alternative was rejected because ``PyMem_RawMalloc()`` is required
|
||||||
for accurate reports of the memory usage. When a debug hook is used to
|
for accurate reports of the memory usage. When a debug hook is used to
|
||||||
track the memory usage, the memory allocated by direct calls to
|
track the memory usage, the memory allocated by direct calls to
|
||||||
``malloc()`` cannot be tracked. ``PyMem_RawMalloc()`` can be hooked and
|
``malloc()`` cannot be tracked. ``PyMem_RawMalloc()`` can be hooked and
|
||||||
so all the memory allocated by Python can be tracked.
|
so all the memory allocated by Python can be tracked, including
|
||||||
|
memory allocated without holding the GIL.
|
||||||
|
|
||||||
|
|
||||||
Use existing debug tools to analyze the memory
|
Use existing debug tools to analyze memory use
|
||||||
----------------------------------------------
|
----------------------------------------------
|
||||||
|
|
||||||
There are many existing debug tools to analyze the memory. Some
|
There are many existing debug tools to analyze memory use. Some
|
||||||
examples: `Valgrind <http://valgrind.org/>`_, `Purify
|
examples: `Valgrind <http://valgrind.org/>`_, `Purify
|
||||||
<http://ibm.com/software/awdtools/purify/>`_, `Clang AddressSanitizer
|
<http://ibm.com/software/awdtools/purify/>`_, `Clang AddressSanitizer
|
||||||
<http://code.google.com/p/address-sanitizer/>`_, `failmalloc
|
<http://code.google.com/p/address-sanitizer/>`_, `failmalloc
|
||||||
|
@ -580,14 +578,14 @@ examples: `Valgrind <http://valgrind.org/>`_, `Purify
|
||||||
|
|
||||||
The problem is to retrieve the Python object related to a memory pointer
|
The problem is to retrieve the Python object related to a memory pointer
|
||||||
to read its type and/or its content. Another issue is to retrieve the
|
to read its type and/or its content. Another issue is to retrieve the
|
||||||
location of the memory allocation: the C backtrace is usually useless
|
source of the memory allocation: the C backtrace is usually useless
|
||||||
(same reasoning than macros using ``__FILE__`` and ``__LINE__``, see
|
(same reasoning than macros using ``__FILE__`` and ``__LINE__``, see
|
||||||
`Pass the C filename and line number`_), the Python filename and line
|
`Pass the C filename and line number`_), the Python filename and line
|
||||||
number (or even the Python traceback) is more useful.
|
number (or even the Python traceback) is more useful.
|
||||||
|
|
||||||
This alternative was rejected because classic tools are unable to
|
This alternative was rejected because classic tools are unable to
|
||||||
introspect Python internals to collect such information. Being able to
|
introspect Python internals to collect such information. Being able to
|
||||||
setup a hook on allocators called with the GIL held allow to collect a
|
setup a hook on allocators called with the GIL held allows to collect a
|
||||||
lot of useful data from Python internals.
|
lot of useful data from Python internals.
|
||||||
|
|
||||||
|
|
||||||
|
@ -606,7 +604,7 @@ is unknown (ex: NULL pointer).
|
||||||
On Windows, this function can be implemented using ``_msize()`` and
|
On Windows, this function can be implemented using ``_msize()`` and
|
||||||
``VirtualQuery()``.
|
``VirtualQuery()``.
|
||||||
|
|
||||||
The function can be used to implement an hook tracking the memory usage.
|
The function can be used to implement a hook tracking the memory usage.
|
||||||
The ``free()`` method of an allocator only gets the address of a memory
|
The ``free()`` method of an allocator only gets the address of a memory
|
||||||
block, whereas the size of the memory block is required to update the
|
block, whereas the size of the memory block is required to update the
|
||||||
memory usage.
|
memory usage.
|
||||||
|
@ -614,9 +612,9 @@ memory usage.
|
||||||
The additional ``msize()`` function was rejected because only few
|
The additional ``msize()`` function was rejected because only few
|
||||||
platforms implement it. For example, Linux with the GNU libc does not
|
platforms implement it. For example, Linux with the GNU libc does not
|
||||||
provide a function to get the size of a memory block. ``msize()`` is not
|
provide a function to get the size of a memory block. ``msize()`` is not
|
||||||
currently used in the Python source code. The function is only used to
|
currently used in the Python source code. The function would only be
|
||||||
track the memory usage, and makes the API more complex. A debug hook can
|
used to track memory use, and make the API more complex. A debug hook
|
||||||
implement the function internally, there is no need to add it to
|
can implement the function internally, there is no need to add it to
|
||||||
``PyMemAllocator`` and ``PyObjectArenaAllocator`` structures.
|
``PyMemAllocator`` and ``PyObjectArenaAllocator`` structures.
|
||||||
|
|
||||||
|
|
||||||
|
@ -733,9 +731,9 @@ Other allocators:
|
||||||
<https://developer.gnome.org/glib/unstable/glib-Memory-Slices.html>`_:
|
<https://developer.gnome.org/glib/unstable/glib-Memory-Slices.html>`_:
|
||||||
efficient way to allocate groups of equal-sized chunks of memory
|
efficient way to allocate groups of equal-sized chunks of memory
|
||||||
|
|
||||||
This PEP permits to choose exactly which memory allocator is used for your
|
This PEP allows to choose exactly which memory allocator is used for your
|
||||||
application depending on its usage of the memory (number of allocation, size of
|
application depending on its usage of the memory (number of allocations,
|
||||||
allocations, lifetime of objects, etc.).
|
size of allocations, lifetime of objects, etc.).
|
||||||
|
|
||||||
|
|
||||||
Links
|
Links
|
||||||
|
|
Loading…
Reference in New Issue