PEP 454: rationale

This commit is contained in:
Victor Stinner 2013-10-22 13:57:53 +02:00
parent 4c8b5a26ff
commit 834123996b
1 changed files with 24 additions and 10 deletions

View File

@ -20,18 +20,23 @@ blocks allocated by Python.
Rationale
=========
Common debug tools tracing memory allocations record the C filename
and line number where the allocation occurs. Using such tools to
analyze Python memory allocations does not help because most memory
blocks are allocated in the same C function, in ``PyMem_Malloc()`` for
example.
Classic generic tools like Valgrind can get the C traceback where a
memory block was allocated. Using such tools to analyze Python memory
allocations does not help because most memory blocks are allocated in
the same C function, in ``PyMem_Malloc()`` for example. Moreover, Python
has an allocator for small object called "pymalloc" which keeps free
blocks for efficiency. This is not well handled by these tools.
There are debug tools dedicated to the Python language like ``Heapy``
and ``PySizer``. These tools analyze objects type and/or content.
They are useful when most memory leaks are instances of the same type
and this type is only instantiated in a few functions. Problems arise
when the object type is very common like ``str`` or ``tuple``, and it
is hard to identify where these objects are instantiated.
``Pympler`` and ``Meliae`` which lists all live objects using the
garbage module (functions like ``gc.get_objects()``,
``gc.get_referrers()`` and ``gc.get_referents()``), compute their size
(ex: using ``sys.getsizeof()``) and group objects by type. These tools
provide a better estimation of the memory usage of an application. They
are useful when most memory leaks are instances of the same type and
this type is only instantiated in a few functions. Problems arise when
the object type is very common like ``str`` or ``tuple``, and it is hard
to identify where these objects are instantiated.
Finding reference cycles is also a difficult problem. There are
different tools to draw a diagram of all references. These tools
@ -63,6 +68,15 @@ faulthandler`` and ``-X tracemalloc``). See the
`documentation of the faulthandler module
<http://docs.python.org/3/library/faulthandler.html>`_.
The idea of tracing memory allocations is not new. It was first
implemented in the PySizer project in 2005. PySizer was implemented
differently: the traceback was stored in frame objects and some Python
types were linked the trace with the name of object type. PySizer patch
on CPython adds a overhead on performances and memory footprint, even if
the PySizer was not used. tracemalloc attachs a traceback to the
underlying layer, to memory blocks, and has no overhead when the module
is disabled.
The tracemalloc module has been written for CPython. Other
implementations of Python may not be able to provide it.