diff --git a/pep-0454.txt b/pep-0454.txt index 88884622a..beabf5e91 100644 --- a/pep-0454.txt +++ b/pep-0454.txt @@ -20,18 +20,23 @@ blocks allocated by Python. Rationale ========= -Common debug tools tracing memory allocations record the C filename -and line number where the allocation occurs. Using such tools to -analyze Python memory allocations does not help because most memory -blocks are allocated in the same C function, in ``PyMem_Malloc()`` for -example. +Classic generic tools like Valgrind can get the C traceback where a +memory block was allocated. Using such tools to analyze Python memory +allocations does not help because most memory blocks are allocated in +the same C function, in ``PyMem_Malloc()`` for example. Moreover, Python +has an allocator for small object called "pymalloc" which keeps free +blocks for efficiency. This is not well handled by these tools. There are debug tools dedicated to the Python language like ``Heapy`` -and ``PySizer``. These tools analyze objects type and/or content. -They are useful when most memory leaks are instances of the same type -and this type is only instantiated in a few functions. Problems arise -when the object type is very common like ``str`` or ``tuple``, and it -is hard to identify where these objects are instantiated. +``Pympler`` and ``Meliae`` which lists all live objects using the +garbage module (functions like ``gc.get_objects()``, +``gc.get_referrers()`` and ``gc.get_referents()``), compute their size +(ex: using ``sys.getsizeof()``) and group objects by type. These tools +provide a better estimation of the memory usage of an application. They +are useful when most memory leaks are instances of the same type and +this type is only instantiated in a few functions. Problems arise when +the object type is very common like ``str`` or ``tuple``, and it is hard +to identify where these objects are instantiated. Finding reference cycles is also a difficult problem. There are different tools to draw a diagram of all references. These tools @@ -63,6 +68,15 @@ faulthandler`` and ``-X tracemalloc``). See the `documentation of the faulthandler module `_. +The idea of tracing memory allocations is not new. It was first +implemented in the PySizer project in 2005. PySizer was implemented +differently: the traceback was stored in frame objects and some Python +types were linked the trace with the name of object type. PySizer patch +on CPython adds a overhead on performances and memory footprint, even if +the PySizer was not used. tracemalloc attachs a traceback to the +underlying layer, to memory blocks, and has no overhead when the module +is disabled. + The tracemalloc module has been written for CPython. Other implementations of Python may not be able to provide it.