This commit is contained in:
Victor Stinner 2013-09-04 01:19:30 +02:00
parent d0f4b45bdc
commit bb7b188f66
1 changed files with 189 additions and 50 deletions

View File

@ -20,9 +20,34 @@ Add a new ``tracemalloc`` module to trace Python memory allocations.
Rationale Rationale
========= =========
This PEP proposes to a new ``tracemalloc`` module. It is a debug tool to Common debug tools tracing memory allocations read the C filename and
trace memory allocations made by Python based on API added by the PEP number. Using such tool to analyze Python memory allocations does not
445. The module provides the following information: help because most memory allocations are done in the same C function,
``PyMem_Malloc()`` for example.
There are debug tools dedicated to the Python languages like ``Heapy``
and ``PySizer``. These projects analyze objects type and/or content.
These tools are useful when the most memory leak are instances of the
same type and this type in allocated only in a few functions. The
problem is when the object type is very common like ``str`` or
``tuple``, and it is hard to identify where these objects are allocated.
Finding reference cycles is also a difficult task. There are different
tools to draw a diagram of all references. These tools cannot be used
huge on large applications with thousands of objects because the diagram
is too huge to be analyzed manually.
Proposal
========
Using the PEP 445, it becomes easy to setup an hook on Python memory
allocators. The hook can inspect the current Python frame to get the
Python filename and line number.
This PEP proposes to add a new ``tracemalloc`` module. It is a debug
tool to trace memory allocations made by Python. The module provides the
following information:
* Statistics on Python memory allocations per Python filename and line * Statistics on Python memory allocations per Python filename and line
number: size, number, and average size of allocations number: size, number, and average size of allocations
@ -30,9 +55,64 @@ trace memory allocations made by Python based on API added by the PEP
* Location of a Python memory allocation: size in bytes, Python filename * Location of a Python memory allocation: size in bytes, Python filename
and line number and line number
The ``tracemalloc`` module is different than other third-party modules
like ``Heapy`` or ``PySizer``, because it is focused on the location of Command line options
a memory allocation rather that the object type or object content. ====================
The ``python -m tracemalloc`` command can be used to analyze and compare
snapshots. The command takes a list of snapshot filenames and has the
following options.
``-g``, ``--group-per-file``
Group allocations per filename, instead of grouping per line number.
``-n NTRACES``, ``--number NTRACES``
Number of traces displayed per top (default: 10).
``--first``
Compare with the first snapshot, instead of comparing with the
previous snapshot.
``--include PATTERN``
Only include filenames matching pattern *PATTERN*. The option can be
specified multiple times.
See ``fnmatch.fnmatch()`` for the syntax of patterns.
``--exclude PATTERN``
Exclude filenames matching pattern *PATTERN*. The option can be
specified multiple times.
See ``fnmatch.fnmatch()`` for the syntax of patterns.
``-S``, ``--hide-size``
Hide the size of allocations.
``-C``, ``--hide-count``
Hide the number of allocations.
``-A``, ``--hide-average``
Hide the average size of allocations.
``-P PARTS``, ``--filename-parts=PARTS``
Number of displayed filename parts (default: 3).
``--color``
Force usage of colors even if ``sys.stdout`` is not a TTY device.
``--no-color``
Disable colors if ``sys.stdout`` is a TTY device.
API API
@ -62,19 +142,17 @@ Functions
Get the status of the module: ``True`` if it is enabled, ``False`` Get the status of the module: ``True`` if it is enabled, ``False``
otherwise. otherwise.
``get_object_address(obj)`` function:
Get the address of the memory block of the specified Python object.
``get_object_trace(obj)`` function: ``get_object_trace(obj)`` function:
Get the trace of the memory allocation of the Python object *obj*. Get the trace of a Python object *obj* as a ``trace`` instance.
Return a namedtuple with 3 attributes if the memory allocation was
traced:
- ``size``: size in bytes of the object Return ``None`` if the tracemalloc module did not save the location
- ``filename``: name of the Python script where the object was allocated when the object was allocated, for example if the module was
- ``lineno``: line number where the object was allocated disabled.
Return ``None`` if ``tracemalloc`` did not trace the memory
allocation, for example if ``tracemalloc`` was disabled when the
object was created.
``get_process_memory()`` function: ``get_process_memory()`` function:
@ -88,6 +166,28 @@ Functions
Use the ``psutil`` module if available. Use the ``psutil`` module if available.
``get_stats()`` function:
Get statistics on Python memory allocations per Python filename and
per Python line number.
Return a dictionary
``{filename: str -> {line_number: int -> stats: line_stat}}``
where *stats* in a ``line_stat`` instance. *filename* and
*line_number* can be ``None``.
Return an empty dictionary if the tracemalloc module is disabled.
``get_traces(obj)`` function:
Get all traces of a Python memory allocations.
Return a dictionary ``{pointer: int -> trace}`` where *trace*
is a ``trace`` instance.
Return an empty dictionary if the ``tracemalloc`` module is disabled.
``start_timer(delay: int, func: callable, args: tuple=(), kwargs: dict={})`` function: ``start_timer(delay: int, func: callable, args: tuple=(), kwargs: dict={})`` function:
Start a timer calling ``func(*args, **kwargs)`` every *delay* Start a timer calling ``func(*args, **kwargs)`` every *delay*
@ -109,10 +209,48 @@ Functions
Stop the timer started by ``start_timer()``. Stop the timer started by ``start_timer()``.
trace class
-----------
``trace`` class:
This class represents debug information of an allocated memory block.
``size`` attribute:
Size in bytes of the memory block.
``filename`` attribute:
Name of the Python script where the memory block was allocated,
``None`` if unknown.
``lineno`` attribute:
Line number where the memory block was allocated, ``None`` if
unknown.
line_stat class
----------------
``line_stat`` class:
Statistics on Python memory allocations of a specific line number.
``size`` attribute:
Total size in bytes of all memory blocks allocated on the line.
``count`` attribute:
Number of memory blocks allocated on the line.
DisplayTop class DisplayTop class
---------------- ----------------
``DisplayTop(count: int, file=sys.stdout)`` class: ``DisplayTop(count: int=10, file=sys.stdout)`` class:
Display the list of the *count* biggest memory allocations into Display the list of the *count* biggest memory allocations into
*file*. *file*.
@ -132,38 +270,38 @@ DisplayTop class
``color`` attribute: ``color`` attribute:
``display()`` uses colors if ``True`` (bool, If ``True``, ``display()`` uses color. The default value is
default: ``stream.isatty()``). ``file.isatty()``.
``compare_with_previous`` attribute: ``compare_with_previous`` attribute:
If ``True``, ``display()`` compares with the previous top if If ``True`` (default value), ``display()`` compares with the
``True``. If ``False``, compare with the first top (bool, default: previous snapshot. If ``False``, compare with the first snapshot.
``True``).
``filename_parts`` attribute: ``filename_parts`` attribute:
Number of displayed filename parts (int, default: ``3``). Number of displayed filename parts (int, default: ``3``). Extra
parts are replaced with ``"..."``.
``group_per_file`` attribute:
If ``True``, group memory allocations per Python filename. If
``False`` (default value), group allocation per Python line number.
``show_average`` attribute: ``show_average`` attribute:
If ``True``, ``display()`` shows the average size of allocations If ``True`` (default value), ``display()`` shows the average size
(bool, default: ``True``). of allocations.
``show_count`` attribute: ``show_count`` attribute:
If ``True``, ``display()`` shows the number of allocations (bool, If ``True`` (default value), ``display()`` shows the number of
default: ``True``). allocations.
``show_lineno`` attribute:
If ``True``, use also the line number, not only the filename (bool,
default: ``True``). If ``False``, only show the filename.
``show_size`` attribute: ``show_size`` attribute:
If ``True``, ``display()`` shows the size of allocations (bool, If ``True`` (default value), ``display()`` shows the size of
default: ``True``). allocations.
``user_data_callback`` attribute: ``user_data_callback`` attribute:
@ -183,8 +321,9 @@ Snapshot class
``create(user_data_callback=None)`` method: ``create(user_data_callback=None)`` method:
Take a snapshot. If *user_data_callback* is specified, it must be a Take a snapshot. If *user_data_callback* is specified, it must be a
callable object returning a list of (title: str, format: str, value: callable object returning a list of
int). format must be ``'size'``. The list must always have the same ``(title: str, format: str, value: int)``.
*format* must be ``'size'``. The list must always have the same
length and the same order to be able to compute differences between length and the same order to be able to compute differences between
values. values.
@ -198,35 +337,35 @@ Snapshot class
See ``fnmatch.fnmatch()`` for the syntax of a pattern. See ``fnmatch.fnmatch()`` for the syntax of a pattern.
``write(filename)`` method:
Write the snapshot into a file.
``load(filename)`` classmethod: ``load(filename)`` classmethod:
Load a snapshot from a file. Load a snapshot from a file.
``process_memory`` attribute: ``write(filename)`` method:
Memory usage of the process, result of ``get_process_memory()``. Write the snapshot into a file.
It can be ``None``.
``user_data`` attribute:
Optional list of user data, result of *user_data_callback* in
``Snapshot.create()`` (default: None).
``pid`` attribute: ``pid`` attribute:
Identifier of the process which created the snapshot (int). Identifier of the process which created the snapshot (int).
``process_memory`` attribute:
Result of the ``get_process_memory()`` function, can be ``None``.
``stats`` attribute: ``stats`` attribute:
Raw memory allocation statistics (dict). Result of the ``get_stats()`` function (dict).
``timestamp`` attribute: ``timestamp`` attribute:
Date and time of the creation of the snapshot (str). Creation date and time of the snapshot, ``datetime.datetime``
instance.
``user_data`` attribute:
Optional list of user data, result of *user_data_callback* in
``Snapshot.create()`` (default: None).
TakeSnapshot class TakeSnapshot class