This commit is contained in:
Victor Stinner 2013-09-04 01:19:30 +02:00
parent d0f4b45bdc
commit bb7b188f66
1 changed files with 189 additions and 50 deletions

View File

@ -20,9 +20,34 @@ Add a new ``tracemalloc`` module to trace Python memory allocations.
Rationale
=========
This PEP proposes to a new ``tracemalloc`` module. It is a debug tool to
trace memory allocations made by Python based on API added by the PEP
445. The module provides the following information:
Common debug tools tracing memory allocations read the C filename and
number. Using such tool to analyze Python memory allocations does not
help because most memory allocations are done in the same C function,
``PyMem_Malloc()`` for example.
There are debug tools dedicated to the Python languages like ``Heapy``
and ``PySizer``. These projects analyze objects type and/or content.
These tools are useful when the most memory leak are instances of the
same type and this type in allocated only in a few functions. The
problem is when the object type is very common like ``str`` or
``tuple``, and it is hard to identify where these objects are allocated.
Finding reference cycles is also a difficult task. There are different
tools to draw a diagram of all references. These tools cannot be used
huge on large applications with thousands of objects because the diagram
is too huge to be analyzed manually.
Proposal
========
Using the PEP 445, it becomes easy to setup an hook on Python memory
allocators. The hook can inspect the current Python frame to get the
Python filename and line number.
This PEP proposes to add a new ``tracemalloc`` module. It is a debug
tool to trace memory allocations made by Python. The module provides the
following information:
* Statistics on Python memory allocations per Python filename and line
number: size, number, and average size of allocations
@ -30,9 +55,64 @@ trace memory allocations made by Python based on API added by the PEP
* Location of a Python memory allocation: size in bytes, Python filename
and line number
The ``tracemalloc`` module is different than other third-party modules
like ``Heapy`` or ``PySizer``, because it is focused on the location of
a memory allocation rather that the object type or object content.
Command line options
====================
The ``python -m tracemalloc`` command can be used to analyze and compare
snapshots. The command takes a list of snapshot filenames and has the
following options.
``-g``, ``--group-per-file``
Group allocations per filename, instead of grouping per line number.
``-n NTRACES``, ``--number NTRACES``
Number of traces displayed per top (default: 10).
``--first``
Compare with the first snapshot, instead of comparing with the
previous snapshot.
``--include PATTERN``
Only include filenames matching pattern *PATTERN*. The option can be
specified multiple times.
See ``fnmatch.fnmatch()`` for the syntax of patterns.
``--exclude PATTERN``
Exclude filenames matching pattern *PATTERN*. The option can be
specified multiple times.
See ``fnmatch.fnmatch()`` for the syntax of patterns.
``-S``, ``--hide-size``
Hide the size of allocations.
``-C``, ``--hide-count``
Hide the number of allocations.
``-A``, ``--hide-average``
Hide the average size of allocations.
``-P PARTS``, ``--filename-parts=PARTS``
Number of displayed filename parts (default: 3).
``--color``
Force usage of colors even if ``sys.stdout`` is not a TTY device.
``--no-color``
Disable colors if ``sys.stdout`` is a TTY device.
API
@ -62,19 +142,17 @@ Functions
Get the status of the module: ``True`` if it is enabled, ``False``
otherwise.
``get_object_address(obj)`` function:
Get the address of the memory block of the specified Python object.
``get_object_trace(obj)`` function:
Get the trace of the memory allocation of the Python object *obj*.
Return a namedtuple with 3 attributes if the memory allocation was
traced:
Get the trace of a Python object *obj* as a ``trace`` instance.
- ``size``: size in bytes of the object
- ``filename``: name of the Python script where the object was allocated
- ``lineno``: line number where the object was allocated
Return ``None`` if ``tracemalloc`` did not trace the memory
allocation, for example if ``tracemalloc`` was disabled when the
object was created.
Return ``None`` if the tracemalloc module did not save the location
when the object was allocated, for example if the module was
disabled.
``get_process_memory()`` function:
@ -88,6 +166,28 @@ Functions
Use the ``psutil`` module if available.
``get_stats()`` function:
Get statistics on Python memory allocations per Python filename and
per Python line number.
Return a dictionary
``{filename: str -> {line_number: int -> stats: line_stat}}``
where *stats* in a ``line_stat`` instance. *filename* and
*line_number* can be ``None``.
Return an empty dictionary if the tracemalloc module is disabled.
``get_traces(obj)`` function:
Get all traces of a Python memory allocations.
Return a dictionary ``{pointer: int -> trace}`` where *trace*
is a ``trace`` instance.
Return an empty dictionary if the ``tracemalloc`` module is disabled.
``start_timer(delay: int, func: callable, args: tuple=(), kwargs: dict={})`` function:
Start a timer calling ``func(*args, **kwargs)`` every *delay*
@ -109,10 +209,48 @@ Functions
Stop the timer started by ``start_timer()``.
trace class
-----------
``trace`` class:
This class represents debug information of an allocated memory block.
``size`` attribute:
Size in bytes of the memory block.
``filename`` attribute:
Name of the Python script where the memory block was allocated,
``None`` if unknown.
``lineno`` attribute:
Line number where the memory block was allocated, ``None`` if
unknown.
line_stat class
----------------
``line_stat`` class:
Statistics on Python memory allocations of a specific line number.
``size`` attribute:
Total size in bytes of all memory blocks allocated on the line.
``count`` attribute:
Number of memory blocks allocated on the line.
DisplayTop class
----------------
``DisplayTop(count: int, file=sys.stdout)`` class:
``DisplayTop(count: int=10, file=sys.stdout)`` class:
Display the list of the *count* biggest memory allocations into
*file*.
@ -132,38 +270,38 @@ DisplayTop class
``color`` attribute:
``display()`` uses colors if ``True`` (bool,
default: ``stream.isatty()``).
If ``True``, ``display()`` uses color. The default value is
``file.isatty()``.
``compare_with_previous`` attribute:
If ``True``, ``display()`` compares with the previous top if
``True``. If ``False``, compare with the first top (bool, default:
``True``).
If ``True`` (default value), ``display()`` compares with the
previous snapshot. If ``False``, compare with the first snapshot.
``filename_parts`` attribute:
Number of displayed filename parts (int, default: ``3``).
Number of displayed filename parts (int, default: ``3``). Extra
parts are replaced with ``"..."``.
``group_per_file`` attribute:
If ``True``, group memory allocations per Python filename. If
``False`` (default value), group allocation per Python line number.
``show_average`` attribute:
If ``True``, ``display()`` shows the average size of allocations
(bool, default: ``True``).
If ``True`` (default value), ``display()`` shows the average size
of allocations.
``show_count`` attribute:
If ``True``, ``display()`` shows the number of allocations (bool,
default: ``True``).
``show_lineno`` attribute:
If ``True``, use also the line number, not only the filename (bool,
default: ``True``). If ``False``, only show the filename.
If ``True`` (default value), ``display()`` shows the number of
allocations.
``show_size`` attribute:
If ``True``, ``display()`` shows the size of allocations (bool,
default: ``True``).
If ``True`` (default value), ``display()`` shows the size of
allocations.
``user_data_callback`` attribute:
@ -183,8 +321,9 @@ Snapshot class
``create(user_data_callback=None)`` method:
Take a snapshot. If *user_data_callback* is specified, it must be a
callable object returning a list of (title: str, format: str, value:
int). format must be ``'size'``. The list must always have the same
callable object returning a list of
``(title: str, format: str, value: int)``.
*format* must be ``'size'``. The list must always have the same
length and the same order to be able to compute differences between
values.
@ -198,35 +337,35 @@ Snapshot class
See ``fnmatch.fnmatch()`` for the syntax of a pattern.
``write(filename)`` method:
Write the snapshot into a file.
``load(filename)`` classmethod:
Load a snapshot from a file.
``process_memory`` attribute:
``write(filename)`` method:
Memory usage of the process, result of ``get_process_memory()``.
It can be ``None``.
``user_data`` attribute:
Optional list of user data, result of *user_data_callback* in
``Snapshot.create()`` (default: None).
Write the snapshot into a file.
``pid`` attribute:
Identifier of the process which created the snapshot (int).
``process_memory`` attribute:
Result of the ``get_process_memory()`` function, can be ``None``.
``stats`` attribute:
Raw memory allocation statistics (dict).
Result of the ``get_stats()`` function (dict).
``timestamp`` attribute:
Date and time of the creation of the snapshot (str).
Creation date and time of the snapshot, ``datetime.datetime``
instance.
``user_data`` attribute:
Optional list of user data, result of *user_data_callback* in
``Snapshot.create()`` (default: None).
TakeSnapshot class