515 lines
14 KiB
Plaintext
515 lines
14 KiB
Plaintext
PEP: 454
|
|
Title: Add a new tracemalloc module to trace Python memory allocations
|
|
Version: $Revision$
|
|
Last-Modified: $Date$
|
|
Author: Victor Stinner <victor.stinner@gmail.com>
|
|
Status: Draft
|
|
Type: Standards Track
|
|
Content-Type: text/x-rst
|
|
Created: 3-September-2013
|
|
Python-Version: 3.4
|
|
|
|
|
|
Abstract
|
|
========
|
|
|
|
Add a new ``tracemalloc`` module to trace Python memory allocations.
|
|
|
|
|
|
|
|
Rationale
|
|
=========
|
|
|
|
Common debug tools tracing memory allocations read the C filename and
|
|
number. Using such tool to analyze Python memory allocations does not
|
|
help because most memory block are allocated in the same C function,
|
|
in ``PyMem_Malloc()`` for example.
|
|
|
|
There are debug tools dedicated to the Python language like ``Heapy``
|
|
and ``PySizer``. These projects analyze objects type and/or content.
|
|
These tools are useful when most memory leaks are instances of the
|
|
same type and this type is only instancied in a few functions. The
|
|
problem is when the object type is very common like ``str`` or
|
|
``tuple``, and it is hard to identify where these objects are
|
|
instancied.
|
|
|
|
Finding reference cycles is also a difficult problem. There are
|
|
different tools to draw a diagram of all references. These tools cannot
|
|
be used on large applications with thousands of objects because the
|
|
diagram is too huge to be analyzed manually.
|
|
|
|
|
|
Proposal
|
|
========
|
|
|
|
Using the PEP 445, it becomes easy to setup an hook on Python memory
|
|
allocators. The hook can inspect the current Python frame to get the
|
|
Python filename and line number.
|
|
|
|
This PEP proposes to add a new ``tracemalloc`` module. It is a debug
|
|
tool to trace memory allocations made by Python. The module provides the
|
|
following information:
|
|
|
|
* Statistics on Python memory allocations per Python filename and line
|
|
number: size, number, and average size of allocations
|
|
* Compute differences between two snapshots of Python memory allocations
|
|
* Location of a Python memory allocation: size in bytes, Python filename
|
|
and line number
|
|
|
|
The API of the tracemalloc module is similar to the API of the
|
|
faulthandler module: ``enable()``, ``disable()`` and ``is_enabled()``
|
|
functions, an environment variable (``PYTHONFAULTHANDLER`` and
|
|
``PYTHONTRACEMALLOC``), a ``-X`` command line option (``-X
|
|
faulthandler`` and ``-X tracemalloc``). See the
|
|
`documentation of the faulthandler module
|
|
<http://docs.python.org/dev/library/faulthandler.html>`_.
|
|
|
|
The tracemalloc module has been written for CPython. Other
|
|
implementations of Python may not provide it.
|
|
|
|
|
|
API
|
|
===
|
|
|
|
To trace the most Python memory allocations, the module should be
|
|
enabled as early as possible in your application by calling
|
|
``tracemalloc.enable()`` function, by setting the ``PYTHONTRACEMALLOC``
|
|
environment variable to ``1``, or by using ``-X tracemalloc`` command
|
|
line option.
|
|
|
|
By default, tracemalloc only stores one ``frame`` instance per memory
|
|
allocation. Use ``tracemalloc.set_number_frame()`` to store more frames.
|
|
|
|
|
|
Functions
|
|
---------
|
|
|
|
``add_filter(include: bool, filename: str, lineno: int=None)`` function:
|
|
|
|
Add a filter. If *include* is ``True``, only trace memory blocks
|
|
allocated in a file with a name matching *filename*. If
|
|
*include* is ``False``, don't trace memory blocks allocated in a
|
|
file with a name matching *filename*.
|
|
|
|
The match is done using *filename* as a prefix. For example,
|
|
``'/usr/bin/'`` only matchs files the ``/usr/bin`` directories. The
|
|
``.pyc`` and ``.pyo`` suffixes are automatically replaced with
|
|
``.py`` when matching the filename.
|
|
|
|
*lineno* is a line number. If *lineno* is ``None`` or lesser than
|
|
``1``, it matches any line number.
|
|
|
|
``clear_filters()`` function:
|
|
|
|
Reset the filter list.
|
|
|
|
``clear_traces()`` function:
|
|
|
|
Clear all traces and statistics of memory allocations.
|
|
|
|
``disable()`` function:
|
|
|
|
Stop tracing Python memory allocations and stop the timer started by
|
|
``start_timer()``.
|
|
|
|
``enable()`` function:
|
|
|
|
Start tracing Python memory allocations.
|
|
|
|
``get_filters()`` function:
|
|
|
|
Get the filters as list of
|
|
``(include: bool, filename: str, lineno: int)`` tuples.
|
|
|
|
If *lineno* is ``None``, a filter matchs any line number.
|
|
|
|
By default, the filename of the Python tracemalloc module
|
|
(``tracemalloc.py``) is excluded.
|
|
|
|
``get_number_frame()`` function:
|
|
|
|
Get the maximum number of frames stored in a trace of a memory
|
|
allocation.
|
|
|
|
``get_object_address(obj)`` function:
|
|
|
|
Get the address of the memory block of the specified Python object.
|
|
|
|
``get_object_trace(obj)`` function:
|
|
|
|
Get the trace of a Python object *obj* as a ``trace`` instance.
|
|
|
|
Return ``None`` if the tracemalloc module did not save the location
|
|
when the object was allocated, for example if the module was
|
|
disabled.
|
|
|
|
``get_process_memory()`` function:
|
|
|
|
Get the memory usage of the current process as a meminfo namedtuple
|
|
with two attributes:
|
|
|
|
* ``rss``: Resident Set Size in bytes
|
|
* ``vms``: size of the virtual memory in bytes
|
|
|
|
Return ``None`` if the platform is not supported.
|
|
|
|
Use the ``psutil`` module if available.
|
|
|
|
``get_stats()`` function:
|
|
|
|
Get statistics on Python memory allocations per Python filename and
|
|
per Python line number.
|
|
|
|
Return a dictionary
|
|
``{filename: str -> {line_number: int -> stats: line_stat}}``
|
|
where *stats* in a ``line_stat`` instance. *filename* and
|
|
*line_number* can be ``None``.
|
|
|
|
Return an empty dictionary if the tracemalloc module is disabled.
|
|
|
|
``get_tracemalloc_size()`` function:
|
|
|
|
Get the memory usage in bytes of the ``tracemalloc`` module.
|
|
|
|
``get_traces(obj)`` function:
|
|
|
|
Get all traces of a Python memory allocations.
|
|
Return a dictionary ``{pointer: int -> trace}`` where *trace*
|
|
is a ``trace`` instance.
|
|
|
|
Return an empty dictionary if the ``tracemalloc`` module is disabled.
|
|
|
|
``is_enabled()`` function:
|
|
|
|
Get the status of the module: ``True`` if it is enabled, ``False``
|
|
otherwise.
|
|
|
|
``set_number_frame(nframe: int)`` function:
|
|
|
|
Set the maximum number of frames stored in a trace of a memory
|
|
allocation.
|
|
|
|
All traces and statistics of memory allocations are cleared.
|
|
|
|
``start_timer(delay: int, func: callable, args: tuple=(), kwargs: dict={})`` function:
|
|
|
|
Start a timer calling ``func(*args, **kwargs)`` every *delay*
|
|
seconds.
|
|
|
|
The timer is based on the Python memory allocator, it is not real
|
|
time. *func* is called after at least *delay* seconds, it is not
|
|
called exactly after *delay* seconds if no Python memory allocation
|
|
occurred.
|
|
|
|
If ``start_timer()`` is called twice, previous parameters are
|
|
replaced. The timer has a resolution of 1 second.
|
|
|
|
``start_timer()`` is used by ``DisplayTop`` and ``TakeSnapshot`` to
|
|
run regulary a task.
|
|
|
|
``stop_timer()`` function:
|
|
|
|
Stop the timer started by ``start_timer()``.
|
|
|
|
|
|
frame class
|
|
-----------
|
|
|
|
``frame`` class:
|
|
|
|
Trace of a Python frame.
|
|
|
|
``filename`` attribute (``str``):
|
|
|
|
Python filename, ``None`` if unknown.
|
|
|
|
``lineno`` attribute (``int``):
|
|
|
|
Python line number, ``None`` if unknown.
|
|
|
|
|
|
trace class
|
|
-----------
|
|
|
|
``trace`` class:
|
|
|
|
This class represents debug information of an allocated memory block.
|
|
|
|
``size`` attribute (``int``):
|
|
|
|
Size in bytes of the memory block.
|
|
|
|
``frames`` attribute (``list``):
|
|
|
|
Traceback where the memory block was allocated as a list of
|
|
``frame`` instances (most recent first).
|
|
|
|
The list can be empty or incomplete if the tracemalloc module was
|
|
unable to retrieve the full traceback.
|
|
|
|
For efficiency, the traceback is truncated to 10 frames.
|
|
|
|
|
|
line_stat class
|
|
----------------
|
|
|
|
``line_stat`` class:
|
|
|
|
Statistics on Python memory allocations of a specific line number.
|
|
|
|
``size`` attribute (``int``):
|
|
|
|
Total size in bytes of all memory blocks allocated on the line.
|
|
|
|
``count`` attribute (``int``):
|
|
|
|
Number of memory blocks allocated on the line.
|
|
|
|
|
|
DisplayTop class
|
|
----------------
|
|
|
|
``DisplayTop(count: int=10, file=sys.stdout)`` class:
|
|
|
|
Display the list of the *count* biggest memory allocations into
|
|
*file*.
|
|
|
|
``display()`` method:
|
|
|
|
Display the top once.
|
|
|
|
``start(delay: int)`` method:
|
|
|
|
Start a task using ``tracemalloc`` timer to display the top every
|
|
*delay* seconds.
|
|
|
|
``stop()`` method:
|
|
|
|
Stop the task started by the ``DisplayTop.start()`` method
|
|
|
|
``color`` attribute (``bool``, default: ``file.isatty()``):
|
|
|
|
If ``True``, ``display()`` uses color.
|
|
|
|
``compare_with_previous`` attribute (``bool``, default: ``True``):
|
|
|
|
If ``True``, ``display()`` compares with the
|
|
previous snapshot. If ``False``, compare with the first snapshot.
|
|
|
|
``filename_parts`` attribute (``int``, default: ``3``):
|
|
|
|
Number of displayed filename parts. Extra parts are replaced
|
|
with ``"..."``.
|
|
|
|
``group_per_file`` attribute (``bool``, default: ``False``):
|
|
|
|
If ``True``, group memory allocations per Python filename. If
|
|
``False``, group allocation per Python line number.
|
|
|
|
``show_average`` attribute (``bool``, default: ``True``):
|
|
|
|
If ``True``, ``display()`` shows the average size
|
|
of allocations.
|
|
|
|
``show_count`` attribute (``bool``, default: ``True``):
|
|
|
|
If ``True``, ``display()`` shows the number of
|
|
allocations.
|
|
|
|
``show_size`` attribute (``bool``, default: ``True``):
|
|
|
|
If ``True``, ``display()`` shows the size of
|
|
allocations.
|
|
|
|
``user_data_callback`` attribute (``callable``, default: ``None``):
|
|
|
|
Optional callback collecting user data. See ``Snapshot.create()``.
|
|
|
|
|
|
Snapshot class
|
|
--------------
|
|
|
|
``Snapshot()`` class:
|
|
|
|
Snapshot of Python memory allocations.
|
|
|
|
Use ``TakeSnapshot`` to take regulary snapshots.
|
|
|
|
``create(user_data_callback=None)`` method:
|
|
|
|
Take a snapshot. If *user_data_callback* is specified, it must be a
|
|
callable object returning a list of
|
|
``(title: str, format: str, value: int)``.
|
|
*format* must be ``'size'``. The list must always have the same
|
|
length and the same order to be able to compute differences between
|
|
values.
|
|
|
|
Example: ``[('Video memory', 'size', 234902)]``.
|
|
|
|
``filter_filenames(patterns: list, include: bool)`` method:
|
|
|
|
Remove filenames not matching any pattern of *patterns* if *include*
|
|
is ``True``, or remove filenames matching a pattern of *patterns* if
|
|
*include* is ``False`` (exclude).
|
|
|
|
See ``fnmatch.fnmatch()`` for the syntax of a pattern.
|
|
|
|
``load(filename)`` classmethod:
|
|
|
|
Load a snapshot from a file.
|
|
|
|
``write(filename)`` method:
|
|
|
|
Write the snapshot into a file.
|
|
|
|
``pid`` attribute (``int``):
|
|
|
|
Identifier of the process which created the snapshot.
|
|
|
|
``process_memory`` attribute:
|
|
|
|
Result of the ``get_process_memory()`` function, can be ``None``.
|
|
|
|
``stats`` attribute (``dict``):
|
|
|
|
Result of the ``get_stats()`` function.
|
|
|
|
``tracemalloc_size`` attribute (``int``):
|
|
|
|
The memory usage in bytes of the ``tracemalloc`` module,
|
|
result of the ``get_tracemalloc_size()`` function.
|
|
|
|
``timestamp`` attribute (``datetime.datetime``):
|
|
|
|
Creation date and time of the snapshot.
|
|
|
|
``user_data`` attribute (``list``, default: ``None``):
|
|
|
|
Optional list of user data, result of *user_data_callback* in
|
|
``Snapshot.create()``.
|
|
|
|
|
|
TakeSnapshot class
|
|
------------------
|
|
|
|
``TakeSnapshot`` class:
|
|
|
|
Task taking snapshots of Python memory allocations: write them into
|
|
files. By default, snapshots are written in the current directory.
|
|
|
|
``start(delay: int)`` method:
|
|
|
|
Start a task taking a snapshot every delay seconds.
|
|
|
|
``stop()`` method:
|
|
|
|
Stop the task started by the ``TakeSnapshot.start()`` method.
|
|
|
|
``take_snapshot()`` method:
|
|
|
|
Take a snapshot.
|
|
|
|
``filename_template`` attribute (``str``,
|
|
default: ``'tracemalloc-$counter.pickle'``):
|
|
|
|
Template used to create a filename. The following variables can be
|
|
used in the template:
|
|
|
|
* ``$pid``: identifier of the current process
|
|
* ``$timestamp``: current date and time
|
|
* ``$counter``: counter starting at 1 and incremented at each snapshot
|
|
|
|
``user_data_callback`` attribute (``callable``, default: ``None``):
|
|
|
|
Optional callback collecting user data. See ``Snapshot.create()``.
|
|
|
|
|
|
Command line options
|
|
====================
|
|
|
|
The ``python -m tracemalloc`` command can be used to analyze and compare
|
|
snapshots. The command takes a list of snapshot filenames and has the
|
|
following options.
|
|
|
|
``-g``, ``--group-per-file``
|
|
|
|
Group allocations per filename, instead of grouping per line number.
|
|
|
|
``-n NTRACES``, ``--number NTRACES``
|
|
|
|
Number of traces displayed per top (default: 10).
|
|
|
|
``--first``
|
|
|
|
Compare with the first snapshot, instead of comparing with the
|
|
previous snapshot.
|
|
|
|
``--include PATTERN``
|
|
|
|
Only include filenames matching pattern *PATTERN*. The option can be
|
|
specified multiple times.
|
|
|
|
See ``fnmatch.fnmatch()`` for the syntax of patterns.
|
|
|
|
``--exclude PATTERN``
|
|
|
|
Exclude filenames matching pattern *PATTERN*. The option can be
|
|
specified multiple times.
|
|
|
|
See ``fnmatch.fnmatch()`` for the syntax of patterns.
|
|
|
|
``-S``, ``--hide-size``
|
|
|
|
Hide the size of allocations.
|
|
|
|
``-C``, ``--hide-count``
|
|
|
|
Hide the number of allocations.
|
|
|
|
``-A``, ``--hide-average``
|
|
|
|
Hide the average size of allocations.
|
|
|
|
``-P PARTS``, ``--filename-parts=PARTS``
|
|
|
|
Number of displayed filename parts (default: 3).
|
|
|
|
``--color``
|
|
|
|
Force usage of colors even if ``sys.stdout`` is not a TTY device.
|
|
|
|
``--no-color``
|
|
|
|
Disable colors if ``sys.stdout`` is a TTY device.
|
|
|
|
|
|
Links
|
|
=====
|
|
|
|
tracemalloc:
|
|
|
|
* `#18874: Add a new tracemalloc module to trace Python
|
|
memory allocations <http://bugs.python.org/issue18874>`_
|
|
* `pytracemalloc on PyPI
|
|
<https://pypi.python.org/pypi/pytracemalloc>`_
|
|
|
|
Similar projects:
|
|
|
|
* `Meliae: Python Memory Usage Analyzer
|
|
<https://pypi.python.org/pypi/meliae>`_
|
|
* `Guppy-PE: umbrella package combining Heapy and GSL
|
|
<http://guppy-pe.sourceforge.net/>`_
|
|
* `PySizer <http://pysizer.8325.org/>`_: developed for Python 2.4
|
|
* `memory_profiler <https://pypi.python.org/pypi/memory_profiler>`_
|
|
* `pympler <http://code.google.com/p/pympler/>`_
|
|
* `Dozer <https://pypi.python.org/pypi/Dozer>`_: WSGI Middleware version of
|
|
the CherryPy memory leak debugger
|
|
* `objgraph <http://mg.pov.lt/objgraph/>`_
|
|
* `caulk <https://github.com/smartfile/caulk/>`_
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document has been placed into the public domain.
|
|
|