* add GroupedStats.traceback_limit attribute
* add a Statistic class
* add GroupedStats.statistics() method
* Grouped.compare_to() first parameter is now mandatory, the list is now sorted
  by default, add a new optional sort parameter
* "-X tracemalloc=NFRAME" command line option and PYTHONTRACEMALLOC=NFRAME
  environment variable now specify the number of frames. "-X tracemalloc" is
  still accepted (NFRAME=1)
* rename clear_traces() to reset()
* rename Snapshot.top_by(group_by) to Snapshot.group_by(key_type);
  rename GroupedStats.group_by to GroupedStats.key_type
* get_tracemalloc_memory() only returns a int (total size),
  instead of a (size: int, free: int) tuple
* mention id() builtin function
* rename get_object_trace() to get_object_traceback(), the function now
  only returns the traceback (no more the size)
* rename add_include_filter() to add_inclusive_filter(),
  and add_exclude_filter() to add_exclusive_filter()
* remove the StatsDiff class
* Remove match*() methods from the Filter class
* remove metrics

More details on filename pattern, traces, etc.
This commit is contained in:
Victor Stinner 2013-10-23 20:03:33 +02:00
parent 9857747bc9
commit f218dd187d
1 changed files with 195 additions and 194 deletions

View File

@ -59,9 +59,9 @@ This PEP proposes to add a new ``tracemalloc`` module, as a debug tool
to trace memory blocks allocated by Python. The module provides the
following information:
* Computed differences between two snapshots to detect memory leaks
* Statistics on allocated memory blocks per filename and per line
number: total size, number and average size of allocated memory blocks
* Computed differences between two snapshots to detect memory leaks
* Traceback where a memory block was allocated
The API of the tracemalloc module is similar to the API of the
@ -91,15 +91,17 @@ API
Main Functions
--------------
``clear_traces()`` function:
``reset()`` function:
Clear traces and statistics on Python memory allocations, and reset
the ``get_traced_memory()`` counter.
Clear traces and statistics on Python memory allocations.
See also ``disable()``.
``disable()`` function:
Stop tracing Python memory allocations.
Stop tracing Python memory allocations and clear traces and
statistics.
See also ``enable()`` and ``is_enabled()`` functions.
@ -108,8 +110,6 @@ Main Functions
Start tracing Python memory allocations.
At fork, the module is automatically disabled in the child process.
See also ``disable()`` and ``is_enabled()`` functions.
@ -120,6 +120,10 @@ Main Functions
``(size: int, count: int)`` tuple, *filename* and *line_number* can
be ``None``.
*size* is the total size in bytes of all memory blocks allocated on
the line, or *count* is the number of memory blocks allocated on the
line.
Return an empty dictionary if the ``tracemalloc`` module is
disabled.
@ -134,12 +138,8 @@ Main Functions
``get_tracemalloc_memory()`` function:
Get the memory usage in bytes of the ``tracemalloc`` module as a
tuple: ``(size: int, free: int)``.
* *size*: total size of bytes allocated by the module,
including *free* bytes
* *free*: number of free bytes available to store data
Get the memory usage in bytes of the ``tracemalloc`` module used
internally to trace memory allocations. Return an ``int``.
``is_enabled()`` function:
@ -147,78 +147,102 @@ Main Functions
``True`` if the ``tracemalloc`` module is tracing Python memory
allocations, ``False`` otherwise.
See also ``enable()`` and ``disable()`` functions.
See also ``disable()`` and ``enable()`` functions.
Trace Functions
---------------
``get_traceback_limit()`` function:
When Python allocates a memory block, ``tracemalloc`` attachs a "trace" to
it to store information on it: its size in bytes and the traceback where the
allocation occured.
Get the maximum number of frames stored in the traceback of a trace
of a memory block.
The following functions give access to these traces. A trace is a ``(size: int,
traceback)`` tuple. *size* is the size of the memory block in bytes.
*traceback* is a tuple of frames sorted from the most recent to the oldest
frame, limited to ``get_traceback_limit()`` frames. A frame is
a ``(filename: str, lineno: int)`` tuple where *filename* and *lineno* can be
``None``.
Use the ``set_traceback_limit()`` function to change the limit.
Example of trace: ``(32, (('x.py', 7), ('x.py', 11)))``. The memory block has
a size of 32 bytes and was allocated at ``x.py:7``, line called from line
``x.py:11``.
``get_object_address(obj)`` function:
Get the address of the main memory block of the specified Python object.
Get the address of the main memory block of the specified Python
object.
A Python object can be composed by multiple memory blocks, the
function only returns the address of the main memory block.
function only returns the address of the main memory block. For
example, items of ``dict`` and ``set`` containers are stored in a
second memory block.
See also ``get_object_trace()`` and ``gc.get_referrers()`` functions.
See also ``get_object_traceback()`` and ``gc.get_referrers()``
functions.
.. note::
The builtin function ``id()`` returns a different address for
objects tracked by the garbage collector, because ``id()``
returns the address after the garbage collector header.
``get_object_trace(obj)`` function:
``get_object_traceback(obj)`` function:
Get the trace of a Python object *obj* as a ``(size: int,
traceback)`` tuple where *traceback* is a tuple of ``(filename: str,
lineno: int)`` tuples, *filename* and *lineno* can be ``None``.
The function only returns the trace of the main memory block of the
object. The *size* of the trace is smaller than the total size of
the object if the object is composed by more than one memory block.
Get the traceback where the Python object *obj* was allocated.
Return a tuple of ``(filename: str, lineno: int)`` tuples,
*filename* and *lineno* can be ``None``.
Return ``None`` if the ``tracemalloc`` module did not trace the
allocation of the object.
See also ``get_object_address()``, ``get_trace()``,
``get_traces()``, ``gc.get_referrers()`` and ``sys.getsizeof()``
functions.
See also ``get_object_address()``, ``gc.get_referrers()`` and
``sys.getsizeof()`` functions.
``get_trace(address)`` function:
Get the trace of a memory block as a ``(size: int, traceback)``
tuple where *traceback* is a tuple of ``(filename: str, lineno:
int)`` tuples, *filename* and *lineno* can be ``None``.
Get the trace of a memory block allocated by Python. Return a tuple:
``(size: int, traceback)``, *traceback* is a tuple of ``(filename:
str, lineno: int)`` tuples, *filename* and *lineno* can be ``None``.
Return ``None`` if the ``tracemalloc`` module did not trace the
allocation of the memory block.
See also ``get_object_trace()``, ``get_stats()`` and
See also ``get_object_traceback()``, ``get_stats()`` and
``get_traces()`` functions.
``get_traceback_limit()`` function:
Get the maximum number of frames stored in the traceback of a trace.
By default, a trace of an allocated memory block only stores the
most recent frame: the limit is ``1``. This limit is enough to get
statistics using ``get_stats()``.
Use the ``set_traceback_limit()`` function to change the limit.
``get_traces()`` function:
Get traces of Python memory allocations as a dictionary ``{address
(int): trace}`` where *trace* is a ``(size: int, traceback)`` and
*traceback* is a list of ``(filename: str, lineno: int)``.
*traceback* can be empty, *filename* and *lineno* can be None.
Get traces of all memory blocks allocated by Python. Return a
dictionary: ``{address (int): trace}``, *trace* is a ``(size: int,
traceback)`` tuple, *traceback* is a tuple of ``(filename: str,
lineno: int)`` tuples, *filename* and *lineno* can be None.
Return an empty dictionary if the ``tracemalloc`` module is disabled.
Return an empty dictionary if the ``tracemalloc`` module is
disabled.
See also ``get_object_trace()``, ``get_stats()`` and ``get_trace()``
functions.
See also ``get_object_traceback()``, ``get_stats()`` and
``get_trace()`` functions.
``set_traceback_limit(nframe: int)`` function:
Set the maximum number of frames stored in the traceback of a trace
of a memory block.
Set the maximum number of frames stored in the traceback of a trace.
Storing the traceback of each memory allocation has an important
overhead on the memory usage. Use the ``get_tracemalloc_memory()``
@ -227,6 +251,10 @@ Trace Functions
Use the ``get_traceback_limit()`` function to get the current limit.
The ``PYTHONTRACEMALLOC`` environment variable and the ``-X``
``tracemalloc=NFRAME`` command line option can be used to set a
limit at startup.
Filter Functions
----------------
@ -242,40 +270,48 @@ Filter Functions
trace.
The new filter is not applied on already collected traces. Use the
``clear_traces()`` function to ensure that all traces match the new
filter.
``reset()`` function to ensure that all traces match the new filter.
``add_include_filter(filename: str, lineno: int=None, traceback: bool=False)`` function:
``add_inclusive_filter(filename_pattern: str, lineno: int=None, traceback: bool=False)`` function:
Add an inclusive filter: helper for the ``add_filter()`` method
Add an inclusive filter: helper for the ``add_filter()`` function
creating a ``Filter`` instance with the ``Filter.include`` attribute
set to ``True``.
Example: ``tracemalloc.add_include_filter(tracemalloc.__file__)``
only includes memory blocks allocated by the ``tracemalloc`` module.
The ``*`` joker character can be used in *filename_pattern* to match
any substring, including empty string.
Example: ``tracemalloc.add_inclusive_filter(subprocess.__file__)``
only includes memory blocks allocated by the ``subprocess`` module.
``add_exclude_filter(filename: str, lineno: int=None, traceback: bool=False)`` function:
``add_exclusive_filter(filename_pattern: str, lineno: int=None, traceback: bool=False)`` function:
Add an exclusive filter: helper for the ``add_filter()`` method
Add an exclusive filter: helper for the ``add_filter()`` function
creating a ``Filter`` instance with the ``Filter.include`` attribute
set to ``False``.
Example: ``tracemalloc.add_exclude_filter(tracemalloc.__file__)``
The ``*`` joker character can be used in *filename_pattern* to match
any substring, including empty string.
Example: ``tracemalloc.add_exclusive_filter(tracemalloc.__file__)``
ignores memory blocks allocated by the ``tracemalloc`` module.
``clear_filters()`` function:
Reset the filter list.
Clear the filter list.
See also the ``get_filters()`` function.
``get_filters()`` function:
Get the filters on Python memory allocations as list of ``Filter``
instances.
Get the filters on Python memory allocations. Return a list of
``Filter`` instances.
By default, there is one exclusive filter to ignore Python memory
blocks allocated by the ``tracemalloc`` module.
See also the ``clear_filters()`` function.
@ -283,54 +319,35 @@ Filter Functions
Filter
------
``Filter(include: bool, pattern: str, lineno: int=None, traceback: bool=False)`` class:
``Filter(include: bool, filename_pattern: str, lineno: int=None, traceback: bool=False)`` class:
Filter to select which memory allocations are traced. Filters can be
used to reduce the memory usage of the ``tracemalloc`` module, which
can be read using the ``get_tracemalloc_memory()`` function.
``match(filename: str, lineno: int)`` method:
Return ``True`` if the filter matchs the filename and line number,
``False`` otherwise.
``match_filename(filename: str)`` method:
Return ``True`` if the filter matchs the filename, ``False`` otherwise.
``match_lineno(lineno: int)`` method:
Return ``True`` if the filter matchs the line number, ``False``
otherwise.
``match_traceback(traceback)`` method:
Return ``True`` if the filter matchs the *traceback*, ``False``
otherwise.
*traceback* is a tuple of ``(filename: str, lineno: int)`` tuples.
The ``*`` joker character can be used in *filename_pattern* to match
any substring, including empty string. The ``.pyc`` and ``.pyo``
file extensions are replaced with ``.py``. On Windows, the
comparison is case insensitive and the alternative separator ``/``
is replaced with the standard separator ``\``.
``include`` attribute:
If *include* is ``True``, only trace memory blocks allocated in a
file with a name matching filename ``pattern`` at line number
file with a name matching ``filename_pattern`` at line number
``lineno``.
If *include* is ``False``, ignore memory blocks allocated in a file
with a name matching filename ``pattern`` at line number ``lineno``.
with a name matching ``filename_pattern`` at line number ``lineno``.
``lineno`` attribute:
Line number (``int``). If is is ``None`` or less than ``1``, it
matches any line number.
Line number (``int``) of the filter. If *lineno* is is ``None`` or
less than ``1``, the filter matches any line number.
``pattern`` attribute:
``filename_pattern`` attribute:
The filename *pattern* can contain one or many ``*`` joker
characters which match any substring, including an empty string. The
``.pyc`` and ``.pyo`` file extensions are replaced with ``.py``. On
Windows, the comparison is case insensitive and the alternative
separator ``/`` is replaced with the standard separator ``\``.
Filename pattern (``str``) of the filter.
``traceback`` attribute:
@ -344,48 +361,68 @@ Filter
GroupedStats
------------
``GroupedStats(timestamp: datetime.datetime, stats: dict, group_by: str, cumulative=False, metrics: dict=None)`` class:
``GroupedStats(timestamp: datetime.datetime, traceback_limit: int, stats: dict, key_type: str, cumulative: bool)`` class:
Top of allocated memory blocks grouped by *group_by* as a
Top of allocated memory blocks grouped by *key_type* as a
dictionary.
The ``Snapshot.top_by()`` method creates a ``GroupedStats``
The ``Snapshot.group_by()`` method creates a ``GroupedStats``
instance.
``compare_to(old_stats: GroupedStats=None)`` method:
``compare_to(old_stats: GroupedStats, sort=True)`` method:
Compare to an older ``GroupedStats`` instance. Return a
``StatsDiff`` instance.
Compare statistics to an older ``GroupedStats`` instance. Return a
list of ``Statistic`` instances.
The ``StatsDiff.differences`` list is not sorted: call the
``StatsDiff.sort()`` method to sort the list.
The result is sorted in the biggest to the smallest by
``abs(size_diff)``, *size*, ``abs(count_diff)``, *count* and then by
*key*. Set the *sort* parameter to ``False`` to get the list
unsorted.
``None`` values are replaced with an empty string for filenames or
zero for line numbers, because ``str`` and ``int`` cannot be
compared to ``None``.
``None`` values in keys are replaced with an empty string for
filenames or zero for line numbers, because ``str`` and ``int``
cannot be compared to ``None``.
See also the ``statistics()`` method.
``statistics(sort=True)`` method:
Get statistics as a list of ``Statistic`` instances.
``Statistic.size_diff`` and ``Statistic.count_diff`` attributes are
set to zero.
The result is sorted in the biggest to the smallest by
``abs(size_diff)``, *size*, ``abs(count_diff)``, *count* and then by
*key*. Set the *sort* parameter to ``False`` to get the list
unsorted.
``None`` values in keys are replaced with an empty string for
filenames or zero for line numbers, because ``str`` and ``int``
cannot be compared to ``None``.
See also the ``compare_to()`` method.
``cumulative`` attribute:
If ``True``, cumulate size and count of memory blocks of all frames
of the traceback of a trace, not only the most recent frame.
If ``True``, size and count of memory blocks of all frames of the
traceback of a trace were cumulated, not only the most recent frame.
``metrics`` attribute:
Dictionary storing metrics read when the snapshot was created:
``{name (str): metric}`` where *metric* type is ``Metric``.
``group_by`` attribute:
``key_type`` attribute:
Determine how memory allocations were grouped: see
``Snapshot.top_by()`` for the available values.
``Snapshot.group_by()()`` for the available values.
``stats`` attribute:
Dictionary ``{key: stats}`` where the *key* type depends on the
``group_by`` attribute and *stats* is a ``(size: int, count: int)``
tuple.
Dictionary ``{key: (size: int, count: int)}`` where the type of
*key* depends on the ``key_type`` attribute.
See the ``Snapshot.top_by()`` method.
See the ``Snapshot.group_by()`` method.
``traceback_limit`` attribute:
Maximum number of frames stored in the traceback of ``traces``,
result of the ``get_traceback_limit()`` function.
``timestamp`` attribute:
@ -393,41 +430,13 @@ GroupedStats
instance.
Metric
------
``Metric(name: str, value: int, format: str)`` class:
Value of a metric when a snapshot is created.
``name`` attribute:
Name of the metric.
``value`` attribute:
Value of the metric.
``format`` attribute:
Format of the metric (``str``).
Snapshot
--------
``Snapshot(timestamp: datetime.datetime, traces: dict=None, stats: dict=None)`` class:
Snapshot of traces and statistics on memory blocks allocated by Python.
``add_metric(name: str, value: int, format: str)`` method:
Helper to add a ``Metric`` instance to ``Snapshot.metrics``. Return
the newly created ``Metric`` instance.
Raise an exception if the name is already present in
``Snapshot.metrics``.
``Snapshot(timestamp: datetime.datetime, traceback_limit: int, stats: dict=None, traces: dict=None)`` class:
Snapshot of statistics and traces of memory blocks allocated by
Python.
``apply_filters(filters)`` method:
@ -437,7 +446,8 @@ Snapshot
``create(traces=False)`` classmethod:
Take a snapshot of traces and/or statistics of allocated memory blocks.
Take a snapshot of statistics and traces of memory blocks allocated
by Python.
If *traces* is ``True``, ``get_traces()`` is called and its result
is stored in the ``Snapshot.traces`` attribute. This attribute
@ -449,55 +459,46 @@ Snapshot
``set_traceback_limit()`` before calling ``Snapshot.create()`` to
store more frames.
The ``tracemalloc`` module must be enabled to take a snapshot. See
The ``tracemalloc`` module must be enabled to take a snapshot, see
the the ``enable()`` function.
``get_metric(name, default=None)`` method:
``dump(filename)`` method:
Get the value of the metric called *name*. Return *default* if the
metric does not exist.
Write the snapshot into a file.
Use ``load()`` to reload the snapshot.
``load(filename, traces=True)`` classmethod:
``load(filename)`` classmethod:
Load a snapshot from a file.
If *traces* is ``False``, don't load traces.
See also ``dump()``.
``top_by(group_by: str, cumulative: bool=False)`` method:
``group_by(key_type: str, cumulative: bool=False)`` method:
Compute top statistics grouped by *group_by* as a ``GroupedStats``
instance:
Group statistics by *key_type* as a ``GroupedStats`` instance:
===================== ======================== ================================
group_by description key type
===================== ======================== ================================
``'filename'`` filename ``str``
``'line'`` filename and line number ``(filename: str, lineno: int)``
``'address'`` memory block address ``int``
``'traceback'`` traceback ``(address: int, traceback)``
===================== ======================== ================================
===================== =================================== ================================
key_type description type
===================== =================================== ================================
``'filename'`` filename ``str``
``'line'`` filename and line number ``(filename: str, lineno: int)``
``'address'`` memory block address ``int``
``'traceback'`` memory block address with traceback ``(address: int, traceback)``
===================== =================================== ================================
The ``traceback`` type is a tuple of ``(filename: str, lineno:
int)`` tuples, *filename* and *lineno* can be ``None``.
If *cumulative* is ``True``, cumulate size and count of memory
blocks of all frames of the traceback of a trace, not only the most
recent frame. The *cumulative* parameter is ignored if *group_by*
is ``'address'`` or if the traceback limit is less than ``2``.
recent frame. The *cumulative* parameter is set to ``False`` if
*key_type* is ``'address'``, or if the traceback limit is less than
``2``.
``write(filename)`` method:
Write the snapshot into a file.
``metrics`` attribute:
Dictionary storing metrics read when the snapshot was created:
``{name (str): metric}`` where *metric* type is ``Metric``.
``stats`` attribute:
Statistics on traced Python memory, result of the ``get_stats()``
@ -505,8 +506,8 @@ Snapshot
``traceback_limit`` attribute:
Maximum number of frames stored in a trace of a memory block
allocated by Python.
Maximum number of frames stored in the traceback of ``traces``,
result of the ``get_traceback_limit()`` function.
``traces`` attribute:
@ -519,37 +520,37 @@ Snapshot
instance.
StatsDiff
Statistic
---------
``StatsDiff(differences, old_stats, new_stats)`` class:
``Statistic(key, size, size_diff, count, count_diff)`` class:
Differences between two ``GroupedStats`` instances.
Statistic on memory allocations.
The ``GroupedStats.compare_to()`` method creates a ``StatsDiff``
instance.
``GroupedStats.compare_to()`` and ``GroupedStats.statistics()``
return a list of ``Statistic`` instances.
``sort()`` method:
``key`` attribute:
Sort the ``differences`` list from the biggest difference to the
smallest difference. Sort by ``abs(size_diff)``, *size*,
``abs(count_diff)``, *count* and then by *key*.
Key identifying the statistic. The key type depends on
``GroupedStats.key_type``, see the ``Snapshot.group_by()`` method.
``differences`` attribute:
Differences between ``old_stats`` and ``new_stats`` as a list of
``(size_diff, size, count_diff, count, key)`` tuples. *size_diff*,
*size*, *count_diff* and *count* are ``int``. The key type depends
on the ``GroupedStats.group_by`` attribute of ``new_stats``: see the
``Snapshot.top_by()`` method.
``count`` attribute:
``old_stats`` attribute:
Number of memory blocks (``int``).
Old ``GroupedStats`` instance, can be ``None``.
``count_diff`` attribute:
``new_stats`` attribute:
Difference of number of memory blocks (``int``).
New ``GroupedStats`` instance.
``size`` attribute:
Total size of memory blocks in bytes (``int``).
``size_diff`` attribute:
Difference of total size of memory blocks in bytes (``int``).
Prior Work