PEP 454: add missing footer, formatting and typo fixes.

This commit is contained in:
gbrandl 2013-10-08 15:54:15 +02:00
parent 1e063f7102
commit 3f90114628
1 changed files with 111 additions and 89 deletions

View File

@ -13,43 +13,44 @@ Python-Version: 3.4
Abstract Abstract
======== ========
Add a new ``tracemalloc`` module to trace memory blocks allocated by Python. This PEP proposes to add a new ``tracemalloc`` module to trace memory
blocks allocated by Python.
Rationale Rationale
========= =========
Common debug tools tracing memory allocations read the C filename and Common debug tools tracing memory allocations record the C filename
line number. Using such tool to analyze Python memory allocations does and line number where the allocation occurs. Using such a tool to
not help because most memory block are allocated in the same C function, analyze Python memory allocations does not help because most memory
in ``PyMem_Malloc()`` for example. blocks are allocated in the same C function, in ``PyMem_Malloc()`` for
example.
There are debug tools dedicated to the Python language like ``Heapy`` There are debug tools dedicated to the Python language like ``Heapy``
and ``PySizer``. These tools analyze objects type and/or content. They and ``PySizer``. These tools analyze objects type and/or content.
are useful when most memory leaks are instances of the same type and They are useful when most memory leaks are instances of the same type
this type is only instantiated in a few functions. The problem is when and this type is only instantiated in a few functions. Problems arise
the object type is very common like ``str`` or ``tuple``, and it is hard when the object type is very common like ``str`` or ``tuple``, and it
to identify where these objects are instantiated. is hard to identify where these objects are instantiated.
Finding reference cycles is also a difficult problem. There are Finding reference cycles is also a difficult problem. There are
different tools to draw a diagram of all references. These tools cannot different tools to draw a diagram of all references. These tools
be used on large applications with thousands of objects because the cannot be used on large applications with thousands of objects because
diagram is too huge to be analyzed manually. the diagram is too huge to be analyzed manually.
Proposal Proposal
======== ========
Using the PEP 445, it becomes easy to setup an hook on Python memory Using the customized allocation API from PEP 445, it becomes easy to
allocators. A hook can inspect Python internals to retrieve the Python set up a hook on Python memory allocators. A hook can inspect Python
tracebacks. internals to retrieve Python tracebacks.
This PEP proposes to add a new ``tracemalloc`` module. It is a debug This PEP proposes to add a new ``tracemalloc`` module, as a debug
tool to trace memory blocks allocated by Python. The module provides the tool to trace memory blocks allocated by Python. The module provides the
following information: following information:
* Compute the differences between two snapshots to detect memory leaks * Computed differences between two snapshots to detect memory leaks
* Statistics on allocated memory blocks per filename and per line * Statistics on allocated memory blocks per filename and per line
number: total size, number and average size of allocated memory blocks number: total size, number and average size of allocated memory blocks
* Traceback where a memory block was allocated * Traceback where a memory block was allocated
@ -57,13 +58,13 @@ following information:
The API of the tracemalloc module is similar to the API of the The API of the tracemalloc module is similar to the API of the
faulthandler module: ``enable()``, ``disable()`` and ``is_enabled()`` faulthandler module: ``enable()``, ``disable()`` and ``is_enabled()``
functions, an environment variable (``PYTHONFAULTHANDLER`` and functions, an environment variable (``PYTHONFAULTHANDLER`` and
``PYTHONTRACEMALLOC``), a ``-X`` command line option (``-X ``PYTHONTRACEMALLOC``), and a ``-X`` command line option (``-X
faulthandler`` and ``-X tracemalloc``). See the faulthandler`` and ``-X tracemalloc``). See the
`documentation of the faulthandler module `documentation of the faulthandler module
<http://docs.python.org/dev/library/faulthandler.html>`_. <http://docs.python.org/3/library/faulthandler.html>`_.
The tracemalloc module has been written for CPython. Other The tracemalloc module has been written for CPython. Other
implementations of Python may not provide it. implementations of Python may not be able to provide it.
API API
@ -72,15 +73,16 @@ API
To trace most memory blocks allocated by Python, the module should be To trace most memory blocks allocated by Python, the module should be
enabled as early as possible by setting the ``PYTHONTRACEMALLOC`` enabled as early as possible by setting the ``PYTHONTRACEMALLOC``
environment variable to ``1``, or by using ``-X tracemalloc`` command environment variable to ``1``, or by using ``-X tracemalloc`` command
line option. The ``tracemalloc.enable()`` function can also be called to line option. The ``tracemalloc.enable()`` function can also be called
start tracing Python memory allocations. to start tracing Python memory allocations.
By default, a trace of an allocated memory block only stores one frame. By default, a trace of an allocated memory block only stores one
Use the ``set_traceback_limit()`` function to store more frames. frame. Use the ``set_traceback_limit()`` function to store additional
frames.
Python memory blocks allocated in the ``tracemalloc`` module are also Python memory blocks allocated in the ``tracemalloc`` module itself are
traced by default. Use ``add_exclude_filter(tracemalloc.__file__)`` to also traced by default. Use ``add_exclude_filter(tracemalloc.__file__)``
ignore these these memory allocations. to ignore these these memory allocations.
At fork, the module is automatically disabled in the child process. At fork, the module is automatically disabled in the child process.
@ -98,7 +100,8 @@ Main Functions
``clear_traces()`` function: ``clear_traces()`` function:
Clear all traces and statistics on Python memory allocations, and Clear all traces and statistics on Python memory allocations, and
reset the ``get_arena_size()`` and ``get_traced_memory()`` counters. reset the ``get_arena_size()`` and ``get_traced_memory()``
counters.
``disable()`` function: ``disable()`` function:
@ -148,31 +151,35 @@ Trace Functions
``get_traceback_limit()`` function: ``get_traceback_limit()`` function:
Get the maximum number of frames stored in the traceback of a trace Get the maximum number of frames stored in the traceback of a
of a memory block. trace of a memory block.
Use the ``set_traceback_limit()`` function to change the limit. Use the ``set_traceback_limit()`` function to change the limit.
``get_object_address(obj)`` function: ``get_object_address(obj)`` function:
Get the address of the memory block of the specified Python object. Get the address of the memory block of the specified Python
object.
A Python object can be composed by multiple memory blocks, the A Python object can be composed by multiple memory blocks, the
function only returns the address of the main memory block. function only returns the address of the main memory block.
See also ``get_object_trace()`` and ``gc.get_referrers()`` functions. See also ``get_object_trace()`` and ``gc.get_referrers()``
functions.
``get_object_trace(obj)`` function: ``get_object_trace(obj)`` function:
Get the trace of a Python object *obj* as a ``(size: int, Get the trace of a Python object *obj* as a ``(size: int,
traceback)`` tuple where *traceback* is a tuple of ``(filename: str, traceback)`` tuple where *traceback* is a tuple of ``(filename:
lineno: int)`` tuples, *filename* and *lineno* can be ``None``. str, lineno: int)`` tuples, *filename* and *lineno* can be
``None``.
The function only returns the trace of the main memory block of the The function only returns the trace of the main memory block of
object. The *size* of the trace is smaller than the total size of the object. The *size* of the trace is smaller than the total
the object if the object is composed by more than one memory block. size of the object if the object is composed by more than one
memory block.
Return ``None`` if the ``tracemalloc`` module did not trace the Return ``None`` if the ``tracemalloc`` module did not trace the
allocation of the object. allocation of the object.
@ -199,21 +206,21 @@ Trace Functions
Get all traces of Python memory allocations as a dictionary Get all traces of Python memory allocations as a dictionary
``{address (int): trace}`` where *trace* is a ``(size: int, ``{address (int): trace}`` where *trace* is a ``(size: int,
traceback)`` and *traceback* is a list of ``(filename: str, lineno: traceback)`` and *traceback* is a list of ``(filename: str,
int)``. *traceback* can be empty, *filename* and *lineno* can be lineno: int)``. *traceback* can be empty, *filename* and *lineno*
None. can be ``None``.
Return an empty dictionary if the ``tracemalloc`` module is Return an empty dictionary if the ``tracemalloc`` module is
disabled. disabled.
See also ``get_object_trace()``, ``get_stats()`` and ``get_trace()`` See also ``get_object_trace()``, ``get_stats()`` and
functions. ``get_trace()`` functions.
``set_traceback_limit(nframe: int)`` function: ``set_traceback_limit(nframe: int)`` function:
Set the maximum number of frames stored in the traceback of a trace Set the maximum number of frames stored in the traceback of a
of a memory block. trace of a memory block.
Storing the traceback of each memory allocation has an important Storing the traceback of each memory allocation has an important
overhead on the memory usage. Use the ``get_tracemalloc_memory()`` overhead on the memory usage. Use the ``get_tracemalloc_memory()``
@ -237,24 +244,25 @@ Filter Functions
trace. trace.
The new filter is not applied on already collected traces. Use the The new filter is not applied on already collected traces. Use the
``clear_traces()`` function to ensure that all traces match the new ``clear_traces()`` function to ensure that all traces match the
filter. new filter.
``add_include_filter(filename: str, lineno: int=None, traceback: bool=False)`` function: ``add_include_filter(filename: str, lineno: int=None, traceback: bool=False)`` function:
Add an inclusive filter: helper for the ``add_filter()`` method Add an inclusive filter: helper for the ``add_filter()`` method
creating a ``Filter`` instance with the ``Filter.include`` attribute creating a ``Filter`` instance with the ``Filter.include``
set to ``True``. attribute set to ``True``.
Example: ``tracemalloc.add_include_filter(tracemalloc.__file__)`` Example: ``tracemalloc.add_include_filter(tracemalloc.__file__)``
only includes memory blocks allocated by the ``tracemalloc`` module. only includes memory blocks allocated by the ``tracemalloc``
module.
``add_exclude_filter(filename: str, lineno: int=None, traceback: bool=False)`` function: ``add_exclude_filter(filename: str, lineno: int=None, traceback: bool=False)`` function:
Add an exclusive filter: helper for the ``add_filter()`` method Add an exclusive filter: helper for the ``add_filter()`` method
creating a ``Filter`` instance with the ``Filter.include`` attribute creating a ``Filter`` instance with the ``Filter.include``
set to ``False``. attribute set to ``False``.
Example: ``tracemalloc.add_exclude_filter(tracemalloc.__file__)`` Example: ``tracemalloc.add_exclude_filter(tracemalloc.__file__)``
ignores memory blocks allocated by the ``tracemalloc`` module. ignores memory blocks allocated by the ``tracemalloc`` module.
@ -343,15 +351,18 @@ the ``Snapshot.add_metric()`` method.
| ``arena_alignment`` | Number of bytes for arena alignment padding | | ``arena_alignment`` | Number of bytes for arena alignment padding |
+---------------------+-------------------------------------------------------+ +---------------------+-------------------------------------------------------+
The function is not available if Python is compiled without ``pymalloc``. The function is not available if Python is compiled without
``pymalloc``.
See also ``get_arena_size()`` and ``sys._debugmallocstats()`` functions. See also the ``get_arena_size()`` and ``sys._debugmallocstats()``
functions.
``get_traced_memory()`` function: ``get_traced_memory()`` function:
Get the current size and maximum size of memory blocks traced by the Get the current size and maximum size of memory blocks traced by
``tracemalloc`` module as a tuple: ``(size: int, max_size: int)``. the ``tracemalloc`` module as a tuple: ``(size: int, max_size:
int)``.
``get_tracemalloc_memory()`` function: ``get_tracemalloc_memory()`` function:
@ -378,12 +389,12 @@ DisplayTop
``DisplayTop()`` class: ``DisplayTop()`` class:
Display the top of allocated memory blocks. Display the "top-n" biggest allocated memory blocks.
``display(count=10, group_by="line", cumulative=False, file=None, callback=None)`` method: ``display(count=10, group_by="line", cumulative=False, file=None, callback=None)`` method:
Take a snapshot and display the top *count* biggest allocated memory Take a snapshot and display the top *count* biggest allocated
blocks grouped by *group_by*. memory blocks grouped by *group_by*.
*callback* is an optional callable object which can be used to add *callback* is an optional callable object which can be used to add
metrics to a snapshot. It is called with only one parameter: the metrics to a snapshot. It is called with only one parameter: the
@ -405,8 +416,8 @@ DisplayTop
``display_top_stats(top_stats, count=10, file=None)`` method: ``display_top_stats(top_stats, count=10, file=None)`` method:
Display the top of allocated memory blocks grouped by the Display the top of allocated memory blocks grouped by the
``GroupedStats.group_by`` attribute of *top_stats*, *top_stats* is a ``GroupedStats.group_by`` attribute of *top_stats*, *top_stats* is
``GroupedStats`` instance. a ``GroupedStats`` instance.
``average`` attribute: ``average`` attribute:
@ -851,8 +862,8 @@ StatsDiff
Differences between ``old_stats`` and ``new_stats`` as a list of Differences between ``old_stats`` and ``new_stats`` as a list of
``(size_diff, size, count_diff, count, key)`` tuples. *size_diff*, ``(size_diff, size, count_diff, count, key)`` tuples. *size_diff*,
*size*, *count_diff* and *count* are ``int``. The key type depends *size*, *count_diff* and *count* are ``int``. The key type depends
on the ``GroupedStats.group_by`` attribute of ``new_stats``: see the on the ``GroupedStats.group_by`` attribute of ``new_stats``: see
``Snapshot.top_by()`` method. the ``Snapshot.top_by()`` method.
``old_stats`` attribute: ``old_stats`` attribute:
@ -869,8 +880,8 @@ Task
``Task(func, *args, **kw)`` class: ``Task(func, *args, **kw)`` class:
Task calling ``func(*args, **kw)``. When scheduled, the task is Task calling ``func(*args, **kw)``. When scheduled, the task is
called when the traced memory is increased or decreased by more than called when the traced memory is increased or decreased by more
*threshold* bytes, or after *delay* seconds. than *threshold* bytes, or after *delay* seconds.
``call()`` method: ``call()`` method:
@ -892,10 +903,10 @@ Task
``get_memory_threshold()`` method: ``get_memory_threshold()`` method:
Get the threshold of the traced memory. When scheduled, the task is Get the threshold of the traced memory. When scheduled, the task
called when the traced memory is increased or decreased by more than is called when the traced memory is increased or decreased by more
*threshold* bytes. The memory threshold is disabled if *threshold* than *threshold* bytes. The memory threshold is disabled if
is ``None``. *threshold* is ``None``.
See also the ``set_memory_threshold()`` method and the See also the ``set_memory_threshold()`` method and the
``get_traced_memory()`` function. ``get_traced_memory()`` function.
@ -903,20 +914,20 @@ Task
``schedule(repeat: int=None)`` method: ``schedule(repeat: int=None)`` method:
Schedule the task *repeat* times. If *repeat* is ``None``, the task Schedule the task *repeat* times. If *repeat* is ``None``, the
is rescheduled after each call until it is cancelled. task is rescheduled after each call until it is cancelled.
If the method is called twice, the task is rescheduled with the new If the method is called twice, the task is rescheduled with the
*repeat* parameter. new *repeat* parameter.
The task must have a memory threshold or a delay: see The task must have a memory threshold or a delay: see
``set_delay()`` and ``set_memory_threshold()`` methods. The ``set_delay()`` and ``set_memory_threshold()`` methods. The
``tracemalloc`` must be enabled to schedule a task: see the ``tracemalloc`` must be enabled to schedule a task: see the
``enable`` function. ``enable`` function.
The task is cancelled if the ``call()`` method raises an exception. The task is cancelled if the ``call()`` method raises an
The task can be cancelled using the ``cancel()`` method or the exception. The task can be cancelled using the ``cancel()``
``cancel_tasks()`` function. method or the ``cancel_tasks()`` function.
``set_delay(seconds: int)`` method: ``set_delay(seconds: int)`` method:
@ -925,18 +936,19 @@ Task
delay to ``None`` to disable the timer. delay to ``None`` to disable the timer.
The timer is based on the Python memory allocator, it is not real The timer is based on the Python memory allocator, it is not real
time. The task is called after at least *delay* seconds, it is not time. The task is called after at least *delay* seconds, it is
called exactly after *delay* seconds if no Python memory allocation not called exactly after *delay* seconds if no Python memory
occurred. The timer has a resolution of 1 second. allocation occurred. The timer has a resolution of 1 second.
The task is rescheduled if it was scheduled. The task is rescheduled if it was scheduled.
``set_memory_threshold(size: int)`` method: ``set_memory_threshold(size: int)`` method:
Set the threshold of the traced memory. When scheduled, the task is Set the threshold of the traced memory. When scheduled, the task
called when the traced memory is increased or decreased by more than is called when the traced memory is increased or decreased by more
*threshold* bytes. Set the threshold to ``None`` to disable it. than *threshold* bytes. Set the threshold to ``None`` to disable
it.
The task is rescheduled if it was scheduled. The task is rescheduled if it was scheduled.
@ -976,8 +988,8 @@ TakeSnapshotTask
``take_snapshot()`` method: ``take_snapshot()`` method:
Take a snapshot and write it into a file. Return ``(snapshot, Take a snapshot and write it into a file. Return ``(snapshot,
filename)`` where *snapshot* is a ``Snapshot`` instance and filename filename)`` where *snapshot* is a ``Snapshot`` instance and
type is ``str``. filename type is ``str``.
``callback`` attribute: ``callback`` attribute:
@ -993,8 +1005,8 @@ TakeSnapshotTask
* ``$pid``: identifier of the current process * ``$pid``: identifier of the current process
* ``$timestamp``: current date and time * ``$timestamp``: current date and time
* ``$counter``: counter starting at 1 and incremented at each snapshot, * ``$counter``: counter starting at 1 and incremented at each
formatted as 4 decimal digits snapshot, formatted as 4 decimal digits
The default template is ``'tracemalloc-$counter.pickle'``. The default template is ``'tracemalloc-$counter.pickle'``.
@ -1034,5 +1046,15 @@ Similar projects:
Copyright Copyright
========= =========
This document has been placed into the public domain. This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: