422 lines
14 KiB
ReStructuredText
422 lines
14 KiB
ReStructuredText
PEP: 669
|
|
Title: Low Impact Monitoring for CPython
|
|
Author: Mark Shannon <mark@hotpy.org>
|
|
Status: Draft
|
|
Type: Standards Track
|
|
Content-Type: text/x-rst
|
|
Created: 18-Aug-2021
|
|
Post-History: 7-Dec-2021
|
|
|
|
|
|
Abstract
|
|
========
|
|
|
|
Using a profiler or debugger in CPython can have a severe impact on
|
|
performance. Slowdowns by an order of magnitude are common.
|
|
|
|
This PEP proposes an API for monitoring of Python programs running
|
|
on CPython that will enable monitoring at low cost.
|
|
|
|
Although this PEP does not specify an implementation, it is expected that
|
|
it will be implemented using the quickening step of PEP 659 [1]_.
|
|
|
|
A ``sys.monitoring`` namespace will be added, which will contain
|
|
the relevant functions and enum.
|
|
|
|
|
|
Motivation
|
|
==========
|
|
|
|
Developers should not have to pay an unreasonable cost to use debuggers,
|
|
profilers and other similar tools.
|
|
|
|
C++ and Java developers expect to be able to run a program at full speed
|
|
(or very close to it) under a debugger.
|
|
Python developers should expect that too.
|
|
|
|
Rationale
|
|
=========
|
|
|
|
The quickening mechanism provided by PEP 659 provides a way to dynamically
|
|
modify executing Python bytecode. These modifications have little cost beyond
|
|
the parts of the code that are modified and a relatively low cost to those
|
|
parts that are modified. We can leverage this to provide an efficient
|
|
mechanism for monitoring that was not possible in 3.10 or earlier.
|
|
|
|
By using quickening, we expect that code run under a debugger on 3.11
|
|
should easily outperform code run without a debugger on 3.10.
|
|
Profiling will still slow down execution, but by much less than in 3.10.
|
|
|
|
|
|
Specification
|
|
=============
|
|
|
|
Monitoring of Python programs is done by registering callback functions
|
|
for events and by activating a set of events.
|
|
|
|
Activating events and registering callback functions are independent of each other.
|
|
|
|
Events
|
|
------
|
|
|
|
As a code object executes various events occur that might be of interest
|
|
to tools. By activating events and by registering callback functions
|
|
tools can respond to these events in any way that suits them.
|
|
Events can be set globally, or for individual code objects.
|
|
|
|
For 3.11, CPython will support the following events:
|
|
|
|
* PY_CALL: Call of a Python function (occurs immediately after the call, the callee's frame will be on the stack)
|
|
* PY_RESUME: Resumption of a Python function (for generator and coroutine functions), except for throw() calls.
|
|
* PY_THROW: A Python function is resumed by a throw() call.
|
|
* PY_RETURN: Return from a Python function (occurs immediately before the return, the callee's frame will be on the stack).
|
|
* PY_YIELD: Yield from a Python function (occurs immediately before the yield, the callee's frame will be on the stack).
|
|
* PY_UNWIND: Exit from a Python function during exception unwinding.
|
|
* C_CALL: Call of a builtin function (before the call in this case).
|
|
* C_RETURN: Return from a builtin function (after the return in this case).
|
|
* RAISE: An exception is raised.
|
|
* EXCEPTION_HANDLED: An exception is handled.
|
|
* LINE: An instruction is about to be executed that has a different line number from the preceding instruction.
|
|
* INSTRUCTION -- A VM instruction is about to be executed.
|
|
* JUMP -- An unconditional jump in the control flow graph is reached.
|
|
* BRANCH -- A conditional branch is about to be taken (or not).
|
|
* MARKER -- A marker is hit
|
|
|
|
More events may be added in the future.
|
|
|
|
All events will be attributes of the ``Event`` enum in ``sys.monitoring``::
|
|
|
|
class Event(enum.IntFlag):
|
|
PY_CALL = ...
|
|
|
|
Note that ``Event`` is an ``IntFlag`` which means that the events can be or-ed
|
|
together to form a set of events.
|
|
|
|
Setting events globally
|
|
-----------------------
|
|
|
|
Events can be controlled globally by modifying the set of events being monitored:
|
|
|
|
* ``sys.monitoring.get_events()->Event``
|
|
Returns the ``Event`` set for all the active events.
|
|
|
|
* ``sys.monitoring.set_events(event_set: Event)``
|
|
Activates all events which are set in ``event_set``.
|
|
|
|
No events are active by default.
|
|
|
|
Per code object events
|
|
----------------------
|
|
|
|
Events can also be controlled on a per code object basis:
|
|
|
|
* ``sys.monitoring.get_local_events(code: CodeType)->Event``
|
|
Returns the ``Event`` set for all the local events for ``code``
|
|
|
|
* ``sys.monitoring.set_local_events(code: CodeType, event_set: Event)``
|
|
Activates all the local events for ``code`` which are set in ``event_set``.
|
|
|
|
Local events add to global events, but do not mask them.
|
|
In other words, all global events will trigger for a code object, regardless of the local events.
|
|
|
|
|
|
Register callback functions
|
|
---------------------------
|
|
|
|
To register a callable for events call::
|
|
|
|
sys.monitoring.register_callback(event, func)
|
|
|
|
Functions can be unregistered by calling
|
|
``sys.monitoring.register_callback(event, None)``.
|
|
|
|
Callback functions can be registered and unregistered at any time.
|
|
|
|
Registering a callback function will generate a ``sys.audit`` event.
|
|
|
|
Callback function arguments
|
|
'''''''''''''''''''''''''''
|
|
|
|
When an active event occurs, the registered callback function is called.
|
|
Different events will provide the callback function with different arguments, as follows:
|
|
|
|
* All events starting with ``PY_``:
|
|
|
|
``func(code: CodeType, instruction_offset: int)``
|
|
|
|
* ``C_CALL`` and ``C_RETURN``:
|
|
|
|
``func(code: CodeType, instruction_offset: int, callable: object)``
|
|
|
|
* ``RAISE`` and ``EXCEPTION_HANDLED``:
|
|
|
|
``func(code: CodeType, instruction_offset: int, exception: BaseException)``
|
|
|
|
* ``LINE``:
|
|
|
|
``func(code: CodeType, line_number: int)``
|
|
|
|
* ``JUMP`` and ``BRANCH``:
|
|
|
|
``func(code: CodeType, instruction_offset: int, destination_offset: int)``
|
|
|
|
Note that the ``destination_offset`` is where the code will next execute.
|
|
For an untaken branch this will be the offset of the instruction following
|
|
the branch.
|
|
|
|
* ``INSTRUCTION``:
|
|
|
|
``func(code: CodeType, instruction_offset: int)``
|
|
|
|
* ``MARKER``:
|
|
|
|
``func(code: CodeType, instruction_offset: int, marker_id: int)``
|
|
|
|
Inserting and removing markers
|
|
''''''''''''''''''''''''''''''''''
|
|
|
|
Two new functions are added to the ``sys`` module to support markers.
|
|
|
|
* ``sys.monitoring.insert_marker(code: CodeType, offset: int, marker_id=0: range(256))``
|
|
* ``sys.monitoring.remove_marker(code: CodeType, offset: int)``
|
|
|
|
The ``marker_id`` has no meaning to the VM,
|
|
and is used only as an argument to the callback function.
|
|
The ``marker_id`` must in the range 0 to 255 (inclusive).
|
|
|
|
Attributes of the ``sys.monitoring`` namespace
|
|
''''''''''''''''''''''''''''''''''''''''''''''
|
|
|
|
* ``class Event(enum.IntFlag)``
|
|
* ``def get_events()->Event``
|
|
* ``def set_events(event_set: Event)->None``
|
|
* ``def get_local_events(code: CodeType)->Event``
|
|
* ``def set_local_events(code: CodeType, event_set: Event)->None``
|
|
* ``def register_callback(event: Event, func: Callable)->None``
|
|
* ``def insert_marker(code: CodeType, offset: Event, marker_id=0: range(256))->None``
|
|
* ``def remove_marker(code: CodeType, offset: Event)->None``
|
|
|
|
Backwards Compatibility
|
|
=======================
|
|
|
|
This PEP is fully backwards compatible, in the sense that old code
|
|
will work if the features of this PEP are unused.
|
|
|
|
However, if it is used it will effectively disable ``sys.settrace``,
|
|
``sys.setprofile`` and PEP 523 frame evaluation.
|
|
|
|
If PEP 523 is in use, or ``sys.settrace`` or ``sys.setprofile`` has been
|
|
set, then calling ``sys.monitoring.set_events()`` or
|
|
``sys.monitoring.set_local_events()`` will raise an exception.
|
|
|
|
Likewise, if ``sys.monitoring.set_events()`` or
|
|
``sys.monitoring.set_local_events()`` has been called, then using PEP 523
|
|
or calling ``sys.settrace`` or ``sys.setprofile`` will raise an exception.
|
|
|
|
This PEP is incompatible with ``sys.settrace`` and ``sys.setprofile``
|
|
because the implementation of ``sys.settrace`` and ``sys.setprofile``
|
|
will use the same underlying mechanism as this PEP. It would be too slow
|
|
to support both the new and old monitoring mechanisms at the same time,
|
|
and they would interfere in awkward ways if both were active at the same time.
|
|
|
|
This PEP is incompatible with PEP 523, because PEP 523 prevents the VM being
|
|
able to modify the code objects of executing code, which is a necessary feature.
|
|
|
|
We may seek to remove ``sys.settrace`` and PEP 523 in the future once the APIs
|
|
provided by this PEP have been widely adopted, but that is for another PEP.
|
|
|
|
Performance
|
|
-----------
|
|
|
|
If no events are active, this PEP should have a negligible impact on
|
|
performance.
|
|
|
|
If a small set of events are active, e.g. for a debugger, then the overhead
|
|
of callbacks will be orders of magnitudes less than for ``sys.settrace`` and
|
|
much cheaper than using PEP 523.
|
|
|
|
For heavily instrumented code, e.g. using ``LINE``, performance should be
|
|
better than ``sys.settrace``, but not by that much as performance will be
|
|
dominated by the time spent in callbacks.
|
|
|
|
For optimizing virtual machines, such as future versions of CPython
|
|
(and ``PyPy`` should they choose to support this API), changing the set of
|
|
globally active events in the midst of a long running program could be quite
|
|
expensive, possibly taking hundreds of milliseconds as it triggers
|
|
de-optimizations. Once such de-optimization has occurred, performance should
|
|
recover as the VM can re-optimize the instrumented code.
|
|
|
|
Security Implications
|
|
=====================
|
|
|
|
Allowing modification of running code has some security implications,
|
|
but no more than the ability to generate and call new code.
|
|
|
|
All the new functions listed above will trigger audit hooks.
|
|
|
|
Implementation
|
|
==============
|
|
|
|
This outlines the proposed implementation for CPython 3.11. The actual
|
|
implementation for later versions of CPython and other Python implementations
|
|
may differ considerably.
|
|
|
|
The proposed implementation of this PEP will be built on top of the quickening
|
|
step of PEP 659 [1]_. Activating some events will cause all code objects to
|
|
be quickened before they are executed.
|
|
|
|
For example, if the ``LINE`` event is turned on, then all instructions that
|
|
are at the start of a line will be replaced with a ``LINE_EVENT`` instruction.
|
|
|
|
Note that this will interfere with specialization, which will result in some
|
|
performance degradation in addition to the overhead of calling the
|
|
registered callable.
|
|
|
|
When the set of active events changes, the VM will immediately update
|
|
all code objects present on the call stack of any thread. It will also set in
|
|
place traps to ensure that all code objects are correctly instrumented when
|
|
called. Consequently changing the set of active events should be done as
|
|
infrequently as possible, as it could be quite an expensive operation.
|
|
|
|
Other events, such as ``RAISE`` can be turned on or off cheaply,
|
|
as they do not rely on code instrumentation, but runtime checks when the
|
|
underlying event occurs.
|
|
|
|
The exact set of events that require instrumentation is an implementation detail,
|
|
but for the current design, the following events will require instrumentation:
|
|
|
|
* PY_CALL
|
|
* PY_RESUME
|
|
* PY_RETURN
|
|
* PY_YIELD
|
|
* C_CALL
|
|
* C_RETURN
|
|
* LINE
|
|
* INSTRUCTION
|
|
* JUMP
|
|
* BRANCH
|
|
|
|
Implementing tools
|
|
==================
|
|
|
|
It is the philosophy of this PEP that it should be possible for third-party monitoring
|
|
tools to achieve high-performance, not that it should be easy for them to do so.
|
|
|
|
Converting events into data that is meaningful to the users is
|
|
the responsibility of the tool.
|
|
|
|
All events have a cost, and tools should attempt to the use set of events
|
|
that trigger the least often and still provide the necessary information.
|
|
|
|
Debuggers
|
|
---------
|
|
|
|
Inserting breakpoints
|
|
'''''''''''''''''''''
|
|
|
|
Breakpoints can be inserted by using markers. For example::
|
|
|
|
sys.insert_marker(code, offset)
|
|
|
|
Which will insert a marker at ``offset`` in ``code``,
|
|
which can be used as a breakpoint.
|
|
|
|
To insert a breakpoint at a given line, the matching instruction offsets
|
|
should be found from ``code.co_lines()``.
|
|
|
|
Breakpoints can be removed by removing the marker::
|
|
|
|
sys.remove_marker(code, offset)
|
|
|
|
Stepping
|
|
''''''''
|
|
|
|
Debuggers usually offer the ability to step execution by a
|
|
single instruction or line.
|
|
|
|
This can be implemented by inserting a new marker at the required
|
|
offset(s) of the code to be stepped to,
|
|
and by removing the current marker.
|
|
|
|
It is the job of the debugger to compute the relevant offset(s).
|
|
|
|
Attaching
|
|
'''''''''
|
|
|
|
Debuggers can use the ``PY_CALL``, etc. events to be informed when
|
|
a code object is first encountered, so that any necessary breakpoints
|
|
can be inserted.
|
|
|
|
|
|
Coverage Tools
|
|
--------------
|
|
|
|
Coverage tools need to track which parts of the control graph have been
|
|
executed. To do this, they need to register for the ``PY_`` events,
|
|
plus ``JUMP`` and ``BRANCH``.
|
|
|
|
This information can be then be converted back into a line based report
|
|
after execution has completed.
|
|
|
|
Profilers
|
|
---------
|
|
|
|
Simple profilers need to gather information about calls.
|
|
To do this profilers should register for the following events:
|
|
|
|
* PY_CALL
|
|
* PY_RESUME
|
|
* PY_THROW
|
|
* PY_RETURN
|
|
* PY_YIELD
|
|
* PY_UNWIND
|
|
* C_CALL
|
|
* C_RETURN
|
|
|
|
|
|
Line based profilers
|
|
''''''''''''''''''''
|
|
|
|
Line based profilers can use the ``LINE`` and ``JUMP`` events.
|
|
Implementers of profilers should be aware that instrumenting ``LINE``
|
|
and ``JUMP`` events will have a large impact on performance.
|
|
|
|
.. note::
|
|
|
|
Instrumenting profilers have significant overhead and will distort
|
|
the results of profiling. Unless you need exact call counts,
|
|
consider using a statistical profiler.
|
|
|
|
|
|
Rejected ideas
|
|
==============
|
|
|
|
A draft version of this PEP proposed making the user responsible
|
|
for inserting the monitoring instructions, rather than have VM do it.
|
|
However, that puts too much of a burden on the tools, and would make
|
|
attaching a debugger nearly impossible.
|
|
|
|
References
|
|
==========
|
|
|
|
.. [1] Quickening in PEP 659
|
|
https://www.python.org/dev/peps/pep-0659/#quickening
|
|
|
|
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document is placed in the public domain or under the
|
|
CC0-1.0-Universal license, whichever is more permissive.
|
|
|
|
|
|
..
|
|
Local Variables:
|
|
mode: indented-text
|
|
indent-tabs-mode: nil
|
|
sentence-end-double-space: t
|
|
fill-column: 70
|
|
coding: utf-8
|
|
End:
|