PEP 669: Low Impact Monitoring for CPython (#2183)
This commit is contained in:
parent
268b0c1d36
commit
23307c1239
|
@ -524,6 +524,7 @@ pep-0665.rst @brettcannon
|
|||
# pep-0666.txt
|
||||
pep-0667.rst @markshannon
|
||||
pep-0668.rst @dstufft
|
||||
pep-0669.rst @markshannon
|
||||
pep-0670.rst @vstinner @erlend-aasland
|
||||
pep-0671.rst @rosuav
|
||||
pep-0672.rst @encukou
|
||||
|
|
|
@ -0,0 +1,411 @@
|
|||
PEP: 669
|
||||
Title: Low Impact Monitoring for CPython
|
||||
Author: Mark Shannon <mark@hotpy.org>
|
||||
Status: Draft
|
||||
Type: Standards Track
|
||||
Content-Type: text/x-rst
|
||||
Created: 18-Aug-2021
|
||||
Post-History: 7-Dec-2021
|
||||
|
||||
|
||||
Abstract
|
||||
========
|
||||
|
||||
Using a profiler or debugger in CPython can have a severe impact on
|
||||
performance. Slowdowns by an order of magnitude are common.
|
||||
|
||||
This PEP proposes an API for monitoring of Python programs running
|
||||
on CPython that will enable monitoring at low cost.
|
||||
|
||||
Although this PEP does not specify an implementation, it is expected that
|
||||
it will be implemented using the quickening step of PEP 659 [1]_.
|
||||
|
||||
Motivation
|
||||
==========
|
||||
|
||||
Developers should not have to pay an unreasonable cost to use debuggers,
|
||||
profilers and other similar tools.
|
||||
|
||||
C++ and Java developers expect to be able to run a program at full speed
|
||||
(or very close to it) under a debugger.
|
||||
Python developers should expect that too.
|
||||
|
||||
Rationale
|
||||
=========
|
||||
|
||||
The quickening mechanism provided by PEP 659 provides a way to dynamically
|
||||
modify executing Python bytecode. These modifications have little cost beyond
|
||||
the parts of the code that are modified and a relatively low cost to those
|
||||
parts that are modified. We can leverage this to provide an efficient
|
||||
mechanism for monitoring that was not possible in 3.10 or earlier.
|
||||
|
||||
By using quickening, we expect that code run under a debugger on 3.11
|
||||
should easily outperform code run without a debugger on 3.10.
|
||||
Profiling will still slow down execution, but by much less than in 3.10.
|
||||
|
||||
|
||||
Specification
|
||||
=============
|
||||
|
||||
Monitoring of Python programs is done by registering callback functions
|
||||
for events and by activating a set of events.
|
||||
|
||||
Activating events and registering callback functions are independent of each other.
|
||||
|
||||
Events
|
||||
------
|
||||
|
||||
As a code object executes various events occur that might be of interest
|
||||
to tools. By activating events and by registering callback functions
|
||||
tools can respond to these events in any way that suits them.
|
||||
Events can be set globally, or for individual code objects.
|
||||
|
||||
For 3.11, CPython will support the following events:
|
||||
|
||||
* PY_CALL: Call of a Python function (occurs immediately after the call, the callee's frame will be on the stack)
|
||||
* PY_RESUME: Resumption of a Python function (for generator and coroutine functions), except for throw() calls.
|
||||
* PY_THROW: A Python function is resumed by a throw() call.
|
||||
* PY_RETURN: Return from a Python function (occurs immediately before the return, the callee's frame will be on the stack).
|
||||
* PY_YIELD: Yield from a Python function (occurs immediately before the yield, the callee's frame will be on the stack).
|
||||
* PY_UNWIND: Exit from a Python function during exception unwinding.
|
||||
* C_CALL: Call of a builtin function (before the call in this case).
|
||||
* C_RETURN: Return from a builtin function (after the return in this case).
|
||||
* RAISE: An exception is raised.
|
||||
* EXCEPTION_HANDLED: An exception is handled.
|
||||
* LINE: An instruction is about to be executed that has a different line number from the preceding instruction.
|
||||
* INSTRUCTION -- A VM instruction is about to be executed.
|
||||
* JUMP -- An unconditional jump in the control flow graph is reached.
|
||||
* BRANCH -- A conditional branch is about to be taken (or not).
|
||||
* MARKER -- A marker is hit
|
||||
|
||||
More events may be added in the future.
|
||||
|
||||
All event codes are integer powers of two and can be bitwise or-ed together to
|
||||
activate multiple events.
|
||||
|
||||
Setting events globally
|
||||
-----------------------
|
||||
|
||||
Events can be controlled globally by modifying the set of events being monitored:
|
||||
|
||||
* ``sys.get_monitoring_events()->int``
|
||||
Returns the ``int`` resulting from bitwise-oring all the active events.
|
||||
|
||||
* ``sys.set_monitoring_events(event_set: int)``
|
||||
Activates all events which are set in ``event_set``.
|
||||
|
||||
No events are active by default.
|
||||
|
||||
Per code object events
|
||||
----------------------
|
||||
|
||||
Events can also be controlled on a per code object basis:
|
||||
|
||||
* ``sys.get_local_monitoring_events(code: CodeType)->int``
|
||||
Returns the ``int`` resulting from bitwise-oring all the local events for ``code``
|
||||
|
||||
* ``sys.set_local_monitoring_events(code: CodeType, event_set: int)``
|
||||
Returns the ``int`` resulting from bitwise-oring all the local events for ``code``
|
||||
|
||||
Local events add to global events, but do not mask them.
|
||||
In other words, all global events will trigger for a code object, regardless of the local events.
|
||||
|
||||
|
||||
Register callback functions
|
||||
---------------------------
|
||||
|
||||
To register a callable for events call::
|
||||
|
||||
sys.register_monitoring_callback(event, func)
|
||||
|
||||
Functions can be unregistered by calling
|
||||
``sys.register_monitoring_callback(event, None)``.
|
||||
|
||||
Callback functions can be registered and unregistered at any time.
|
||||
|
||||
Registering a callback function will generate a ``sys.audit`` event.
|
||||
|
||||
Callback function arguments
|
||||
'''''''''''''''''''''''''''
|
||||
|
||||
When an active event occurs, the registered callback function is called.
|
||||
Different events will provide the callback function with different arguments, as follows:
|
||||
|
||||
* All events starting with ``PY_``:
|
||||
|
||||
``func(code: CodeType, instruction_offset: int)``
|
||||
|
||||
* ``C_CALL`` and ``C_RETURN``:
|
||||
|
||||
``func(code: CodeType, instruction_offset: int, callable: object)``
|
||||
|
||||
* ``RAISE`` and ``EXCEPTION_HANDLED``:
|
||||
|
||||
``func(code: CodeType, instruction_offset: int, exception: BaseException)``
|
||||
|
||||
* ``LINE``:
|
||||
|
||||
``func(code: CodeType, line_number: int)``
|
||||
|
||||
* ``JUMP`` and ``BRANCH``:
|
||||
|
||||
``func(code: CodeType, instruction_offset: int, destination_offset: int)``
|
||||
|
||||
Note that the ``destination_offset`` is where the code will next execute.
|
||||
For an untaken branch this will be the offset of the instruction following
|
||||
the branch.
|
||||
|
||||
* ``INSTRUCTION``:
|
||||
|
||||
``func(code: CodeType, instruction_offset: int)``
|
||||
|
||||
* ``MARKER``:
|
||||
|
||||
``func(code: CodeType, instruction_offset: int, marker_id: int)``
|
||||
|
||||
Inserting and removing markers
|
||||
''''''''''''''''''''''''''''''''''
|
||||
|
||||
Two new functions are added to the ``sys`` module to support markers.
|
||||
|
||||
* ``sys.insert_marker(code: CodeType, offset: int, marker_id=0: range(256))``
|
||||
* ``sys.remove_marker(code: CodeType, offset: int)``
|
||||
|
||||
The ``marker_id`` has no meaning to the VM,
|
||||
and is used only as an argument to the callback function.
|
||||
The ``marker_id`` must in the range 0 to 255 (inclusive).
|
||||
|
||||
List of new functions
|
||||
'''''''''''''''''''''
|
||||
|
||||
* ``sys.get_monitoring_events()->int``
|
||||
* ``sys.set_monitoring_events(event_set: int)``
|
||||
* ``sys.get_local_monitoring_events(code: CodeType)->int``
|
||||
* ``sys.set_local_monitoring_events(code: CodeType, event_set: int)``
|
||||
* ``sys.register_monitoring_callback(event: int, func: Callable)``
|
||||
* ``sys.insert_marker(code: CodeType, offset: int, marker_id=0: range(256))``
|
||||
* ``sys.remove_marker(code: CodeType, offset: int)``
|
||||
|
||||
Backwards Compatibility
|
||||
=======================
|
||||
|
||||
This PEP is fully backwards compatible, in the sense that old code
|
||||
will work if the features of this PEP are unused.
|
||||
|
||||
However, if it is used it will effectively disable ``sys.settrace``,
|
||||
``sys.setprofile`` and PEP 523 frame evaluation.
|
||||
|
||||
If PEP 523 is in use, or ``sys.settrace`` or ``sys.setprofile`` has been
|
||||
set, then calling ``sys.set_monitoring_events()`` or
|
||||
``sys.set_local_monitoring_events()`` will raise an exception.
|
||||
|
||||
Likewise, if ``sys.set_monitoring_events()`` or
|
||||
``sys.set_local_monitoring_events()`` has been called, then using PEP 523
|
||||
or calling ``sys.settrace`` or ``sys.setprofile`` will raise an exception.
|
||||
|
||||
This PEP is incompatible with ``sys.settrace`` and ``sys.setprofile``
|
||||
because the implementation of ``sys.settrace`` and ``sys.setprofile``
|
||||
will use the same underlying mechanism as this PEP. It would be too slow
|
||||
to support both the new and old monitoring mechanisms at the same time,
|
||||
and they would interfere in awkward ways if both were active at the same time.
|
||||
|
||||
This PEP is incompatible with PEP 523, because PEP 523 prevents the VM being
|
||||
able to modify the code objects of executing code, which is a necessary feature.
|
||||
|
||||
We may seek to remove ``sys.settrace`` and PEP 523 in the future once the APIs
|
||||
provided by this PEP have been widely adopted, but that is for another PEP.
|
||||
|
||||
Performance
|
||||
-----------
|
||||
|
||||
If no events are active, this PEP should have a negligible impact on
|
||||
performance.
|
||||
|
||||
If a small set of events are active, e.g. for a debugger, then the overhead
|
||||
of callbacks will be orders of magnitudes less than for ``sys.settrace`` and
|
||||
much cheaper than using PEP 523.
|
||||
|
||||
For heavily instrumented code, e.g. using ``LINE``, performance should be
|
||||
better than ``sys.settrace``, but not by that much as performance will be
|
||||
dominated by the time spent in callbacks.
|
||||
|
||||
For optimizing virtual machines, such as future versions of CPython
|
||||
(and ``PyPy`` should they choose to support this API), changing the set of
|
||||
globally active events in the midst of a long running program could be quite
|
||||
expensive, possibly taking hundreds of milliseconds as it triggers
|
||||
de-optimizations. Once such de-optimization has occurred, performance should
|
||||
recover as the VM can re-optimize the instrumented code.
|
||||
|
||||
Security Implications
|
||||
=====================
|
||||
|
||||
Allowing modification of running code has some security implications,
|
||||
but no more than the ability to generate and call new code.
|
||||
|
||||
All the new functions listed above will trigger audit hooks.
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
This outlines the proposed implementation for CPython 3.11. The actual
|
||||
implementation for later versions of CPython and other Python implementations
|
||||
may differ considerably.
|
||||
|
||||
The proposed implementation of this PEP will be built on top of the quickening
|
||||
step of PEP 659 [1]_. Activating some events will cause all code objects to
|
||||
be quickened before they are executed.
|
||||
|
||||
For example, if the ``LINE`` event is turned on, then all instructions that
|
||||
are at the start of a line will be replaced with a ``LINE_EVENT`` instruction.
|
||||
|
||||
Note that this will interfere with specialization, which will result in some
|
||||
performance degradation in addition to the overhead of calling the
|
||||
registered callable.
|
||||
|
||||
When the set of active events changes, the VM will immediately update
|
||||
all code objects present on the call stack of any thread. It will also set in
|
||||
place traps to ensure that all code objects are correctly instrumented when
|
||||
called. Consequently changing the set of active events should be done as
|
||||
infrequently as possible, as it could be quite an expensive operation.
|
||||
|
||||
Other events, such as ``RAISE`` can be turned on or off cheaply,
|
||||
as they do not rely on code instrumentation, but runtime checks when the
|
||||
underlying event occurs.
|
||||
|
||||
The exact set of events that require instrumentation is an implementation detail,
|
||||
but for the current design, the following events will require instrumentation:
|
||||
|
||||
* PY_CALL
|
||||
* PY_RESUME
|
||||
* PY_RETURN
|
||||
* PY_YIELD
|
||||
* C_CALL
|
||||
* C_RETURN
|
||||
* LINE
|
||||
* INSTRUCTION
|
||||
* JUMP
|
||||
* BRANCH
|
||||
|
||||
Implementing tools
|
||||
==================
|
||||
|
||||
It is the philosophy of this PEP that it should be possible for third-party monitoring
|
||||
tools to achieve high-performance, not that it should be easy for them to do so.
|
||||
|
||||
Converting events into data that is meaningful to the users is
|
||||
the responsibility of the tool.
|
||||
|
||||
All events have a cost, and tools should attempt to the use set of events
|
||||
that trigger the least often and still provide the necessary information.
|
||||
|
||||
Debuggers
|
||||
---------
|
||||
|
||||
Inserting breakpoints
|
||||
'''''''''''''''''''''
|
||||
|
||||
Breakpoints can be inserted by using markers. For example::
|
||||
|
||||
sys.insert_marker(code, offset)
|
||||
|
||||
Which will insert a marker at ``offset`` in ``code``,
|
||||
which can be used as a breakpoint.
|
||||
|
||||
To insert a breakpoint at a given line, the matching instruction offsets
|
||||
should be found from ``code.co_lines()``.
|
||||
|
||||
Breakpoints can be removed by removing the marker::
|
||||
|
||||
sys.remove_marker(code, offset)
|
||||
|
||||
Stepping
|
||||
''''''''
|
||||
|
||||
Debuggers usually offer the ability to step execution by a
|
||||
single instruction or line.
|
||||
|
||||
This can be implemented by inserting a new marker at the required
|
||||
offset(s) of the code to be stepped to,
|
||||
and by removing the current marker.
|
||||
|
||||
It is the job of the debugger to compute the relevant offset(s).
|
||||
|
||||
Attaching
|
||||
'''''''''
|
||||
|
||||
Debuggers can use the ``PY_CALL``, etc. events to be informed when
|
||||
a code object is first encountered, so that any necessary breakpoints
|
||||
can be inserted.
|
||||
|
||||
|
||||
Coverage Tools
|
||||
--------------
|
||||
|
||||
Coverage tools need to track which parts of the control graph have been
|
||||
executed. To do this, they need to register for the ``PY_`` events,
|
||||
plus ``JUMP`` and ``BRANCH``.
|
||||
|
||||
This information can be then be converted back into a line based report
|
||||
after execution has completed.
|
||||
|
||||
Profilers
|
||||
---------
|
||||
|
||||
Simple profilers need to gather information about calls.
|
||||
To do this profilers should register for the following events:
|
||||
|
||||
* PY_CALL
|
||||
* PY_RESUME
|
||||
* PY_THROW
|
||||
* PY_RETURN
|
||||
* PY_YIELD
|
||||
* PY_UNWIND
|
||||
* C_CALL
|
||||
* C_RETURN
|
||||
|
||||
|
||||
Line based profilers
|
||||
''''''''''''''''''''
|
||||
|
||||
Line based profilers can use the ``LINE`` and ``JUMP`` events.
|
||||
Implementers of profilers should be aware that instrumenting ``LINE``
|
||||
and ``JUMP`` events will have a large impact on performance.
|
||||
|
||||
.. note::
|
||||
|
||||
Instrumenting profilers have significant overhead and will distort
|
||||
the results of profiling. Unless you need exact call counts,
|
||||
consider using a statistical profiler.
|
||||
|
||||
|
||||
Rejected ideas
|
||||
==============
|
||||
|
||||
A draft version of this PEP proposed making the user responsible
|
||||
for inserting the monitoring instructions, rather than have VM do it.
|
||||
However, that puts too much of a burden on the tools, and would make
|
||||
attaching a debugger nearly impossible.
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
.. [1] Quickening in PEP 659
|
||||
https://www.python.org/dev/peps/pep-0659/#quickening
|
||||
|
||||
|
||||
|
||||
Copyright
|
||||
=========
|
||||
|
||||
This document is placed in the public domain or under the
|
||||
CC0-1.0-Universal license, whichever is more permissive.
|
||||
|
||||
|
||||
..
|
||||
Local Variables:
|
||||
mode: indented-text
|
||||
indent-tabs-mode: nil
|
||||
sentence-end-double-space: t
|
||||
fill-column: 70
|
||||
coding: utf-8
|
||||
End:
|
Loading…
Reference in New Issue