PEP 669: Update to reflect latest design and implementation (#3021)
Co-authored-by: Petr Viktorin <encukou@gmail.com> Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
This commit is contained in:
parent
706b7aa2e1
commit
bf937e64f5
230
pep-0669.rst
230
pep-0669.rst
|
@ -22,7 +22,7 @@ it will be implemented using the quickening step of
|
|||
:pep:`659`.
|
||||
|
||||
A ``sys.monitoring`` namespace will be added, which will contain
|
||||
the relevant functions and enum.
|
||||
the relevant functions and constants.
|
||||
|
||||
|
||||
Motivation
|
||||
|
@ -78,25 +78,84 @@ For 3.12, CPython will support the following events:
|
|||
* PY_RETURN: Return from a Python function (occurs immediately before the return, the callee's frame will be on the stack).
|
||||
* PY_YIELD: Yield from a Python function (occurs immediately before the yield, the callee's frame will be on the stack).
|
||||
* PY_UNWIND: Exit from a Python function during exception unwinding.
|
||||
* C_CALL: Call to any callable, except Python functions (before the call in this case).
|
||||
* C_RETURN: Return from any callable, except Python functions (after the return in this case).
|
||||
* RAISE: An exception is raised.
|
||||
* CALL: A call in Python code (event occurs before the call).
|
||||
* C_RETURN: Return from any callable, except Python functions (event occurs after the return).
|
||||
* C_RAISE: Exception raised from any callable, except Python functions (event occurs after the exit).
|
||||
* RAISE: An exception is raised, except those that cause a ``STOP_ITERATION`` event.
|
||||
* EXCEPTION_HANDLED: An exception is handled.
|
||||
* LINE: An instruction is about to be executed that has a different line number from the preceding instruction.
|
||||
* INSTRUCTION -- A VM instruction is about to be executed.
|
||||
* JUMP -- An unconditional jump in the control flow graph is reached.
|
||||
* BRANCH -- A conditional branch is about to be taken (or not).
|
||||
* MARKER -- A marker is hit
|
||||
* JUMP -- An unconditional jump in the control flow graph is made.
|
||||
* BRANCH -- A conditional branch is taken (or not).
|
||||
* STOP_ITERATION -- An artificial ``StopIteration`` is raised; see :ref:`669-stopiteration`.
|
||||
|
||||
More events may be added in the future.
|
||||
|
||||
All events will be attributes of the ``Event`` enum in ``sys.monitoring``::
|
||||
All events will be attributes of the ``events`` namespace in ``sys.monitoring``.
|
||||
All events will represented by a power of two integer, so that they can be combined
|
||||
with the ``|`` operator.
|
||||
|
||||
class Event(enum.IntFlag):
|
||||
PY_CALL = ...
|
||||
Events are divided into three groups:
|
||||
|
||||
Note that ``Event`` is an ``IntFlag`` which means that the events can be or-ed
|
||||
together to form a set of events.
|
||||
Local events
|
||||
''''''''''''
|
||||
|
||||
Local events are associated with normal execution of the program and happen
|
||||
at clearly defined locations. All local events can be disabled.
|
||||
The local events are:
|
||||
|
||||
* PY_START
|
||||
* PY_RESUME
|
||||
* PY_RETURN
|
||||
* PY_YIELD
|
||||
* CALL
|
||||
* LINE
|
||||
* INSTRUCTION
|
||||
* JUMP
|
||||
* BRANCH
|
||||
* STOP_ITERATION
|
||||
|
||||
Ancilliary events
|
||||
'''''''''''''''''
|
||||
|
||||
Ancillary events can be monitored like other events, but are controlled
|
||||
by another event:
|
||||
|
||||
* C_RAISE
|
||||
* C_RETURN
|
||||
|
||||
The ``C_RETURN`` and ``C_RAISE`` events are are controlled by the ``CALL``
|
||||
event. ``C_RETURN`` and ``C_RAISE`` events will only be seen if the
|
||||
corresponding ``CALL`` event is being monitored.
|
||||
|
||||
Other events
|
||||
''''''''''''
|
||||
|
||||
Other events are not necessarily tied to a specific location in the
|
||||
program and cannot be individually disabled.
|
||||
|
||||
The other events that can be monitored are:
|
||||
|
||||
* PY_THROW
|
||||
* PY_UNWIND
|
||||
* RAISE
|
||||
* EXCEPTION_HANDLED
|
||||
|
||||
|
||||
.. _669-stopiteration:
|
||||
|
||||
The STOP_ITERATION event
|
||||
''''''''''''''''''''''''
|
||||
|
||||
:pep:`PEP 380 <380#use-of-stopiteration-to-return-values>`
|
||||
specifies that a ``StopIteration`` exception is raised when returning a value
|
||||
from a generator or coroutine. However, this is a very inefficient way to
|
||||
return a value, so some Python implementations, notably CPython 3.12+, do not
|
||||
raise an exception unless it would be visible to other code.
|
||||
|
||||
To allow tools to monitor for real exceptions without slowing down generators
|
||||
and coroutines, the ``STOP_ITERATION`` event is provided.
|
||||
``STOP_ITERATION`` can be locally disabled, unlike ``RAISE``.
|
||||
|
||||
Tool identifiers
|
||||
----------------
|
||||
|
@ -115,20 +174,20 @@ Identifiers are integers in the range 0 to 5.
|
|||
``sys.monitoring.get_tool`` returns the name of the tool if ``id`` is in use,
|
||||
otherwise it returns ``None``.
|
||||
|
||||
All IDs are treated the same by the VM with regard to events, but the following
|
||||
IDs are pre-defined to make co-operation of tools easier::
|
||||
All IDs are treated the same by the VM with regard to events, but the
|
||||
following IDs are pre-defined to make co-operation of tools easier::
|
||||
|
||||
sys.monitoring.DEBUGGER_ID = 0
|
||||
sys.monitoring.COVERAGE_ID = 1
|
||||
sys.monitoring.PROFILER_ID = 2
|
||||
sys.monitoring.OPTIMIZER_ID = 3
|
||||
sys.monitoring.OPTIMIZER_ID = 5
|
||||
|
||||
There is no obligation to set an ID, nor is there anything preventing a tool from
|
||||
using an ID even it is already in use.
|
||||
However, tool are encouraged to use a unique ID and respect other tools.
|
||||
There is no obligation to set an ID, nor is there anything preventing a tool
|
||||
from using an ID even it is already in use.
|
||||
However, tools are encouraged to use a unique ID and respect other tools.
|
||||
|
||||
For example, if a debugger were attached and ``DEBUGGER_ID`` were in use, it should
|
||||
report an error, rather than carrying on regardless.
|
||||
For example, if a debugger were attached and ``DEBUGGER_ID`` were in use, it
|
||||
should report an error, rather than carrying on regardless.
|
||||
|
||||
The ``OPTIMIZER_ID`` is provided for tools like Cinder or PyTorch
|
||||
that want to optimize Python code, but need to decide what to
|
||||
|
@ -139,10 +198,10 @@ Setting events globally
|
|||
|
||||
Events can be controlled globally by modifying the set of events being monitored:
|
||||
|
||||
* ``sys.monitoring.get_events(tool_id:int)->Event``
|
||||
Returns the ``Event`` set for all the active events.
|
||||
* ``sys.monitoring.get_events(tool_id:int)->int``
|
||||
Returns the ``int`` representing all the active events.
|
||||
|
||||
* ``sys.monitoring.set_events(tool_id:int, event_set: Event)``
|
||||
* ``sys.monitoring.set_events(tool_id:int, event_set: int)``
|
||||
Activates all events which are set in ``event_set``.
|
||||
|
||||
No events are active by default.
|
||||
|
@ -152,22 +211,22 @@ Per code object events
|
|||
|
||||
Events can also be controlled on a per code object basis:
|
||||
|
||||
* ``sys.monitoring.get_local_events(tool_id:int, code: CodeType)->Event``
|
||||
Returns the ``Event`` set for all the local events for ``code``
|
||||
* ``sys.monitoring.get_local_events(tool_id:int, code: CodeType)->int``
|
||||
Returns all the local events for ``code``
|
||||
|
||||
* ``sys.monitoring.set_local_events(tool_id:int, code: CodeType, event_set: Event)``
|
||||
* ``sys.monitoring.set_local_events(tool_id:int, code: CodeType, event_set: int)``
|
||||
Activates all the local events for ``code`` which are set in ``event_set``.
|
||||
|
||||
Local events add to global events, but do not mask them.
|
||||
In other words, all global events will trigger for a code object, regardless of the local events.
|
||||
|
||||
In other words, all global events will trigger for a code object,
|
||||
regardless of the local events.
|
||||
|
||||
Register callback functions
|
||||
---------------------------
|
||||
|
||||
To register a callable for events call::
|
||||
|
||||
sys.monitoring.register_callback(tool_id:int, event: Event, func: Callable | None) -> Callable | None
|
||||
sys.monitoring.register_callback(tool_id:int, event: int, func: Callable | None) -> Callable | None
|
||||
|
||||
If another callback was registered for the given ``tool_id`` and ``event``,
|
||||
it is unregistered and returned.
|
||||
|
@ -186,13 +245,19 @@ Callback function arguments
|
|||
When an active event occurs, the registered callback function is called.
|
||||
Different events will provide the callback function with different arguments, as follows:
|
||||
|
||||
* All events starting with ``PY_``:
|
||||
* ``PY_START`` and ``PY_RESUME``::
|
||||
|
||||
``func(code: CodeType, instruction_offset: int) -> DISABLE | Any``
|
||||
func(code: CodeType, instruction_offset: int) -> DISABLE | Any
|
||||
|
||||
* ``C_CALL`` and ``C_RETURN``:
|
||||
* ``PY_RETURN`` and ``PY_YIELD``:
|
||||
|
||||
``func(code: CodeType, instruction_offset: int, callable: object) -> DISABLE | Any``
|
||||
``func(code: CodeType, instruction_offset: int, retval: object) -> DISABLE | Any``
|
||||
|
||||
* ``CALL``, ``C_RAISE`` and ``C_RETURN``:
|
||||
|
||||
``func(code: CodeType, instruction_offset: int, callable: object, arg0: object | MISSING) -> DISABLE | Any``
|
||||
|
||||
If there are no arguments, ``arg0`` is set to ``MISSING``.
|
||||
|
||||
* ``RAISE`` and ``EXCEPTION_HANDLED``:
|
||||
|
||||
|
@ -214,9 +279,6 @@ Different events will provide the callback function with different arguments, as
|
|||
|
||||
``func(code: CodeType, instruction_offset: int) -> DISABLE | Any``
|
||||
|
||||
* ``MARKER``:
|
||||
|
||||
``func(code: CodeType, instruction_offset: int) -> DISABLE | Any``
|
||||
|
||||
If a callback function returns ``DISABLE``, then that function will no longer
|
||||
be called for that ``(code, instruction_offset)`` until
|
||||
|
@ -237,45 +299,40 @@ That means that other tools will see events in the callback functions for other
|
|||
tools. This could be useful for debugging a profiling tool, but would produce
|
||||
misleading profiles, as the debugger tool would show up in the profile.
|
||||
|
||||
Inserting and removing markers
|
||||
------------------------------
|
||||
|
||||
Two new functions are added to the ``sys`` module to support markers.
|
||||
|
||||
* ``sys.monitoring.insert_marker(tool_id: int, code: CodeType, offset: int)``
|
||||
* ``sys.monitoring.remove_marker(tool_id: int, code: CodeType, offset: int)``
|
||||
|
||||
A single code object may not have more than 255 markers at once.
|
||||
``sys.monitoring.insert_marker`` raises a ``ValueError`` if this limit
|
||||
is exceeded.
|
||||
|
||||
Order of events
|
||||
---------------
|
||||
|
||||
If an instructions triggers several events they occur in the following order:
|
||||
|
||||
* MARKER
|
||||
* INSTRUCTION
|
||||
* LINE
|
||||
* All other events (only one of these events can occur per instruction)
|
||||
|
||||
Each event is delivered to tools in ascending order of ID.
|
||||
|
||||
The "call" event group
|
||||
----------------------
|
||||
|
||||
Most events are independent; setting or disabling one event has no effect on the others.
|
||||
However, the ``CALL``, ``C_RAISE`` and ``C_RETURN`` events form a group.
|
||||
If any of those events are set or disabled, then all events in the group are.
|
||||
Disabling a ``CALL`` event will not disable the matching ``C_RAISE`` or ``C_RETURN``,
|
||||
but will disable all subsequent events.
|
||||
|
||||
|
||||
Attributes of the ``sys.monitoring`` namespace
|
||||
----------------------------------------------
|
||||
|
||||
* ``class Event(enum.IntFlag)``
|
||||
* ``def use_tool_id(id)->None``
|
||||
* ``def free_tool_id(id)->None``
|
||||
* ``def get_events(tool_id: int)->Event``
|
||||
* ``def set_events(tool_id: int, event_set: Event)->None``
|
||||
* ``def get_local_events(tool_id: int, code: CodeType)->Event``
|
||||
* ``def set_local_events(tool_id: int, code: CodeType, event_set: Event)->None``
|
||||
* ``def register_callback(tool_id: int, event: Event, func: Callable)->Optional[Callable]``
|
||||
* ``def insert_marker(tool_id: int, code: CodeType, offset: Event)->None``
|
||||
* ``def remove_marker(tool_id: int, code: CodeType, offset: Event)->None``
|
||||
* ``def get_events(tool_id: int)->int``
|
||||
* ``def set_events(tool_id: int, event_set: int)->None``
|
||||
* ``def get_local_events(tool_id: int, code: CodeType)->int``
|
||||
* ``def set_local_events(tool_id: int, code: CodeType, event_set: int)->None``
|
||||
* ``def register_callback(tool_id: int, event: int, func: Callable)->Optional[Callable]``
|
||||
* ``def restart_events()->None``
|
||||
* ``DISABLE: object``
|
||||
* ``MISSING: object``
|
||||
|
||||
Access to "debug only" features
|
||||
-------------------------------
|
||||
|
@ -311,7 +368,7 @@ If no events are active, this PEP should have a small positive impact on
|
|||
performance. Experiments show between 1 and 2% speedup from not supporting
|
||||
:func:`sys.settrace` directly.
|
||||
|
||||
The performance of :func:`sys.settrace` will be worse.
|
||||
The performance of :func:`sys.settrace` will be about the same.
|
||||
The performance of :func:`sys.setprofile` should be better.
|
||||
However, tools relying on :func:`sys.settrace` and
|
||||
:func:`sys.setprofile` can be made a lot faster by using the
|
||||
|
@ -337,25 +394,23 @@ recover as the VM can re-optimize the instrumented code.
|
|||
|
||||
In general these operations can be considered to be fast:
|
||||
|
||||
* ``def get_events(tool_id: int)->Event``
|
||||
* ``def get_local_events(tool_id: int, code: CodeType)->Event``
|
||||
* ``def register_callback(tool_id: int, event: Event, func: Callable)->Optional[Callable]``
|
||||
* ``def get_events(tool_id: int)->int``
|
||||
* ``def get_local_events(tool_id: int, code: CodeType)->int``
|
||||
* ``def register_callback(tool_id: int, event: int, func: Callable)->Optional[Callable]``
|
||||
* ``def get_tool(tool_id) -> str | None``
|
||||
|
||||
These operations are slower, but not especially so:
|
||||
|
||||
* ``def set_local_events(tool_id: int, code: CodeType, event_set: Event)->None``
|
||||
* ``def insert_marker(tool_id: int, code: CodeType, offset: Event)->None``
|
||||
* ``def remove_marker(tool_id: int, code: CodeType, offset: Event)->None``
|
||||
* ``def set_local_events(tool_id: int, code: CodeType, event_set: int)->None``
|
||||
|
||||
And these operations should be regarded as slow:
|
||||
|
||||
* ``def use_tool_id(id, name:str)->None``
|
||||
* ``def free_tool_id(id)->None``
|
||||
* ``def set_events(tool_id: int, event_set: Event)->None``
|
||||
* ``def set_events(tool_id: int, event_set: int)->None``
|
||||
* ``def restart_events()->None``
|
||||
|
||||
How slow the slow operations are depends on when then happen.
|
||||
How slow the slow operations are depends on when they happen.
|
||||
If done early in the program, before modules are loaded,
|
||||
they should be fairly inexpensive.
|
||||
|
||||
|
@ -399,7 +454,7 @@ step of CPython 3.11, as described in :pep:`PEP 659 <659#quickening>`.
|
|||
Instrumentation works in much the same way as quickening, bytecodes are
|
||||
replaced with instrumented ones as needed.
|
||||
|
||||
For example, if the ``C_CALL`` event is turned on,
|
||||
For example, if the ``CALL`` event is turned on,
|
||||
then all call instructions will be
|
||||
replaced with a ``INSTRUMENTED_CALL`` instruction.
|
||||
|
||||
|
@ -424,8 +479,7 @@ but for the current design, the following events will require instrumentation:
|
|||
* PY_RESUME
|
||||
* PY_RETURN
|
||||
* PY_YIELD
|
||||
* C_CALL
|
||||
* C_RETURN
|
||||
* CALL
|
||||
* LINE
|
||||
* INSTRUCTION
|
||||
* JUMP
|
||||
|
@ -456,19 +510,8 @@ Debuggers
|
|||
Inserting breakpoints
|
||||
'''''''''''''''''''''
|
||||
|
||||
Breakpoints can be inserted by using markers. For example::
|
||||
|
||||
sys.monitoring.insert_marker(code, offset)
|
||||
|
||||
Which will insert a marker at ``offset`` in ``code``,
|
||||
which can be used as a breakpoint.
|
||||
|
||||
To insert a breakpoint at a given line, the matching instruction offsets
|
||||
should be found from ``code.co_lines()``.
|
||||
|
||||
Breakpoints can be removed by removing the marker::
|
||||
|
||||
sys.monitoring.remove_marker(code, offset)
|
||||
Breakpoints can be inserted setting per code object events, either ``LINE`` or ``INSTRUCTION``,
|
||||
and returning ``DISABLE`` for any events not matching a breakpoint.
|
||||
|
||||
Stepping
|
||||
''''''''
|
||||
|
@ -476,17 +519,14 @@ Stepping
|
|||
Debuggers usually offer the ability to step execution by a
|
||||
single instruction or line.
|
||||
|
||||
This can be implemented by inserting a new marker at the required
|
||||
offset(s) of the code to be stepped to,
|
||||
and by removing the current marker.
|
||||
|
||||
It is the job of the debugger to compute the relevant offset(s).
|
||||
Like breakpoints, stepping can be implemented by setting per code object events.
|
||||
As soon as normal execution is to be resumed, the local events can be unset.
|
||||
|
||||
Attaching
|
||||
'''''''''
|
||||
|
||||
Debuggers can use the ``PY_CALL``, etc. events to be informed when
|
||||
a code object is first encountered, so that any necessary breakpoints
|
||||
Debuggers can use the ``PY_START`` and ``PY_RESUME`` events to be informed
|
||||
when a code object is first encountered, so that any necessary breakpoints
|
||||
can be inserted.
|
||||
|
||||
Coverage Tools
|
||||
|
@ -505,13 +545,14 @@ Profilers
|
|||
Simple profilers need to gather information about calls.
|
||||
To do this profilers should register for the following events:
|
||||
|
||||
* PY_CALL
|
||||
* PY_START
|
||||
* PY_RESUME
|
||||
* PY_THROW
|
||||
* PY_RETURN
|
||||
* PY_YIELD
|
||||
* PY_UNWIND
|
||||
* C_CALL
|
||||
* CALL
|
||||
* C_RAISE
|
||||
* C_RETURN
|
||||
|
||||
|
||||
|
@ -537,6 +578,13 @@ for inserting the monitoring instructions, rather than have VM do it.
|
|||
However, that puts too much of a burden on the tools, and would make
|
||||
attaching a debugger nearly impossible.
|
||||
|
||||
An earlier version of this PEP, proposed storing events as ``enums``::
|
||||
|
||||
class Event(enum.IntFlag):
|
||||
PY_START = ...
|
||||
|
||||
However, that would prevent monitoring of code before the ``enum`` module was
|
||||
loaded and could cause unnecessary overhead.
|
||||
|
||||
Copyright
|
||||
=========
|
||||
|
|
Loading…
Reference in New Issue