From bf937e64f525db25823cba9271f352b6d86df8f6 Mon Sep 17 00:00:00 2001 From: Mark Shannon Date: Thu, 23 Feb 2023 13:37:40 +0000 Subject: [PATCH] PEP 669: Update to reflect latest design and implementation (#3021) Co-authored-by: Petr Viktorin Co-authored-by: C.A.M. Gerlach --- pep-0669.rst | 232 +++++++++++++++++++++++++++++++-------------------- 1 file changed, 140 insertions(+), 92 deletions(-) diff --git a/pep-0669.rst b/pep-0669.rst index 485a5a924..106a68b52 100644 --- a/pep-0669.rst +++ b/pep-0669.rst @@ -22,7 +22,7 @@ it will be implemented using the quickening step of :pep:`659`. A ``sys.monitoring`` namespace will be added, which will contain -the relevant functions and enum. +the relevant functions and constants. Motivation @@ -78,25 +78,84 @@ For 3.12, CPython will support the following events: * PY_RETURN: Return from a Python function (occurs immediately before the return, the callee's frame will be on the stack). * PY_YIELD: Yield from a Python function (occurs immediately before the yield, the callee's frame will be on the stack). * PY_UNWIND: Exit from a Python function during exception unwinding. -* C_CALL: Call to any callable, except Python functions (before the call in this case). -* C_RETURN: Return from any callable, except Python functions (after the return in this case). -* RAISE: An exception is raised. +* CALL: A call in Python code (event occurs before the call). +* C_RETURN: Return from any callable, except Python functions (event occurs after the return). +* C_RAISE: Exception raised from any callable, except Python functions (event occurs after the exit). +* RAISE: An exception is raised, except those that cause a ``STOP_ITERATION`` event. * EXCEPTION_HANDLED: An exception is handled. * LINE: An instruction is about to be executed that has a different line number from the preceding instruction. * INSTRUCTION -- A VM instruction is about to be executed. -* JUMP -- An unconditional jump in the control flow graph is reached. -* BRANCH -- A conditional branch is about to be taken (or not). -* MARKER -- A marker is hit +* JUMP -- An unconditional jump in the control flow graph is made. +* BRANCH -- A conditional branch is taken (or not). +* STOP_ITERATION -- An artificial ``StopIteration`` is raised; see :ref:`669-stopiteration`. More events may be added in the future. -All events will be attributes of the ``Event`` enum in ``sys.monitoring``:: +All events will be attributes of the ``events`` namespace in ``sys.monitoring``. +All events will represented by a power of two integer, so that they can be combined +with the ``|`` operator. - class Event(enum.IntFlag): - PY_CALL = ... +Events are divided into three groups: -Note that ``Event`` is an ``IntFlag`` which means that the events can be or-ed -together to form a set of events. +Local events +'''''''''''' + +Local events are associated with normal execution of the program and happen +at clearly defined locations. All local events can be disabled. +The local events are: + +* PY_START +* PY_RESUME +* PY_RETURN +* PY_YIELD +* CALL +* LINE +* INSTRUCTION +* JUMP +* BRANCH +* STOP_ITERATION + +Ancilliary events +''''''''''''''''' + +Ancillary events can be monitored like other events, but are controlled +by another event: + +* C_RAISE +* C_RETURN + +The ``C_RETURN`` and ``C_RAISE`` events are are controlled by the ``CALL`` +event. ``C_RETURN`` and ``C_RAISE`` events will only be seen if the +corresponding ``CALL`` event is being monitored. + +Other events +'''''''''''' + +Other events are not necessarily tied to a specific location in the +program and cannot be individually disabled. + +The other events that can be monitored are: + +* PY_THROW +* PY_UNWIND +* RAISE +* EXCEPTION_HANDLED + + +.. _669-stopiteration: + +The STOP_ITERATION event +'''''''''''''''''''''''' + +:pep:`PEP 380 <380#use-of-stopiteration-to-return-values>` +specifies that a ``StopIteration`` exception is raised when returning a value +from a generator or coroutine. However, this is a very inefficient way to +return a value, so some Python implementations, notably CPython 3.12+, do not +raise an exception unless it would be visible to other code. + +To allow tools to monitor for real exceptions without slowing down generators +and coroutines, the ``STOP_ITERATION`` event is provided. +``STOP_ITERATION`` can be locally disabled, unlike ``RAISE``. Tool identifiers ---------------- @@ -115,20 +174,20 @@ Identifiers are integers in the range 0 to 5. ``sys.monitoring.get_tool`` returns the name of the tool if ``id`` is in use, otherwise it returns ``None``. -All IDs are treated the same by the VM with regard to events, but the following -IDs are pre-defined to make co-operation of tools easier:: +All IDs are treated the same by the VM with regard to events, but the +following IDs are pre-defined to make co-operation of tools easier:: sys.monitoring.DEBUGGER_ID = 0 sys.monitoring.COVERAGE_ID = 1 sys.monitoring.PROFILER_ID = 2 - sys.monitoring.OPTIMIZER_ID = 3 + sys.monitoring.OPTIMIZER_ID = 5 -There is no obligation to set an ID, nor is there anything preventing a tool from -using an ID even it is already in use. -However, tool are encouraged to use a unique ID and respect other tools. +There is no obligation to set an ID, nor is there anything preventing a tool +from using an ID even it is already in use. +However, tools are encouraged to use a unique ID and respect other tools. -For example, if a debugger were attached and ``DEBUGGER_ID`` were in use, it should -report an error, rather than carrying on regardless. +For example, if a debugger were attached and ``DEBUGGER_ID`` were in use, it +should report an error, rather than carrying on regardless. The ``OPTIMIZER_ID`` is provided for tools like Cinder or PyTorch that want to optimize Python code, but need to decide what to @@ -139,10 +198,10 @@ Setting events globally Events can be controlled globally by modifying the set of events being monitored: -* ``sys.monitoring.get_events(tool_id:int)->Event`` - Returns the ``Event`` set for all the active events. +* ``sys.monitoring.get_events(tool_id:int)->int`` + Returns the ``int`` representing all the active events. -* ``sys.monitoring.set_events(tool_id:int, event_set: Event)`` +* ``sys.monitoring.set_events(tool_id:int, event_set: int)`` Activates all events which are set in ``event_set``. No events are active by default. @@ -152,22 +211,22 @@ Per code object events Events can also be controlled on a per code object basis: -* ``sys.monitoring.get_local_events(tool_id:int, code: CodeType)->Event`` - Returns the ``Event`` set for all the local events for ``code`` +* ``sys.monitoring.get_local_events(tool_id:int, code: CodeType)->int`` + Returns all the local events for ``code`` -* ``sys.monitoring.set_local_events(tool_id:int, code: CodeType, event_set: Event)`` +* ``sys.monitoring.set_local_events(tool_id:int, code: CodeType, event_set: int)`` Activates all the local events for ``code`` which are set in ``event_set``. Local events add to global events, but do not mask them. -In other words, all global events will trigger for a code object, regardless of the local events. - +In other words, all global events will trigger for a code object, +regardless of the local events. Register callback functions --------------------------- To register a callable for events call:: - sys.monitoring.register_callback(tool_id:int, event: Event, func: Callable | None) -> Callable | None + sys.monitoring.register_callback(tool_id:int, event: int, func: Callable | None) -> Callable | None If another callback was registered for the given ``tool_id`` and ``event``, it is unregistered and returned. @@ -186,13 +245,19 @@ Callback function arguments When an active event occurs, the registered callback function is called. Different events will provide the callback function with different arguments, as follows: -* All events starting with ``PY_``: +* ``PY_START`` and ``PY_RESUME``:: - ``func(code: CodeType, instruction_offset: int) -> DISABLE | Any`` + func(code: CodeType, instruction_offset: int) -> DISABLE | Any -* ``C_CALL`` and ``C_RETURN``: +* ``PY_RETURN`` and ``PY_YIELD``: - ``func(code: CodeType, instruction_offset: int, callable: object) -> DISABLE | Any`` + ``func(code: CodeType, instruction_offset: int, retval: object) -> DISABLE | Any`` + +* ``CALL``, ``C_RAISE`` and ``C_RETURN``: + + ``func(code: CodeType, instruction_offset: int, callable: object, arg0: object | MISSING) -> DISABLE | Any`` + + If there are no arguments, ``arg0`` is set to ``MISSING``. * ``RAISE`` and ``EXCEPTION_HANDLED``: @@ -214,9 +279,6 @@ Different events will provide the callback function with different arguments, as ``func(code: CodeType, instruction_offset: int) -> DISABLE | Any`` -* ``MARKER``: - - ``func(code: CodeType, instruction_offset: int) -> DISABLE | Any`` If a callback function returns ``DISABLE``, then that function will no longer be called for that ``(code, instruction_offset)`` until @@ -237,45 +299,40 @@ That means that other tools will see events in the callback functions for other tools. This could be useful for debugging a profiling tool, but would produce misleading profiles, as the debugger tool would show up in the profile. -Inserting and removing markers ------------------------------- - -Two new functions are added to the ``sys`` module to support markers. - -* ``sys.monitoring.insert_marker(tool_id: int, code: CodeType, offset: int)`` -* ``sys.monitoring.remove_marker(tool_id: int, code: CodeType, offset: int)`` - -A single code object may not have more than 255 markers at once. -``sys.monitoring.insert_marker`` raises a ``ValueError`` if this limit -is exceeded. - Order of events --------------- If an instructions triggers several events they occur in the following order: -* MARKER * INSTRUCTION * LINE * All other events (only one of these events can occur per instruction) Each event is delivered to tools in ascending order of ID. +The "call" event group +---------------------- + +Most events are independent; setting or disabling one event has no effect on the others. +However, the ``CALL``, ``C_RAISE`` and ``C_RETURN`` events form a group. +If any of those events are set or disabled, then all events in the group are. +Disabling a ``CALL`` event will not disable the matching ``C_RAISE`` or ``C_RETURN``, +but will disable all subsequent events. + + Attributes of the ``sys.monitoring`` namespace ---------------------------------------------- -* ``class Event(enum.IntFlag)`` * ``def use_tool_id(id)->None`` * ``def free_tool_id(id)->None`` -* ``def get_events(tool_id: int)->Event`` -* ``def set_events(tool_id: int, event_set: Event)->None`` -* ``def get_local_events(tool_id: int, code: CodeType)->Event`` -* ``def set_local_events(tool_id: int, code: CodeType, event_set: Event)->None`` -* ``def register_callback(tool_id: int, event: Event, func: Callable)->Optional[Callable]`` -* ``def insert_marker(tool_id: int, code: CodeType, offset: Event)->None`` -* ``def remove_marker(tool_id: int, code: CodeType, offset: Event)->None`` +* ``def get_events(tool_id: int)->int`` +* ``def set_events(tool_id: int, event_set: int)->None`` +* ``def get_local_events(tool_id: int, code: CodeType)->int`` +* ``def set_local_events(tool_id: int, code: CodeType, event_set: int)->None`` +* ``def register_callback(tool_id: int, event: int, func: Callable)->Optional[Callable]`` * ``def restart_events()->None`` * ``DISABLE: object`` +* ``MISSING: object`` Access to "debug only" features ------------------------------- @@ -298,7 +355,7 @@ of this PEP. Simple plugins that do not change the state of the VM, and defer execution to ``_PyEval_EvalFrameDefault()`` should continue to work. :func:`sys.settrace` and :func:`sys.setprofile` will act as if they were tools -6 and 7 respectively, so can be used along side this PEP. +6 and 7 respectively, so can be used alongside this PEP. This means that :func:`sys.settrace` and :func:`sys.setprofile` may not work correctly with all :pep:`523` plugins. Although, simple :pep:`523` @@ -311,7 +368,7 @@ If no events are active, this PEP should have a small positive impact on performance. Experiments show between 1 and 2% speedup from not supporting :func:`sys.settrace` directly. -The performance of :func:`sys.settrace` will be worse. +The performance of :func:`sys.settrace` will be about the same. The performance of :func:`sys.setprofile` should be better. However, tools relying on :func:`sys.settrace` and :func:`sys.setprofile` can be made a lot faster by using the @@ -337,25 +394,23 @@ recover as the VM can re-optimize the instrumented code. In general these operations can be considered to be fast: -* ``def get_events(tool_id: int)->Event`` -* ``def get_local_events(tool_id: int, code: CodeType)->Event`` -* ``def register_callback(tool_id: int, event: Event, func: Callable)->Optional[Callable]`` +* ``def get_events(tool_id: int)->int`` +* ``def get_local_events(tool_id: int, code: CodeType)->int`` +* ``def register_callback(tool_id: int, event: int, func: Callable)->Optional[Callable]`` * ``def get_tool(tool_id) -> str | None`` These operations are slower, but not especially so: -* ``def set_local_events(tool_id: int, code: CodeType, event_set: Event)->None`` -* ``def insert_marker(tool_id: int, code: CodeType, offset: Event)->None`` -* ``def remove_marker(tool_id: int, code: CodeType, offset: Event)->None`` +* ``def set_local_events(tool_id: int, code: CodeType, event_set: int)->None`` And these operations should be regarded as slow: * ``def use_tool_id(id, name:str)->None`` * ``def free_tool_id(id)->None`` -* ``def set_events(tool_id: int, event_set: Event)->None`` +* ``def set_events(tool_id: int, event_set: int)->None`` * ``def restart_events()->None`` -How slow the slow operations are depends on when then happen. +How slow the slow operations are depends on when they happen. If done early in the program, before modules are loaded, they should be fairly inexpensive. @@ -399,7 +454,7 @@ step of CPython 3.11, as described in :pep:`PEP 659 <659#quickening>`. Instrumentation works in much the same way as quickening, bytecodes are replaced with instrumented ones as needed. -For example, if the ``C_CALL`` event is turned on, +For example, if the ``CALL`` event is turned on, then all call instructions will be replaced with a ``INSTRUMENTED_CALL`` instruction. @@ -424,8 +479,7 @@ but for the current design, the following events will require instrumentation: * PY_RESUME * PY_RETURN * PY_YIELD -* C_CALL -* C_RETURN +* CALL * LINE * INSTRUCTION * JUMP @@ -456,19 +510,8 @@ Debuggers Inserting breakpoints ''''''''''''''''''''' -Breakpoints can be inserted by using markers. For example:: - - sys.monitoring.insert_marker(code, offset) - -Which will insert a marker at ``offset`` in ``code``, -which can be used as a breakpoint. - -To insert a breakpoint at a given line, the matching instruction offsets -should be found from ``code.co_lines()``. - -Breakpoints can be removed by removing the marker:: - - sys.monitoring.remove_marker(code, offset) +Breakpoints can be inserted setting per code object events, either ``LINE`` or ``INSTRUCTION``, +and returning ``DISABLE`` for any events not matching a breakpoint. Stepping '''''''' @@ -476,17 +519,14 @@ Stepping Debuggers usually offer the ability to step execution by a single instruction or line. -This can be implemented by inserting a new marker at the required -offset(s) of the code to be stepped to, -and by removing the current marker. - -It is the job of the debugger to compute the relevant offset(s). +Like breakpoints, stepping can be implemented by setting per code object events. +As soon as normal execution is to be resumed, the local events can be unset. Attaching ''''''''' -Debuggers can use the ``PY_CALL``, etc. events to be informed when -a code object is first encountered, so that any necessary breakpoints +Debuggers can use the ``PY_START`` and ``PY_RESUME`` events to be informed +when a code object is first encountered, so that any necessary breakpoints can be inserted. Coverage Tools @@ -505,13 +545,14 @@ Profilers Simple profilers need to gather information about calls. To do this profilers should register for the following events: -* PY_CALL +* PY_START * PY_RESUME * PY_THROW * PY_RETURN * PY_YIELD * PY_UNWIND -* C_CALL +* CALL +* C_RAISE * C_RETURN @@ -537,6 +578,13 @@ for inserting the monitoring instructions, rather than have VM do it. However, that puts too much of a burden on the tools, and would make attaching a debugger nearly impossible. +An earlier version of this PEP, proposed storing events as ``enums``:: + + class Event(enum.IntFlag): + PY_START = ... + +However, that would prevent monitoring of code before the ``enum`` module was +loaded and could cause unnecessary overhead. Copyright =========