461 lines
20 KiB
ReStructuredText
461 lines
20 KiB
ReStructuredText
PEP: 657
|
||
Title: Include Fine Grained Error Locations in Tracebacks
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Pablo Galindo <pablogsal@python.org>,
|
||
Batuhan Taskaya <batuhan@python.org>,
|
||
Ammar Askar <ammar@ammaraskar.com>
|
||
Discussions-To: https://discuss.python.org/t/pep-657-include-fine-grained-error-locations-in-tracebacks/8629
|
||
Status: Accepted
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 08-May-2021
|
||
Python-Version: 3.11
|
||
Post-History:
|
||
|
||
Abstract
|
||
========
|
||
|
||
This PEP proposes adding a mapping from each bytecode instruction to the start
|
||
and end column offsets of the line that generated them as well as the end line
|
||
number. This data will be used to improve tracebacks displayed by the CPython
|
||
interpreter in order to improve the debugging experience. The PEP also proposes
|
||
adding APIs that allow other tools (such as coverage analysis tools, profilers,
|
||
tracers, debuggers) to consume this information from code objects.
|
||
|
||
Motivation
|
||
==========
|
||
|
||
The primary motivation for this PEP is to improve the feedback presented about
|
||
the location of errors to aid with debugging.
|
||
|
||
Python currently keeps a mapping of bytecode to line numbers from compilation.
|
||
The interpreter uses this mapping to point to the source line associated with
|
||
an error. While this line-level granularity for instructions is useful, a
|
||
single line of Python code can compile into dozens of bytecode operations
|
||
making it hard to track which part of the line caused the error.
|
||
|
||
Consider the following line of Python code::
|
||
|
||
x['a']['b']['c']['d'] = 1
|
||
|
||
If any of the values in the dictionaries are ``None``, the error shown is::
|
||
|
||
Traceback (most recent call last):
|
||
File "test.py", line 2, in <module>
|
||
x['a']['b']['c']['d'] = 1
|
||
TypeError: 'NoneType' object is not subscriptable
|
||
|
||
From the traceback, it is impossible to determine which one of the dictionaries
|
||
had the ``None`` element that caused the error. Users often have to attach a
|
||
debugger or split up their expression to track down the problem.
|
||
|
||
However, if the interpreter had a mapping of bytecode to column offsets as well
|
||
as line numbers, it could helpfully display::
|
||
|
||
Traceback (most recent call last):
|
||
File "test.py", line 2, in <module>
|
||
x['a']['b']['c']['d'] = 1
|
||
^^^^^^^^^^^^^^^^
|
||
TypeError: 'NoneType' object is not subscriptable
|
||
|
||
indicating to the user that the object ``x['a']['b']`` must have been ``None``.
|
||
This highlighting will occur for every frame in the traceback. For instance, if
|
||
a similar error is part of a complex function call chain, the traceback would
|
||
display the code associated to the current instruction in every frame::
|
||
|
||
Traceback (most recent call last):
|
||
File "test.py", line 14, in <module>
|
||
lel3(x)
|
||
^^^^^^^
|
||
File "test.py", line 12, in lel3
|
||
return lel2(x) / 23
|
||
^^^^^^^
|
||
File "test.py", line 9, in lel2
|
||
return 25 + lel(x) + lel(x)
|
||
^^^^^^
|
||
File "test.py", line 6, in lel
|
||
return 1 + foo(a,b,c=x['z']['x']['y']['z']['y'], d=e)
|
||
^^^^^^^^^^^^^^^^^^^^^
|
||
TypeError: 'NoneType' object is not subscriptable
|
||
|
||
This problem presents itself in the following situations.
|
||
|
||
* When passing down multiple objects to function calls while
|
||
accessing the same attribute in them.
|
||
For instance, this error::
|
||
|
||
Traceback (most recent call last):
|
||
File "test.py", line 19, in <module>
|
||
foo(a.name, b.name, c.name)
|
||
AttributeError: 'NoneType' object has no attribute 'name'
|
||
|
||
With the improvements in this PEP this would show::
|
||
|
||
Traceback (most recent call last):
|
||
File "test.py", line 17, in <module>
|
||
foo(a.name, b.name, c.name)
|
||
^^^^^^
|
||
AttributeError: 'NoneType' object has no attribute 'name'
|
||
|
||
* When dealing with lines with complex mathematical expressions,
|
||
especially with libraries such as numpy where arithmetic
|
||
operations can fail based on the arguments. For example: ::
|
||
|
||
Traceback (most recent call last):
|
||
File "test.py", line 1, in <module>
|
||
x = (a + b) @ (c + d)
|
||
ValueError: operands could not be broadcast together with shapes (1,2) (2,3)
|
||
|
||
There is no clear indication as to which operation failed, was it the addition
|
||
on the left, the right or the matrix multiplication in the middle? With this
|
||
PEP the new error message would look like::
|
||
|
||
Traceback (most recent call last):
|
||
File "test.py", line 1, in <module>
|
||
x = (a + b) @ (c + d)
|
||
^^^^^
|
||
ValueError: operands could not be broadcast together with shapes (1,2) (2,3)
|
||
|
||
Giving a much clearer and easier to debug error message.
|
||
|
||
|
||
Debugging aside, this extra information would also be useful for code
|
||
coverage tools, enabling them to measure expression-level coverage instead of
|
||
just line-level coverage. For instance, given the following line: ::
|
||
|
||
x = foo() if bar() else baz()
|
||
|
||
coverage, profile or state analysis tools will highlight the full line in both
|
||
branches, making it impossible to differentiate what branch was taken. This is
|
||
a known problem in pycoverage_.
|
||
|
||
Similar efforts to this PEP have taken place in other languages such as Java in
|
||
the form of JEP358_. ``NullPointerExceptions`` in Java were similarly nebulous when
|
||
it came to lines with complicated expressions. A ``NullPointerException`` would
|
||
provide very little aid in finding the root cause of an error. The
|
||
implementation for JEP358 is fairly complex, requiring walking back through the
|
||
bytecode by using a control flow graph analyzer and decompilation techniques to
|
||
recover the source code that led to the null pointer. Although the complexity
|
||
of this solution is high and requires maintenance for the decompiler every time
|
||
Java bytecode is changed, this improvement was deemed to be worth it for the
|
||
extra information provided for *just one exception type*.
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
In order to identify the range of source code being executed when exceptions
|
||
are raised, this proposal requires adding new data for every bytecode
|
||
instruction. This will have an impact on the size of ``pyc`` files on disk and
|
||
the size of code objects in memory. The authors of this proposal have chosen
|
||
the data types in a way that tries to minimize this impact. The proposed
|
||
overhead is storing two ``uint8_t`` (one for the start offset and one for the
|
||
end offset) and the end line information for every bytecode instruction (in
|
||
the same encoded fashion as the start line is stored currently).
|
||
|
||
As an illustrative example to gauge the impact of this change, we have
|
||
calculated that including the start and end offsets will increase the size of
|
||
the standard library’s pyc files by 22% (6MB) from 28.4MB to 34.7MB. The
|
||
overhead in memory usage will be the same (assuming the *full standard library*
|
||
is loaded into the same program). We believe that this is a very acceptable
|
||
number since the order of magnitude of the overhead is very small, especially
|
||
considering the storage size and memory capabilities of modern computers.
|
||
Additionally, in general the memory size of a Python program is not dominated
|
||
by code objects. To check this assumption we have executed the test suite of
|
||
several popular PyPI projects (including NumPy, pytest, Django and Cython) as
|
||
well as several applications (Black, pylint, mypy executed over either mypy or
|
||
the standard library) and we found that code objects represent normally 3-6% of
|
||
the average memory size of the program.
|
||
|
||
We understand that the extra cost of this information may not be acceptable for
|
||
some users, so we propose an opt-out mechanism which will cause generated code
|
||
objects to not have the extra information while also allowing pyc files to not
|
||
include the extra information.
|
||
|
||
|
||
Specification
|
||
=============
|
||
|
||
In order to have enough information to correctly resolve the location
|
||
within a given line where an error was raised, a map linking bytecode
|
||
instructions to column offsets (start and end offset) and end line numbers
|
||
is needed. This is similar in fashion to how line numbers are currently linked
|
||
to bytecode instructions.
|
||
|
||
The following changes will be performed as part of the implementation of
|
||
this PEP:
|
||
|
||
* The offset information will be exposed to Python via a new attribute in the
|
||
code object class called ``co_positions`` that will return a sequence of
|
||
four-element tuples containing the full location of every instruction
|
||
(including start line, end line, start column offset and end column offset)
|
||
or ``None`` if the code object was created without the offset information.
|
||
* One new C-API function: ::
|
||
|
||
int PyCode_Addr2Location(
|
||
PyCodeObject *co, int addrq,
|
||
int *start_line, int *start_column,
|
||
int *end_line, int *end_column)
|
||
|
||
will be added so the end line, the start column offsets and the end column
|
||
offset can be obtained given the index of a bytecode instruction. This
|
||
function will set the values to 0 if the information is not available.
|
||
|
||
The internal storage, compression and encoding of the information is left as an
|
||
implementation detail and can be changed at any point as long as the public API
|
||
remains unchanged.
|
||
|
||
Offset semantics
|
||
^^^^^^^^^^^^^^^^
|
||
|
||
These offsets are propagated by the compiler from the ones stored currently in
|
||
all AST nodes. The output of the public APIs (``co_positions`` and ``PyCode_Addr2Location``)
|
||
that deal with these attributes use 0-indexed offsets (just like the AST nodes), but the underlying
|
||
implementation is free to represent the actual data in whatever form they choose to be most efficient.
|
||
The error code regarding information not available is ``None`` for the ``co_positions()`` API,
|
||
and ``-1`` for the ``PyCode_Addr2Location`` API. The availability of the information highly depends
|
||
on whether the offsets fall under the range, as well as the runtime flags for the interpreter
|
||
configuration.
|
||
|
||
The AST nodes use ``int`` types to store these values. The current implementation, however,
|
||
utilizes ``uint8_t`` types as an implementation detail to minimize storage impact. This decision
|
||
allows offsets to go from 0 to 255, while offsets bigger than these values will be treated as
|
||
missing (returning ``-1`` on the ``PyCode_Addr2Location`` and ``None`` API in the ``co_positions()`` API).
|
||
|
||
As specified previously, the underlying storage of the offsets should be
|
||
considered an implementation detail, as the public APIs to obtain this values
|
||
will return either C ``int`` types or Python ``int`` objects, which allows to
|
||
implement better compression/encoding in the future if bigger ranges would need
|
||
to be supported. This PEP proposes to start with this simpler version and
|
||
defer improvements to future work.
|
||
|
||
Displaying tracebacks
|
||
^^^^^^^^^^^^^^^^^^^^^
|
||
|
||
When displaying tracebacks, the default exception hook will be modified to
|
||
query this information from the code objects and use it to display a sequence
|
||
of carets for every displayed line in the traceback if the information is
|
||
available. For instance::
|
||
|
||
File "test.py", line 6, in lel
|
||
return 1 + foo(a,b,c=x['z']['x']['y']['z']['y'], d=e)
|
||
^^^^^^^^^^^^^^^^^^^^^
|
||
TypeError: 'NoneType' object is not subscriptable
|
||
|
||
When displaying tracebacks, instruction offsets will be taken from the
|
||
traceback objects. This makes highlighting exceptions that are re-raised work
|
||
naturally without the need to store the new information in the stack. For
|
||
example, for this code::
|
||
|
||
def foo(x):
|
||
1 + 1/0 + 2
|
||
|
||
def bar(x):
|
||
try:
|
||
1 + foo(x) + foo(x)
|
||
except Exception as e:
|
||
raise ValueError("oh no!") from e
|
||
|
||
bar(bar(bar(2)))
|
||
|
||
The printed traceback would look like this::
|
||
|
||
Traceback (most recent call last):
|
||
File "test.py", line 6, in bar
|
||
1 + foo(x) + foo(x)
|
||
^^^^^^
|
||
File "test.py", line 2, in foo
|
||
1 + 1/0 + 2
|
||
^^^
|
||
ZeroDivisionError: division by zero
|
||
|
||
The above exception was the direct cause of the following exception:
|
||
|
||
Traceback (most recent call last):
|
||
File "test.py", line 10, in <module>
|
||
bar(bar(bar(2)))
|
||
^^^^^^
|
||
File "test.py", line 8, in bar
|
||
raise ValueError("oh no!") from e
|
||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||
ValueError: oh no
|
||
|
||
While this code::
|
||
|
||
def foo(x):
|
||
1 + 1/0 + 2
|
||
def bar(x):
|
||
try:
|
||
1 + foo(x) + foo(x)
|
||
except Exception:
|
||
raise
|
||
bar(bar(bar(2)))
|
||
|
||
Will be displayed as::
|
||
|
||
Traceback (most recent call last):
|
||
File "test.py", line 10, in <module>
|
||
bar(bar(bar(2)))
|
||
^^^^^^
|
||
File "test.py", line 6, in bar
|
||
1 + foo(x) + foo(x)
|
||
^^^^^^
|
||
File "test.py", line 2, in foo
|
||
1 + 1/0 + 2
|
||
^^^
|
||
ZeroDivisionError: division by zero
|
||
|
||
Maintaining the current behavior, only a single line will be displayed
|
||
in tracebacks. For instructions that span multiple lines (the end offset
|
||
and the start offset belong to different lines), the end line number must
|
||
be inspected to know if the end offset applies to the same line as the
|
||
starting offset.
|
||
|
||
Opt-out mechanism
|
||
^^^^^^^^^^^^^^^^^
|
||
|
||
To offer an opt-out mechanism for those users that care about the
|
||
storage and memory overhead and to allow third party tools and other
|
||
programs that are currently parsing tracebacks to catch up the following
|
||
methods will be provided to deactivate this feature:
|
||
|
||
* A new environment variable: ``PYTHONNODEBUGRANGES``.
|
||
* A new command line option for the dev mode: ``python -Xnodebugranges``.
|
||
|
||
If any of these methods are used, the Python compiler will **not** populate
|
||
code objects with the new information (``None`` will be used instead) and any
|
||
unmarshalled code objects that contain the extra information will have it stripped
|
||
away and replaced with ``None``). Additionally, the traceback machinery will not
|
||
show the extended location information even if the information was present.
|
||
This method allows users to:
|
||
|
||
* Create smaller ``pyc`` files by using one of the two methods when said files
|
||
are created.
|
||
* Don't load the extra information from ``pyc`` files if those were created with
|
||
the extra information in the first place.
|
||
* Deactivate the extra information when displaying tracebacks (the caret characters
|
||
indicating the location of the error).
|
||
|
||
Doing this has a **very small** performance hit as the interpreter state needs
|
||
to be fetched when code objects are created to look up the configuration.
|
||
Creating code objects is not a performance sensitive operation so this should
|
||
not be a concern.
|
||
|
||
Backwards Compatibility
|
||
=======================
|
||
|
||
The change is fully backwards compatible.
|
||
|
||
|
||
Reference Implementation
|
||
========================
|
||
|
||
A reference implementation can be found in the implementation_ fork.
|
||
|
||
Rejected Ideas
|
||
==============
|
||
|
||
Use a single caret instead of a range
|
||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||
It has been proposed to use a single caret instead of highlighting the full
|
||
range when reporting errors as a way to simplify the feature. We have decided
|
||
to not go this route for the following reasons:
|
||
|
||
* Deriving the location of the caret is not straightforward using the current
|
||
layout of the AST. This is because the AST nodes only record the start and end
|
||
line numbers as well as the start and end column offsets. As the AST nodes do
|
||
not preserve the original tokens (by design) deriving the exact location of some
|
||
tokens is not possible without extra re-parsing. For instance, currently binary
|
||
operators have nodes for the operands but the type of the operator is stored
|
||
in an enumeration so its location cannot be derived from the node (this is just
|
||
an example of how this problem manifest, and not the only one).
|
||
* Deriving the ranges from AST nodes greatly simplifies the implementation and reduces
|
||
a lot the maintenance cost and the possibilities of errors. This is because using
|
||
the ranges is always possible to do generically for any AST node, while any other
|
||
custom information would need to be extracted differently from different types of
|
||
nodes. Given how error-prone getting the locations manually was when this used to
|
||
be a manual process when generating the AST, we believe that a generic solution is
|
||
a very important property to pursue.
|
||
* Storing the information to highlight a single caret will be very limiting for tools
|
||
such as coverage tools and profilers as well as for tools like IPython and IDEs that
|
||
want to make use of this new feature. As `this message <https://discuss.python.org/t/pep-657-include-fine-grained-error-locations-in-tracebacks/8629/2?u=pablogsal>`_ from the author of "friendly-traceback"
|
||
mentions, the reason is that without the full range (including end lines) these tools
|
||
will find very difficult to highlight correctly the relevant source code. For instance,
|
||
for this code::
|
||
|
||
something = foo(a,b,c) if bar(a,b,c) else other(b,c,d)
|
||
|
||
tools (such as coverage reporters) want to be able to highlight the totality of the call
|
||
that is covered by the executed bytecode (let's say ``foo(a,b,c)``) and not just a single
|
||
character. Even if is technically possible to re-parse and re-tokenize the source code
|
||
to re-construct the information, it is not possible to do this reliably and would
|
||
result in a much worse user experience.
|
||
* Many users have reported that a single caret is much harder to read than a full range,
|
||
and this motivated using ranges to highlight syntax errors, which was very well received.
|
||
Additionally, it has been noted that users with vision problems can identify the ranges
|
||
much easily than a single caret character, which we believe is a great advantage of
|
||
using ranges.
|
||
|
||
Have a configure flag to opt out
|
||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||
Having a configure flag to opt out of the overhead even when executing Python
|
||
in non-optimized mode may sound desirable, but it may cause problems when
|
||
reading pyc files that were created with a version of the interpreter that was
|
||
not compiled with the flag activated. This can lead to crashes that would be
|
||
very difficult to debug for regular users and will make different pyc files
|
||
incompatible between each other. As this pyc could be shipped as part of
|
||
libraries or applications without the original source, it is also not always
|
||
possible to force recompilation of said pyc files. For these reasons we have
|
||
decided to use the -O flag to opt-out of this behaviour.
|
||
|
||
Lazy loading of column information
|
||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||
One potential solution to reduce the memory usage of this feature is to not
|
||
load the column information from the pyc file when code is imported. Only if an
|
||
uncaught exception bubbles up or if a call to the C-API functions is made will
|
||
the column information be loaded from the pyc file. This is similar to how we
|
||
only read source lines to display them in the traceback when an exception
|
||
bubbles up. While this would indeed lower memory usage, it also results in a
|
||
far more complex implementation requiring changes to the importing machinery to
|
||
selectively ignore a part of the code object. We consider this an interesting
|
||
avenue to explore but ultimately we think is out of the scope for this particular
|
||
PEP. It also means that column information will not be available if the user is
|
||
not using pyc files or for code objects created dynamically at runtime.
|
||
|
||
Implement compression
|
||
^^^^^^^^^^^^^^^^^^^^^
|
||
Although it would be possible to implement some form of compression over the
|
||
pyc files and the new data in code objects, we believe that this is out of the
|
||
scope of this proposal due to its larger impact (in the case of pyc files) and
|
||
the fact that we expect column offsets to not compress well due to the lack of
|
||
patterns in them (in case of the new data in code objects).
|
||
|
||
Acknowledgments
|
||
===============
|
||
Thanks to Carl Friedrich Bolz-Tereick for showing an initial prototype of this
|
||
idea for the Pypy interpreter and for the helpful discussion.
|
||
|
||
|
||
References
|
||
==========
|
||
|
||
.. _JEP358: https://openjdk.java.net/jeps/358
|
||
.. _implementation: https://github.com/colnotab/cpython/tree/bpo-43950
|
||
.. _pycoverage: https://github.com/nedbat/coveragepy/issues/509
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document is placed in the public domain or under the
|
||
CC0-1.0-Universal license, whichever is more permissive.
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|