PEP: 473 Title: Adding structured data to built-in exceptions Version: $Revision$ Last-Modified: $Date$ Author: Sebastian Kreft Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 29-Mar-2014 Post-History: Abstract ======== Exceptions like ``AttributeError``, ``IndexError``, ``KeyError``, ``LookupError``, ``NameError``, ``TypeError``, and ``ValueError`` do not provide all information required by programmers to debug and better understand what caused them. Furthermore, in some cases the messages even have slightly different formats, which makes it really difficult for tools to automatically provide additional information to diagnose the problem. To tackle the former and to lay ground for the latter, it is proposed to expand these exceptions so to hold both the offending and affected entities. Rationale ========= The main issue this PEP aims to solve is the fact that currently error messages are not that expressive and lack some key information to resolve the exceptions. Additionally, the information present on the error message is not always in the same format, which makes it very difficult for third-party libraries to provide automated diagnosis of the error. These automated tools could, for example, detect typos or display or log extra debug information. These could be particularly useful when running tests or in a long running application. Although it is in theory possible to have such libraries, they need to resort to hacks in order to achieve the goal. One such example is python-improved-exceptions [1]_, which modifies the byte-code to keep references to the possibly interesting objects and also parses the error messages to extract information like types or names. Unfortunately, such approach is extremely fragile and not portable. A similar proposal [2]_ has been implemented for ``ImportError`` and in the same fashion this idea has received support [3]_. Additionally, almost 10 years ago Guido asked in [11]_ to have a clean API to access the affected objects in Exceptions like ``KeyError``, ``AttributeError``, ``NameError``, and ``IndexError``. Similar issues and proposals ideas have been written in the last year. Some other issues have been created, but despite receiving support they finally get abandoned. References to the created issues are listed below: * ``AttributeError``: [11]_, [10]_, [5]_, [4]_, [3]_ * ``IndexError``: [11]_, [6]_, [3]_ * ``KeyError``: [11]_, [7]_, [3]_ * ``LookupError``: [11]_ * ``NameError``: [11]_, [10]_, [3]_ * ``TypeError``: [8]_ * ``ValueError``: [9]_ To move forward with the development and to centralize the information and discussion, this PEP aims to be a meta-issue summarizing all the above discussions and ideas. Examples ======== IndexError ---------- The error message does not reference the list's length nor the index used. :: a = [1, 2, 3, 4, 5] a[5] IndexError: list index out of range KeyError -------- By convention the key is the first element of the error's argument, but there's no other information regarding the affected dictionary (keys types, size, etc.) :: b = {'foo': 1} b['fo'] KeyError: 'fo' AttributeError -------------- The object's type and the offending attribute are part of the error message. However, there are some different formats and the information is not always available. Furthermore, although the object type is useful in some cases, given the dynamic nature of Python, it would be much more useful to have a reference to the object itself. Additionally the reference to the type is not fully qualified and in some cases the type is just too generic to provide useful information, for example in case of accessing a module's attribute. :: c = object() c.foo AttributeError: 'object' object has no attribute 'foo' import string string.foo AttributeError: 'module' object has no attribute 'foo' a = string.Formatter() a.foo AttributeError: 'Formatter' object has no attribute 'foo' NameError --------- The error message provides typically the name. :: foo = 1 fo NameError: global name 'fo' is not defined Other Cases ----------- Issues are even harder to debug when the target object is the result of another expression, for example: :: a[b[c[0]]] This issue is also related to the fact that opcodes only have line number information and not the offset. This proposal would help in this case but not as much as having offsets. Proposal ======== Extend the exceptions ``AttributeError``, ``IndexError``, ``KeyError``, ``LookupError``, ``NameError``, ``TypeError``, and ``ValueError`` with the following: * ``AttributeError``: target :sup:`w`, attribute * ``IndexError``: target :sup:`w`, key :sup:`w`, index (just an alias to key) * ``KeyError``: target :sup:`w`, key :sup:`w` * ``LookupError``: target :sup:`w`, key :sup:`w` * ``NameError``: name, scope? * ``TypeError``: unexpected_type * ``ValueError``: unexpected_value :sup:`w` Attributes with the superscript :sup:`w` may need to be weak references [12]_ to prevent any memory cycles. However, this may add an unnecessary extra complexity as noted by R. David Murray [13]_. This is specially true given that builtin types do not support being weak referenced. TODO(skreft): expand this with examples of corner cases. To remain backwards compatible these new attributes will be optional and keyword only. It is proposed to add this information, rather than just improve the error, as the former would allow new debugging frameworks and tools and also in the future to switch to a lazy generated message. Generated messages are discussed in [2]_, although they are not implemented at the moment. They would not only save some resources, but also uniform the messages. The stdlib will be then gradually changed so to start using these new attributes. Potential Uses ============== An automated tool could for example search for similar keys within the object, allowing to display the following::: a = {'foo': 1} a['fo'] KeyError: 'fo'. Did you mean 'foo'? foo = 1 fo NameError: global name 'fo' is not defined. Did you mean 'foo'? See [3]_ for the output a TestRunner could display. Performance =========== Filling these new attributes would only require two extra parameters with data already available so the impact should be marginal. However, it may need special care for ``KeyError`` as the following pattern is already widespread. :: try: a[foo] = a[foo] + 1 except: a[foo] = 0 Note as well that storing these objects into the error itself would allow the lazy generation of the error message, as discussed in [2]_. References ========== .. [1] Python Exceptions Improved (https://www.github.com/sk-/python-exceptions-improved) .. [2] ImportError needs attributes for module and file name (http://bugs.python.org/issue1559549) .. [3] Enhance exceptions by attaching some more information to them (https://mail.python.org/pipermail/python-ideas/2014-February/025601.html) .. [4] Specifity in AttributeError (https://mail.python.org/pipermail/python-ideas/2013-April/020308.html) .. [5] Add an 'attr' attribute to AttributeError (http://bugs.python.org/issue18156) .. [6] Add index attribute to IndexError (http://bugs.python.org/issue18162) .. [7] Add a 'key' attribute to KeyError (http://bugs.python.org/issue18163) .. [8] Add 'unexpected_type' to TypeError (http://bugs.python.org/issue18165) .. [9] 'value' attribute for ValueError (http://bugs.python.org/issue18166) .. [10] making builtin exceptions more informative (http://bugs.python.org/issue1182143) .. [11] LookupError etc. need API to get the key (http://bugs.python.org/issue614557) .. [12] weakref - Weak References (https://docs.python.org/3/library/weakref.html) .. [13] Message by R. David Murray: Weak refs on exceptions? (http://bugs.python.org/issue18163#msg190791) Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: