394 lines
17 KiB
ReStructuredText
394 lines
17 KiB
ReStructuredText
PEP: 707
|
|
Title: A simplified signature for __exit__ and __aexit__
|
|
Author: Irit Katriel <iritkatriel@gmail.com>
|
|
Discussions-To: https://discuss.python.org/t/24402
|
|
Status: Draft
|
|
Type: Standards Track
|
|
Content-Type: text/x-rst
|
|
Created: 18-Feb-2023
|
|
Python-Version: 3.12
|
|
Post-History: `02-Mar-2023 <https://discuss.python.org/t/24402/>`__,
|
|
Resolution:
|
|
|
|
|
|
Abstract
|
|
========
|
|
|
|
This PEP proposes to make the interpreter accept context managers whose
|
|
:meth:`~py3.11:object.__exit__` / :meth:`~py3.11:object.__aexit__` method
|
|
takes only a single exception instance,
|
|
while continuing to also support the current ``(typ, exc, tb)`` signature
|
|
for backwards compatibility.
|
|
|
|
This proposal is part of an ongoing effort to remove the redundancy of
|
|
the 3-item exception representation from the language, a relic of earlier
|
|
Python versions which now confuses language users while adding complexity
|
|
and overhead to the interpreter.
|
|
|
|
The proposed implementation uses introspection, which is tailored to the
|
|
requirements of this use case. The solution ensures the safety of the new
|
|
feature by supporting it only in non-ambiguous cases. In particular, any
|
|
signature that *could* accept three arguments is assumed to expect them.
|
|
|
|
Because reliable introspection of callables is not currently possible in
|
|
Python, the solution proposed here is limited in that only the common types
|
|
of single-arg callables will be identified as such, while some of the more
|
|
esoteric ones will continue to be called with three arguments. This
|
|
imperfect solution was chosen among several imperfect alternatives in the
|
|
spirit of practicality. It is my hope that the discussion about this PEP
|
|
will explore the other options and lead us to the best way forward, which
|
|
may well be to remain with our imperfect status quo.
|
|
|
|
|
|
Motivation
|
|
==========
|
|
|
|
In the past, an exception was represented in many parts of Python by a
|
|
tuple of three elements: the type of the exception, its value, and its
|
|
traceback. While there were good reasons for this design at the time,
|
|
they no longer hold because the type and traceback can now be reliably
|
|
deduced from the exception instance. Over the last few years we saw
|
|
several efforts to simplify the representation of exceptions.
|
|
|
|
Since 3.10 in `CPython PR #70577 <https://github.com/python/cpython/issues/70577>`_,
|
|
the :mod:`py3.11:traceback` module's functions accept either a 3-tuple
|
|
as described above, or just an exception instance as a single argument.
|
|
|
|
Internally, the interpreter no longer represents exceptions as a triplet.
|
|
This was `removed for the handled exception in 3.11
|
|
<https://github.com/python/cpython/pull/30122>`_ and
|
|
`for the raised exception in 3.12
|
|
<https://github.com/python/cpython/pull/101607>`_. As a consequence,
|
|
several APIs that expose the triplet can now be replaced by
|
|
simpler alternatives:
|
|
|
|
.. list-table::
|
|
:header-rows: 1
|
|
:widths: auto
|
|
|
|
* -
|
|
- Legacy API
|
|
- Alternative
|
|
* - Get handled exception (Python)
|
|
- :func:`py3.12:sys.exc_info`
|
|
- :func:`py3.12:sys.exception`
|
|
* - Get handled exception (C)
|
|
- :external+py3.12:c:func:`PyErr_GetExcInfo`
|
|
- :external+py3.12:c:func:`PyErr_GetHandledException`
|
|
* - Set handled exception (C)
|
|
- :external+py3.12:c:func:`PyErr_SetExcInfo`
|
|
- :external+py3.12:c:func:`PyErr_SetHandledException`
|
|
* - Get raised exception (C)
|
|
- :external+py3.12:c:func:`PyErr_Fetch`
|
|
- :external+py3.12:c:func:`PyErr_GetRaisedException`
|
|
* - Set raised exception (C)
|
|
- :external+py3.12:c:func:`PyErr_Restore`
|
|
- :external+py3.12:c:func:`PyErr_SetRaisedException`
|
|
* - Construct an exception instance from the 3-tuple (C)
|
|
- :external+py3.12:c:func:`PyErr_NormalizeException`
|
|
- N/A
|
|
|
|
|
|
The current proposal is a step in this process, and considers the way
|
|
forward for one more case in which the 3-tuple representation has
|
|
leaked to the language. The motivation for all this work is twofold.
|
|
|
|
Simplify the implementation of the language
|
|
-------------------------------------------
|
|
|
|
The simplification gained by reducing the interpreter's internal
|
|
representation of the handled exception to a single object was significant.
|
|
Previously, the interpreter needed to push onto/pop
|
|
from the stack three items whenever it did anything with exceptions.
|
|
This increased stack depth (adding pressure on caches and registers) and
|
|
complicated some of the bytecodes. Reducing this to one item
|
|
`removed about 100 lines of code <https://github.com/python/cpython/pull/30122>`_
|
|
from ``ceval.c`` (the interpreter's eval loop implementation), and it was later
|
|
followed by the removal of the ``POP_EXCEPT_AND_RERAISE`` opcode which has
|
|
become simple enough to be `replaced by generic stack manipulation instructions
|
|
<https://github.com/python/cpython/issues/90360>`_. Micro-benchmarks showed
|
|
`a speedup of about 10% for catching and raising an exception, as well as
|
|
for creating generators
|
|
<https://github.com/faster-cpython/ideas/issues/106#issuecomment-990172363>`_.
|
|
To summarize, removing this redundancy in Python's internals simplified the
|
|
interpreter and made it faster.
|
|
|
|
The performance of invoking ``__exit__``/``__aexit__`` when leaving
|
|
a context manager can be also improved by replacing a multi-arg function
|
|
call with a single-arg one. Micro-benchmarks showed that entering and exiting
|
|
a context manager with single-arg ``__exit__`` is about 13% faster.
|
|
|
|
Simplify the language itself
|
|
----------------------------
|
|
|
|
One of the reasons for the popularity of Python is its simplicity. The
|
|
:func:`py3.11:sys.exc_info` triplet is cryptic for new learners,
|
|
and the redundancy in it is confusing for those who do understand it.
|
|
|
|
It will take multiple releases to get to a point where we can think of
|
|
deprecating ``sys.exc_info()``. However, we can relatively quickly reach a
|
|
stage where new learners do not need to know about it, or about the 3-tuple
|
|
representation, at least until they are maintaining legacy code.
|
|
|
|
Rationale
|
|
=========
|
|
|
|
The only reason to object today to the removal of the last remaining
|
|
appearances of the 3-tuple from the language is the concerns about
|
|
disruption that such changes can bring. The goal of this PEP is to propose
|
|
a safe, gradual and minimally disruptive way to make this change in the
|
|
case of ``__exit__``, and with this to initiate a discussion of our options
|
|
for evolving its method signature.
|
|
|
|
In the case of the :mod:`py3.11:traceback` module's API, evolving the
|
|
functions to have a hybrid signature is relatively straightforward and
|
|
safe. The functions take one positional and two optional arguments, and
|
|
interpret them according to their types. This is safe when sentinels
|
|
are used for default values. The signatures of callbacks, which are
|
|
defined by the user's program, are harder to evolve.
|
|
|
|
The safest option is to make the user explicitly indicate which signature
|
|
the callback is expecting, by marking it with an additional attribute or
|
|
giving it a different name. For example, we could make the interpreter
|
|
look for a ``__leave__`` method on the context manager, and call it with
|
|
a single arg if it exists (otherwise, it looks for ``__exit__`` and
|
|
continues as it does now). The introspection-based alternative proposed
|
|
here intends to make it more convenient for users to write new code,
|
|
because they can just use the single-arg version and remain unaware of
|
|
the legacy API. However, if the limitations of introspection are found
|
|
to be too severe, we should consider an explicit option. Having both
|
|
``__exit__`` and ``__leave__`` around for 5-10 years with similar
|
|
functionality is not ideal, but it is an option.
|
|
|
|
Let us now examine the limitations of the current proposal. It identifies
|
|
2-arg python functions and ``METH_O`` C functions as having a single-arg
|
|
signature, and assumes that anything else is expecting 3 args. Obviously
|
|
it is possible to create false negatives for this heuristic (single-arg
|
|
callables that it will not identify). Context managers written in this
|
|
way won't work, they will continue to fail as they do now when their
|
|
``__exit__`` function will be called with three arguments.
|
|
|
|
I believe that it will not be a problem in practice. First, all working
|
|
code will continue to work, so this is a limitation on new code rather
|
|
than a problem impacting existing code. Second, exotic callable types are
|
|
rarely used for ``__exit__`` and if one is needed, it can always be wrapped
|
|
by a plain vanilla method that delegates to the callable. For example, we
|
|
can write this::
|
|
|
|
class C:
|
|
__enter__ = lambda self: self
|
|
__exit__ = ExoticCallable()
|
|
|
|
as follows::
|
|
|
|
class CM:
|
|
__enter__ = lambda self: self
|
|
_exit = ExoticCallable()
|
|
__exit__ = lambda self, exc: CM._exit(exc)
|
|
|
|
While discussing the real-world impact of the problem in this PEP, it is
|
|
worth noting that most ``__exit__`` functions don't do anything with their
|
|
arguments. Typically, a context manager is implemented to ensure that some
|
|
cleanup actions take place upon exit. It is rarely appropriate for the
|
|
``__exit__`` function to handle exceptions raised within the context, and
|
|
they are typically allowed to propagate out of ``__exit__`` to the calling
|
|
function. This means that most ``__exit__`` functions do not access their
|
|
arguments at all, and we should take this into account when trying to
|
|
assess the impact of different solutions on Python's userbase.
|
|
|
|
|
|
Specification
|
|
=============
|
|
|
|
A context manager's ``__exit__``/``__aexit__`` method can have a single-arg
|
|
signature, in which case it is invoked by the interpreter with the argument
|
|
equal to an exception instance or ``None``:
|
|
|
|
.. code-block::
|
|
|
|
>>> class C:
|
|
... def __enter__(self):
|
|
... return self
|
|
... def __exit__(self, exc):
|
|
... print(f'__exit__ called with: {exc!r}')
|
|
...
|
|
>>> with C():
|
|
... pass
|
|
...
|
|
__exit__ called with: None
|
|
>>> with C():
|
|
... 1/0
|
|
...
|
|
__exit__ called with: ZeroDivisionError('division by zero')
|
|
Traceback (most recent call last):
|
|
File "<stdin>", line 2, in <module>
|
|
ZeroDivisionError: division by zero
|
|
|
|
If ``__exit__``/``__aexit__`` has any other signature, it is invoked with
|
|
the 3-tuple ``(typ, exc, tb)`` as happens now:
|
|
|
|
.. code-block::
|
|
|
|
>>> class C:
|
|
... def __enter__(self):
|
|
... return self
|
|
... def __exit__(self, *exc):
|
|
... print(f'__exit__ called with: {exc!r}')
|
|
...
|
|
>>> with C():
|
|
... pass
|
|
...
|
|
__exit__ called with: (None, None, None)
|
|
>>> with C():
|
|
... 1/0
|
|
...
|
|
__exit__ called with: (<class 'ZeroDivisionError'>, ZeroDivisionError('division by zero'), <traceback object at 0x1039cb570>)
|
|
Traceback (most recent call last):
|
|
File "<stdin>", line 2, in <module>
|
|
ZeroDivisionError: division by zero
|
|
|
|
|
|
These ``__exit__`` methods will also be called with a 3-tuple:
|
|
|
|
.. code-block::
|
|
|
|
def __exit__(self, typ, *exc):
|
|
pass
|
|
|
|
def __exit__(self, typ, exc, tb):
|
|
pass
|
|
|
|
A reference implementation is provided in
|
|
`CPython PR #101995 <https://github.com/python/cpython/pull/101995>`_.
|
|
|
|
When the interpreter reaches the end of the scope of a context manager,
|
|
and it is about to call the relevant ``__exit__`` or ``__aexit__`` function,
|
|
it instrospects this function to determine whether it is the single-arg
|
|
or the legacy 3-arg version. In the draft PR, this introspection is performed
|
|
by the ``is_legacy___exit__`` function:
|
|
|
|
.. code-block:: c
|
|
|
|
static int is_legacy___exit__(PyObject *exit_func) {
|
|
if (PyMethod_Check(exit_func)) {
|
|
PyObject *func = PyMethod_GET_FUNCTION(exit_func);
|
|
if (PyFunction_Check(func)) {
|
|
PyCodeObject *code = (PyCodeObject*)PyFunction_GetCode(func);
|
|
if (code->co_argcount == 2 && !(code->co_flags & CO_VARARGS)) {
|
|
/* Python method that expects self + one more arg */
|
|
return false;
|
|
}
|
|
}
|
|
}
|
|
else if (PyCFunction_Check(exit_func)) {
|
|
if (PyCFunction_GET_FLAGS(exit_func) == METH_O) {
|
|
/* C function declared as single-arg */
|
|
return false;
|
|
}
|
|
}
|
|
return true;
|
|
}
|
|
|
|
It is important to note that this is not a generic introspection function, but
|
|
rather one which is specifically designed for our use case. We know that
|
|
``exit_func`` is an attribute of the context manager class (taken from the
|
|
type of the object that provided ``__enter__``), and it is typically a function.
|
|
Furthermore, for this to be useful we need to identify enough single-arg forms,
|
|
but not necessarily all of them. What is critical for backwards compatibility is
|
|
that we will never misidentify a legacy ``exit_func`` as a single-arg one. So,
|
|
for example, ``__exit__(self, *args)`` and ``__exit__(self, exc_type, *args)``
|
|
both have the legacy form, even though they *could* be invoked with one arg.
|
|
|
|
In summary, an ``exit_func`` will be invoke with a single arg if:
|
|
|
|
* It is a ``PyMethod`` with ``argcount`` ``2`` (to count ``self``) and no vararg, or
|
|
* it is a ``PyCFunction`` with the ``METH_O`` flag.
|
|
|
|
Note that any performance cost of the introspection can be mitigated via
|
|
:pep:`specialization <659>`, so it won't be a problem if we need to make it more
|
|
sophisticated than this for some reason.
|
|
|
|
|
|
Backwards Compatibility
|
|
=======================
|
|
|
|
All context managers that previously worked will continue to work in the
|
|
same way because the interpreter will call them with three args whenever
|
|
they can accept three args. There may be context managers that previously
|
|
did not work because their ``exit_func`` expected one argument, so the call
|
|
to ``__exit__`` would have caused a ``TypeError`` exception to be raised,
|
|
and now the call would succeed. This could theoretically change the
|
|
behaviour of existing code, but it is unlikely to be a problem in practice.
|
|
|
|
The backwards compatibility concerns will show up in some cases when libraries
|
|
try to migrate their context managers from the multi-arg to the single-arg
|
|
signature. If ``__exit__`` or ``__aexit__`` is called by any code other than
|
|
the interpreter's eval loop, the introspection does not automatically happen.
|
|
For example, this will occur where a context manager is subclassed and its
|
|
``__exit__`` method is called directly from the derived ``__exit__``. Such
|
|
context managers will need to migrate to the single-arg version with their
|
|
users, and may choose to offer a parallel API rather than breaking the
|
|
existing one. Alternatively, a superclass can stay with the signature
|
|
``__exit__(self, *args)``, and support both one and three args. Since
|
|
most context managers do not use the value of the arguments to ``__exit__``,
|
|
and simply allow the exception to propagate onward, this is likely to be the
|
|
common approach.
|
|
|
|
|
|
Security Implications
|
|
=====================
|
|
|
|
I am not aware of any.
|
|
|
|
How to Teach This
|
|
=================
|
|
|
|
The language tutorial will present the single-arg version, and the documentation
|
|
for context managers will include a section on the legacy signatures of
|
|
``__exit__`` and ``__aexit__``.
|
|
|
|
|
|
Reference Implementation
|
|
========================
|
|
|
|
`CPython PR #101995 <https://github.com/python/cpython/pull/101995>`_
|
|
implements the proposal of this PEP.
|
|
|
|
|
|
Rejected Ideas
|
|
==============
|
|
|
|
Support ``__leave__(self, exc)``
|
|
----------------------------------
|
|
|
|
It was considered to support a method by a new name, such as ``__leave__``,
|
|
with the new signature. This basically makes the programmer explicitly declare
|
|
which signature they are intending to use, and avoid the need for introspection.
|
|
|
|
Different variations of this idea include different amounts of magic that can
|
|
help automate the equivalence between ``__leave__`` and ``__exit__``. For example,
|
|
`Mark Shannon suggested <https://github.com/faster-cpython/ideas/issues/550#issuecomment-1410120100>`_
|
|
that the type constructor would add a default implementation for each of ``__exit__``
|
|
and ``__leave__`` whenever one of them is defined on a class. This default
|
|
implementation acts as a trampoline that calls the user's function. This would
|
|
make inheritance work seamlessly, as well as the migration from ``__exit__`` to
|
|
``__leave__`` for particular classes. The interpreter would just need to call
|
|
``__leave__``, and that would call ``__exit__`` whenever necessary.
|
|
|
|
While this suggestion has several advantages over the current proposal, it has
|
|
two drawbacks. The first is that it adds a new dunder name to the data model,
|
|
and we would end up with two dunders that mean the same thing, and only slightly
|
|
differ in their signatures. The second is that it would require the migration of
|
|
every ``__exit__`` to ``__leave__``, while with introspection it would not be
|
|
necessary to change the many ``__exit__(*arg)`` methods that do not access their
|
|
args. While it is not as simple as a grep for ``__exit__``, it is possible to write
|
|
an AST visitor that detects ``__exit__`` methods that can accept multiple arguments,
|
|
and which do access them.
|
|
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document is placed in the public domain or under the
|
|
CC0-1.0-Universal license, whichever is more permissive.
|