403 lines
17 KiB
ReStructuredText
403 lines
17 KiB
ReStructuredText
PEP: 707
|
||
Title: A simplified signature for __exit__ and __aexit__
|
||
Author: Irit Katriel <iritkatriel@gmail.com>
|
||
Discussions-To: https://discuss.python.org/t/24402
|
||
Status: Rejected
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 18-Feb-2023
|
||
Python-Version: 3.12
|
||
Post-History: `02-Mar-2023 <https://discuss.python.org/t/24402/>`__,
|
||
Resolution: https://discuss.python.org/t/pep-707-a-simplified-signature-for-exit-and-aexit/24402/46
|
||
|
||
Rejection Notice
|
||
================
|
||
|
||
`Per the SC <https://discuss.python.org/t/24402/46>`__:
|
||
|
||
We discussed the PEP and have decided to reject it. Our thinking was the
|
||
magic and risk of potential breakage didn’t warrant the benefits. We are
|
||
totally supportive, though, of exploring a potential context manager v2
|
||
API or ``__leave__``.
|
||
|
||
Abstract
|
||
========
|
||
|
||
This PEP proposes to make the interpreter accept context managers whose
|
||
:meth:`~py3.11:object.__exit__` / :meth:`~py3.11:object.__aexit__` method
|
||
takes only a single exception instance,
|
||
while continuing to also support the current ``(typ, exc, tb)`` signature
|
||
for backwards compatibility.
|
||
|
||
This proposal is part of an ongoing effort to remove the redundancy of
|
||
the 3-item exception representation from the language, a relic of earlier
|
||
Python versions which now confuses language users while adding complexity
|
||
and overhead to the interpreter.
|
||
|
||
The proposed implementation uses introspection, which is tailored to the
|
||
requirements of this use case. The solution ensures the safety of the new
|
||
feature by supporting it only in non-ambiguous cases. In particular, any
|
||
signature that *could* accept three arguments is assumed to expect them.
|
||
|
||
Because reliable introspection of callables is not currently possible in
|
||
Python, the solution proposed here is limited in that only the common types
|
||
of single-arg callables will be identified as such, while some of the more
|
||
esoteric ones will continue to be called with three arguments. This
|
||
imperfect solution was chosen among several imperfect alternatives in the
|
||
spirit of practicality. It is my hope that the discussion about this PEP
|
||
will explore the other options and lead us to the best way forward, which
|
||
may well be to remain with our imperfect status quo.
|
||
|
||
|
||
Motivation
|
||
==========
|
||
|
||
In the past, an exception was represented in many parts of Python by a
|
||
tuple of three elements: the type of the exception, its value, and its
|
||
traceback. While there were good reasons for this design at the time,
|
||
they no longer hold because the type and traceback can now be reliably
|
||
deduced from the exception instance. Over the last few years we saw
|
||
several efforts to simplify the representation of exceptions.
|
||
|
||
Since 3.10 in `CPython PR #70577 <https://github.com/python/cpython/issues/70577>`_,
|
||
the :mod:`py3.11:traceback` module's functions accept either a 3-tuple
|
||
as described above, or just an exception instance as a single argument.
|
||
|
||
Internally, the interpreter no longer represents exceptions as a triplet.
|
||
This was `removed for the handled exception in 3.11
|
||
<https://github.com/python/cpython/pull/30122>`_ and
|
||
`for the raised exception in 3.12
|
||
<https://github.com/python/cpython/pull/101607>`_. As a consequence,
|
||
several APIs that expose the triplet can now be replaced by
|
||
simpler alternatives:
|
||
|
||
.. list-table::
|
||
:header-rows: 1
|
||
:widths: auto
|
||
|
||
* -
|
||
- Legacy API
|
||
- Alternative
|
||
* - Get handled exception (Python)
|
||
- :func:`py3.12:sys.exc_info`
|
||
- :func:`py3.12:sys.exception`
|
||
* - Get handled exception (C)
|
||
- :external+py3.12:c:func:`PyErr_GetExcInfo`
|
||
- :external+py3.12:c:func:`PyErr_GetHandledException`
|
||
* - Set handled exception (C)
|
||
- :external+py3.12:c:func:`PyErr_SetExcInfo`
|
||
- :external+py3.12:c:func:`PyErr_SetHandledException`
|
||
* - Get raised exception (C)
|
||
- :external+py3.12:c:func:`PyErr_Fetch`
|
||
- :external+py3.12:c:func:`PyErr_GetRaisedException`
|
||
* - Set raised exception (C)
|
||
- :external+py3.12:c:func:`PyErr_Restore`
|
||
- :external+py3.12:c:func:`PyErr_SetRaisedException`
|
||
* - Construct an exception instance from the 3-tuple (C)
|
||
- :external+py3.12:c:func:`PyErr_NormalizeException`
|
||
- N/A
|
||
|
||
|
||
The current proposal is a step in this process, and considers the way
|
||
forward for one more case in which the 3-tuple representation has
|
||
leaked to the language. The motivation for all this work is twofold.
|
||
|
||
Simplify the implementation of the language
|
||
-------------------------------------------
|
||
|
||
The simplification gained by reducing the interpreter's internal
|
||
representation of the handled exception to a single object was significant.
|
||
Previously, the interpreter needed to push onto/pop
|
||
from the stack three items whenever it did anything with exceptions.
|
||
This increased stack depth (adding pressure on caches and registers) and
|
||
complicated some of the bytecodes. Reducing this to one item
|
||
`removed about 100 lines of code <https://github.com/python/cpython/pull/30122>`_
|
||
from ``ceval.c`` (the interpreter's eval loop implementation), and it was later
|
||
followed by the removal of the ``POP_EXCEPT_AND_RERAISE`` opcode which has
|
||
become simple enough to be `replaced by generic stack manipulation instructions
|
||
<https://github.com/python/cpython/issues/90360>`_. Micro-benchmarks showed
|
||
`a speedup of about 10% for catching and raising an exception, as well as
|
||
for creating generators
|
||
<https://github.com/faster-cpython/ideas/issues/106#issuecomment-990172363>`_.
|
||
To summarize, removing this redundancy in Python's internals simplified the
|
||
interpreter and made it faster.
|
||
|
||
The performance of invoking ``__exit__``/``__aexit__`` when leaving
|
||
a context manager can be also improved by replacing a multi-arg function
|
||
call with a single-arg one. Micro-benchmarks showed that entering and exiting
|
||
a context manager with single-arg ``__exit__`` is about 13% faster.
|
||
|
||
Simplify the language itself
|
||
----------------------------
|
||
|
||
One of the reasons for the popularity of Python is its simplicity. The
|
||
:func:`py3.11:sys.exc_info` triplet is cryptic for new learners,
|
||
and the redundancy in it is confusing for those who do understand it.
|
||
|
||
It will take multiple releases to get to a point where we can think of
|
||
deprecating ``sys.exc_info()``. However, we can relatively quickly reach a
|
||
stage where new learners do not need to know about it, or about the 3-tuple
|
||
representation, at least until they are maintaining legacy code.
|
||
|
||
Rationale
|
||
=========
|
||
|
||
The only reason to object today to the removal of the last remaining
|
||
appearances of the 3-tuple from the language is the concerns about
|
||
disruption that such changes can bring. The goal of this PEP is to propose
|
||
a safe, gradual and minimally disruptive way to make this change in the
|
||
case of ``__exit__``, and with this to initiate a discussion of our options
|
||
for evolving its method signature.
|
||
|
||
In the case of the :mod:`py3.11:traceback` module's API, evolving the
|
||
functions to have a hybrid signature is relatively straightforward and
|
||
safe. The functions take one positional and two optional arguments, and
|
||
interpret them according to their types. This is safe when sentinels
|
||
are used for default values. The signatures of callbacks, which are
|
||
defined by the user's program, are harder to evolve.
|
||
|
||
The safest option is to make the user explicitly indicate which signature
|
||
the callback is expecting, by marking it with an additional attribute or
|
||
giving it a different name. For example, we could make the interpreter
|
||
look for a ``__leave__`` method on the context manager, and call it with
|
||
a single arg if it exists (otherwise, it looks for ``__exit__`` and
|
||
continues as it does now). The introspection-based alternative proposed
|
||
here intends to make it more convenient for users to write new code,
|
||
because they can just use the single-arg version and remain unaware of
|
||
the legacy API. However, if the limitations of introspection are found
|
||
to be too severe, we should consider an explicit option. Having both
|
||
``__exit__`` and ``__leave__`` around for 5-10 years with similar
|
||
functionality is not ideal, but it is an option.
|
||
|
||
Let us now examine the limitations of the current proposal. It identifies
|
||
2-arg python functions and ``METH_O`` C functions as having a single-arg
|
||
signature, and assumes that anything else is expecting 3 args. Obviously
|
||
it is possible to create false negatives for this heuristic (single-arg
|
||
callables that it will not identify). Context managers written in this
|
||
way won't work, they will continue to fail as they do now when their
|
||
``__exit__`` function will be called with three arguments.
|
||
|
||
I believe that it will not be a problem in practice. First, all working
|
||
code will continue to work, so this is a limitation on new code rather
|
||
than a problem impacting existing code. Second, exotic callable types are
|
||
rarely used for ``__exit__`` and if one is needed, it can always be wrapped
|
||
by a plain vanilla method that delegates to the callable. For example, we
|
||
can write this::
|
||
|
||
class C:
|
||
__enter__ = lambda self: self
|
||
__exit__ = ExoticCallable()
|
||
|
||
as follows::
|
||
|
||
class CM:
|
||
__enter__ = lambda self: self
|
||
_exit = ExoticCallable()
|
||
__exit__ = lambda self, exc: CM._exit(exc)
|
||
|
||
While discussing the real-world impact of the problem in this PEP, it is
|
||
worth noting that most ``__exit__`` functions don't do anything with their
|
||
arguments. Typically, a context manager is implemented to ensure that some
|
||
cleanup actions take place upon exit. It is rarely appropriate for the
|
||
``__exit__`` function to handle exceptions raised within the context, and
|
||
they are typically allowed to propagate out of ``__exit__`` to the calling
|
||
function. This means that most ``__exit__`` functions do not access their
|
||
arguments at all, and we should take this into account when trying to
|
||
assess the impact of different solutions on Python's userbase.
|
||
|
||
|
||
Specification
|
||
=============
|
||
|
||
A context manager's ``__exit__``/``__aexit__`` method can have a single-arg
|
||
signature, in which case it is invoked by the interpreter with the argument
|
||
equal to an exception instance or ``None``:
|
||
|
||
.. code-block::
|
||
|
||
>>> class C:
|
||
... def __enter__(self):
|
||
... return self
|
||
... def __exit__(self, exc):
|
||
... print(f'__exit__ called with: {exc!r}')
|
||
...
|
||
>>> with C():
|
||
... pass
|
||
...
|
||
__exit__ called with: None
|
||
>>> with C():
|
||
... 1/0
|
||
...
|
||
__exit__ called with: ZeroDivisionError('division by zero')
|
||
Traceback (most recent call last):
|
||
File "<stdin>", line 2, in <module>
|
||
ZeroDivisionError: division by zero
|
||
|
||
If ``__exit__``/``__aexit__`` has any other signature, it is invoked with
|
||
the 3-tuple ``(typ, exc, tb)`` as happens now:
|
||
|
||
.. code-block::
|
||
|
||
>>> class C:
|
||
... def __enter__(self):
|
||
... return self
|
||
... def __exit__(self, *exc):
|
||
... print(f'__exit__ called with: {exc!r}')
|
||
...
|
||
>>> with C():
|
||
... pass
|
||
...
|
||
__exit__ called with: (None, None, None)
|
||
>>> with C():
|
||
... 1/0
|
||
...
|
||
__exit__ called with: (<class 'ZeroDivisionError'>, ZeroDivisionError('division by zero'), <traceback object at 0x1039cb570>)
|
||
Traceback (most recent call last):
|
||
File "<stdin>", line 2, in <module>
|
||
ZeroDivisionError: division by zero
|
||
|
||
|
||
These ``__exit__`` methods will also be called with a 3-tuple:
|
||
|
||
.. code-block::
|
||
|
||
def __exit__(self, typ, *exc):
|
||
pass
|
||
|
||
def __exit__(self, typ, exc, tb):
|
||
pass
|
||
|
||
A reference implementation is provided in
|
||
`CPython PR #101995 <https://github.com/python/cpython/pull/101995>`_.
|
||
|
||
When the interpreter reaches the end of the scope of a context manager,
|
||
and it is about to call the relevant ``__exit__`` or ``__aexit__`` function,
|
||
it instrospects this function to determine whether it is the single-arg
|
||
or the legacy 3-arg version. In the draft PR, this introspection is performed
|
||
by the ``is_legacy___exit__`` function:
|
||
|
||
.. code-block:: c
|
||
|
||
static int is_legacy___exit__(PyObject *exit_func) {
|
||
if (PyMethod_Check(exit_func)) {
|
||
PyObject *func = PyMethod_GET_FUNCTION(exit_func);
|
||
if (PyFunction_Check(func)) {
|
||
PyCodeObject *code = (PyCodeObject*)PyFunction_GetCode(func);
|
||
if (code->co_argcount == 2 && !(code->co_flags & CO_VARARGS)) {
|
||
/* Python method that expects self + one more arg */
|
||
return false;
|
||
}
|
||
}
|
||
}
|
||
else if (PyCFunction_Check(exit_func)) {
|
||
if (PyCFunction_GET_FLAGS(exit_func) == METH_O) {
|
||
/* C function declared as single-arg */
|
||
return false;
|
||
}
|
||
}
|
||
return true;
|
||
}
|
||
|
||
It is important to note that this is not a generic introspection function, but
|
||
rather one which is specifically designed for our use case. We know that
|
||
``exit_func`` is an attribute of the context manager class (taken from the
|
||
type of the object that provided ``__enter__``), and it is typically a function.
|
||
Furthermore, for this to be useful we need to identify enough single-arg forms,
|
||
but not necessarily all of them. What is critical for backwards compatibility is
|
||
that we will never misidentify a legacy ``exit_func`` as a single-arg one. So,
|
||
for example, ``__exit__(self, *args)`` and ``__exit__(self, exc_type, *args)``
|
||
both have the legacy form, even though they *could* be invoked with one arg.
|
||
|
||
In summary, an ``exit_func`` will be invoke with a single arg if:
|
||
|
||
* It is a ``PyMethod`` with ``argcount`` ``2`` (to count ``self``) and no vararg, or
|
||
* it is a ``PyCFunction`` with the ``METH_O`` flag.
|
||
|
||
Note that any performance cost of the introspection can be mitigated via
|
||
:pep:`specialization <659>`, so it won't be a problem if we need to make it more
|
||
sophisticated than this for some reason.
|
||
|
||
|
||
Backwards Compatibility
|
||
=======================
|
||
|
||
All context managers that previously worked will continue to work in the
|
||
same way because the interpreter will call them with three args whenever
|
||
they can accept three args. There may be context managers that previously
|
||
did not work because their ``exit_func`` expected one argument, so the call
|
||
to ``__exit__`` would have caused a ``TypeError`` exception to be raised,
|
||
and now the call would succeed. This could theoretically change the
|
||
behaviour of existing code, but it is unlikely to be a problem in practice.
|
||
|
||
The backwards compatibility concerns will show up in some cases when libraries
|
||
try to migrate their context managers from the multi-arg to the single-arg
|
||
signature. If ``__exit__`` or ``__aexit__`` is called by any code other than
|
||
the interpreter's eval loop, the introspection does not automatically happen.
|
||
For example, this will occur where a context manager is subclassed and its
|
||
``__exit__`` method is called directly from the derived ``__exit__``. Such
|
||
context managers will need to migrate to the single-arg version with their
|
||
users, and may choose to offer a parallel API rather than breaking the
|
||
existing one. Alternatively, a superclass can stay with the signature
|
||
``__exit__(self, *args)``, and support both one and three args. Since
|
||
most context managers do not use the value of the arguments to ``__exit__``,
|
||
and simply allow the exception to propagate onward, this is likely to be the
|
||
common approach.
|
||
|
||
|
||
Security Implications
|
||
=====================
|
||
|
||
I am not aware of any.
|
||
|
||
How to Teach This
|
||
=================
|
||
|
||
The language tutorial will present the single-arg version, and the documentation
|
||
for context managers will include a section on the legacy signatures of
|
||
``__exit__`` and ``__aexit__``.
|
||
|
||
|
||
Reference Implementation
|
||
========================
|
||
|
||
`CPython PR #101995 <https://github.com/python/cpython/pull/101995>`_
|
||
implements the proposal of this PEP.
|
||
|
||
|
||
Rejected Ideas
|
||
==============
|
||
|
||
Support ``__leave__(self, exc)``
|
||
----------------------------------
|
||
|
||
It was considered to support a method by a new name, such as ``__leave__``,
|
||
with the new signature. This basically makes the programmer explicitly declare
|
||
which signature they are intending to use, and avoid the need for introspection.
|
||
|
||
Different variations of this idea include different amounts of magic that can
|
||
help automate the equivalence between ``__leave__`` and ``__exit__``. For example,
|
||
`Mark Shannon suggested <https://github.com/faster-cpython/ideas/issues/550#issuecomment-1410120100>`_
|
||
that the type constructor would add a default implementation for each of ``__exit__``
|
||
and ``__leave__`` whenever one of them is defined on a class. This default
|
||
implementation acts as a trampoline that calls the user's function. This would
|
||
make inheritance work seamlessly, as well as the migration from ``__exit__`` to
|
||
``__leave__`` for particular classes. The interpreter would just need to call
|
||
``__leave__``, and that would call ``__exit__`` whenever necessary.
|
||
|
||
While this suggestion has several advantages over the current proposal, it has
|
||
two drawbacks. The first is that it adds a new dunder name to the data model,
|
||
and we would end up with two dunders that mean the same thing, and only slightly
|
||
differ in their signatures. The second is that it would require the migration of
|
||
every ``__exit__`` to ``__leave__``, while with introspection it would not be
|
||
necessary to change the many ``__exit__(*arg)`` methods that do not access their
|
||
args. While it is not as simple as a grep for ``__exit__``, it is possible to write
|
||
an AST visitor that detects ``__exit__`` methods that can accept multiple arguments,
|
||
and which do access them.
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document is placed in the public domain or under the
|
||
CC0-1.0-Universal license, whichever is more permissive.
|