PEP 707: A simplified signature for __exit__ and __aexit__ (#3018)
This commit is contained in:
parent
d54d4e66c7
commit
23e1d474ad
|
@ -587,6 +587,7 @@ pep-0703.rst @ambv
|
|||
pep-0704.rst @brettcannon @pradyunsg
|
||||
# pep-0705.rst
|
||||
pep-0706.rst @encukou
|
||||
pep-0707.rst @iritkatriel
|
||||
pep-0708.rst @dstufft
|
||||
pep-0709.rst @carljm
|
||||
# ...
|
||||
|
|
1
conf.py
1
conf.py
|
@ -50,6 +50,7 @@ intersphinx_mapping = {
|
|||
'python': ('https://docs.python.org/3/', None),
|
||||
'packaging': ('https://packaging.python.org/en/latest/', None),
|
||||
'py3.11': ('https://docs.python.org/3.11/', None),
|
||||
'py3.12': ('https://docs.python.org/3.12/', None),
|
||||
}
|
||||
intersphinx_disabled_reftypes = []
|
||||
|
||||
|
|
|
@ -0,0 +1,393 @@
|
|||
PEP: 707
|
||||
Title: A simplified signature for __exit__ and __aexit__
|
||||
Author: Irit Katriel <iritkatriel@gmail.com>
|
||||
Discussions-To:
|
||||
Status: Draft
|
||||
Type: Standards Track
|
||||
Content-Type: text/x-rst
|
||||
Created: 18-Feb-2023
|
||||
Python-Version: 3.12
|
||||
Post-History:
|
||||
Resolution:
|
||||
|
||||
|
||||
Abstract
|
||||
========
|
||||
|
||||
This PEP proposes to make the interpreter accept context managers whose
|
||||
:meth:`~py3.11:object.__exit__` / :meth:`~py3.11:object.__aexit__` method
|
||||
takes only a single exception instance,
|
||||
while continuing to also support the current ``(typ, exc, tb)`` signature
|
||||
for backwards compatibility.
|
||||
|
||||
This proposal is part of an ongoing effort to remove the redundancy of
|
||||
the 3-item exception representation from the language, a relic of earlier
|
||||
Python versions which now confuses language users while adding complexity
|
||||
and overhead to the interpreter.
|
||||
|
||||
The proposed implementation uses introspection, which is tailored to the
|
||||
requirements of this use case. The solution ensures the safety of the new
|
||||
feature by supporting it only in non-ambiguous cases. In particular, any
|
||||
signature that *could* accept three arguments is assumed to expect them.
|
||||
|
||||
Because reliable introspection of callables is not currently possible in
|
||||
Python, the solution proposed here is limited in that only the common types
|
||||
of single-arg callables will be identified as such, while some of the more
|
||||
esoteric ones will continue to be called with three arguments. This
|
||||
imperfect solution was chosen among several imperfect alternatives in the
|
||||
spirit of practicality. It is my hope that the discussion about this PEP
|
||||
will explore the other options and lead us to the best way forward, which
|
||||
may well be to remain with our imperfect status quo.
|
||||
|
||||
|
||||
Motivation
|
||||
==========
|
||||
|
||||
In the past, an exception was represented in many parts of Python by a
|
||||
tuple of three elements: the type of the exception, its value, and its
|
||||
traceback. While there were good reasons for this design at the time,
|
||||
they no longer hold because the type and traceback can now be reliably
|
||||
deduced from the exception instance. Over the last few years we saw
|
||||
several efforts to simplify the representation of exceptions.
|
||||
|
||||
Since 3.10 in `CPython PR #70577 <https://github.com/python/cpython/issues/70577>`_,
|
||||
the :mod:`py3.11:traceback` module's functions accept either a 3-tuple
|
||||
as described above, or just an exception instance as a single argument.
|
||||
|
||||
Internally, the interpreter no longer represents exceptions as a triplet.
|
||||
This was `removed for the handled exception in 3.11
|
||||
<https://github.com/python/cpython/pull/30122>`_ and
|
||||
`for the raised exception in 3.12
|
||||
<https://github.com/python/cpython/pull/101607>`_. As a consequence,
|
||||
several APIs that expose the triplet were deprecated in favour of
|
||||
simpler alternatives:
|
||||
|
||||
.. list-table::
|
||||
:header-rows: 1
|
||||
:widths: auto
|
||||
|
||||
* -
|
||||
- Deprecated
|
||||
- Alternative
|
||||
* - Get handled exception (Python)
|
||||
- :func:`py3.12:sys.exc_info`
|
||||
- :func:`py3.12:sys.exception`
|
||||
* - Get handled exception (C)
|
||||
- :external+py3.12:c:func:`PyErr_GetExcInfo`
|
||||
- :external+py3.12:c:func:`PyErr_GetHandledException`
|
||||
* - Set handled exception (C)
|
||||
- :external+py3.12:c:func:`PyErr_SetExcInfo`
|
||||
- :external+py3.12:c:func:`PyErr_SetHandledException`
|
||||
* - Get raised exception (C)
|
||||
- :external+py3.12:c:func:`PyErr_Fetch`
|
||||
- :external+py3.12:c:func:`PyErr_GetRaisedException`
|
||||
* - Set raised exception (C)
|
||||
- :external+py3.12:c:func:`PyErr_Restore`
|
||||
- :external+py3.12:c:func:`PyErr_SetRaisedException`
|
||||
* - Construct an exception instance from the 3-tuple (C)
|
||||
- :external+py3.12:c:func:`PyErr_NormalizeException`
|
||||
- N/A
|
||||
|
||||
|
||||
The current proposal is a step in this process, and considers the way
|
||||
forward for one more case in which the 3-tuple representation has
|
||||
leaked to the language. The motivation for all this work is twofold.
|
||||
|
||||
Simplify the implementation of the language
|
||||
-------------------------------------------
|
||||
|
||||
The simplification gained by reducing the interpreter's internal
|
||||
representation of the handled exception to a single object was significant.
|
||||
Previously, the interpreter needed to push onto/pop
|
||||
from the stack three items whenever it did anything with exceptions.
|
||||
This increased stack depth (adding pressure on caches and registers) and
|
||||
complicated some of the bytecodes. Reducing this to one item
|
||||
`removed about 100 lines of code <https://github.com/python/cpython/pull/30122>`_
|
||||
from ``ceval.c`` (the interpreter's eval loop implementation), and it was later
|
||||
followed by the removal of the ``POP_EXCEPT_AND_RERAISE`` opcode which has
|
||||
become simple enough to be `replaced by generic stack manipulation instructions
|
||||
<https://github.com/python/cpython/issues/90360>`_. Micro-benchmarks showed
|
||||
`a speedup of about 10% for catching and raising an exception, as well as
|
||||
for creating generators
|
||||
<https://github.com/faster-cpython/ideas/issues/106#issuecomment-990172363>`_.
|
||||
To summarize, removing this redundancy in Python's internals simplified the
|
||||
interpreter and made it faster.
|
||||
|
||||
The performance of invoking ``__exit__``/``__aexit__`` when leaving
|
||||
a context manager can be also improved by replacing a multi-arg function
|
||||
call with a single-arg one. Micro-benchmarks showed that entering and exiting
|
||||
a context manager with single-arg ``__exit__`` is about 13% faster.
|
||||
|
||||
Simplify the language itself
|
||||
----------------------------
|
||||
|
||||
One of the reasons for the popularity of Python is its simplicity. The
|
||||
:func:`py3.11:sys.exc_info` triplet is cryptic for new learners,
|
||||
and the redundancy in it is confusing for those who do understand it.
|
||||
|
||||
It will take multiple releases to get to a point where we can think of
|
||||
deprecating ``sys.exc_info()``. However, we can relatively quickly reach a
|
||||
stage where new learners do not need to know about it, or about the 3-tuple
|
||||
representation, at least until they are maintaining legacy code.
|
||||
|
||||
Rationale
|
||||
=========
|
||||
|
||||
The only reason to object today to the removal of the last remaining
|
||||
appearances of the 3-tuple from the language is the concerns about
|
||||
disruption that such changes can bring. The goal of this PEP is to propose
|
||||
a safe, gradual and minimally disruptive way to make this change in the
|
||||
case of ``__exit__``, and with this to initiate a discussion of our options
|
||||
for evolving its method signature.
|
||||
|
||||
In the case of the :mod:`py3.11:traceback` module's API, evolving the
|
||||
functions to have a hybrid signature is relatively straightforward and
|
||||
safe. The functions take one positional and two optional arguments, and
|
||||
interpret them according to their types. This is safe when sentinels
|
||||
are used for default values. The signatures of callbacks, which are
|
||||
defined by the user's program, are harder to evolve.
|
||||
|
||||
The safest option is to make the user explicitly indicate which signature
|
||||
the callback is expecting, by marking it with an additional attribute or
|
||||
giving it a different name. For example, we could make the interpreter
|
||||
look for a ``__leave__`` method on the context manager, and call it with
|
||||
a single arg if it exists (otherwise, it looks for ``__exit__`` and
|
||||
continues as it does now). The introspection-based alternative proposed
|
||||
here intends to make it more convenient for users to write new code,
|
||||
because they can just use the single-arg version and remain unaware of
|
||||
the legacy API. However, if the limitations of introspection are found
|
||||
to be too severe, we should consider an explicit option. Having both
|
||||
``__exit__`` and ``__leave__`` around for 5-10 years with similar
|
||||
functionality is not ideal, but it is an option.
|
||||
|
||||
Let us now examine the limitations of the current proposal. It identifies
|
||||
2-arg python functions and ``METH_O`` C functions as having a single-arg
|
||||
signature, and assumes that anything else is expecting 3 args. Obviously
|
||||
it is possible to create false negatives for this heuristic (single-arg
|
||||
callables that it will not identify). Context managers written in this
|
||||
way won't work, they will continue to fail as they do now when their
|
||||
``__exit__`` function will be called with three arguments.
|
||||
|
||||
I believe that it will not be a problem in practice. First, all working
|
||||
code will continue to work, so this is a limitation on new code rather
|
||||
than a problem impacting existing code. Second, exotic callable types are
|
||||
rarely used for ``__exit__`` and if one is needed, it can always be wrapped
|
||||
by a plain vanilla method that delegates to the callable. For example, we
|
||||
can write this::
|
||||
|
||||
class C:
|
||||
__enter__ = lambda self: self
|
||||
__exit__ = ExoticCallable()
|
||||
|
||||
as follows::
|
||||
|
||||
class CM:
|
||||
__enter__ = lambda self: self
|
||||
_exit = ExoticCallable()
|
||||
__exit__ = lambda self, exc: CM._exit(exc)
|
||||
|
||||
While discussing the real-world impact of the problem in this PEP, it is
|
||||
worth noting that most ``__exit__`` functions don't do anything with their
|
||||
arguments. Typically, a context manager is implemented to ensure that some
|
||||
cleanup actions take place upon exit. It is rarely appropriate for the
|
||||
``__exit__`` function to handle exceptions raised within the context, and
|
||||
they are typically allowed to propagate out of ``__exit__`` to the calling
|
||||
function. This means that most ``__exit__`` functions do not access their
|
||||
arguments at all, and we should take this into account when trying to
|
||||
assess the impact of different solutions on Python's userbase.
|
||||
|
||||
|
||||
Specification
|
||||
=============
|
||||
|
||||
A context manager's ``__exit__``/``__aexit__`` method can have a single-arg
|
||||
signature, in which case it is invoked by the interpreter with the argument
|
||||
equal to an exception instance or ``None``:
|
||||
|
||||
.. code-block::
|
||||
|
||||
>>> class C:
|
||||
... def __enter__(self):
|
||||
... return self
|
||||
... def __exit__(self, exc):
|
||||
... print(f'__exit__ called with: {exc!r}')
|
||||
...
|
||||
>>> with C():
|
||||
... pass
|
||||
...
|
||||
__exit__ called with: None
|
||||
>>> with C():
|
||||
... 1/0
|
||||
...
|
||||
__exit__ called with: ZeroDivisionError('division by zero')
|
||||
Traceback (most recent call last):
|
||||
File "<stdin>", line 2, in <module>
|
||||
ZeroDivisionError: division by zero
|
||||
|
||||
If ``__exit__``/``__aexit__`` has any other signature, it is invoked with
|
||||
the 3-tuple ``(typ, exc, tb)`` as happens now:
|
||||
|
||||
.. code-block::
|
||||
|
||||
>>> class C:
|
||||
... def __enter__(self):
|
||||
... return self
|
||||
... def __exit__(self, *exc):
|
||||
... print(f'__exit__ called with: {exc!r}')
|
||||
...
|
||||
>>> with C():
|
||||
... pass
|
||||
...
|
||||
__exit__ called with: (None, None, None)
|
||||
>>> with C():
|
||||
... 1/0
|
||||
...
|
||||
__exit__ called with: (<class 'ZeroDivisionError'>, ZeroDivisionError('division by zero'), <traceback object at 0x1039cb570>)
|
||||
Traceback (most recent call last):
|
||||
File "<stdin>", line 2, in <module>
|
||||
ZeroDivisionError: division by zero
|
||||
|
||||
|
||||
These ``__exit__`` methods will also be called with a 3-tuple:
|
||||
|
||||
.. code-block::
|
||||
|
||||
def __exit__(self, typ, *exc):
|
||||
pass
|
||||
|
||||
def __exit__(self, typ, exc, tb):
|
||||
pass
|
||||
|
||||
A reference implementation is provided in
|
||||
`CPython PR #101995 <https://github.com/python/cpython/pull/101995>`_.
|
||||
|
||||
When the interpreter reaches the end of the scope of a context manager,
|
||||
and it is about to call the relevant ``__exit__`` or ``__aexit__`` function,
|
||||
it instrospects this function to determine whether it is the single-arg
|
||||
or the legacy 3-arg version. In the draft PR, this introspection is performed
|
||||
by the ``is_legacy___exit__`` function:
|
||||
|
||||
.. code-block:: c
|
||||
|
||||
static int is_legacy___exit__(PyObject *exit_func) {
|
||||
if (PyMethod_Check(exit_func)) {
|
||||
PyObject *func = PyMethod_GET_FUNCTION(exit_func);
|
||||
if (PyFunction_Check(func)) {
|
||||
PyCodeObject *code = (PyCodeObject*)PyFunction_GetCode(func);
|
||||
if (code->co_argcount == 2 && !(code->co_flags & CO_VARARGS)) {
|
||||
/* Python method that expects self + one more arg */
|
||||
return false;
|
||||
}
|
||||
}
|
||||
}
|
||||
else if (PyCFunction_Check(exit_func)) {
|
||||
if (PyCFunction_GET_FLAGS(exit_func) == METH_O) {
|
||||
/* C function declared as single-arg */
|
||||
return false;
|
||||
}
|
||||
}
|
||||
return true;
|
||||
}
|
||||
|
||||
It is important to note that this is not a generic introspection function, but
|
||||
rather one which is specifically designed for our use case. We know that
|
||||
``exit_func`` is an attribute of the context manager class (taken from the
|
||||
type of the object that provided ``__enter__``), and it is typically a function.
|
||||
Furthermore, for this to be useful we need to identify enough single-arg forms,
|
||||
but not necessarily all of them. What is critical for backwards compatibility is
|
||||
that we will never misidentify a legacy ``exit_func`` as a single-arg one. So,
|
||||
for example, ``__exit__(self, *args)`` and ``__exit__(self, exc_type, *args)``
|
||||
both have the legacy form, even though they *could* be invoked with one arg.
|
||||
|
||||
In summary, an ``exit_func`` will be invoke with a single arg if:
|
||||
|
||||
* It is a ``PyMethod`` with ``argcount`` ``2`` (to count ``self``) and no vararg, or
|
||||
* it is a ``PyCFunction`` with the ``METH_O`` flag.
|
||||
|
||||
Note that any performance cost of the introspection can be mitigated via
|
||||
:pep:`specialization <564>`, so it won't be a problem if we need to make it more
|
||||
sophisticated than this for some reason.
|
||||
|
||||
|
||||
Backwards Compatibility
|
||||
=======================
|
||||
|
||||
All context managers that previously worked will continue to work in the
|
||||
same way because the interpreter will call them with three args whenever
|
||||
they can accept three args. There may be context managers that previously
|
||||
did not work because their ``exit_func`` expected one argument, so the call
|
||||
to ``__exit__`` would have caused a ``TypeError`` exception to be raised,
|
||||
and now the call would succeed. This could theoretically change the
|
||||
behaviour of existing code, but it is unlikely to be a problem in practice.
|
||||
|
||||
The backwards compatibility concerns will show up in some cases when libraries
|
||||
try to migrate their context managers from the multi-arg to the single-arg
|
||||
signature. If ``__exit__`` or ``__aexit__`` is called by any code other than
|
||||
the interpreter's eval loop, the introspection does not automatically happen.
|
||||
For example, this will occur where a context manager is subclassed and its
|
||||
``__exit__`` method is called directly from the derived ``__exit__``. Such
|
||||
context managers will need to migrate to the single-arg version with their
|
||||
users, and may choose to offer a parallel API rather than breaking the
|
||||
existing one. Alternatively, a superclass can stay with the signature
|
||||
``__exit__(self, *args)``, and support both one and three args. Since
|
||||
most context managers do not use the value of the arguments to ``__exit__``,
|
||||
and simply allow the exception to propagate onward, this is likely to be the
|
||||
common approach.
|
||||
|
||||
|
||||
Security Implications
|
||||
=====================
|
||||
|
||||
I am not aware of any.
|
||||
|
||||
How to Teach This
|
||||
=================
|
||||
|
||||
The language tutorial will present the single-arg version, and the documentation
|
||||
for context managers will include a section on the legacy signatures of
|
||||
``__exit__`` and ``__aexit__``.
|
||||
|
||||
|
||||
Reference Implementation
|
||||
========================
|
||||
|
||||
`CPython PR #101995 <https://github.com/python/cpython/pull/101995>`_
|
||||
implements the proposal of this PEP.
|
||||
|
||||
|
||||
Rejected Ideas
|
||||
==============
|
||||
|
||||
Support ``__leave__(self, exc)``
|
||||
----------------------------------
|
||||
|
||||
It was considered to support a method by a new name, such as ``__leave__``,
|
||||
with the new signature. This basically makes the programmer explicitly declare
|
||||
which signature they are intending to use, and avoid the need for introspection.
|
||||
|
||||
Different variations of this idea include different amounts of magic that can
|
||||
help automate the equivalence between ``__leave__`` and ``__exit__``. For example,
|
||||
`Mark Shannon suggested <https://github.com/faster-cpython/ideas/issues/550#issuecomment-1410120100>`_
|
||||
that the type constructor would add a default implementation for each of ``__exit__``
|
||||
and ``__leave__`` whenever one of them is defined on a class. This default
|
||||
implementation acts as a trampoline that calls the user's function. This would
|
||||
make inheritance work seamlessly, as well as the migration from ``__exit__`` to
|
||||
``__leave__`` for particular classes. The interpreter would just need to call
|
||||
``__leave__``, and that would call ``__exit__`` whenever necessary.
|
||||
|
||||
While this suggestion has several advantages over the current proposal, it has
|
||||
two drawbacks. The first is that it adds a new dunder name to the data model,
|
||||
and we would end up with two dunders that mean the same thing, and only slightly
|
||||
differ in their signatures. The second is that it would require the migration of
|
||||
every ``__exit__`` to ``__leave__``, while with introspection it would not be
|
||||
necessary to change the many ``__exit__(*arg)`` methods that do not access their
|
||||
args. While it is not as simple as a grep for ``__exit__``, it is possible to write
|
||||
an AST visitor that detects ``__exit__`` methods that can accept multiple arguments,
|
||||
and which do access them.
|
||||
|
||||
|
||||
Copyright
|
||||
=========
|
||||
|
||||
This document is placed in the public domain or under the
|
||||
CC0-1.0-Universal license, whichever is more permissive.
|
Loading…
Reference in New Issue