python-peps/peps/pep-0765.rst

374 lines
15 KiB
ReStructuredText
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

PEP: 765
Title: Disallow return/break/continue that exit a finally block
Author: Irit Katriel <irit@python.org>, Alyssa Coghlan <ncoghlan@gmail.com>
Discussions-To: https://discuss.python.org/t/pep-765-disallow-return-break-continue-that-exit-a-finally-block/71348
Status: Draft
Type: Standards Track
Created: 15-Nov-2024
Python-Version: 3.14
Post-History: `09-Nov-2024 <https://discuss.python.org/t/an-analysis-of-return-in-finally-in-the-wild/70633>`__,
`16-Nov-2024 <https://discuss.python.org/t/pep-765-disallow-return-break-continue-that-exit-a-finally-block/71348>`__,
Replaces: 601
Abstract
========
This PEP proposes to withdraw support for ``return``, ``break`` and
``continue`` statements that break out of a ``finally`` block.
This was proposed in the past by :pep:`601`. The current PEP
is based on empirical evidence regarding the cost/benefit of
this change, which did not exist at the time that :pep:`601`
was rejected. It also proposes a slightly different solution
than that which was proposed by :pep:`601`.
Motivation
==========
The semantics of ``return``, ``break`` and ``continue`` in a
finally block are surprising for many developers.
The :ref:`documentation <python:tut-cleanup>` mentions that:
- If the ``finally`` clause executes a ``break``, ``continue``
or ``return`` statement, exceptions are not re-raised.
- If a ``finally`` clause includes a ``return`` statement, the
returned value will be the one from the ``finally`` clauses
``return`` statement, not the value from the ``try`` clauses
``return`` statement.
Both of these behaviours cause confusion, but the first is
particularly dangerous because a swallowed exception is more
likely to slip through testing, than an incorrect return value.
In 2019, :pep:`601` proposed to change Python to emit a
``SyntaxWarning`` for a few releases and then turn it into a
``SyntaxError``. It was rejected in favour of viewing this
as a programming style issue, to be handled by linters and :pep:`8`.
Indeed, :pep:`8` now recommends not to use control flow statements
in a ``finally`` block, and linters such as
`Pylint <https://pylint.readthedocs.io/en/stable/>`__,
`Ruff <https://docs.astral.sh/ruff/>`__ and
`flake8-bugbear <https://github.com/PyCQA/flake8-bugbear>`__
flag them as a problem.
Rationale
=========
A recent
`analysis of real world code <https://github.com/iritkatriel/finally/blob/main/README.md>`__ shows that:
- These features are rare (2 per million LOC in the top 8,000 PyPI
packages, 4 per million LOC in a random selection of packages).
This could be thanks to the linters that flag this pattern.
- Most of the usages are incorrect, and introduce unintended
exception-swallowing bugs.
- Code owners are typically receptive to fixing the bugs, and
find that easy to do.
See `the appendix <#appendix>`__ for more details.
This new data indicates that it would benefit Python's users if
Python itself moved them away from this harmful feature.
One of the arguments brought up in
`the PEP 601 discussion <https://discuss.python.org/t/pep-601-forbid-return-break-continue-breaking-out-of-finally/2239/24>`__
was that language features should be orthogonal, and combine without
context-based restrictions. However, in the meantime :pep:`654` has
been implemented, and it forbids ``return``, ``break`` and ``continue``
in an ``except*`` clause because the semantics of that would violate
the property that ``except*`` clauses operate *in parallel*, so the
code of one clause should not suppress the invocation of another.
In that case we accepted that a combination of features can be
harmful enough that it makes sense to disallow it.
Specification
=============
The change is to specify as part of the language spec that
Python's compiler may emit a ``SyntaxWarning`` or ``SyntaxError``
when a ``return``, ``break`` or ``continue`` would transfer
control flow from within a ``finally`` block to a location outside
of it.
This includes the following examples:
.. code-block::
:class: bad
def f():
try:
...
finally:
return 42
for x in o:
try:
...
finally:
break # (or continue)
But excludes these:
.. code-block::
:class: good
try:
...
finally:
def f():
return 42
try:
...
finally:
for x in o:
break # (or continue)
CPython will emit a ``SyntaxWarning`` in version 3.14, and we leave
it open whether, and when, this will become a ``SyntaxError``.
However, we specify here that a ``SyntaxError`` is permitted by
the language spec, so that other Python implementations can choose
to implement that.
The CPython implementation will emit the ``SyntaxWarning`` during
``AST`` construction, to ensure that the warning will show up during
static anlaysis and compilation, but not during execution of
pre-compiled code. We expect that the warning will be seen by a
project maintainer (when they run static analysis, or CI which
does not have precompiled files). However, end users of a project
will only see a warning if they skip precompilation at installation
time, check installation time warnings, or run static analysis over
their dependencies.
Backwards Compatibility
=======================
For backwards compatibility reasons, we are proposing that CPython
emit only a ``SyntaxWarning``, with no concrete plan to upgrade that
to an error. Code running with ``-We`` may stop working once this
is introduced.
Security Implications
=====================
The warning/error will help programmers avoid some hard to find bugs,
so will have a security benefit. We are not aware of security issues
related to raising a new ``SyntaxWarning`` or ``SyntaxError``.
How to Teach This
=================
The change will be documented in the language spec and in the
What's New documentation. The ``SyntaxWarning`` will alert users
that their code needs to change. The `empirical evidence <#appendix>`__
shows that the changes necessary are typically quite
straightforward.
Rejected Ideas
==============
Emit ``SyntaxError`` in CPython
-------------------------------
:pep:`601` proposed that CPython would emit ``SyntaxWarning`` for a couple of
releases and ``SyntaxError`` afterwards. We are leaving it open whether, and
when, this will become a ``SyntaxError`` in CPython, because we believe that a
``SyntaxWarning`` would provide most of the benefit with less risk.
Change Semantics
----------------
It `was suggested <https://discuss.python.org/t/pep-765-disallow-return-break-continue-that-exit-a-finally-block/71348/32>`__
to change the semantics of control flow instructions in ``finally`` such that an
in-flight exception takes precedence over them. In other words, a ``return``,
``break`` or ``continue`` would be permitted, and would exit the ``finally``
block, but the exception would still be raised.
This was rejected for two reasons. First, it would change the semantics of
working code in a way that can be hard to debug: a ``finally`` that was written
with the intention of swallowing all exceptions (correctly using the documented
semantics) would now allow the exception to propagate on. This may happen only
in rare edge cases at runtime, and is not guaranteed to be detected in testing.
Even if the code is wrong, and has an exception swallowing bug, it could be
hard for users to understand why a program started raising exceptions in 3.14,
while it did not in 3.13.
In contrast, a ``SyntaxWarning`` is likely to be seen during testing, it would
point to the precise location of the problem in the code, and it would not
prevent the program from running.
The second objection was about the proposed semantics. The motivation for
allowing control flow statements is not that this would be useful, but rather
the desire for orthogonality of features (which, as we mentioned in the
introduction, is already violated in the case of ``except*`` clauses). However,
the proposed semantics are complicated because they suggest that ``return``,
``break`` and ``continue`` behave as they normally do when ``finally`` executes
without an in-flight exception, but turn into something like a bare ``raise``
when there is one. It is hard to claim that the features are orthogonal if
the presence of one changes the semantics of the other.
Appendix
========
``return`` in ``finally`` considered harmful
--------------------------------------------
Below is an abridged version of a
`research report <https://github.com/iritkatriel/finally/commits/main/README.md>`__
by Irit Katriel, which was posted on 9 Nov 2024.
It describes an investigation into usage of ``return``, ``break`` and ``continue``
in a ``finally`` clause in real world code, addressing the
questions: Are people using it? How often are they using it incorrectly?
How much churn would the proposed change create?
Method
^^^^^^
The analysis is based on the 8,000 most popular PyPI packages, in terms of number
of downloads in the last 30 days. They were downloaded on the 17th-18th of
October, using
`a script <https://github.com/faster-cpython/tools/blob/main/scripts/download_packages.py>`__
written by Guido van Rossum, which in turn relies on Hugo van Kemenade's
`tool <https://hugovk.github.io/top-pypi-packages/>`__ that creates a list of the
most popular packages.
Once downloaded, a
`second script <https://github.com/iritkatriel/finally/blob/main/scripts/ast_analysis.py>`__
was used to construct an AST for each file, and traverse it to identify ``break``,
``continue`` and ``return`` statements which are directly inside a ``finally`` block.
I then found the current source code for each occurrence, and categorized it. For
cases where the code seems incorrect, I created an issue in the project's bug
tracker. The responses to these issues are also part of the data collected in
this investigation.
Results
^^^^^^^
I decided not to include a list of the incorrect usages, out of concern that
it would make this report look like a shaming exercise. Instead I will describe
the results in general terms, but will mention that some of the problems I found
appear in very popular libraries, including a cloud security application.
For those so inclined, it should not be hard to replicate my analysis, as I
provided links to the scripts I used in the Method section.
The projects examined contained a total of 120,964,221 lines of Python code,
and among them the script found 203 instances of control flow instructions in a
``finally`` block. Most were ``return``, a handful were ``break``, and none were
``continue``. Of these:
- 46 are correct, and appear in tests that target this pattern as a feature (e.g.,
tests for linters that detect it).
- 8 seem like they could be correct - either intentionally swallowing exceptions
or appearing where an active exception cannot occur. Despite being correct, it is
not hard to rewrite them to avoid the bad pattern, and it would make the code
clearer: deliberately swallowing exceptions can be more explicitly done with
``except BaseException:``, and ``return`` which doesn't swallow exceptions can be
moved after the ``finally`` block.
- 149 were clearly incorrect, and can lead to unintended swallowing of exceptions.
These are analyzed in the next section.
**The Error Cases**
Many of the error cases followed this pattern:
.. code-block::
:class: bad
try:
...
except SomeSpecificError:
...
except Exception:
logger.log(...)
finally:
return some_value
Code like this is obviously incorrect because it deliberately logs and swallows
``Exception`` subclasses, while silently swallowing ``BaseExceptions``. The intention
here is either to allow ``BaseExceptions`` to propagate on, or (if the author is
unaware of the ``BaseException`` issue), to log and swallow all exceptions. However,
even if the ``except Exception`` was changed to ``except BaseException``, this code
would still have the problem that the ``finally`` block swallows all exceptions
raised from within the ``except`` block, and this is probably not the intention
(if it is, that can be made explicit with another ``try``-``except BaseException``).
Another variation on the issue found in real code looks like this:
.. code-block::
:class: bad
try:
...
except:
return NotImplemented
finally:
return some_value
Here the intention seems to be to return ``NotImplemented`` when an exception is
raised, but the ``return`` in the ``finally`` block would override the one in the
``except`` block.
.. note:: Following the
`discussion <https://discuss.python.org/t/an-analysis-of-return-in-finally-in-the-wild/70633/15>`__,
I repeated the analysis on a random selection of PyPI packages (to
analyze code written by *average* programmers). The sample contained
in total 77,398,892 lines of code with 316 instances of ``return``/``break``/``continue``
in ``finally``. So about 4 instances per million lines of code.
**Author reactions**
Of the 149 incorrect instances of ``return`` or ``break`` in a ``finally`` clause,
27 were out of date, in the sense that they do not appear in the main/master branch
of the library, as the code has been deleted or fixed by now. The remaining 122
are in 73 different packages, and I created an issue in each one to alert the
authors to the problems. Within two weeks, 40 of the 73 issues received a reaction
from the code maintainers:
- 15 issues had a PR opened to fix the problem.
- 20 received reactions acknowledging the problem as one worth looking into.
- 3 replied that the code is no longer maintained so this won't be fixed.
- 2 closed the issue as "works as intended", one said that they intend to
swallow all exceptions, but the other seemed unaware of the distinction
between ``Exception`` and ``BaseException``.
One issue was linked to a pre-existing open issue about non-responsiveness to Ctrl-C,
conjecturing a connection.
Two of the issue were labelled as "good first issue".
**The correct usages**
The 8 cases where the feature appears to be used correctly (in non-test code) also
deserve attention. These represent the "churn" that would be caused by blocking
the feature, because this is where working code will need to change. I did not
contact the authors in these cases, so we need to assess the difficulty of
making these changes ourselves. It is shown in
`the full report <https://github.com/iritkatriel/finally/commits/main/README.md>`__,
that the change required in each case is small.
Discussion
^^^^^^^^^^
The first thing to note is that ``return``/``break``/``continue`` in a ``finally``
block is not something we see often: 203 instance in over 120 million lines
of code. This is, possibly, thanks to the linters that warn about this.
The second observation is that most of the usages were incorrect: 73% in our
sample (149 of 203).
Finally, the author responses were overwhelmingly positive. Of the 40 responses
received within two weeks, 35 acknowledged the issue, 15 of which also created
a PR to fix it. Only two thought that the code is fine as it is, and three
stated that the code is no longer maintained so they will not look into it.
The 8 instances where the code seems to work as intended, are not hard to
rewrite.
Copyright
=========
This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.