python-peps/pep-0419.txt

521 lines
17 KiB
Plaintext
Raw Normal View History

2012-04-07 17:06:30 -04:00
PEP: 419
Title: Protecting cleanup statements from interruptions
Version: $Revision$
Last-Modified: $Date$
Author: Paul Colomiets <paul@colomiets.name>
Status: Deferred
2012-04-07 17:06:30 -04:00
Type: Standards Track
Content-Type: text/x-rst
Created: 06-Apr-2012
Python-Version: 3.3
Abstract
========
This PEP proposes a way to protect Python code from being interrupted
inside a finally clause or during context manager cleanup.
2012-04-07 17:06:30 -04:00
PEP Deferral
============
Further exploration of the concepts covered in this PEP has been deferred
for lack of a current champion interested in promoting the goals of the PEP
and collecting and incorporating feedback, and with sufficient available
time to do so effectively.
2012-04-07 17:06:30 -04:00
Rationale
=========
Python has two nice ways to do cleanup. One is a ``finally``
statement and the other is a context manager (usually called using a
``with`` statement). However, neither is protected from interruption
by ``KeyboardInterrupt`` or ``GeneratorExit`` caused by
``generator.throw()``. For example::
2012-04-07 17:06:30 -04:00
lock.acquire()
try:
print('starting')
do_something()
finally:
print('finished')
lock.release()
2012-04-07 17:06:30 -04:00
If ``KeyboardInterrupt`` occurs just after the second ``print()``
call, the lock will not be released. Similarly, the following code
using the ``with`` statement is affected::
2012-04-07 17:06:30 -04:00
from threading import Lock
2012-04-07 17:06:30 -04:00
class MyLock:
2012-04-07 17:06:30 -04:00
def __init__(self):
self._lock_impl = Lock()
2012-04-07 17:06:30 -04:00
def __enter__(self):
self._lock_impl.acquire()
print("LOCKED")
2012-04-07 17:06:30 -04:00
def __exit__(self):
print("UNLOCKING")
self._lock_impl.release()
2012-04-07 17:06:30 -04:00
lock = MyLock()
with lock:
do_something
2012-04-07 17:06:30 -04:00
If ``KeyboardInterrupt`` occurs near any of the ``print()`` calls, the
2012-04-07 17:06:30 -04:00
lock will never be released.
Coroutine Use Case
------------------
A similar case occurs with coroutines. Usually coroutine libraries
want to interrupt the coroutine with a timeout. The
``generator.throw()`` method works for this use case, but there is no
way of knowing if the coroutine is currently suspended from inside a
``finally`` clause.
2012-04-07 17:06:30 -04:00
An example that uses yield-based coroutines follows. The code looks
2012-04-07 17:06:30 -04:00
similar using any of the popular coroutine libraries Monocle [1]_,
Bluelet [2]_, or Twisted [3]_. ::
def run_locked():
yield connection.sendall('LOCK')
try:
yield do_something()
yield do_something_else()
finally:
yield connection.sendall('UNLOCK')
2012-04-07 17:06:30 -04:00
with timeout(5):
yield run_locked()
2012-04-07 17:06:30 -04:00
In the example above, ``yield something`` means to pause executing the
current coroutine and to execute coroutine ``something`` until it
finishes execution. Therefore, the coroutine library itself needs to
maintain a stack of generators. The ``connection.sendall()`` call waits
until the socket is writable and does a similar thing to what
``socket.sendall()`` does.
2012-04-07 17:06:30 -04:00
The ``with`` statement ensures that all code is executed within 5
seconds timeout. It does so by registering a callback in the main
loop, which calls ``generator.throw()`` on the top-most frame in the
coroutine stack when a timeout happens.
2012-04-07 17:06:30 -04:00
The ``greenlets`` extension works in a similar way, except that it
doesn't need ``yield`` to enter a new stack frame. Otherwise
considerations are similar.
2012-04-07 17:06:30 -04:00
Specification
=============
Frame Flag 'f_in_cleanup'
-------------------------
A new flag on the frame object is proposed. It is set to ``True`` if
this frame is currently executing a ``finally`` clause. Internally,
the flag must be implemented as a counter of nested finally statements
currently being executed.
2012-04-07 17:06:30 -04:00
The internal counter also needs to be incremented during execution of
the ``SETUP_WITH`` and ``WITH_CLEANUP`` bytecodes, and decremented
when execution for these bytecodes is finished. This allows to also
protect ``__enter__()`` and ``__exit__()`` methods.
2012-04-07 17:06:30 -04:00
Function 'sys.setcleanuphook'
-----------------------------
A new function for the ``sys`` module is proposed. This function sets
2012-04-07 17:06:30 -04:00
a callback which is executed every time ``f_in_cleanup`` becomes
false. Callbacks get a frame object as their sole argument, so that
they can figure out where they are called from.
2012-04-07 17:06:30 -04:00
The setting is thread local and must be stored in the
``PyThreadState`` structure.
2012-04-07 17:06:30 -04:00
Inspect Module Enhancements
---------------------------
Two new functions are proposed for the ``inspect`` module:
``isframeincleanup()`` and ``getcleanupframe()``.
2012-04-07 17:06:30 -04:00
``isframeincleanup()``, given a frame or generator object as its sole
argument, returns the value of the ``f_in_cleanup`` attribute of a
2012-04-07 17:06:30 -04:00
frame itself or of the ``gi_frame`` attribute of a generator.
``getcleanupframe()``, given a frame object as its sole argument,
returns the innermost frame which has a true value of
``f_in_cleanup``, or ``None`` if no frames in the stack have a nonzero
value for that attribute. It starts to inspect from the specified
frame and walks to outer frames using ``f_back`` pointers, just like
``getouterframes()`` does.
2012-04-07 17:06:30 -04:00
Example
=======
An example implementation of a SIGINT handler that interrupts safely
2012-04-07 17:06:30 -04:00
might look like::
import inspect, sys, functools
2012-04-07 17:06:30 -04:00
def sigint_handler(sig, frame):
if inspect.getcleanupframe(frame) is None:
raise KeyboardInterrupt()
sys.setcleanuphook(functools.partial(sigint_handler, 0))
2012-04-07 17:06:30 -04:00
A coroutine example is out of scope of this document, because its
implementation depends very much on a trampoline (or main loop) used
by coroutine library.
2012-04-07 17:06:30 -04:00
Unresolved Issues
=================
Interruption Inside With Statement Expression
---------------------------------------------
Given the statement ::
2012-04-07 17:06:30 -04:00
with open(filename):
do_something()
2012-04-07 17:06:30 -04:00
Python can be interrupted after ``open()`` is called, but before the
``SETUP_WITH`` bytecode is executed. There are two possible
decisions:
2012-04-07 17:06:30 -04:00
* Protect ``with`` expressions. This would require another bytecode,
since currently there is no way of recognizing the start of the
``with`` expression.
2012-04-07 17:06:30 -04:00
* Let the user write a wrapper if he considers it important for the
use-case. A safe wrapper might look like this::
2012-04-07 17:06:30 -04:00
class FileWrapper(object):
2012-04-07 17:06:30 -04:00
def __init__(self, filename, mode):
self.filename = filename
self.mode = mode
2012-04-07 17:06:30 -04:00
def __enter__(self):
self.file = open(self.filename, self.mode)
2012-04-07 17:06:30 -04:00
def __exit__(self):
self.file.close()
2012-04-07 17:06:30 -04:00
Alternatively it can be written using the ``contextmanager()``
decorator::
2012-04-07 17:06:30 -04:00
@contextmanager
def open_wrapper(filename, mode):
file = open(filename, mode)
try:
yield file
finally:
file.close()
2012-04-07 17:06:30 -04:00
This code is safe, as the first part of the generator (before yield)
is executed inside the ``SETUP_WITH`` bytecode of the caller.
2012-04-07 17:06:30 -04:00
Exception Propagation
---------------------
Sometimes a ``finally`` clause or an ``__enter__()``/``__exit__()``
method can raise an exception. Usually this is not a problem, since
more important exceptions like ``KeyboardInterrupt`` or ``SystemExit``
should be raised instead. But it may be nice to be able to keep the
original exception inside a ``__context__`` attribute. So the cleanup
hook signature may grow an exception argument::
2012-04-07 17:06:30 -04:00
def sigint_handler(sig, frame)
if inspect.getcleanupframe(frame) is None:
raise KeyboardInterrupt()
sys.setcleanuphook(retry_sigint)
2012-04-07 17:06:30 -04:00
def retry_sigint(frame, exception=None):
if inspect.getcleanupframe(frame) is None:
raise KeyboardInterrupt() from exception
2012-04-07 17:06:30 -04:00
.. note::
There is no need to have three arguments like in the ``__exit__``
method since there is a ``__traceback__`` attribute in exception in
Python 3.
2012-04-07 17:06:30 -04:00
However, this will set the ``__cause__`` for the exception, which is
not exactly what's intended. So some hidden interpreter logic may be
used to put a ``__context__`` attribute on every exception raised in a
cleanup hook.
2012-04-07 17:06:30 -04:00
Interruption Between Acquiring Resource and Try Block
-----------------------------------------------------
The example from the first section is not totally safe. Let's take a
closer look::
2012-04-07 17:06:30 -04:00
lock.acquire()
try:
do_something()
finally:
lock.release()
2012-04-07 17:06:30 -04:00
2012-04-08 16:18:51 -04:00
The problem might occur if the code is interrupted just after
``lock.acquire()`` is executed but before the ``try`` block is
entered.
There is no way the code can be fixed unmodified. The actual fix
depends very much on the use case. Usually code can be fixed using a
``with`` statement::
2012-04-07 17:06:30 -04:00
with lock:
do_something()
2012-04-07 17:06:30 -04:00
However, for coroutines one usually can't use the ``with`` statement
because you need to ``yield`` for both the acquire and release
operations. So the code might be rewritten like this::
2012-04-07 17:06:30 -04:00
try:
yield lock.acquire()
do_something()
finally:
yield lock.release()
2012-04-07 17:06:30 -04:00
The actual locking code might need more code to support this use case,
but the implementation is usually trivial, like this: check if the
lock has been acquired and unlock if it is.
2012-04-07 17:06:30 -04:00
2012-04-08 16:18:51 -04:00
Handling EINTR Inside a Finally
-------------------------------
Even if a signal handler is prepared to check the ``f_in_cleanup``
flag, ``InterruptedError`` might be raised in the cleanup handler,
because the respective system call returned an ``EINTR`` error. The
primary use cases are prepared to handle this:
* Posix mutexes never return ``EINTR``
* Networking libraries are always prepared to handle ``EINTR``
* Coroutine libraries are usually interrupted with the ``throw()``
method, not with a signal
The platform-specific function ``siginterrupt()`` might be used to
remove the need to handle ``EINTR``. However, it may have hardly
predictable consequences, for example ``SIGINT`` a handler is never
called if the main thread is stuck inside an IO routine.
A better approach would be to have the code, which is usually used in
cleanup handlers, be prepared to handle ``InterruptedError``
explicitly. An example of such code might be a file-based lock
implementation.
``signal.pthread_sigmask`` can be used to block signals inside
2012-05-05 11:58:14 -04:00
cleanup handlers which can be interrupted with ``EINTR``.
2012-04-08 16:18:51 -04:00
2012-04-07 17:06:30 -04:00
Setting Interruption Context Inside Finally Itself
--------------------------------------------------
Some coroutine libraries may need to set a timeout for the finally
clause itself. For example::
2012-04-07 17:06:30 -04:00
try:
do_something()
finally:
with timeout(0.5):
try:
yield do_slow_cleanup()
finally:
yield do_fast_cleanup()
2012-04-07 17:06:30 -04:00
With current semantics, timeout will either protect the whole ``with``
block or nothing at all, depending on the implementation of each
library. What the author intended is to treat ``do_slow_cleanup`` as
ordinary code, and ``do_fast_cleanup`` as a cleanup (a
non-interruptible one).
2012-04-07 17:06:30 -04:00
A similar case might occur when using greenlets or tasklets.
2012-04-07 17:06:30 -04:00
This case can be fixed by exposing ``f_in_cleanup`` as a counter, and
by calling a cleanup hook on each decrement. A coroutine library may
then remember the value at timeout start, and compare it on each hook
2012-04-07 17:06:30 -04:00
execution.
But in practice, the example is considered to be too obscure to take
into account.
2012-04-07 17:06:30 -04:00
2012-04-08 16:18:51 -04:00
Modifying KeyboardInterrupt
---------------------------
It should be decided if the default ``SIGINT`` handler should be
modified to use the described mechanism. The initial proposition is
to keep old behavior, for two reasons:
* Most application do not care about cleanup on exit (either they do
not have external state, or they modify it in crash-safe way).
* Cleanup may take too much time, not giving user a chance to
interrupt an application.
The latter case can be fixed by allowing an unsafe break if a
``SIGINT`` handler is called twice, but it seems not worth the
complexity.
2012-04-07 17:06:30 -04:00
Alternative Python Implementations Support
==========================================
We consider ``f_in_cleanup`` an implementation detail. The actual
2012-04-07 17:06:30 -04:00
implementation may have some fake frame-like object passed to signal
handler, cleanup hook and returned from ``getcleanupframe()``. The
only requirement is that the ``inspect`` module functions work as
expected on these objects. For this reason, we also allow to pass a
generator object to the ``isframeincleanup()`` function, which removes
the need to use the ``gi_frame`` attribute.
2012-04-07 17:06:30 -04:00
It might be necessary to specify that ``getcleanupframe()`` must
return the same object that will be passed to cleanup hook at the next
invocation.
2012-04-07 17:06:30 -04:00
Alternative Names
=================
The original proposal had a ``f_in_finally`` frame attribute, as the
original intention was to protect ``finally`` clauses. But as it grew
up to protecting ``__enter__`` and ``__exit__`` methods too, the
``f_in_cleanup`` name seems better. Although the ``__enter__`` method
is not a cleanup routine, it at least relates to cleanup done by
context managers.
2012-04-07 17:06:30 -04:00
``setcleanuphook``, ``isframeincleanup`` and ``getcleanupframe`` can
be unobscured to ``set_cleanup_hook``, ``is_frame_in_cleanup`` and
``get_cleanup_frame``, although they follow the naming convention of
their respective modules.
2012-04-07 17:06:30 -04:00
Alternative Proposals
=====================
Propagating 'f_in_cleanup' Flag Automatically
---------------------------------------------
2012-04-07 17:06:30 -04:00
This can make ``getcleanupframe()`` unnecessary. But for yield-based
coroutines you need to propagate it yourself. Making it writable
leads to somewhat unpredictable behavior of ``setcleanuphook()``.
2012-04-07 17:06:30 -04:00
Add Bytecodes 'INCR_CLEANUP', 'DECR_CLEANUP'
--------------------------------------------
These bytecodes can be used to protect the expression inside the
``with`` statement, as well as making counter increments more explicit
and easy to debug (visible inside a disassembly). Some middle ground
might be chosen, like ``END_FINALLY`` and ``SETUP_WITH`` implicitly
decrementing the counter (``END_FINALLY`` is present at end of every
``with`` suite).
2012-04-07 17:06:30 -04:00
However, adding new bytecodes must be considered very carefully.
2012-04-07 17:06:30 -04:00
Expose 'f_in_cleanup' as a Counter
----------------------------------
The original intention was to expose a minimum of needed
functionality. However, as we consider the frame flag
``f_in_cleanup`` an implementation detail, we may expose it as a
counter.
2012-04-07 17:06:30 -04:00
Similarly, if we have a counter we may need to have the cleanup hook
called on every counter decrement. It's unlikely to have much
performance impact as nested finally clauses are an uncommon case.
2012-04-07 17:06:30 -04:00
Add code object flag 'CO_CLEANUP'
---------------------------------
As an alternative to set the flag inside the ``SETUP_WITH`` and
``WITH_CLEANUP`` bytecodes, we can introduce a flag ``CO_CLEANUP``.
When the interpreter starts to execute code with ``CO_CLEANUP`` set,
it sets ``f_in_cleanup`` for the whole function body. This flag is
set for code objects of ``__enter__`` and ``__exit__`` special
methods. Technically it might be set on functions called
``__enter__`` and ``__exit__``.
2012-04-07 17:06:30 -04:00
This seems to be less clear solution. It also covers the case where
``__enter__`` and ``__exit__`` are called manually. This may be
accepted either as a feature or as an unnecessary side-effect (or,
though unlikely, as a bug).
2012-04-07 17:06:30 -04:00
It may also impose a problem when ``__enter__`` or ``__exit__``
functions are implemented in C, as there is no code object to check
for the ``f_in_cleanup`` flag.
2012-04-07 17:06:30 -04:00
Have Cleanup Callback on Frame Object Itself
--------------------------------------------
2012-04-07 17:06:30 -04:00
The frame object may be extended to have a ``f_cleanup_callback``
member which is called when ``f_in_cleanup`` is reset to 0. This
would help to register different callbacks to different coroutines.
2012-04-07 17:06:30 -04:00
Despite its apparent beauty, this solution doesn't add anything, as
the two primary use cases are:
2012-04-07 17:06:30 -04:00
* Setting the callback in a signal handler. The callback is
inherently a single one for this case.
2012-04-07 17:06:30 -04:00
* Use a single callback per loop for the coroutine use case. Here, in
almost all cases, there is only one loop per thread.
2012-04-07 17:06:30 -04:00
No Cleanup Hook
---------------
The original proposal included no cleanup hook specification, as there
are a few ways to achieve the same using current tools:
2012-04-07 17:06:30 -04:00
* Using ``sys.settrace()`` and the ``f_trace`` callback. This may
impose some problem to debugging, and has a big performance impact
(although interrupting doesn't happen very often).
2012-04-07 17:06:30 -04:00
* Sleeping a bit more and trying again. For a coroutine library this
is easy. For signals it may be achieved using ``signal.alert``.
2012-04-07 17:06:30 -04:00
Both methods are considered too impractical and a way to catch exit
from ``finally`` clauses is proposed.
2012-04-07 17:06:30 -04:00
References
==========
.. [1] Monocle
https://github.com/saucelabs/monocle
2012-04-07 17:06:30 -04:00
.. [2] Bluelet
https://github.com/sampsyo/bluelet
2012-04-07 17:06:30 -04:00
.. [3] Twisted: inlineCallbacks
https://twisted.org/documents/8.1.0/api/twisted.internet.defer.html
2012-04-07 17:06:30 -04:00
[4] Original discussion
\ https://mail.python.org/pipermail/python-ideas/2012-April/014705.html
2012-05-05 11:58:14 -04:00
[5] Implementation of PEP 419
\ https://github.com/python/cpython/issues/58935
2012-04-07 17:06:30 -04:00
Copyright
=========
This document has been placed in the public domain.