460 lines
14 KiB
Plaintext
460 lines
14 KiB
Plaintext
|
PEP: 419
|
||
|
Title: Protecting cleanup statements from interruptions
|
||
|
Version: $Revision$
|
||
|
Last-Modified: $Date$
|
||
|
Author: Paul Colomiets <paul@colomiets.name>
|
||
|
Status: Draft
|
||
|
Type: Standards Track
|
||
|
Content-Type: text/x-rst
|
||
|
Created: 06-Apr-2012
|
||
|
Python-Version: 3.3
|
||
|
|
||
|
|
||
|
Abstract
|
||
|
========
|
||
|
|
||
|
This PEP proposes a way to protect python code from being interrupted inside
|
||
|
finally statement or context manager.
|
||
|
|
||
|
|
||
|
Rationale
|
||
|
=========
|
||
|
|
||
|
Python has two nice ways to do cleanup. One is a ``finally`` statement
|
||
|
and the other is context manager (or ``with`` statement). Although,
|
||
|
neither of them is protected from ``KeyboardInterrupt`` or
|
||
|
``generator.throw()``. For example::
|
||
|
|
||
|
lock.acquire()
|
||
|
try:
|
||
|
print('starting')
|
||
|
do_someting()
|
||
|
finally:
|
||
|
print('finished')
|
||
|
lock.release()
|
||
|
|
||
|
If ``KeyboardInterrupt`` occurs just after ``print`` function is
|
||
|
executed, lock will not be released. Similarly the following code
|
||
|
using ``with`` statement is affected::
|
||
|
|
||
|
from threading import Lock
|
||
|
|
||
|
class MyLock:
|
||
|
|
||
|
def __init__(self):
|
||
|
self._lock_impl = lock
|
||
|
|
||
|
def __enter__(self):
|
||
|
self._lock_impl.acquire()
|
||
|
print("LOCKED")
|
||
|
|
||
|
def __exit__(self):
|
||
|
print("UNLOCKING")
|
||
|
self._lock_impl.release()
|
||
|
|
||
|
lock = MyLock()
|
||
|
with lock:
|
||
|
do_something
|
||
|
|
||
|
If ``KeyboardInterrupt`` occurs near any of the ``print`` statements,
|
||
|
lock will never be released.
|
||
|
|
||
|
|
||
|
Coroutine Use Case
|
||
|
------------------
|
||
|
|
||
|
Similar case occurs with coroutines. Usually coroutine libraries want
|
||
|
to interrupt coroutine with a timeout. There is a
|
||
|
``generator.throw()`` method for this use case, but there are no
|
||
|
method to know is it currently yielded from inside a ``finally``.
|
||
|
|
||
|
Example that uses yield-based coroutines follows. Code looks
|
||
|
similar using any of the popular coroutine libraries Monocle [1]_,
|
||
|
Bluelet [2]_, or Twisted [3]_. ::
|
||
|
|
||
|
def run_locked()
|
||
|
yield connection.sendall('LOCK')
|
||
|
try:
|
||
|
yield do_something()
|
||
|
yield do_something_else()
|
||
|
finally:
|
||
|
yield connection.sendall('UNLOCK')
|
||
|
|
||
|
with timeout(5):
|
||
|
yield run_locked()
|
||
|
|
||
|
In the example above ``yield something`` means pause executing current
|
||
|
coroutine and execute coroutine ``something`` until it finished
|
||
|
execution. So that library keeps stack of generators itself. The
|
||
|
``connection.sendall`` waits until socket is writable and does thing
|
||
|
similar to what ``socket.sendall`` does.
|
||
|
|
||
|
The ``with`` statement ensures that all that code is executed within 5
|
||
|
seconds timeout. It does so by registering a callback in main loop,
|
||
|
which calls ``generator.throw()`` to the top-most frame in the
|
||
|
coroutine stack when timeout happens.
|
||
|
|
||
|
The ``greenlets`` extension works in similar way, except it doesn't
|
||
|
need ``yield`` to enter new stack frame. Otherwise considerations are
|
||
|
similar.
|
||
|
|
||
|
|
||
|
Specification
|
||
|
=============
|
||
|
|
||
|
Frame Flag 'f_in_cleanup'
|
||
|
-------------------------
|
||
|
|
||
|
A new flag on frame object is proposed. It is set to ``True`` if this
|
||
|
frame is currently in the ``finally`` suite. Internally it must be
|
||
|
implemented as a counter of nested finally statements currently
|
||
|
executed.
|
||
|
|
||
|
The internal counter is also incremented when entering ``WITH_SETUP``
|
||
|
bytecode and ``WITH_CLEANUP`` bytecode, and is decremented when
|
||
|
leaving that bytecode. This allows to protect ``__enter__`` and
|
||
|
``__exit__`` methods too.
|
||
|
|
||
|
|
||
|
Function 'sys.setcleanuphook'
|
||
|
-----------------------------
|
||
|
|
||
|
A new function for the ``sys`` module is proposed. This function sets
|
||
|
a callback which is executed every time ``f_in_cleanup`` becomes
|
||
|
``False``. Callbacks gets ``frame`` as it's sole argument so it can
|
||
|
get some evindence where it is called from.
|
||
|
|
||
|
The setting is thread local and is stored inside ``PyThreadState``
|
||
|
structure.
|
||
|
|
||
|
|
||
|
Inspect Module Enhancements
|
||
|
---------------------------
|
||
|
|
||
|
Two new functions are proposed for ``inspect`` module:
|
||
|
``isframeincleanup`` and ``getcleanupframe``.
|
||
|
|
||
|
``isframeincleanup`` given ``frame`` object or ``generator`` object as
|
||
|
sole argument returns the value of ``f_in_cleanup`` attribute of a
|
||
|
frame itself or of the ``gi_frame`` attribute of a generator.
|
||
|
|
||
|
``getcleanupframe`` given ``frame`` object as sole argument returns
|
||
|
the innermost frame which has true value of ``f_in_cleanup`` or
|
||
|
``None`` if no frames in the stack has the attribute set. It starts to
|
||
|
inspect from specified frame and walks to outer frames using
|
||
|
``f_back`` pointers, just like ``getouterframes`` does.
|
||
|
|
||
|
|
||
|
Example
|
||
|
=======
|
||
|
|
||
|
Example implementation of ``SIGINT`` handler that interrupts safely
|
||
|
might look like::
|
||
|
|
||
|
import inspect, sys, functools
|
||
|
|
||
|
def sigint_handler(sig, frame)
|
||
|
if inspect.getcleanupframe(frame) is None:
|
||
|
raise KeyboardInterrupt()
|
||
|
sys.setcleanuphook(functools.partial(sigint_handler, 0))
|
||
|
|
||
|
Coroutine example is out of scope of this document, because it's
|
||
|
implemention depends very much on a trampoline (or main loop) used by
|
||
|
coroutine library.
|
||
|
|
||
|
|
||
|
Unresolved Issues
|
||
|
=================
|
||
|
|
||
|
Interruption Inside With Statement Expression
|
||
|
---------------------------------------------
|
||
|
|
||
|
Given the statement::
|
||
|
|
||
|
with open(filename):
|
||
|
do_something()
|
||
|
|
||
|
Python can be interrupted after ``open`` is called, but before
|
||
|
``SETUP_WITH`` bytecode is executed. There are two possible decisions:
|
||
|
|
||
|
* Protect expression inside ``with`` statement. This would need
|
||
|
another bytecode, since currently there is no delimiter at the start
|
||
|
of ``with`` expression
|
||
|
|
||
|
* Let user write a wrapper if he considers it's important for his
|
||
|
use-case. Safe wrapper code might look like the following::
|
||
|
|
||
|
class FileWrapper(object):
|
||
|
|
||
|
def __init__(self, filename, mode):
|
||
|
self.filename = filename
|
||
|
self.mode = mode
|
||
|
|
||
|
def __enter__(self):
|
||
|
self.file = open(self.filename, self.mode)
|
||
|
|
||
|
def __exit__(self):
|
||
|
self.file.close()
|
||
|
|
||
|
Alternatively it can be written using context manager::
|
||
|
|
||
|
@contextmanager
|
||
|
def open_wrapper(filename, mode):
|
||
|
file = open(filename, mode)
|
||
|
try:
|
||
|
yield file
|
||
|
finally:
|
||
|
file.close()
|
||
|
|
||
|
This code is safe, as first part of generator (before yield) is
|
||
|
executed inside ``WITH_SETUP`` bytecode of caller
|
||
|
|
||
|
|
||
|
Exception Propagation
|
||
|
---------------------
|
||
|
|
||
|
Sometimes ``finally`` block or ``__enter__/__exit__`` method can be
|
||
|
exited with an exception. Usually it's not a problem, since more
|
||
|
important exception like ``KeyboardInterrupt`` or ``SystemExit``
|
||
|
should be thrown instead. But it may be nice to be able to keep
|
||
|
original exception inside a ``__context__`` attibute. So cleanup hook
|
||
|
signature may grow an exception argument::
|
||
|
|
||
|
def sigint_handler(sig, frame)
|
||
|
if inspect.getcleanupframe(frame) is None:
|
||
|
raise KeyboardInterrupt()
|
||
|
sys.setcleanuphook(retry_sigint)
|
||
|
|
||
|
def retry_sigint(frame, exception=None):
|
||
|
if inspect.getcleanupframe(frame) is None:
|
||
|
raise KeyboardInterrupt() from exception
|
||
|
|
||
|
.. note::
|
||
|
|
||
|
No need to have three arguments like in ``__exit__`` method since
|
||
|
we have a ``__traceback__`` attribute in exception in Python 3.x
|
||
|
|
||
|
Although, this will set ``__cause__`` for the exception, which is not
|
||
|
exactly what's intended. So some hidden interpeter logic may be used
|
||
|
to put ``__context__`` attribute on every exception raised in cleanup
|
||
|
hook.
|
||
|
|
||
|
|
||
|
Interruption Between Acquiring Resource and Try Block
|
||
|
-----------------------------------------------------
|
||
|
|
||
|
Example from the first section is not totally safe. Let's look closer::
|
||
|
|
||
|
lock.acquire()
|
||
|
try:
|
||
|
do_something()
|
||
|
finally:
|
||
|
lock.release()
|
||
|
|
||
|
There is no way it can be fixed without modifying the code. The actual
|
||
|
fix of this code depends very much on use case.
|
||
|
|
||
|
Usually code can be fixed using a ``with`` statement::
|
||
|
|
||
|
with lock:
|
||
|
do_something()
|
||
|
|
||
|
Although, for coroutines you usually can't use ``with`` statement
|
||
|
because you need to ``yield`` for both aquire and release operations.
|
||
|
So code might be rewritten as following::
|
||
|
|
||
|
try:
|
||
|
yield lock.acquire()
|
||
|
do_something()
|
||
|
finally:
|
||
|
yield lock.release()
|
||
|
|
||
|
The actual lock code might need more code to support this use case,
|
||
|
but implementation is usually trivial, like check if lock has been
|
||
|
acquired and unlock if it is.
|
||
|
|
||
|
|
||
|
Setting Interruption Context Inside Finally Itself
|
||
|
--------------------------------------------------
|
||
|
|
||
|
Some coroutine libraries may need to set a timeout for the finally
|
||
|
clause itself. For example::
|
||
|
|
||
|
try:
|
||
|
do_something()
|
||
|
finally:
|
||
|
with timeout(0.5):
|
||
|
try:
|
||
|
yield do_slow_cleanup()
|
||
|
finally:
|
||
|
yield do_fast_cleanup()
|
||
|
|
||
|
With current semantics timeout will either protect
|
||
|
the whole ``with`` block or nothing at all, depending on the
|
||
|
implementation of a library. What the author is intended is to treat
|
||
|
``do_slow_cleanup`` as an ordinary code, and ``do_fast_cleanup`` as a
|
||
|
cleanup (non-interruptible one).
|
||
|
|
||
|
Similar case might occur when using greenlets or tasklets.
|
||
|
|
||
|
This case can be fixed by exposing ``f_in_cleanup`` as a counter, and
|
||
|
by calling cleanup hook on each decrement. Corouting library may then
|
||
|
remember the value at timeout start, and compare it on each hook
|
||
|
execution.
|
||
|
|
||
|
But in practice example is considered to be too obscure to take in
|
||
|
account.
|
||
|
|
||
|
|
||
|
Alternative Python Implementations Support
|
||
|
==========================================
|
||
|
|
||
|
We consider ``f_in_cleanup`` and implementation detail. The actual
|
||
|
implementation may have some fake frame-like object passed to signal
|
||
|
handler, cleanup hook and returned from ``getcleanupframe``. The only
|
||
|
requirement is that ``inspect`` module functions work as expected on
|
||
|
that objects. For this reason we also allow to pass a ``generator``
|
||
|
object to a ``isframeincleanup`` function, this disables need to use
|
||
|
``gi_frame`` attribute.
|
||
|
|
||
|
It may need to be specified that ``getcleanupframe`` must return the
|
||
|
same object that will be passed to cleanup hook at next invocation.
|
||
|
|
||
|
|
||
|
Alternative Names
|
||
|
=================
|
||
|
|
||
|
Original proposal had ``f_in_finally`` flag. The original intention
|
||
|
was to protect ``finally`` clauses. But as it grew up to protecting
|
||
|
``__enter__`` and ``__exit__`` methods too, the ``f_in_cleanup``
|
||
|
method seems better. Although ``__enter__`` method is not a cleanup
|
||
|
routine, it at least relates to cleanup done by context managers.
|
||
|
|
||
|
``setcleanuphook``, ``isframeincleanup`` and ``getcleanupframe`` can
|
||
|
be unobscured to ``set_cleanup_hook``, ``is_frame_in_cleanup`` and
|
||
|
``get_cleanup_frame``, althought they follow convention of their
|
||
|
respective modules.
|
||
|
|
||
|
|
||
|
Alternative Proposals
|
||
|
=====================
|
||
|
|
||
|
Propagating 'f_in_cleanup' Flag Automatically
|
||
|
-----------------------------------------------
|
||
|
|
||
|
This can make ``getcleanupframe`` unnecessary. But for yield based
|
||
|
coroutines you need to propagate it yourself. Making it writable leads
|
||
|
to somewhat unpredictable behavior of ``setcleanuphook``
|
||
|
|
||
|
|
||
|
Add Bytecodes 'INCR_CLEANUP', 'DECR_CLEANUP'
|
||
|
--------------------------------------------
|
||
|
|
||
|
These bytecodes can be used to protect expression inside ``with``
|
||
|
statement, as well as making counter increments more explicit and easy
|
||
|
to debug (visible inside a disassembly). Some middle ground might be
|
||
|
chosen, like ``END_FINALLY`` and ``SETUP_WITH`` imlicitly decrements
|
||
|
counter (``END_FINALLY`` is present at end of ``with`` suite).
|
||
|
|
||
|
Although, adding new bytecodes must be considered very carefully.
|
||
|
|
||
|
|
||
|
Expose 'f_in_cleanup' as a Counter
|
||
|
----------------------------------
|
||
|
|
||
|
The original intention was to expose minimum needed functionality.
|
||
|
Although, as we consider frame flag ``f_in_cleanup`` as an
|
||
|
implementation detail, we may expose it as a counter.
|
||
|
|
||
|
Similarly, if we have a counter we may need to have cleanup hook
|
||
|
called on every counter decrement. It's unlikely have much performance
|
||
|
impact as nested finally clauses are unlikely common case.
|
||
|
|
||
|
|
||
|
Add code object flag 'CO_CLEANUP'
|
||
|
---------------------------------
|
||
|
|
||
|
As an alternative to set flag inside ``WITH_SETUP``, and
|
||
|
``WITH_CLEANUP`` bytecodes we can introduce a flag ``CO_CLEANUP``.
|
||
|
When interpreter starts to execute code with ``CO_CLEANUP`` set, it
|
||
|
sets ``f_in_cleanup`` for the whole function body. This flag is set
|
||
|
for code object of ``__enter__`` and ``__exit__`` special methods.
|
||
|
Technically it might be set on functions called ``__enter__`` and
|
||
|
``__exit__``.
|
||
|
|
||
|
This seems to be less clear solution. It also covers the case where
|
||
|
``__enter__`` and ``__exit__`` are called manually. This may be
|
||
|
accepted either as feature or as a unnecessary side-effect (unlikely
|
||
|
as a bug).
|
||
|
|
||
|
It may also impose a problem when ``__enter__`` or ``__exit__``
|
||
|
function are implemented in C, as there usually no frame to check for
|
||
|
``f_in_cleanup`` flag.
|
||
|
|
||
|
|
||
|
Have Cleanup Callback on Frame Object Itself
|
||
|
----------------------------------------------
|
||
|
|
||
|
Frame may be extended to have ``f_cleanup_callback`` which is called
|
||
|
when ``f_in_cleanup`` is reset to 0. It would help to register
|
||
|
different callbacks to different coroutines.
|
||
|
|
||
|
Despite apparent beauty. This solution doesn't add anything. As there
|
||
|
are two primary use cases:
|
||
|
|
||
|
* Set callback in signal handler. The callback is inherently single
|
||
|
one for this case
|
||
|
|
||
|
* Use single callback per loop for coroutine use case. And in almost
|
||
|
all cases there is only one loop per thread
|
||
|
|
||
|
|
||
|
No Cleanup Hook
|
||
|
---------------
|
||
|
|
||
|
Original proposal included no cleanup hook specification. As there are
|
||
|
few ways to achieve the same using current tools:
|
||
|
|
||
|
* Use ``sys.settrace`` and ``f_trace`` callback. It may impose some
|
||
|
problem to debugging, and has big performance impact (although,
|
||
|
interrupting doesn't happen very often)
|
||
|
|
||
|
* Sleep a bit more and try again. For coroutine library it's easy. For
|
||
|
signals it may be achieved using ``alert``.
|
||
|
|
||
|
Both methods are considered too impractical and a way to catch exit
|
||
|
from ``finally`` statement is proposed.
|
||
|
|
||
|
|
||
|
References
|
||
|
==========
|
||
|
|
||
|
.. [1] Monocle
|
||
|
https://github.com/saucelabs/monocle
|
||
|
|
||
|
.. [2] Bluelet
|
||
|
https://github.com/sampsyo/bluelet
|
||
|
|
||
|
.. [3] Twisted: inlineCallbacks
|
||
|
http://twistedmatrix.com/documents/8.1.0/api/twisted.internet.defer.html
|
||
|
|
||
|
.. [4] Original discussion
|
||
|
http://mail.python.org/pipermail/python-ideas/2012-April/014705.html
|
||
|
|
||
|
|
||
|
Copyright
|
||
|
=========
|
||
|
|
||
|
This document has been placed in the public domain.
|
||
|
|
||
|
|
||
|
|
||
|
..
|
||
|
Local Variables:
|
||
|
mode: indented-text
|
||
|
indent-tabs-mode: nil
|
||
|
sentence-end-double-space: t
|
||
|
fill-column: 70
|
||
|
coding: utf-8
|
||
|
End:
|