2021-01-19 07:46:22 -05:00
|
|
|
|
PEP: 651
|
2021-01-20 12:12:42 -05:00
|
|
|
|
Title: Robust Stack Overflow Handling
|
2021-01-19 07:46:22 -05:00
|
|
|
|
Author: Mark Shannon <mark@hotpy.org>
|
|
|
|
|
Status: Draft
|
|
|
|
|
Type: Standards Track
|
|
|
|
|
Content-Type: text/x-rst
|
|
|
|
|
Created: 18-Jan-2021
|
|
|
|
|
Post-History: 19-Jan-2021
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Abstract
|
|
|
|
|
========
|
|
|
|
|
|
2021-01-25 05:32:38 -05:00
|
|
|
|
This PEP proposes that Python should treat machine stack overflow differently from runaway recursion.
|
|
|
|
|
|
2021-01-19 07:46:22 -05:00
|
|
|
|
This would allow programs to set the maximum recursion depth to fit their needs
|
|
|
|
|
and provide additional safety guarantees.
|
|
|
|
|
|
2021-01-20 10:20:35 -05:00
|
|
|
|
If this PEP is accepted, then the following program will run safely to completion::
|
2021-01-19 07:46:22 -05:00
|
|
|
|
|
|
|
|
|
sys.setrecursionlimit(1_000_000)
|
|
|
|
|
|
|
|
|
|
def f(n):
|
|
|
|
|
if n:
|
|
|
|
|
f(n-1)
|
|
|
|
|
|
|
|
|
|
f(500_000)
|
|
|
|
|
|
2021-01-20 10:20:35 -05:00
|
|
|
|
and the following program will raise a ``StackOverflow``, without causing a VM crash::
|
2021-01-19 07:46:22 -05:00
|
|
|
|
|
|
|
|
|
sys.setrecursionlimit(1_000_000)
|
|
|
|
|
|
|
|
|
|
class X:
|
|
|
|
|
def __add__(self, other):
|
|
|
|
|
return self + other
|
|
|
|
|
|
|
|
|
|
X() + 1
|
|
|
|
|
|
|
|
|
|
Motivation
|
|
|
|
|
==========
|
|
|
|
|
|
|
|
|
|
CPython uses a single recursion depth counter to prevent both runaway recursion and C stack overflow.
|
|
|
|
|
However, runaway recursion and machine stack overflow are two different things.
|
|
|
|
|
Allowing machine stack overflow is a potential security vulnerability, but limiting recursion depth can prevent the
|
|
|
|
|
use of some algorithms in Python.
|
|
|
|
|
|
|
|
|
|
Currently, if a program needs to deeply recurse it must manage the maximum recursion depth allowed,
|
|
|
|
|
hopefully managing to set it in the region between the minimum needed to run correctly and the maximum that is safe
|
|
|
|
|
to avoid a memory protection error.
|
|
|
|
|
|
|
|
|
|
By separating the checks for C stack overflow from checks for recursion depth,
|
|
|
|
|
pure Python programs can run safely, using whatever level of recursion they require.
|
|
|
|
|
|
|
|
|
|
Rationale
|
|
|
|
|
=========
|
|
|
|
|
|
|
|
|
|
CPython currently relies on a single limit to guard against potentially dangerous stack overflow
|
|
|
|
|
in the virtual machine and to guard against run away recursion in the Python program.
|
|
|
|
|
|
|
|
|
|
This is a consequence of the implementation which couples the C and Python call stacks.
|
|
|
|
|
By breaking this coupling, we can improve both the usability of CPython and its safety.
|
|
|
|
|
|
|
|
|
|
The recursion limit exists to protect against runaway recursion, the integrity of the virtual machine should not depend on it.
|
|
|
|
|
Similarly, recursion should not be limited by implementation details.
|
|
|
|
|
|
|
|
|
|
Specification
|
|
|
|
|
=============
|
|
|
|
|
|
|
|
|
|
Two new exception classes will be added, ``StackOverflow`` and ``RecursionOverflow``, both of which will be
|
|
|
|
|
sub-classes of ``RecursionError``
|
|
|
|
|
|
|
|
|
|
StackOverflow exception
|
|
|
|
|
-----------------------
|
|
|
|
|
|
|
|
|
|
A ``StackOverflow`` exception will be raised whenever the interpreter or builtin module code
|
|
|
|
|
determines that the C stack is at or nearing a limit of safety.
|
|
|
|
|
``StackOverflow`` is a sub-class of ``RecursionError``,
|
|
|
|
|
so any code that handles ``RecursionError`` will handle ``StackOverflow``
|
|
|
|
|
|
|
|
|
|
RecursionOverflow exception
|
|
|
|
|
---------------------------
|
|
|
|
|
|
|
|
|
|
A ``RecursionOverflow`` exception will be raised when a call to a Python function
|
|
|
|
|
causes the recursion limit to be exceeded.
|
|
|
|
|
This is a slight change from current behavior which raises a ``RecursionError``.
|
|
|
|
|
``RecursionOverflow`` is a sub-class of ``RecursionError``,
|
|
|
|
|
so any code that handles ``RecursionError`` will continue to work as before.
|
|
|
|
|
|
|
|
|
|
Decoupling the Python stack from the C stack
|
|
|
|
|
--------------------------------------------
|
|
|
|
|
|
|
|
|
|
In order to provide the above guarantees and ensure that any program that worked previously
|
|
|
|
|
continues to do so, the Python and C stack will need to be separated.
|
|
|
|
|
That is, calls to Python functions from Python functions, should not consume space on the C stack.
|
|
|
|
|
Calls to and from builtin functions will continue to consume space on the C stack.
|
|
|
|
|
|
|
|
|
|
The size of the C stack will be implementation defined, and may vary from machine to machine.
|
|
|
|
|
It may even differ between threads. However, there is an expectation that any code that could run
|
|
|
|
|
with the recursion limit set to the previous default value, will continue to run.
|
|
|
|
|
|
|
|
|
|
Many operations in Python perform some sort of call at the C level.
|
|
|
|
|
Most of these will continue to consume C stack, and will result in a
|
|
|
|
|
``StackOverflow`` exception if uncontrolled recursion occurs.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Other Implementations
|
|
|
|
|
---------------------
|
|
|
|
|
|
|
|
|
|
Other implementations are required to fail safely regardless of what value the recursion limit is set to.
|
|
|
|
|
|
|
|
|
|
If the implementation couples the Python stack to the underlying VM or hardware stack,
|
|
|
|
|
then it should raise a ``RecursionOverflow`` exception when the recursion limit is exceeded,
|
|
|
|
|
but the underlying stack does not overflow.
|
|
|
|
|
If the underlying stack overflows, or is near to overflow,
|
|
|
|
|
then a ``StackOverflow`` exception should be raised.
|
|
|
|
|
|
|
|
|
|
C-API
|
|
|
|
|
-----
|
|
|
|
|
|
2021-01-20 11:23:53 -05:00
|
|
|
|
A new function, ``Py_CheckStackDepth()`` will be added, and the behavior of ``Py_EnterRecursiveCall()`` will be modified slightly.
|
2021-01-19 07:46:22 -05:00
|
|
|
|
|
|
|
|
|
Py_CheckStackDepth()
|
|
|
|
|
''''''''''''''''''''
|
|
|
|
|
|
|
|
|
|
``int Py_CheckStackDepth(const char *where)``
|
|
|
|
|
will return 0 if there is no immediate danger of C stack overflow.
|
|
|
|
|
It will return -1 and set an exception, if the C stack is near to overflowing.
|
2021-01-22 11:48:00 -05:00
|
|
|
|
The ``where`` parameter is used in the exception message, in the same fashion
|
|
|
|
|
as the ``where`` parameter of ``Py_EnterRecursiveCall()``.
|
2021-01-19 07:46:22 -05:00
|
|
|
|
|
|
|
|
|
Py_EnterRecursiveCall()
|
|
|
|
|
'''''''''''''''''''''''
|
|
|
|
|
|
2021-01-20 11:23:53 -05:00
|
|
|
|
``Py_EnterRecursiveCall()`` will be modified to call ``Py_CheckStackDepth()`` before performing its current function.
|
2021-01-19 07:46:22 -05:00
|
|
|
|
|
|
|
|
|
PyLeaveRecursiveCall()
|
|
|
|
|
''''''''''''''''''''''
|
|
|
|
|
|
2021-01-20 11:23:53 -05:00
|
|
|
|
``Py_LeaveRecursiveCall()`` will remain unchanged.
|
2021-01-19 07:46:22 -05:00
|
|
|
|
|
|
|
|
|
Backwards Compatibility
|
|
|
|
|
=======================
|
|
|
|
|
|
|
|
|
|
This feature is fully backwards compatibile at the Python level.
|
|
|
|
|
Some low-level tools, such as machine-code debuggers, will need to be modified.
|
|
|
|
|
For example, the gdb scripts for Python will need to be aware that there may be more than one Python frame
|
|
|
|
|
per C frame.
|
|
|
|
|
|
|
|
|
|
C code that uses the ``Py_EnterRecursiveCall()``, ``PyLeaveRecursiveCall()`` pair of
|
2021-01-20 11:23:53 -05:00
|
|
|
|
functions will continue to work correctly. In addition, ``Py_EnterRecursiveCall()``
|
|
|
|
|
may raise a ``StackOverflow`` exception.
|
|
|
|
|
|
|
|
|
|
New code should use the ``Py_CheckStackDepth()`` function, unless the code wants to
|
|
|
|
|
count as a Python function call with regard to the recursion limit.
|
|
|
|
|
|
|
|
|
|
We recommend that "python-like" code, such as Cython-generated functions,
|
|
|
|
|
use ``Py_EnterRecursiveCall()``, but other code use ``Py_CheckStackDepth()``.
|
2021-01-19 07:46:22 -05:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Security Implications
|
|
|
|
|
=====================
|
|
|
|
|
|
|
|
|
|
It will no longer be possible to crash the CPython virtual machine through recursion.
|
|
|
|
|
|
|
|
|
|
Performance Impact
|
|
|
|
|
==================
|
|
|
|
|
|
2021-02-03 09:06:23 -05:00
|
|
|
|
It is unlikely that the performance impact will be significant.
|
2021-01-19 07:46:22 -05:00
|
|
|
|
|
|
|
|
|
The additional logic required will probably have a very small negative impact on performance.
|
|
|
|
|
The improved locality of reference from reduced C stack use should have some small positive impact.
|
|
|
|
|
|
|
|
|
|
It is hard to predict whether the overall effect will be positive or negative,
|
|
|
|
|
but it is quite likely that the net effect will be too small to be measured.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Implementation
|
|
|
|
|
==============
|
|
|
|
|
|
2021-01-22 11:48:00 -05:00
|
|
|
|
Monitoring C stack consumption
|
|
|
|
|
------------------------------
|
2021-01-19 07:46:22 -05:00
|
|
|
|
|
|
|
|
|
Gauging whether a C stack overflow is imminent is difficult. So we need to be conservative.
|
|
|
|
|
We need to determine a safe bounds for the stack, which is not something possible in portable C code.
|
|
|
|
|
|
|
|
|
|
For major platforms, the platform specific API will be used to provide an accurate stack bounds.
|
|
|
|
|
However, for minor platforms some amount of guessing may be required.
|
|
|
|
|
While this might sound bad, it is no worse than the current situation, where we guess that the
|
|
|
|
|
size of the C stack is at least 1000 times the stack space required for the chain of calls from
|
|
|
|
|
``_PyEval_EvalFrameDefault`` to ``_PyEval_EvalFrameDefault``.
|
|
|
|
|
|
|
|
|
|
This means that in some cases the amount of recursion possible may be reduced.
|
|
|
|
|
In general, however, the amount of recursion possible should be increased, as many calls will use no C stack.
|
|
|
|
|
|
|
|
|
|
Our general approach to determining a limit for the C stack is to get an address within the current C frame,
|
|
|
|
|
as early as possible in the call chain. The limit can then be guessed by adding some constant to that.
|
|
|
|
|
|
2021-01-22 11:48:00 -05:00
|
|
|
|
Making Python-to-Python calls without consuming the C stack
|
|
|
|
|
-----------------------------------------------------------
|
|
|
|
|
|
|
|
|
|
Calls in the interpreter are handled by the ``CALL_FUNCTION``,
|
|
|
|
|
``CALL_FUNCTION_KW``, ``CALL_FUNCTION_EX`` and ``CALL_METHOD`` instructions.
|
|
|
|
|
The code for those instructions will be modified so that when
|
|
|
|
|
a Python function or method is called, instead of making a call in C,
|
|
|
|
|
the interpreter will setup the callee's frame and continue interpretation as normal.
|
|
|
|
|
|
|
|
|
|
The ``RETURN_VALUE`` instruction will perform the reverse operation,
|
|
|
|
|
except when the current frame is the entry frame of the interpreter
|
|
|
|
|
when it will return as normal.
|
|
|
|
|
|
2021-01-19 07:46:22 -05:00
|
|
|
|
|
|
|
|
|
Rejected Ideas
|
|
|
|
|
==============
|
|
|
|
|
|
|
|
|
|
None, as yet.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Open Issues
|
|
|
|
|
===========
|
|
|
|
|
|
|
|
|
|
None, as yet.
|
|
|
|
|
|
|
|
|
|
Copyright
|
|
|
|
|
=========
|
|
|
|
|
|
|
|
|
|
This document is placed in the public domain or under the
|
|
|
|
|
CC0-1.0-Universal license, whichever is more permissive.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
..
|
|
|
|
|
Local Variables:
|
|
|
|
|
mode: indented-text
|
|
|
|
|
indent-tabs-mode: nil
|
|
|
|
|
sentence-end-double-space: t
|
|
|
|
|
fill-column: 70
|
|
|
|
|
coding: utf-8
|
|
|
|
|
End:
|
|
|
|
|
|