PEP: 311 Title: Simplified Global Interpreter Lock Acquisition for Extensions Version: $Revision$ Last-Modified: $Date$ Author: Mark Hammond Status: Draft Type: Standards Track Content-Type: text/plain Created: 05-Feb-2003 Post-History: 05-Feb-2003 14-Feb-2003 Open Issues This is where I note comments from people that are yet to be resolved. - JustvR prefers a PyGIL prefix over PyAutoThreadState. - JackJ notes that the "Auto" prefix will look a little silly in a few years, assuming this becomes the standard way of managing the lock. He doesn't really like Just's "GIL", and suggested "PyIntLock" - JackJ prefers "Acquire" over "Ensure", even though the semantics are different than for other "Acquire" functions in the API. Mark still prefers Ensure for exactly this reason. - Mark notes Dutch people must love names, and still remembers "pulling dead cows from the ditch" (but has forgotten the Dutch!) He also hopes Jack remembers the reference . - Should we provide Py_AUTO_THREAD_STATE macros? - Is my "Limitation" regarding PyEval_InitThreads() OK? Abstract This PEP proposes a simplified API for access to the Global Interpreter Lock (GIL) for Python extension modules. Specifically, it provides a solution for authors of complex multi-threaded extensions, where the current state of Python (i.e., the state of the GIL is unknown. This PEP proposes a new API, for platforms built with threading support, to manage the Python thread state. An implementation strategy is proposed, along with an initial, platform independent implementation. Rationale The current Python interpreter state API is suitable for simple, single-threaded extensions, but quickly becomes incredibly complex for non-trivial, multi-threaded extensions. Currently Python provides two mechanisms for dealing with the GIL: - Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS macros. These macros are provided primarily to allow a simple Python extension that already owns the GIL to temporarily release it while making an "external" (ie, non-Python), generally expensive, call. Any existing Python threads that are blocked waiting for the GIL are then free to run. While this is fine for extensions making calls from Python into the outside world, it is no help for extensions that need to make calls into Python when the thread state is unknown. - PyThreadState and PyInterpreterState APIs. These API functions allow an extension/embedded application to acquire the GIL, but suffer from a serious boot-strapping problem - they require you to know the state of the Python interpreter and of the GIL before they can be used. One particular problem is for extension authors that need to deal with threads never before seen by Python, but need to call Python from this thread. It is very difficult, delicate and error prone to author an extension where these "new" threads always know the exact state of the GIL, and therefore can reliably interact with this API. For these reasons, the question of how such extensions should interact with Python is quickly becoming a FAQ. The main impetus for this PEP, a thread on python-dev [1], immediately identified the following projects with this exact issue: - The win32all extensions - Boost - ctypes - Python-GTK bindings - Uno - PyObjC - Mac toolbox - PyXPCOM Currently, there is no reasonable, portable solution to this problem, forcing each extension author to implement their own hand-rolled version. Further, the problem is complex, meaning many implementations are likely to be incorrect, leading to a variety of problems that will often manifest simply as "Python has hung". While the biggest problem in the existing thread-state API is the lack of the ability to query the current state of the lock, it is felt that a more complete, simplified solution should be offered to extension authors. Such a solution should encourage authors to provide error-free, complex extension modules that take full advantage of Python's threading mechanisms. Limitations and Exclusions This proposal identifies a solution for extension authors with complex multi-threaded requirements, but that only require a single "PyInterpreterState". There is no attempt to cater for extensions that require multiple interpreter states. At the time of writing, no extension has been identified that requires multiple PyInterpreterStates, and indeed it is not clear if that facility works correctly in Python itself. This API will not perform automatic initialization of Python, or initialize Python for multi-threaded operation. Extension authors must continue to call Py_Initialize(), and for multi-threaded applications, PyEval_InitThreads(). The reason for this is that the first thread to call PyEval_InitThreads() is nominated as the "main thread" by Python, and so forcing the extension author to specify the main thread (by forcing her to make this first call) removes ambiguity. As Py_Initialize() must be called before PyEval_InitThreads(), and as both of these functions currently support being called multiple times, the burden this places on extension authors is considered reasonable. It is intended that this API be all that is necessary to acquire the Python GIL. Apart from the existing, standard Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS macros, it is assumed that no additional thread state API functions will be used by the extension. Extensions with such complicated requirements are free to continue to use the existing thread state API. Proposal This proposal recommends a new API be added to Python to simplify the management of the GIL. This API will be available on all platforms built with WITH_THREAD defined. The intent is that an extension author be able to use a small, well-defined "prologue dance", at any time and on any thread, and assuming Python has correctly been initialized, this dance will ensure Python is ready to be used on that thread. After the extension has finished with Python, it must also perform an "epilogue dance" to release any resources previously acquired. Ideally, these dances can be expressed in a single line. Specifically, the following new APIs are proposed: /* Ensure that the current thread is ready to call the Python C API, regardless of the current state of Python, or of its thread lock. This may be called as many times as desired by a thread, so long as each call is matched with a call to PyAutoThreadState_Release() The return value is an opaque "handle" to the thread state when PyAutoThreadState_Ensure() was called, and must be passed to PyAutoThreadState_Release() to ensure Python is left in the same state. When the function returns, the current thread will hold the GIL. Thus, the GIL is held by the thread until PyAutoThreadState_Release() is called. (Note that as happens now in Python, calling a Python API function may indeed cause a thread-switch and therefore a GIL ownership change. However, Python guarantees that when the API function returns, the GIL will again be owned by the thread making the call) Failure is a fatal error. */ PyAutoThreadState_State PyAutoThreadState_Ensure(void); /* Release any resources previously acquired. After this call, Python's state will be the same as it was prior to the corresponding PyAutoThreadState_Ensure call (but generally this state will be unknown to the caller, hence the use of the AutoThreadState API.) Every call to PyAutoThreadState_Ensure must be matched by a call to PyAutoThreadState_Release on the same thread. */ void PyAutoThreadState_Release(PyAutoThreadState_State state); Common usage will be: void SomeCFunction(void) { /* ensure we hold the lock */ PyAutoThreadState_State state = PyAutoThreadState_Ensure(); /* Use the Python API */ ... /* Restore the state of Python */ PyAutoThreadState_Release(state); } Design and Implementation The general operation of PyAutoThreadState_Ensure() will be: - assert Python is initialized. - Get a PyThreadState for the current thread, creating and saving if necessary. - remember the current state of the lock (owned/not owned) - If the current state does not own the GIL, acquire it. - Increment a counter for how many calls to PyAutoThreadState_Ensure have been made on the current thread. - return The general operation of PyAutoThreadState_Release() will be: - assert our thread currently holds the lock. - If old state indicates lock as previously unlocked, release GIL. - Decrement the PyAutoThreadState_Ensure counter for the thread. - If counter == 0: - release the PyThreadState. - forget the ThreadState as being owned by the thread. - return It is assumed that it is an error if two discrete PyThreadStates are used for a single thread. Comments in pystate.h ("State unique per thread") support this view, although it is never directly stated. Thus, this will require some implementation of Thread Local Storage. Fortunately, a platform independent implementation of Thread Local Storage already exists in the Python source tree, in the SGI threading port. This code will be integrated into the platform independent Python core, but in such a way that platforms can provide a more optimal implementation if desired. Implementation An implementation of this proposal can be found at http://www.python.org/sf/684256 References [1] http://mail.python.org/pipermail/python-dev/2002-December/031424.html Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: