387 lines
18 KiB
Plaintext
387 lines
18 KiB
Plaintext
PEP: 539
|
|
Title: A New C-API for Thread-Local Storage in CPython
|
|
Version: $Revision$
|
|
Last-Modified: $Date$
|
|
Author: Erik M. Bray, Masayuki Yamamoto
|
|
BDFL-Delegate: Nick Coghlan
|
|
Status: Draft
|
|
Type: Informational
|
|
Content-Type: text/x-rst
|
|
Created: 20-Dec-2016
|
|
Post-History: 16-Dec-2016
|
|
|
|
|
|
Abstract
|
|
========
|
|
|
|
The proposal is to add a new Thread Local Storage (TLS) API to CPython which
|
|
would supersede use of the existing TLS API within the CPython interpreter,
|
|
while deprecating the existing API. The new API is named "Thread Specific
|
|
Storage (TSS) API" (see `Rationale for Proposed Solution`_ for the origin of
|
|
the name).
|
|
|
|
Because the existing TLS API is only used internally (it is not mentioned in
|
|
the documentation, and the header that defines it, ``pythread.h``, is not
|
|
included in ``Python.h`` either directly or indirectly), this proposal probably
|
|
only affects CPython, but might also affect other interpreter implementations
|
|
(PyPy?) that implement parts of the CPython API.
|
|
|
|
This is motivated primarily by the fact that the old API uses ``int`` to
|
|
represent TLS keys across all platforms, which is neither POSIX-compliant,
|
|
nor portable in any practical sense [1]_.
|
|
|
|
.. note::
|
|
|
|
Throughout this document the acronym "TLS" refers to Thread Local
|
|
Storage and should not be confused with "Transportation Layer Security"
|
|
protocols.
|
|
|
|
|
|
Specification
|
|
=============
|
|
|
|
The current API for TLS used inside the CPython interpreter consists of 6
|
|
functions::
|
|
|
|
PyAPI_FUNC(int) PyThread_create_key(void)
|
|
PyAPI_FUNC(void) PyThread_delete_key(int key)
|
|
PyAPI_FUNC(int) PyThread_set_key_value(int key, void *value)
|
|
PyAPI_FUNC(void *) PyThread_get_key_value(int key)
|
|
PyAPI_FUNC(void) PyThread_delete_key_value(int key)
|
|
PyAPI_FUNC(void) PyThread_ReInitTLS(void)
|
|
|
|
These would be superseded by a new set of analogous functions::
|
|
|
|
PyAPI_FUNC(int) PyThread_tss_create(Py_tss_t *key)
|
|
PyAPI_FUNC(void) PyThread_tss_delete(Py_tss_t *key)
|
|
PyAPI_FUNC(int) PyThread_tss_set(Py_tss_t *key, void *value)
|
|
PyAPI_FUNC(void *) PyThread_tss_get(Py_tss_t *key)
|
|
|
|
The specification also adds a few new features:
|
|
|
|
* A new type ``Py_tss_t``--an opaque type the definition of which may
|
|
depend on the underlying TLS implementation. It is defined::
|
|
|
|
typedef struct {
|
|
bool _is_initialized;
|
|
NATIVE_TSS_KEY_T _key;
|
|
} Py_tss_t;
|
|
|
|
where ``NATIVE_TSS_KEY_T`` is a macro whose value depends on the
|
|
underlying native TLS implementation (e.g. ``pthread_key_t``).
|
|
|
|
* A constant default value for ``Py_tss_t`` variables,
|
|
``Py_tss_NEEDS_INIT``.
|
|
|
|
* Three new functions::
|
|
|
|
PyAPI_FUNC(Py_tss_t *) PyThread_tss_alloc(void)
|
|
PyAPI_FUNC(void) PyThread_tss_free(Py_tss_t *key)
|
|
PyAPI_FUNC(bool) PyThread_tss_is_created(Py_tss_t *key)
|
|
|
|
The first two are needed for dynamic (de-)allocation of a ``Py_tss_t``,
|
|
particularly in extension modules built with ``Py_LIMITED_API``, where
|
|
static allocation of this type is not possible due to its implementation
|
|
being opaque at build time. A value returned by ``PyThread_tss_alloc``
|
|
is the same state as initialized by ``Py_tss_NEEDS_INIT``, or ``NULL`` for
|
|
dynamic allocation failure. The behavior of ``PyThread_tss_free`` involves
|
|
calling ``PyThread_tss_delete`` preventively, or is no-op if the value
|
|
pointed to by the ``key`` argument is ``NULL``. ``PyThread_tss_is_created``
|
|
returns ``true`` if the given ``Py_tss_t`` has been initialized (i.e. by
|
|
``PyThread_tss_create``).
|
|
|
|
The new TSS API does not provide functions which correspond to
|
|
``PyThread_delete_key_value`` and ``PyThread_ReInitTLS``, because these
|
|
functions are for the removed CPython's own TLS implementation, that is the
|
|
existing API behavior has become as follows: ``PyThread_delete_key_value(key)``
|
|
is equal to ``PyThread_set_key_value(key, NULL)``, and ``PyThread_ReInitTLS()``
|
|
is no-op [8]_.
|
|
|
|
The new ``PyThread_tss_`` functions are almost exactly analogous to their
|
|
original counterparts with a few minor differences: Whereas
|
|
``PyThread_create_key`` takes no arguments and returns a TLS key as an
|
|
``int``, ``PyThread_tss_create`` takes a ``Py_tss_t*`` as an argument and
|
|
returns an ``int`` status code. The behavior of ``PyThread_tss_create`` is
|
|
undefined if the value pointed to by the ``key`` argument is not initialized
|
|
by ``Py_tss_NEEDS_INIT``. The returned status code is zero on success
|
|
and non-zero on failure. The meanings of non-zero status codes are not
|
|
otherwise defined by this specification.
|
|
|
|
Similarly the other ``PyThread_tss_`` functions are passed a ``Py_tss_t*``
|
|
whereas previously the key was passed by value. This change is necessary, as
|
|
being an opaque type, the ``Py_tss_t`` type could hypothetically be almost
|
|
any size. This is especially necessary for extension modules built with
|
|
``Py_LIMITED_API``, where the size of the type is not known. Except for
|
|
``PyThread_tss_free``, the behaviors of ``PyThread_tss_`` are undefined if the
|
|
value pointed to by the ``key`` argument is ``NULL``.
|
|
|
|
Moreover, because of the use of ``Py_tss_t`` instead of ``int``, there are
|
|
additional behaviors which the existing API design would be carried over into
|
|
new API: The TSS key creation and deletion are parts of "do-if-needed" flow
|
|
and these features are silently skipped if already done--Calling
|
|
``PyThread_tss_create`` with an initialized key does nothing and returns
|
|
success soon. This is also the case of calling ``PyThread_tss_delete`` with
|
|
an uninitialized key.
|
|
|
|
The behavior of ``PyThread_tss_delete`` is defined to change the key's
|
|
initialization state to "uninitialized" in order to restart the CPython
|
|
interpreter without terminating the process (e.g. embedding Python in an
|
|
application) [12]_.
|
|
|
|
The old ``PyThread_*_key*`` functions will be marked as deprecated in the
|
|
documentation, but will not generate runtime deprecation warnings.
|
|
|
|
Additionally, on platforms where ``sizeof(pthread_key_t) != sizeof(int)``,
|
|
``PyThread_create_key`` will return immediately with a failure status, and
|
|
the other TLS functions will all be no-ops on such platforms.
|
|
|
|
Comparison of API Specification
|
|
-------------------------------
|
|
|
|
================= ============================= =============================
|
|
API Thread Local Storage (TLS) Thread Specific Storage (TSS)
|
|
================= ============================= =============================
|
|
Version Existing New
|
|
Key Type ``int`` ``Py_tss_t`` (opaque type)
|
|
Handle Native Key cast to ``int`` conceal into internal field
|
|
Function Argument ``int`` ``Py_tss_t *``
|
|
Features - create key - create key
|
|
- delete key - delete key
|
|
- set value - set value
|
|
- get value - get value
|
|
- delete value - (set ``NULL`` instead) [8]_
|
|
- reinitialize keys (for - (unnecessary) [8]_
|
|
after fork)
|
|
- dynamically (de-)allocate
|
|
key
|
|
- check key's initialization
|
|
state
|
|
Default Value (``-1`` as key creation ``Py_tss_NEEDS_INIT``
|
|
failure)
|
|
Requirement native thread native thread
|
|
(since CPython 3.7 [9]_)
|
|
Restriction Not support platform where Unable to statically allocate
|
|
native TLS key is defined in key when ``Py_LIMITED_API``
|
|
a way that cannot be safely is defined.
|
|
cast to ``int``.
|
|
================= ============================= =============================
|
|
|
|
Example
|
|
-------
|
|
|
|
With the proposed changes, a TSS key is initialized like::
|
|
|
|
static Py_tss_t tss_key = Py_tss_NEEDS_INIT;
|
|
if (PyThread_tss_create(&tss_key)) {
|
|
/* ... handle key creation failure ... */
|
|
}
|
|
|
|
The initialization state of the key can then be checked like::
|
|
|
|
assert(PyThread_tss_is_created(&tss_key));
|
|
|
|
The rest of the API is used analogously to the old API::
|
|
|
|
int the_value = 1;
|
|
if (PyThread_tss_get(&tss_key) == NULL) {
|
|
PyThread_tss_set(&tss_key, (void *)&the_value);
|
|
assert(PyThread_tss_get(&tss_key) != NULL);
|
|
}
|
|
/* ... once done with the key ... */
|
|
PyThread_tss_delete(&tss_key);
|
|
assert(!PyThread_tss_is_created(&tss_key));
|
|
|
|
When ``Py_LIMITED_API`` is defined, a TSS key must be dynamically allocated::
|
|
|
|
static Py_tss_t *ptr_key = PyThread_tss_alloc();
|
|
if (ptr_key == NULL) {
|
|
/* ... handle key allocation failure ... */
|
|
}
|
|
assert(!PyThread_tss_is_created(ptr_key));
|
|
/* ... once done with the key ... */
|
|
PyThread_tss_free(ptr_key);
|
|
ptr_key = NULL;
|
|
|
|
|
|
Platform Support Changes
|
|
========================
|
|
|
|
A new "Native Thread Implementation" section will be added to PEP 11 that
|
|
states:
|
|
|
|
* As of CPython 3.7, in the case of enabling thread support, all platforms are
|
|
required to provide at least one of native thread implementation (as of
|
|
pthreads or Windows) to implement TSS API. Any TSS API problems that occur
|
|
in the implementation without native thread will be closed as "won't fix".
|
|
|
|
|
|
Motivation
|
|
==========
|
|
|
|
The primary problem at issue here is the type of the keys (``int``) used for
|
|
TLS values, as defined by the original PyThread TLS API.
|
|
|
|
The original TLS API was added to Python by GvR back in 1997, and at the
|
|
time the key used to represent a TLS value was an ``int``, and so it has
|
|
been to the time of writing. This used CPython's own TLS implementation,
|
|
but the current generation of which hasn't been used, largely unchanged, in
|
|
Python/thread.c. Support for implementation of the API on top of native
|
|
thread implementations (pthreads and Windows) was added much later, and the
|
|
own implementation has been no longer necessary and removed [9]_.
|
|
|
|
The problem with the choice of ``int`` to represent a TLS key, is that while
|
|
it was fine for CPython's own TLS implementation, and happens to be
|
|
compatible with Windows (which uses ``DWORD`` for the analogous data), it is
|
|
not compatible with the POSIX standard for the pthreads API, which defines
|
|
``pthread_key_t`` as an opaque type not further defined by the standard (as
|
|
with ``Py_tss_t`` described above) [14]_. This leaves it up to the underlying
|
|
implementation how a ``pthread_key_t`` value is used to look up
|
|
thread-specific data.
|
|
|
|
This has not generally been a problem for Python's API, as it just happens
|
|
that on Linux ``pthread_key_t`` is defined as an ``unsigned int``, and so is
|
|
fully compatible with Python's TLS API--``pthread_key_t``'s created by
|
|
``pthread_create_key`` can be freely cast to ``int`` and back (well, not
|
|
exactly, even this has some limitations as pointed out by issue #22206).
|
|
|
|
However, as issue #25658 points out, there are at least some platforms
|
|
(namely Cygwin, CloudABI, but likely others as well) which have otherwise
|
|
modern and POSIX-compliant pthreads implementations, but are not compatible
|
|
with Python's API because their ``pthread_key_t`` is defined in a way that
|
|
cannot be safely cast to ``int``. In fact, the possibility of running into
|
|
this problem was raised by MvL at the time pthreads TLS was added [2]_.
|
|
|
|
It could be argued that PEP-11 makes specific requirements for supporting a
|
|
new, not otherwise officially-support platform (such as CloudABI), and that
|
|
the status of Cygwin support is currently dubious. However, this creates a
|
|
very high barrier to supporting platforms that are otherwise Linux- and/or
|
|
POSIX-compatible and where CPython might otherwise "just work" except for
|
|
this one hurdle. CPython itself imposes this implementation barrier by way
|
|
of an API that is not compatible with POSIX (and in fact makes invalid
|
|
assumptions about pthreads).
|
|
|
|
|
|
Rationale for Proposed Solution
|
|
===============================
|
|
|
|
The use of an opaque type (``Py_tss_t``) to key TLS values allows the API to
|
|
be compatible, with all present (POSIX and Windows) and future (C11?)
|
|
native TLS implementations supported by CPython, as it allows the definition
|
|
of ``Py_tss_t`` to depend on the underlying implementation.
|
|
|
|
Since the existing TLS API has been available in *the limited API* [13]_ for
|
|
some platforms (e.g. Linux), CPython makes an effort to provide the new TSS API
|
|
at that level likewise. Note, however, that ``Py_tss_t`` definition becomes to
|
|
be an opaque struct when ``Py_LIMITED_API`` is defined, because exposing
|
|
``NATIVE_TSS_KEY_T`` as part of the limited API would prevent us from switching
|
|
native thread implementation without rebuilding extension module.
|
|
|
|
A new API must be introduced, rather than changing the function signatures of
|
|
the current API, in order to maintain backwards compatibility. The new API
|
|
also more clearly groups together these related functions under a single name
|
|
prefix, ``PyThread_tss_``. The "tss" in the name stands for "thread-specific
|
|
storage", and was influenced by the naming and design of the "tss" API that is
|
|
part of the C11 threads API [15]_. However, this is in no way meant to imply
|
|
compatibility with or support for the C11 threads API, or signal any future
|
|
intention of supporting C11--it's just the influence for the naming and design.
|
|
|
|
The inclusion of the special default value ``Py_tss_NEEDS_INIT`` is required
|
|
by the fact that not all native TLS implementations define a sentinel value
|
|
for uninitialized TLS keys. For example, on Windows a TLS key is
|
|
represented by a ``DWORD`` (``unsigned int``) and its value must be treated
|
|
as opaque [3]_. So there is no unsigned integer value that can be safely
|
|
used to represent an uninitialized TLS key on Windows. Likewise, POSIX
|
|
does not specify a sentinel for an uninitialized ``pthread_key_t``, instead
|
|
relying on the ``pthread_once`` interface to ensure that a given TLS key is
|
|
initialized only once per-process. Therefore, the ``Py_tss_t`` type
|
|
contains an explicit ``._is_initialized`` that can indicate the key's
|
|
initialization state independent of the underlying implementation.
|
|
|
|
Changing ``PyThread_create_key`` to immediately return a failure status on
|
|
systems using pthreads where ``sizeof(int) != sizeof(pthread_key_t)`` is
|
|
intended as a sanity check: Currently, ``PyThread_create_key`` may report
|
|
initial success on such systems, but attempts to use the returned key are
|
|
likely to fail. Although in practice this failure occurs earlier in the
|
|
interpreter initialization, it's better to fail immediately at the source of
|
|
problem (``PyThread_create_key``) rather than sometime later when use of an
|
|
invalid key is attempted. In other words, this indicates clearly that the
|
|
old API is not supported on platforms where it cannot be used reliably, and
|
|
that no effort will be made to add such support.
|
|
|
|
|
|
Rejected Ideas
|
|
==============
|
|
|
|
* Do nothing: The status quo is fine because it works on Linux, and platforms
|
|
wishing to be supported by CPython should follow the requirements of
|
|
PEP-11. As explained above, while this would be a fair argument if
|
|
CPython were being to asked to make changes to support particular quirks
|
|
or features of a specific platform, in this case it is quirk of CPython
|
|
that prevents it from being used to its full potential on otherwise
|
|
POSIX-compliant platforms. The fact that the current implementation
|
|
happens to work on Linux is a happy accident, and there's no guarantee
|
|
that this will never change.
|
|
|
|
* Affected platforms should just configure Python ``--without-threads``:
|
|
This is a possible temporary workaround to the issue, but only that.
|
|
Python should not be hobbled on affected platforms despite them being
|
|
otherwise perfectly capable of running multi-threaded Python.
|
|
|
|
* Affected platforms should use CPython's own TLS implementation instead of
|
|
native TLS implementation: This is a more acceptable alternative to the
|
|
previous idea, and in fact there had been a patch to do just that [4]_.
|
|
However, the own implementation being "slower and clunkier" in general
|
|
than native implementations still needlessly hobbles performance on affected
|
|
platforms. At least one other module (``tracemalloc``) is also broken if
|
|
Python is built without native implementation. And this idea cannot be
|
|
adopted because the own implementation was removed.
|
|
|
|
* Keep the existing API, but work around the issue by providing a mapping from
|
|
``pthread_key_t`` values to ``int`` values. A couple attempts were made at
|
|
this ([5]_, [6]_), but this only injects needless complexity and overhead
|
|
into performance-critical code on platforms that are not currently affected
|
|
by this issue (such as Linux). Even if use of this workaround were made
|
|
conditional on platform compatibility, it introduces platform-specific code
|
|
to maintain, and still has the problem of the previous rejected ideas of
|
|
needlessly hobbling performance on affected platforms.
|
|
|
|
|
|
Implementation
|
|
==============
|
|
|
|
An initial version of a patch [7]_ is available on the bug tracker for this
|
|
issue. Since the migration to Github, it's being developed in the
|
|
``pep539-tss-api`` feature branch [10]_ in Masayuki Yamamoto's fork of the
|
|
CPython repository on Github. A work-in-progress PR is available at [11]_.
|
|
|
|
This reference implementation covers not only the enhancement request in API
|
|
features, but also the client codes fix needed to replace the existing TLS API
|
|
with the new TSS API.
|
|
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
|
|
References and Footnotes
|
|
========================
|
|
|
|
.. [1] http://bugs.python.org/issue25658
|
|
.. [2] https://bugs.python.org/msg116292
|
|
.. [3] https://msdn.microsoft.com/en-us/library/windows/desktop/ms686801(v=vs.85).aspx
|
|
.. [4] http://bugs.python.org/file45548/configure-pthread_key_t.patch
|
|
.. [5] http://bugs.python.org/file44269/issue25658-1.patch
|
|
.. [6] http://bugs.python.org/file44303/key-constant-time.diff
|
|
.. [7] http://bugs.python.org/file46379/pythread-tss-3.patch
|
|
.. [8] https://bugs.python.org/msg298342
|
|
.. [9] http://bugs.python.org/issue30832
|
|
.. [10] https://github.com/python/cpython/compare/master...ma8ma:pep539-tss-api
|
|
.. [11] https://github.com/python/cpython/pull/1362
|
|
.. [12] https://docs.python.org/3/c-api/init.html#c.Py_FinalizeEx
|
|
.. [13] It is also called as "stable ABI"
|
|
(https://www.python.org/dev/peps/pep-0384/)
|
|
.. [14] http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_key_create.html
|
|
.. [15] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf#page=404
|