584 lines
17 KiB
ReStructuredText
584 lines
17 KiB
ReStructuredText
PEP: 567
|
|
Title: Context Variables
|
|
Version: $Revision$
|
|
Last-Modified: $Date$
|
|
Author: Yury Selivanov <yury@magic.io>
|
|
Status: Draft
|
|
Type: Standards Track
|
|
Content-Type: text/x-rst
|
|
Created: 12-Dec-2017
|
|
Python-Version: 3.7
|
|
Post-History: 12-Dec-2017, 28-Dec-2017
|
|
|
|
|
|
Abstract
|
|
========
|
|
|
|
This PEP proposes a new ``contextvars`` module and a set of new
|
|
CPython C APIs to support context variables. This concept is
|
|
similar to thread-local storage (TLS), but, unlike TLS, it also allows
|
|
correctly keeping track of values per asynchronous task, e.g.
|
|
``asyncio.Task``.
|
|
|
|
This proposal is a simplified version of :pep:`550`. The key
|
|
difference is that this PEP is concerned only with solving the case
|
|
for asynchronous tasks, not for generators. There are no proposed
|
|
modifications to any built-in types or to the interpreter.
|
|
|
|
This proposal is not strictly related to Python Context Managers.
|
|
Although it does provide a mechanism that can be used by Context
|
|
Managers to store their state.
|
|
|
|
|
|
Rationale
|
|
=========
|
|
|
|
Thread-local variables are insufficient for asynchronous tasks that
|
|
execute concurrently in the same OS thread. Any context manager that
|
|
saves and restores a context value using ``threading.local()`` will
|
|
have its context values bleed to other code unexpectedly when used
|
|
in async/await code.
|
|
|
|
A few examples where having a working context local storage for
|
|
asynchronous code is desirable:
|
|
|
|
* Context managers like ``decimal`` contexts and ``numpy.errstate``.
|
|
|
|
* Request-related data, such as security tokens and request
|
|
data in web applications, language context for ``gettext``, etc.
|
|
|
|
* Profiling, tracing, and logging in large code bases.
|
|
|
|
|
|
Introduction
|
|
============
|
|
|
|
The PEP proposes a new mechanism for managing context variables.
|
|
The key classes involved in this mechanism are ``contextvars.Context``
|
|
and ``contextvars.ContextVar``. The PEP also proposes some policies
|
|
for using the mechanism around asynchronous tasks.
|
|
|
|
The proposed mechanism for accessing context variables uses the
|
|
``ContextVar`` class. A module (such as ``decimal``) that wishes to
|
|
store a context variable should:
|
|
|
|
* declare a module-global variable holding a ``ContextVar`` to
|
|
serve as a key;
|
|
|
|
* access the current value via the ``get()`` method on the
|
|
key variable;
|
|
|
|
* modify the current value via the ``set()`` method on the
|
|
key variable.
|
|
|
|
The notion of "current value" deserves special consideration:
|
|
different asynchronous tasks that exist and execute concurrently
|
|
may have different values for the same key. This idea is well-known
|
|
from thread-local storage but in this case the locality of the value is
|
|
not necessarily bound to a thread. Instead, there is the notion of the
|
|
"current ``Context``" which is stored in thread-local storage, and
|
|
is accessed via ``contextvars.copy_context()`` function.
|
|
Manipulation of the current ``Context`` is the responsibility of the
|
|
task framework, e.g. asyncio.
|
|
|
|
A ``Context`` is conceptually a read-only mapping, implemented using
|
|
an immutable dictionary. The ``ContextVar.get()`` method does a
|
|
lookup in the current ``Context`` with ``self`` as a key, raising a
|
|
``LookupError`` or returning a default value specified in
|
|
the constructor.
|
|
|
|
The ``ContextVar.set(value)`` method clones the current ``Context``,
|
|
assigns the ``value`` to it with ``self`` as a key, and sets the
|
|
new ``Context`` as the new current ``Context``.
|
|
|
|
|
|
Specification
|
|
=============
|
|
|
|
A new standard library module ``contextvars`` is added with the
|
|
following APIs:
|
|
|
|
1. ``copy_context() -> Context`` function is used to get a copy of
|
|
the current ``Context`` object for the current OS thread.
|
|
|
|
2. ``ContextVar`` class to declare and access context variables.
|
|
|
|
3. ``Context`` class encapsulates context state. Every OS thread
|
|
stores a reference to its current ``Context`` instance.
|
|
It is not possible to control that reference manually.
|
|
Instead, the ``Context.run(callable, *args, **kwargs)`` method is
|
|
used to run Python code in another context.
|
|
|
|
|
|
contextvars.ContextVar
|
|
----------------------
|
|
|
|
The ``ContextVar`` class has the following constructor signature:
|
|
``ContextVar(name, *, default=_NO_DEFAULT)``. The ``name`` parameter
|
|
is used only for introspection and debug purposes, and is exposed
|
|
as a read-only ``ContextVar.name`` attribute. The ``default``
|
|
parameter is optional. Example::
|
|
|
|
# Declare a context variable 'var' with the default value 42.
|
|
var = ContextVar('var', default=42)
|
|
|
|
(The ``_NO_DEFAULT`` is an internal sentinel object used to
|
|
detect if the default value was provided.)
|
|
|
|
``ContextVar.get()`` returns a value for context variable from the
|
|
current ``Context``::
|
|
|
|
# Get the value of `var`.
|
|
var.get()
|
|
|
|
``ContextVar.set(value) -> Token`` is used to set a new value for
|
|
the context variable in the current ``Context``::
|
|
|
|
# Set the variable 'var' to 1 in the current context.
|
|
var.set(1)
|
|
|
|
``ContextVar.reset(token)`` is used to reset the variable in the
|
|
current context to the value it had before the ``set()`` operation
|
|
that created the ``token``::
|
|
|
|
assert var.get(None) is None
|
|
|
|
token = var.set(1)
|
|
try:
|
|
...
|
|
finally:
|
|
var.reset(token)
|
|
|
|
assert var.get(None) is None
|
|
|
|
``ContextVar.reset()`` method is idempotent and can be called
|
|
multiple times on the same Token object: second and later calls
|
|
will be no-ops.
|
|
|
|
|
|
contextvars.Token
|
|
-----------------
|
|
|
|
``contextvars.Token`` is an opaque object that should be used to
|
|
restore the ``ContextVar`` to its previous value, or remove it from
|
|
the context if the variable was not set before. It can be created
|
|
only by calling ``ContextVar.set()``.
|
|
|
|
For debug and introspection purposes it has:
|
|
|
|
* a read-only attribute ``Token.var`` pointing to the variable
|
|
that created the token;
|
|
|
|
* a read-only attribute ``Token.old_value`` set to the value the
|
|
variable had before the ``set()`` call, or to ``Token.MISSING``
|
|
if the variable wasn't set before.
|
|
|
|
Having the ``ContextVar.set()`` method returning a ``Token`` object
|
|
and the ``ContextVar.reset(token)`` method, allows context variables
|
|
to be removed from the context if they were not in it before the
|
|
``set()`` call.
|
|
|
|
|
|
contextvars.Context
|
|
-------------------
|
|
|
|
``Context`` object is a mapping of context variables to values.
|
|
|
|
``Context()`` creates an empty context. To get a copy of the current
|
|
``Context`` for the current OS thread, use the
|
|
``contextvars.copy_context()`` method::
|
|
|
|
ctx = contextvars.copy_context()
|
|
|
|
To run Python code in some ``Context``, use ``Context.run()``
|
|
method::
|
|
|
|
ctx.run(function)
|
|
|
|
Any changes to any context variables that ``function`` causes will
|
|
be contained in the ``ctx`` context::
|
|
|
|
var = ContextVar('var')
|
|
var.set('spam')
|
|
|
|
def function():
|
|
assert var.get() == 'spam'
|
|
|
|
var.set('ham')
|
|
assert var.get() == 'ham'
|
|
|
|
ctx = copy_context()
|
|
|
|
# Any changes that 'function' makes to 'var' will stay
|
|
# isolated in the 'ctx'.
|
|
ctx.run(function)
|
|
|
|
assert var.get() == 'spam'
|
|
|
|
Any changes to the context will be contained in the ``Context``
|
|
object on which ``run()`` is called on.
|
|
|
|
``Context.run()`` is used to control in which context asyncio
|
|
callbacks and Tasks are executed. It can also be used to run some
|
|
code in a different thread in the context of the current thread::
|
|
|
|
executor = ThreadPoolExecutor()
|
|
current_context = contextvars.copy_context()
|
|
|
|
executor.submit(
|
|
lambda: current_context.run(some_function))
|
|
|
|
``Context`` objects implement the ``collections.abc.Mapping`` ABC.
|
|
This can be used to introspect context objects::
|
|
|
|
ctx = contextvars.copy_context()
|
|
|
|
# Print all context variables and their values in 'ctx':
|
|
print(ctx.items())
|
|
|
|
# Print the value of 'some_variable' in context 'ctx':
|
|
print(ctx[some_variable])
|
|
|
|
|
|
asyncio
|
|
-------
|
|
|
|
``asyncio`` uses ``Loop.call_soon()``, ``Loop.call_later()``,
|
|
and ``Loop.call_at()`` to schedule the asynchronous execution of a
|
|
function. ``asyncio.Task`` uses ``call_soon()`` to run the
|
|
wrapped coroutine.
|
|
|
|
We modify ``Loop.call_{at,later,soon}`` and
|
|
``Future.add_done_callback()`` to accept the new optional *context*
|
|
keyword-only argument, which defaults to the current context::
|
|
|
|
def call_soon(self, callback, *args, context=None):
|
|
if context is None:
|
|
context = contextvars.copy_context()
|
|
|
|
# ... some time later
|
|
context.run(callback, *args)
|
|
|
|
Tasks in asyncio need to maintain their own context that they inherit
|
|
from the point they were created at. ``asyncio.Task`` is modified
|
|
as follows::
|
|
|
|
class Task:
|
|
def __init__(self, coro):
|
|
...
|
|
# Get the current context snapshot.
|
|
self._context = contextvars.copy_context()
|
|
self._loop.call_soon(self._step, context=self._context)
|
|
|
|
def _step(self, exc=None):
|
|
...
|
|
# Every advance of the wrapped coroutine is done in
|
|
# the task's context.
|
|
self._loop.call_soon(self._step, context=self._context)
|
|
...
|
|
|
|
|
|
C API
|
|
-----
|
|
|
|
1. ``PyContextVar * PyContextVar_New(char *name, PyObject *default)``:
|
|
create a ``ContextVar`` object.
|
|
|
|
2. ``int PyContextVar_Get(PyContextVar *, PyObject *default_value, PyObject **value)``:
|
|
return ``-1`` if an error occurs during the lookup, ``0`` otherwise.
|
|
If a value for the context variable is found, it will be set to the
|
|
``value`` pointer. Otherwise, ``value`` will be set to
|
|
``default_value`` when it is not ``NULL``. If ``default_value`` is
|
|
``NULL``, ``value`` will be set to the default value of the
|
|
variable, which can be ``NULL`` too. ``value`` is always a new
|
|
reference.
|
|
|
|
3. ``PyContextToken * PyContextVar_Set(PyContextVar *, PyObject *)``:
|
|
set the value of the variable in the current context.
|
|
|
|
4. ``PyContextVar_Reset(PyContextVar *, PyContextToken *)``:
|
|
reset the value of the context variable.
|
|
|
|
5. ``PyContext * PyContext_New()``: create a new empty context.
|
|
|
|
6. ``PyContext * PyContext_Copy()``: get a copy of the current context.
|
|
|
|
7. ``int PyContext_Enter(PyContext *)`` and
|
|
``int PyContext_Exit(PyContext *)`` allow to set and restore
|
|
the context for the current OS thread. It is required to always
|
|
restore the previous context::
|
|
|
|
PyContext *old_ctx = PyContext_Copy();
|
|
if (old_ctx == NULL) goto error;
|
|
|
|
if (PyContext_Enter(new_ctx)) goto error;
|
|
|
|
// run some code
|
|
|
|
if (PyContext_Exit(old_ctx)) goto error;
|
|
|
|
|
|
Implementation
|
|
==============
|
|
|
|
This section explains high-level implementation details in
|
|
pseudo-code. Some optimizations are omitted to keep this section
|
|
short and clear.
|
|
|
|
For the purposes of this section, we implement an immutable dictionary
|
|
using ``dict.copy()``::
|
|
|
|
class _ContextData:
|
|
|
|
def __init__(self):
|
|
self._mapping = dict()
|
|
|
|
def get(self, key):
|
|
return self._mapping[key]
|
|
|
|
def set(self, key, value):
|
|
copy = _ContextData()
|
|
copy._mapping = self._mapping.copy()
|
|
copy._mapping[key] = value
|
|
return copy
|
|
|
|
def delete(self, key):
|
|
copy = _ContextData()
|
|
copy._mapping = self._mapping.copy()
|
|
del copy._mapping[key]
|
|
return copy
|
|
|
|
Every OS thread has a reference to the current ``_ContextData``.
|
|
``PyThreadState`` is updated with a new ``context_data`` field that
|
|
points to a ``_ContextData`` object::
|
|
|
|
class PyThreadState:
|
|
context_data: _ContextData
|
|
|
|
``contextvars.copy_context()`` is implemented as follows::
|
|
|
|
def copy_context():
|
|
ts : PyThreadState = PyThreadState_Get()
|
|
|
|
if ts.context_data is None:
|
|
ts.context_data = _ContextData()
|
|
|
|
ctx = Context()
|
|
ctx._data = ts.context_data
|
|
return ctx
|
|
|
|
``contextvars.Context`` is a wrapper around ``_ContextData``::
|
|
|
|
class Context(collections.abc.Mapping):
|
|
|
|
def __init__(self):
|
|
self._data = _ContextData()
|
|
|
|
def run(self, callable, *args, **kwargs):
|
|
ts : PyThreadState = PyThreadState_Get()
|
|
saved_data : _ContextData = ts.context_data
|
|
|
|
try:
|
|
ts.context_data = self._data
|
|
return callable(*args, **kwargs)
|
|
finally:
|
|
self._data = ts.context_data
|
|
ts.context_data = saved_data
|
|
|
|
# Mapping API methods are implemented by delegating
|
|
# `get()` and other Mapping calls to `self._data`.
|
|
|
|
``contextvars.ContextVar`` interacts with
|
|
``PyThreadState.context_data`` directly::
|
|
|
|
class ContextVar:
|
|
|
|
def __init__(self, name, *, default=_NO_DEFAULT):
|
|
self._name = name
|
|
self._default = default
|
|
|
|
@property
|
|
def name(self):
|
|
return self._name
|
|
|
|
def get(self, default=_NO_DEFAULT):
|
|
ts : PyThreadState = PyThreadState_Get()
|
|
data : _ContextData = ts.context_data
|
|
|
|
try:
|
|
return data.get(self)
|
|
except KeyError:
|
|
pass
|
|
|
|
if default is not _NO_DEFAULT:
|
|
return default
|
|
|
|
if self._default is not _NO_DEFAULT:
|
|
return self._default
|
|
|
|
raise LookupError
|
|
|
|
def set(self, value):
|
|
ts : PyThreadState = PyThreadState_Get()
|
|
data : _ContextData = ts.context_data
|
|
|
|
try:
|
|
old_value = data.get(self)
|
|
except KeyError:
|
|
old_value = Token.MISSING
|
|
|
|
ts.context_data = data.set(self, value)
|
|
return Token(self, old_value)
|
|
|
|
def reset(self, token):
|
|
if token._used:
|
|
return
|
|
|
|
if token._old_value is Token.MISSING:
|
|
ts.context_data = data.delete(token._var)
|
|
else:
|
|
ts.context_data = data.set(token._var,
|
|
token._old_value)
|
|
|
|
token._used = True
|
|
|
|
|
|
class Token:
|
|
|
|
MISSING = object()
|
|
|
|
def __init__(self, var, old_value):
|
|
self._var = var
|
|
self._old_value = old_value
|
|
self._used = False
|
|
|
|
@property
|
|
def var(self):
|
|
return self._var
|
|
|
|
@property
|
|
def old_value(self):
|
|
return self._old_value
|
|
|
|
|
|
Implementation Notes
|
|
====================
|
|
|
|
* The internal immutable dictionary for ``Context`` is implemented
|
|
using Hash Array Mapped Tries (HAMT). They allow for O(log N)
|
|
``set`` operation, and for O(1) ``copy_context()`` function, where
|
|
*N* is the number of items in the dictionary. For a detailed
|
|
analysis of HAMT performance please refer to :pep:`550` [1]_.
|
|
|
|
* ``ContextVar.get()`` has an internal cache for the most recent
|
|
value, which allows to bypass a hash lookup. This is similar
|
|
to the optimization the ``decimal`` module implements to
|
|
retrieve its context from ``PyThreadState_GetDict()``.
|
|
See :pep:`550` which explains the implementation of the cache
|
|
in a great detail.
|
|
|
|
|
|
Summary of the New APIs
|
|
=======================
|
|
|
|
* A new ``contextvars`` module with ``ContextVar``, ``Context``,
|
|
and ``Token`` classes, and a ``copy_context()`` function.
|
|
|
|
* ``asyncio.Loop.call_at()``, ``asyncio.Loop.call_later()``,
|
|
``asyncio.Loop.call_soon()``, and
|
|
``asyncio.Future.add_done_callback()`` run callback functions in
|
|
the context they were called in. A new *context* keyword-only
|
|
parameter can be used to specify a custom context.
|
|
|
|
* ``asyncio.Task`` is modified internally to maintain its own
|
|
context.
|
|
|
|
|
|
Design Considerations
|
|
=====================
|
|
|
|
Why contextvars.Token and not ContextVar.unset()?
|
|
-------------------------------------------------
|
|
|
|
The Token API allows to get around having a ``ContextVar.unset()``
|
|
method, which is incompatible with chained contexts design of
|
|
:pep:`550`. Future compatibility with :pep:`550` is desired
|
|
(at least for Python 3.7) in case there is demand to support
|
|
context variables in generators and asynchronous generators.
|
|
|
|
The Token API also offers better usability: the user does not have
|
|
to special-case absence of a value. Compare::
|
|
|
|
token = cv.get()
|
|
try:
|
|
cv.set(blah)
|
|
# code
|
|
finally:
|
|
cv.reset(token)
|
|
|
|
with::
|
|
|
|
_deleted = object()
|
|
old = cv.get(default=_deleted)
|
|
try:
|
|
cv.set(blah)
|
|
# code
|
|
finally:
|
|
if old is _deleted:
|
|
cv.unset()
|
|
else:
|
|
cv.set(old)
|
|
|
|
|
|
Rejected Ideas
|
|
==============
|
|
|
|
Replication of threading.local() interface
|
|
------------------------------------------
|
|
|
|
Please refer to :pep:`550` where this topic is covered in detail: [2]_.
|
|
|
|
|
|
Backwards Compatibility
|
|
=======================
|
|
|
|
This proposal preserves 100% backwards compatibility.
|
|
|
|
Libraries that use ``threading.local()`` to store context-related
|
|
values, currently work correctly only for synchronous code. Switching
|
|
them to use the proposed API will keep their behavior for synchronous
|
|
code unmodified, but will automatically enable support for
|
|
asynchronous code.
|
|
|
|
|
|
Reference Implementation
|
|
========================
|
|
|
|
The reference implementation can be found here: [3]_.
|
|
|
|
|
|
References
|
|
==========
|
|
|
|
.. [1] https://www.python.org/dev/peps/pep-0550/#appendix-hamt-performance-analysis
|
|
|
|
.. [2] https://www.python.org/dev/peps/pep-0550/#replication-of-threading-local-interface
|
|
|
|
.. [3] https://github.com/python/cpython/pull/5027
|
|
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
|
|
..
|
|
Local Variables:
|
|
mode: indented-text
|
|
indent-tabs-mode: nil
|
|
sentence-end-double-space: t
|
|
fill-column: 70
|
|
coding: utf-8
|
|
End:
|