pep-567: V3. (#536)

This commit is contained in:
Yury Selivanov 2018-01-17 08:54:57 -05:00 committed by GitHub
parent 6f48b8dd53
commit 583a14f814
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 346 additions and 223 deletions

View File

@ -8,7 +8,7 @@ Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 12-Dec-2017 Created: 12-Dec-2017
Python-Version: 3.7 Python-Version: 3.7
Post-History: 12-Dec-2017, 28-Dec-2017 Post-History: 12-Dec-2017, 28-Dec-2017, 16-Jan-2018
Abstract Abstract
@ -60,7 +60,7 @@ for using the mechanism around asynchronous tasks.
The proposed mechanism for accessing context variables uses the The proposed mechanism for accessing context variables uses the
``ContextVar`` class. A module (such as ``decimal``) that wishes to ``ContextVar`` class. A module (such as ``decimal``) that wishes to
store a context variable should: use the new mechanism should:
* declare a module-global variable holding a ``ContextVar`` to * declare a module-global variable holding a ``ContextVar`` to
serve as a key; serve as a key;
@ -76,20 +76,30 @@ different asynchronous tasks that exist and execute concurrently
may have different values for the same key. This idea is well-known may have different values for the same key. This idea is well-known
from thread-local storage but in this case the locality of the value is from thread-local storage but in this case the locality of the value is
not necessarily bound to a thread. Instead, there is the notion of the not necessarily bound to a thread. Instead, there is the notion of the
"current ``Context``" which is stored in thread-local storage, and "current ``Context``" which is stored in thread-local storage.
is accessed via ``contextvars.copy_context()`` function. Manipulation of the current context is the responsibility of the
Manipulation of the current ``Context`` is the responsibility of the
task framework, e.g. asyncio. task framework, e.g. asyncio.
A ``Context`` is conceptually a read-only mapping, implemented using A ``Context`` is a mapping of ``ContextVar`` objects to their values.
an immutable dictionary. The ``ContextVar.get()`` method does a The ``Context`` itself exposes the ``abc.Mapping`` interface
lookup in the current ``Context`` with ``self`` as a key, raising a (not ``abc.MutableMapping``!), so it cannot be modified directly.
``LookupError`` or returning a default value specified in To set a new value for a context variable in a ``Context`` object,
the constructor. the user needs to:
The ``ContextVar.set(value)`` method clones the current ``Context``, * make the ``Context`` object "current" using the ``Context.run()``
assigns the ``value`` to it with ``self`` as a key, and sets the method;
new ``Context`` as the new current ``Context``.
* use ``ContextVar.set()`` to set a new value for the context
variable.
The ``ContextVar.get()`` method looks for the variable in the current
``Context`` object using ``self`` as a key.
It is not possible to get a direct reference to the current ``Context``
object, but it is possible to obtain a shallow copy of it using the
``contextvars.copy_context()`` function. This ensures that the
*caller* of ``Context.run()`` is the sole owner of its ``Context``
object.
Specification Specification
@ -105,7 +115,7 @@ following APIs:
3. ``Context`` class encapsulates context state. Every OS thread 3. ``Context`` class encapsulates context state. Every OS thread
stores a reference to its current ``Context`` instance. stores a reference to its current ``Context`` instance.
It is not possible to control that reference manually. It is not possible to control that reference directly.
Instead, the ``Context.run(callable, *args, **kwargs)`` method is Instead, the ``Context.run(callable, *args, **kwargs)`` method is
used to run Python code in another context. used to run Python code in another context.
@ -115,7 +125,7 @@ contextvars.ContextVar
The ``ContextVar`` class has the following constructor signature: The ``ContextVar`` class has the following constructor signature:
``ContextVar(name, *, default=_NO_DEFAULT)``. The ``name`` parameter ``ContextVar(name, *, default=_NO_DEFAULT)``. The ``name`` parameter
is used only for introspection and debug purposes, and is exposed is used for introspection and debug purposes, and is exposed
as a read-only ``ContextVar.name`` attribute. The ``default`` as a read-only ``ContextVar.name`` attribute. The ``default``
parameter is optional. Example:: parameter is optional. Example::
@ -125,12 +135,22 @@ parameter is optional. Example::
(The ``_NO_DEFAULT`` is an internal sentinel object used to (The ``_NO_DEFAULT`` is an internal sentinel object used to
detect if the default value was provided.) detect if the default value was provided.)
``ContextVar.get()`` returns a value for context variable from the ``ContextVar.get(default=_NO_DEFAULT)`` returns a value for
current ``Context``:: the context variable for the current ``Context``::
# Get the value of `var`. # Get the value of `var`.
var.get() var.get()
If there is no value for the variable in the current context,
``ContextVar.get()`` will:
* return the value of the *default* argument of the ``get()`` method,
if provided; or
* return the default value for the context variable, if provided; or
* raise a ``LookupError``.
``ContextVar.set(value) -> Token`` is used to set a new value for ``ContextVar.set(value) -> Token`` is used to set a new value for
the context variable in the current ``Context``:: the context variable in the current ``Context``::
@ -139,28 +159,36 @@ the context variable in the current ``Context``::
``ContextVar.reset(token)`` is used to reset the variable in the ``ContextVar.reset(token)`` is used to reset the variable in the
current context to the value it had before the ``set()`` operation current context to the value it had before the ``set()`` operation
that created the ``token``:: that created the ``token`` (or to remove the variable if it was
not set)::
assert var.get(None) is None # Assume: var.get(None) is None
# Set 'var' to 1:
token = var.set(1) token = var.set(1)
try: try:
... # var.get() == 1
finally: finally:
var.reset(token) var.reset(token)
assert var.get(None) is None # After reset: var.get(None) is None,
# i.e. 'var' was removed from the current context.
``ContextVar.reset()`` method is idempotent and can be called ``ContextVar.reset()`` method is idempotent and can be called
multiple times on the same Token object: second and later calls multiple times on the same Token object: second and later calls
will be no-ops. will be no-ops. The method raises a ``ValueError`` if:
* it is called with a token object created by another variable; or
* the current ``Context`` object does not match the one where
the token object was created.
contextvars.Token contextvars.Token
----------------- -----------------
``contextvars.Token`` is an opaque object that should be used to ``contextvars.Token`` is an opaque object that should be used to
restore the ``ContextVar`` to its previous value, or remove it from restore the ``ContextVar`` to its previous value, or to remove it from
the context if the variable was not set before. It can be created the context if the variable was not set before. It can be created
only by calling ``ContextVar.set()``. only by calling ``ContextVar.set()``.
@ -173,11 +201,6 @@ For debug and introspection purposes it has:
variable had before the ``set()`` call, or to ``Token.MISSING`` variable had before the ``set()`` call, or to ``Token.MISSING``
if the variable wasn't set before. if the variable wasn't set before.
Having the ``ContextVar.set()`` method returning a ``Token`` object
and the ``ContextVar.reset(token)`` method, allows context variables
to be removed from the context if they were not in it before the
``set()`` call.
contextvars.Context contextvars.Context
------------------- -------------------
@ -201,35 +224,37 @@ be contained in the ``ctx`` context::
var = ContextVar('var') var = ContextVar('var')
var.set('spam') var.set('spam')
def function(): def main():
assert var.get() == 'spam' # 'var' was set to 'spam' before
# calling 'copy_context()' and 'ctx.run(main)', so:
# var.get() == ctx[var] == 'spam'
var.set('ham') var.set('ham')
assert var.get() == 'ham'
# Now, after setting 'var' to 'ham':
# var.get() == ctx[var] == 'ham'
ctx = copy_context() ctx = copy_context()
# Any changes that 'function' makes to 'var' will stay # Any changes that the 'main' function makes to 'var'
# isolated in the 'ctx'. # will be contained in 'ctx'.
ctx.run(function) ctx.run(main)
assert var.get() == 'spam' # The 'main()' function was run in the 'ctx' context,
# so changes to 'var' are contained in it:
# ctx[var] == 'ham'
Any changes to the context will be contained in the ``Context`` # However, outside of 'ctx', 'var' is still set to 'spam':
object on which ``run()`` is called on. # var.get() == 'spam'
``Context.run()`` is used to control in which context asyncio ``Context.run()`` raises a ``RuntimeError`` when called on the same
callbacks and Tasks are executed. It can also be used to run some context object from more than one OS thread, or when called
code in a different thread in the context of the current thread:: recursively.
executor = ThreadPoolExecutor() ``Context.copy()`` returns a shallow copy of the context object.
current_context = contextvars.copy_context()
executor.submit(
lambda: current_context.run(some_function))
``Context`` objects implement the ``collections.abc.Mapping`` ABC. ``Context`` objects implement the ``collections.abc.Mapping`` ABC.
This can be used to introspect context objects:: This can be used to introspect contexts::
ctx = contextvars.copy_context() ctx = contextvars.copy_context()
@ -239,6 +264,18 @@ This can be used to introspect context objects::
# Print the value of 'some_variable' in context 'ctx': # Print the value of 'some_variable' in context 'ctx':
print(ctx[some_variable]) print(ctx[some_variable])
Note that all Mapping methods, including ``Context.__getitem__`` and
``Context.get``, ignore default values for context variables
(i.e. ``ContextVar.default``). This means that for a variable *var*
that was created with a default value and was not set in the
*context*:
* ``context[var]`` raises a ``KeyError``,
* ``var in context`` returns ``False``,
* the variable isn't included in ``context.items()``, etc.
asyncio asyncio
------- -------
@ -278,11 +315,227 @@ as follows::
... ...
Implementation
==============
This section explains high-level implementation details in
pseudo-code. Some optimizations are omitted to keep this section
short and clear.
The ``Context`` mapping is implemented using an immutable dictionary.
This allows for a O(1) implementation of the ``copy_context()``
function. The reference implementation implements the immutable
dictionary using Hash Array Mapped Tries (HAMT); see :pep:`550`
for analysis of HAMT performance [1]_.
For the purposes of this section, we implement an immutable dictionary
using a copy-on-write approach and built-in dict type::
class _ContextData:
def __init__(self):
self._mapping = dict()
def __getitem__(self, key):
return self._mapping[key]
def __contains__(self, key):
return key in self._mapping
def __len__(self):
return len(self._mapping)
def __iter__(self):
return iter(self._mapping)
def set(self, key, value):
copy = _ContextData()
copy._mapping = self._mapping.copy()
copy._mapping[key] = value
return copy
def delete(self, key):
copy = _ContextData()
copy._mapping = self._mapping.copy()
del copy._mapping[key]
return copy
Every OS thread has a reference to the current ``Context`` object::
class PyThreadState:
context: Context
``contextvars.Context`` is a wrapper around ``_ContextData``::
class Context(collections.abc.Mapping):
_data: _ContextData
_prev_context: Optional[Context]
def __init__(self):
self._data = _ContextData()
self._prev_context = None
def run(self, callable, *args, **kwargs):
if self._prev_context is not None:
raise RuntimeError(
f'cannot enter context: {self} is already entered')
ts: PyThreadState = PyThreadState_Get()
self._prev_context = ts.context
try:
ts.context = self
return callable(*args, **kwargs)
finally:
ts.context = self._prev_context
self._prev_context = None
def copy(self):
new = Context()
new._data = self._data
return new
# Implement abstract Mapping.__getitem__
def __getitem__(self, var):
return self._data[var]
# Implement abstract Mapping.__contains__
def __contains__(self, var):
return var in self._data
# Implement abstract Mapping.__len__
def __len__(self):
return len(self._data)
# Implement abstract Mapping.__iter__
def __iter__(self):
return iter(self._data)
# The rest of the Mapping methods are implemented
# by collections.abc.Mapping.
``contextvars.copy_context()`` is implemented as follows::
def copy_context():
ts: PyThreadState = PyThreadState_Get()
return ts.context.copy()
``contextvars.ContextVar`` interacts with ``PyThreadState.context``
directly::
class ContextVar:
def __init__(self, name, *, default=_NO_DEFAULT):
self._name = name
self._default = default
@property
def name(self):
return self._name
def get(self, default=_NO_DEFAULT):
ts: PyThreadState = PyThreadState_Get()
try:
return ts.context[self]
except KeyError:
pass
if default is not _NO_DEFAULT:
return default
if self._default is not _NO_DEFAULT:
return self._default
raise LookupError
def set(self, value):
ts: PyThreadState = PyThreadState_Get()
data: _ContextData = ts.context._data
try:
old_value = data[self]
except KeyError:
old_value = Token.MISSING
updated_data = data.set(self, value)
ts.context._data = updated_data
return Token(ts.context, self, old_value)
def reset(self, token):
if token._var is not self:
raise ValueError(
"Token was created by a different ContextVar")
ts: PyThreadState = PyThreadState_Get()
if token._context is not ts.context:
raise ValueError(
"Token was created in a different Context")
if token._used:
return
if token._old_value is Token.MISSING:
ts.context._data = data.delete(token._var)
else:
ts.context._data = data.set(token._var,
token._old_value)
token._used = True
Note that the in the reference implementation, ``ContextVar.get()``
has an internal cache for the most recent value, which allows to
bypass a hash lookup. This is similar to the optimization the
``decimal`` module implements to retrieve its context from
``PyThreadState_GetDict()``. See :pep:`550` which explains the
implementation of the cache in great detail.
The ``Token`` class is implemented as follows::
class Token:
MISSING = object()
def __init__(self, context, var, old_value):
self._context = context
self._var = var
self._old_value = old_value
self._used = False
@property
def var(self):
return self._var
@property
def old_value(self):
return self._old_value
Summary of the New APIs
=======================
Python API
----------
1. A new ``contextvars`` module with ``ContextVar``, ``Context``,
and ``Token`` classes, and a ``copy_context()`` function.
2. ``asyncio.Loop.call_at()``, ``asyncio.Loop.call_later()``,
``asyncio.Loop.call_soon()``, and
``asyncio.Future.add_done_callback()`` run callback functions in
the context they were called in. A new *context* keyword-only
parameter can be used to specify a custom context.
3. ``asyncio.Task`` is modified internally to maintain its own
context.
C API C API
----- -----
1. ``PyContextVar * PyContextVar_New(char *name, PyObject *default)``: 1. ``PyContextVar * PyContextVar_New(char *name, PyObject *default)``:
create a ``ContextVar`` object. create a ``ContextVar`` object. The *default* argument can be
``NULL``, which means that the variable has no default value.
2. ``int PyContextVar_Get(PyContextVar *, PyObject *default_value, PyObject **value)``: 2. ``int PyContextVar_Get(PyContextVar *, PyObject *default_value, PyObject **value)``:
return ``-1`` if an error occurs during the lookup, ``0`` otherwise. return ``-1`` if an error occurs during the lookup, ``0`` otherwise.
@ -318,182 +571,6 @@ C API
if (PyContext_Exit(old_ctx)) goto error; if (PyContext_Exit(old_ctx)) goto error;
Implementation
==============
This section explains high-level implementation details in
pseudo-code. Some optimizations are omitted to keep this section
short and clear.
For the purposes of this section, we implement an immutable dictionary
using ``dict.copy()``::
class _ContextData:
def __init__(self):
self._mapping = dict()
def get(self, key):
return self._mapping[key]
def set(self, key, value):
copy = _ContextData()
copy._mapping = self._mapping.copy()
copy._mapping[key] = value
return copy
def delete(self, key):
copy = _ContextData()
copy._mapping = self._mapping.copy()
del copy._mapping[key]
return copy
Every OS thread has a reference to the current ``_ContextData``.
``PyThreadState`` is updated with a new ``context_data`` field that
points to a ``_ContextData`` object::
class PyThreadState:
context_data: _ContextData
``contextvars.copy_context()`` is implemented as follows::
def copy_context():
ts : PyThreadState = PyThreadState_Get()
if ts.context_data is None:
ts.context_data = _ContextData()
ctx = Context()
ctx._data = ts.context_data
return ctx
``contextvars.Context`` is a wrapper around ``_ContextData``::
class Context(collections.abc.Mapping):
def __init__(self):
self._data = _ContextData()
def run(self, callable, *args, **kwargs):
ts : PyThreadState = PyThreadState_Get()
saved_data : _ContextData = ts.context_data
try:
ts.context_data = self._data
return callable(*args, **kwargs)
finally:
self._data = ts.context_data
ts.context_data = saved_data
# Mapping API methods are implemented by delegating
# `get()` and other Mapping calls to `self._data`.
``contextvars.ContextVar`` interacts with
``PyThreadState.context_data`` directly::
class ContextVar:
def __init__(self, name, *, default=_NO_DEFAULT):
self._name = name
self._default = default
@property
def name(self):
return self._name
def get(self, default=_NO_DEFAULT):
ts : PyThreadState = PyThreadState_Get()
data : _ContextData = ts.context_data
try:
return data.get(self)
except KeyError:
pass
if default is not _NO_DEFAULT:
return default
if self._default is not _NO_DEFAULT:
return self._default
raise LookupError
def set(self, value):
ts : PyThreadState = PyThreadState_Get()
data : _ContextData = ts.context_data
try:
old_value = data.get(self)
except KeyError:
old_value = Token.MISSING
ts.context_data = data.set(self, value)
return Token(self, old_value)
def reset(self, token):
if token._used:
return
if token._old_value is Token.MISSING:
ts.context_data = data.delete(token._var)
else:
ts.context_data = data.set(token._var,
token._old_value)
token._used = True
class Token:
MISSING = object()
def __init__(self, var, old_value):
self._var = var
self._old_value = old_value
self._used = False
@property
def var(self):
return self._var
@property
def old_value(self):
return self._old_value
Implementation Notes
====================
* The internal immutable dictionary for ``Context`` is implemented
using Hash Array Mapped Tries (HAMT). They allow for O(log N)
``set`` operation, and for O(1) ``copy_context()`` function, where
*N* is the number of items in the dictionary. For a detailed
analysis of HAMT performance please refer to :pep:`550` [1]_.
* ``ContextVar.get()`` has an internal cache for the most recent
value, which allows to bypass a hash lookup. This is similar
to the optimization the ``decimal`` module implements to
retrieve its context from ``PyThreadState_GetDict()``.
See :pep:`550` which explains the implementation of the cache
in a great detail.
Summary of the New APIs
=======================
* A new ``contextvars`` module with ``ContextVar``, ``Context``,
and ``Token`` classes, and a ``copy_context()`` function.
* ``asyncio.Loop.call_at()``, ``asyncio.Loop.call_later()``,
``asyncio.Loop.call_soon()``, and
``asyncio.Future.add_done_callback()`` run callback functions in
the context they were called in. A new *context* keyword-only
parameter can be used to specify a custom context.
* ``asyncio.Task`` is modified internally to maintain its own
context.
Design Considerations Design Considerations
===================== =====================
@ -551,6 +628,52 @@ code unmodified, but will automatically enable support for
asynchronous code. asynchronous code.
Examples
========
Converting code that uses threading.local()
-------------------------------------------
A typical code fragment that uses ``threading.local()`` usually
looks like the following::
class PrecisionStorage(threading.local):
# Subclass threading.local to specify a default value.
value = 0.0
precision = PrecisionStorage()
# To set a new precision:
precision.value = 0.5
# To read the current precision:
print(precision.value)
Such code can be converted to use the ``contextvars`` module::
precision = contextvars.ContextVar('precision', default=0.0)
# To set a new precision:
precision.set(0.5)
# To read the current precision:
precision.get()
Offloading execution to other threads
-------------------------------------
It is possible to run code in a separate OS thread using a copy
of the current thread context::
executor = ThreadPoolExecutor()
current_context = contextvars.copy_context()
executor.submit(current_context.run, some_function)
Reference Implementation Reference Implementation
======================== ========================