454 lines
18 KiB
ReStructuredText
454 lines
18 KiB
ReStructuredText
PEP: 661
|
||
Title: Sentinel Values
|
||
Author: Tal Einat <tal@python.org>
|
||
Discussions-To: https://discuss.python.org/t/pep-661-sentinel-values/9126
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 06-Jun-2021
|
||
Post-History: `20-May-2021 <https://discuss.python.org/t/sentinel-values-in-the-stdlib/8810>`__, `06-Jun-2021 <https://discuss.python.org/t/pep-661-sentinel-values/9126>`__
|
||
|
||
|
||
TL;DR: See the `Specification`_ and `Reference Implementation`_.
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
Unique placeholder values, commonly known as "sentinel values", are common in
|
||
programming. They have many uses, such as for:
|
||
|
||
* Default values for function arguments, for when a value was not given::
|
||
|
||
def foo(value=None):
|
||
...
|
||
|
||
* Return values from functions when something is not found or unavailable::
|
||
|
||
>>> "abc".find("d")
|
||
-1
|
||
|
||
* Missing data, such as NULL in relational databases or "N/A" ("not
|
||
available") in spreadsheets
|
||
|
||
Python has the special value ``None``, which is intended to be used as such
|
||
a sentinel value in most cases. However, sometimes an alternative sentinel
|
||
value is needed, usually when it needs to be distinct from ``None`` since
|
||
``None`` is a valid value in that context. Such cases are common enough that
|
||
several idioms for implementing such sentinels have arisen over the years, but
|
||
uncommon enough that there hasn't been a clear need for standardization.
|
||
However, the common implementations, including some in the stdlib, suffer from
|
||
several significant drawbacks.
|
||
|
||
This PEP proposes adding a utility for defining sentinel values, to be used
|
||
in the stdlib and made publicly available as part of the stdlib.
|
||
|
||
Note: Changing all existing sentinels in the stdlib to be implemented this
|
||
way is not deemed necessary, and whether to do so is left to the discretion
|
||
of the maintainers.
|
||
|
||
|
||
Motivation
|
||
==========
|
||
|
||
In May 2021, a question was brought up on the python-dev mailing list
|
||
[1]_ about how to better implement a sentinel value for
|
||
``traceback.print_exception``. The existing implementation used the
|
||
following common idiom::
|
||
|
||
_sentinel = object()
|
||
|
||
However, this object has an uninformative and overly verbose repr, causing the
|
||
function's signature to be overly long and hard to read::
|
||
|
||
>>> help(traceback.print_exception)
|
||
Help on function print_exception in module traceback:
|
||
|
||
print_exception(exc, /, value=<object object at
|
||
0x000002825DF09650>, tb=<object object at 0x000002825DF09650>,
|
||
limit=None, file=None, chain=True)
|
||
|
||
Additionally, two other drawbacks of many existing sentinels were brought up
|
||
in the discussion:
|
||
|
||
1. Some do not have a distinct type, hence it is impossible to define clear
|
||
type signatures for functions with such sentinels as default values.
|
||
2. They behave unexpectedly after being copied or unpickled, due to a separate
|
||
instance being created and thus comparisons using ``is`` failing.
|
||
|
||
In the ensuing discussion, Victor Stinner supplied a list of currently used
|
||
sentinel values in the Python standard library [2]_. This showed that the
|
||
need for sentinels is fairly common, that there are various implementation
|
||
methods used even within the stdlib, and that many of these suffer from at
|
||
least one of the three above drawbacks.
|
||
|
||
The discussion did not lead to any clear consensus on whether a standard
|
||
implementation method is needed or desirable, whether the drawbacks mentioned
|
||
are significant, nor which kind of implementation would be good. The author
|
||
of this PEP created an issue on bugs.python.org (now a GitHub issue [3]_)
|
||
suggesting options for improvement, but that focused on only a single
|
||
problematic aspect of a few cases, and failed to gather any support.
|
||
|
||
A poll [4]_ was created on discuss.python.org to get a clearer sense of
|
||
the community's opinions. After nearly two weeks, significant further,
|
||
discussion, and 39 votes, the poll's results were not conclusive. 40% had
|
||
voted for "The status-quo is fine / there’s no need for consistency in
|
||
this", but most voters had voted for one or more standardized solutions.
|
||
Specifically, 37% of the voters chose "Consistent use of a new, dedicated
|
||
sentinel factory / class / meta-class, also made publicly available in the
|
||
stdlib".
|
||
|
||
With such mixed opinions, this PEP was created to facilitate making a decision
|
||
on the subject.
|
||
|
||
While working on this PEP, iterating on various options and implementations
|
||
and continuing discussions, the author has come to the opinion that a simple,
|
||
good implementation available in the standard library would be worth having,
|
||
both for use in the standard library itself and elsewhere.
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
The criteria guiding the chosen implementation were:
|
||
|
||
1. The sentinel objects should behave as expected by a sentinel object: When
|
||
compared using the ``is`` operator, it should always be considered
|
||
identical to itself but never to any other object.
|
||
2. Creating a sentinel object should be a simple, straightforward one-liner.
|
||
3. It should be simple to define as many distinct sentinel values as needed.
|
||
4. The sentinel objects should have a clear and short repr.
|
||
5. It should be possible to use clear type signatures for sentinels.
|
||
6. The sentinel objects should behave correctly after copying and/or
|
||
unpickling.
|
||
7. Such sentinels should work when using CPython 3.x and PyPy3, and ideally
|
||
also with other implementations of Python.
|
||
8. As simple and straightforward as possible, in implementation and especially
|
||
in use. Avoid this becoming one more special thing to learn when learning
|
||
Python. It should be easy to find and use when needed, and obvious enough
|
||
when reading code that one would normally not feel a need to look up its
|
||
documentation.
|
||
|
||
With so many uses in the Python standard library [2]_, it would be useful to
|
||
have an implementation in the standard library, since the stdlib cannot use
|
||
implementations of sentinel objects available elsewhere (such as the
|
||
``sentinels`` [5]_ or ``sentinel`` [6]_ PyPI packages).
|
||
|
||
After researching existing idioms and implementations, and going through many
|
||
different possible implementations, an implementation was written which meets
|
||
all of these criteria (see `Reference Implementation`_).
|
||
|
||
|
||
Specification
|
||
=============
|
||
|
||
A new ``Sentinel`` class will be added to a new ``sentinels`` module.
|
||
Its initializer will accept a single required argument, the name of the
|
||
sentinel object, and three optional arguments: the repr of the object, its
|
||
boolean value, and the name of its module::
|
||
|
||
>>> from sentinels import Sentinel
|
||
>>> NotGiven = Sentinel('NotGiven')
|
||
>>> NotGiven
|
||
<NotGiven>
|
||
>>> MISSING = Sentinel('MISSING', repr='mymodule.MISSING')
|
||
>>> MISSING
|
||
mymodule.MISSING
|
||
>>> MEGA = Sentinel('MEGA',
|
||
repr='<MEGA>',
|
||
bool_value=False,
|
||
module_name='mymodule')
|
||
<MEGA>
|
||
|
||
Checking if a value is such a sentinel *should* be done using the ``is``
|
||
operator, as is recommended for ``None``. Equality checks using ``==`` will
|
||
also work as expected, returning ``True`` only when the object is compared
|
||
with itself. Identity checks such as ``if value is MISSING:`` should usually
|
||
be used rather than boolean checks such as ``if value:`` or ``if not value:``.
|
||
|
||
Sentinel instances are truthy by default, unlike ``None``. This parallels the
|
||
default for arbitrary classes, as well as the boolean value of ``Ellipsis``.
|
||
|
||
The names of sentinels are unique within each module. When calling
|
||
``Sentinel()`` in a module where a sentinel with that name was already
|
||
defined, the existing sentinel with that name will be returned. Sentinels
|
||
with the same name in different modules will be distinct from each other.
|
||
|
||
Creating a copy of a sentinel object, such as by using ``copy.copy()`` or by
|
||
pickling and unpickling, will return the same object.
|
||
|
||
Type annotations for sentinel values should use ``Literal[<sentinel_object>]``.
|
||
For example::
|
||
|
||
def foo(value: int | Literal[MISSING] = MISSING) -> int:
|
||
...
|
||
|
||
The ``module_name`` optional argument should normally not need to be supplied,
|
||
as ``Sentinel()`` will usually be able to recognize the module in which it was
|
||
called. ``module_name`` should be supplied only in unusual cases when this
|
||
automatic recognition does not work as intended, such as perhaps when using
|
||
Jython or IronPython. This parallels the designs of ``Enum`` and
|
||
``namedtuple``. For more details, see :pep:`435`.
|
||
|
||
The ``Sentinel`` class may not be sub-classed, to avoid overly-clever uses
|
||
based on it, such as attempts to use it as a base for implementing singletons.
|
||
It is considered important that the addition of Sentinel to the stdlib should
|
||
add minimal complexity.
|
||
|
||
Ordering comparisons are undefined for sentinel objects.
|
||
|
||
|
||
Backwards Compatibility
|
||
=======================
|
||
|
||
While not breaking existing code, adding a new "sentinels" stdlib module could
|
||
cause some confusion with regard to existing modules named "sentinels", and
|
||
specifically with the "sentinels" package on PyPI.
|
||
|
||
The existing "sentinels" package on PyPI [10]_ appears to be abandoned, with
|
||
the latest release being made on Aug. 2016. Therefore, using this name for a
|
||
new stdlib module seems reasonable.
|
||
|
||
If and when this PEP is accepted, it may be worth verifying if this has indeed
|
||
been abandoned, and if so asking to transfer ownership to the CPython
|
||
maintainers to reduce the potential for confusion with the new stdlib module.
|
||
|
||
|
||
How to Teach This
|
||
=================
|
||
|
||
The normal types of documentation of new stdlib modules and features, namely
|
||
doc-strings, module docs and a section in "What's New", should suffice.
|
||
|
||
|
||
Security Implications
|
||
=====================
|
||
|
||
This proposal should have no security implications.
|
||
|
||
|
||
Reference Implementation
|
||
========================
|
||
|
||
The reference implementation is found in a dedicated GitHub repo [7]_. A
|
||
simplified version follows::
|
||
|
||
_registry = {}
|
||
|
||
class Sentinel:
|
||
"""Unique sentinel values."""
|
||
|
||
def __new__(cls, name, repr=None, bool_value=True, module_name=None):
|
||
name = str(name)
|
||
repr = str(repr) if repr else f'<{name.split(".")[-1]}>'
|
||
bool_value = bool(bool_value)
|
||
if module_name is None:
|
||
try:
|
||
module_name = \
|
||
sys._getframe(1).f_globals.get('__name__', '__main__')
|
||
except (AttributeError, ValueError):
|
||
module_name = __name__
|
||
|
||
registry_key = f'{module_name}-{name}'
|
||
|
||
sentinel = _registry.get(registry_key, None)
|
||
if sentinel is not None:
|
||
return sentinel
|
||
|
||
sentinel = super().__new__(cls)
|
||
sentinel._name = name
|
||
sentinel._repr = repr
|
||
sentinel._bool_value = bool_value
|
||
sentinel._module_name = module_name
|
||
|
||
return _registry.setdefault(registry_key, sentinel)
|
||
|
||
def __repr__(self):
|
||
return self._repr
|
||
|
||
def __bool__(self):
|
||
return self._bool_value
|
||
|
||
def __reduce__(self):
|
||
return (
|
||
self.__class__,
|
||
(
|
||
self._name,
|
||
self._repr,
|
||
self._module_name,
|
||
),
|
||
)
|
||
|
||
|
||
Rejected Ideas
|
||
==============
|
||
|
||
|
||
Use ``NotGiven = object()``
|
||
---------------------------
|
||
|
||
This suffers from all of the drawbacks mentioned in the `Rationale`_ section.
|
||
|
||
|
||
Add a single new sentinel value, such as ``MISSING`` or ``Sentinel``
|
||
--------------------------------------------------------------------
|
||
|
||
Since such a value could be used for various things in various places, one
|
||
could not always be confident that it would never be a valid value in some use
|
||
cases. On the other hand, a dedicated and distinct sentinel value can be used
|
||
with confidence without needing to consider potential edge-cases.
|
||
|
||
Additionally, it is useful to be able to provide a meaningful name and repr
|
||
for a sentinel value, specific to the context where it is used.
|
||
|
||
Finally, this was a very unpopular option in the poll [4]_, with only 12%
|
||
of the votes voting for it.
|
||
|
||
|
||
Use the existing ``Ellipsis`` sentinel value
|
||
--------------------------------------------
|
||
|
||
This is not the original intended use of Ellipsis, though it has become
|
||
increasingly common to use it to define empty class or function blocks instead
|
||
of using ``pass``.
|
||
|
||
Also, similar to a potential new single sentinel value, ``Ellipsis`` can't be
|
||
as confidently used in all cases, unlike a dedicated, distinct value.
|
||
|
||
|
||
Use a single-valued enum
|
||
------------------------
|
||
|
||
The suggested idiom is::
|
||
|
||
class NotGivenType(Enum):
|
||
NotGiven = 'NotGiven'
|
||
NotGiven = NotGivenType.NotGiven
|
||
|
||
Besides the excessive repetition, the repr is overly long:
|
||
``<NotGivenType.NotGiven: 'NotGiven'>``. A shorter repr can be defined, at
|
||
the expense of a bit more code and yet more repetition.
|
||
|
||
Finally, this option was the least popular among the nine options in the
|
||
poll [4]_, being the only option to receive no votes.
|
||
|
||
|
||
A sentinel class decorator
|
||
--------------------------
|
||
|
||
The suggested idiom is::
|
||
|
||
@sentinel(repr='<NotGiven>')
|
||
class NotGivenType: pass
|
||
NotGiven = NotGivenType()
|
||
|
||
While this allows for a very simple and clear implementation of the decorator,
|
||
the idiom is too verbose, repetitive, and difficult to remember.
|
||
|
||
|
||
Using class objects
|
||
-------------------
|
||
|
||
Since classes are inherently singletons, using a class as a sentinel value
|
||
makes sense and allows for a simple implementation.
|
||
|
||
The simplest version of this is::
|
||
|
||
class NotGiven: pass
|
||
|
||
To have a clear repr, one would need to use a meta-class::
|
||
|
||
class NotGiven(metaclass=SentinelMeta): pass
|
||
|
||
... or a class decorator::
|
||
|
||
@Sentinel
|
||
class NotGiven: pass
|
||
|
||
Using classes this way is unusual and could be confusing. The intention of
|
||
code would be hard to understand without comments. It would also cause
|
||
such sentinels to have some unexpected and undesirable behavior, such as
|
||
being callable.
|
||
|
||
|
||
Define a recommended "standard" idiom, without supplying an implementation
|
||
--------------------------------------------------------------------------
|
||
|
||
Most common existing idioms have significant drawbacks. So far, no idiom
|
||
has been found that is clear and concise while avoiding these drawbacks.
|
||
|
||
Also, in the poll [4]_ on this subject, the options for recommending an
|
||
idiom were unpopular, with the highest-voted option being voted for by only
|
||
25% of the voters.
|
||
|
||
|
||
Additional Notes
|
||
================
|
||
|
||
* This PEP and the initial implementation are drafted in a dedicated GitHub
|
||
repo [7]_.
|
||
|
||
* For sentinels defined in a class scope, to avoid potential name clashes,
|
||
one should use the fully-qualified name of the variable in the module. Only
|
||
the part of the name after the last period will be used for the default
|
||
repr. For example::
|
||
|
||
>>> class MyClass:
|
||
... NotGiven = sentinel('MyClass.NotGiven')
|
||
>>> MyClass.NotGiven
|
||
<NotGiven>
|
||
|
||
* One should be careful when creating sentinels in a function or method, since
|
||
sentinels with the same name created by code in the same module will be
|
||
identical. If distinct sentinel objects are needed, make sure to use
|
||
distinct names.
|
||
|
||
* There is no single desirable value for the "truthiness" of sentinels, i.e.
|
||
their boolean value. It is sometimes useful for the boolean value to be
|
||
``True``, and sometimes ``False``. Of the built-in sentinels in Python,
|
||
``None`` evaluates to ``False``, while ``Ellipsis`` (a.k.a. ``...``)
|
||
evaluates to ``True``. The desire for this to be set as needed came up in
|
||
discussions as well.
|
||
|
||
* The boolean value of ``NotImplemented`` is ``True``, but using this is
|
||
deprecated since Python 3.9 (doing so generates a deprecation warning.)
|
||
This deprecation is due to issues specific to ``NotImplemented``, as
|
||
described in bpo-35712 [8]_.
|
||
|
||
* To define multiple, related sentinel values, possibly with a defined
|
||
ordering among them, one should instead use ``Enum`` or something similar.
|
||
|
||
* There was a discussion on the typing-sig mailing list [9]_ about the typing
|
||
for these sentinels, where different options were discussed.
|
||
|
||
|
||
Open Issues
|
||
===========
|
||
|
||
* **Is adding a new stdlib module the right way to go?** I could not find any
|
||
existing module which seems like a logical place for this. However, adding
|
||
new stdlib modules should be done judiciously, so perhaps choosing an
|
||
existing module would be preferable even if it is not a perfect fit?
|
||
|
||
|
||
Footnotes
|
||
=========
|
||
|
||
.. [1] Python-Dev mailing list: `The repr of a sentinel <https://mail.python.org/archives/list/python-dev@python.org/thread/ZLVPD2OISI7M4POMTR2FCQTE6TPMPTO3/>`_
|
||
.. [2] Python-Dev mailing list: `"The stdlib contains tons of sentinels" <https://mail.python.org/archives/list/python-dev@python.org/message/JBYXQH3NV3YBF7P2HLHB5CD6V3GVTY55/>`_
|
||
.. [3] `bpo-44123: Make function parameter sentinel values true singletons <https://github.com/python/cpython/issues/88289>`_
|
||
.. [4] discuss.python.org Poll: `Sentinel Values in the Stdlib <https://discuss.python.org/t/sentinel-values-in-the-stdlib/8810/>`_
|
||
.. [5] `The "sentinels" package on PyPI <https://pypi.org/project/sentinels/>`_
|
||
.. [6] `The "sentinel" package on PyPI <https://pypi.org/project/sentinel/>`_
|
||
.. [7] `Reference implementation at the taleinat/python-stdlib-sentinels GitHub repo <https://github.com/taleinat/python-stdlib-sentinels>`_
|
||
.. [8] `bpo-35712: Make NotImplemented unusable in boolean context <https://github.com/python/cpython/issues/79893>`_
|
||
.. [9] `Discussion thread about type signatures for these sentinels on the typing-sig mailing list <https://mail.python.org/archives/list/typing-sig@python.org/thread/NDEJ7UCDPINP634GXWDARVMTGDVSNBKV/#LVCPTY26JQJW7NKGKGAZXHQKWVW7GOGL>`_
|
||
.. [10] `sentinels package on PyPI <https://pypi.org/project/sentinels/>`_
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document is placed in the public domain or under the
|
||
CC0-1.0-Universal license, whichever is more permissive.
|