PEP 661: Update draft in anticipation of proposal (#2785)
This commit is contained in:
parent
6c72c58229
commit
7bad8d9f10
316
pep-0661.rst
316
pep-0661.rst
|
@ -15,27 +15,43 @@ TL;DR: See the `Specification`_ and `Reference Implementation`_.
|
|||
Abstract
|
||||
========
|
||||
|
||||
Unique placeholder values, commonly known as "sentinel values", are useful in
|
||||
Python programs for several things, such as default values for function
|
||||
arguments where ``None`` is a valid input value. These cases are common
|
||||
enough for several idioms for implementing such "sentinels" to have arisen
|
||||
over the years, but uncommon enough that there hasn't been a clear need for
|
||||
standardization. However, the common implementations, including some in the
|
||||
stdlib, suffer from several significant drawbacks.
|
||||
Unique placeholder values, commonly known as "sentinel values", are common in
|
||||
programming. They have many uses, such as for:
|
||||
|
||||
This PEP suggests adding a utility for defining sentinel values, to be used
|
||||
* Default values for function arguments, for when a value was not given::
|
||||
|
||||
def foo(value=None):
|
||||
...
|
||||
|
||||
* Return values from functions when something is not found or unavailable::
|
||||
|
||||
>>> "abc".find("d")
|
||||
-1
|
||||
|
||||
* Missing data, such as NULL in relational databases or "N/A" ("not
|
||||
available") in spreadsheets
|
||||
|
||||
Python has the special value ``None``, which is intended to be used as such
|
||||
a sentinel value in most cases. However, sometimes an alternative sentinel
|
||||
value is needed, usually when it needs to be distinct from ``None``. These
|
||||
cases are common enough that several idioms for implementing such sentinels
|
||||
have arisen over the years, but uncommon enough that there hasn't been a
|
||||
clear need for standardization. However, the common implementations,
|
||||
including some in the stdlib, suffer from several significant drawbacks.
|
||||
|
||||
This PEP proposes adding a utility for defining sentinel values, to be used
|
||||
in the stdlib and made publicly available as part of the stdlib.
|
||||
|
||||
Note: Changing all existing sentinels in the stdlib to be implemented this
|
||||
way is not deemed necessary, and whether to do so is left to the discretion
|
||||
of each maintainer.
|
||||
of the maintainers.
|
||||
|
||||
|
||||
Motivation
|
||||
==========
|
||||
|
||||
In May 2021, a question was brought up on the `python-dev mailing list
|
||||
<python-dev-thread_>`__ about how to better implement a sentinel value for
|
||||
In May 2021, a question was brought up on the python-dev mailing list
|
||||
[1]_ about how to better implement a sentinel value for
|
||||
``traceback.print_exception``. The existing implementation used the
|
||||
following common idiom::
|
||||
|
||||
|
@ -54,22 +70,25 @@ function's signature to be overly long and hard to read::
|
|||
Additionally, two other drawbacks of many existing sentinels were brought up
|
||||
in the discussion:
|
||||
|
||||
1. Not having a distinct type, hence it being impossible to define strict
|
||||
type signatures functions with sentinels as default values
|
||||
1. Not having a distinct type, hence it being impossible to define clear
|
||||
type signatures for functions with sentinels as default values
|
||||
2. Incorrect behavior after being copied or unpickled, due to a separate
|
||||
instance being created and thus comparisons using ``is`` failing
|
||||
|
||||
In the ensuing discussion, Victor Stinner supplied a list of currently used
|
||||
`sentinel values in the Python standard library <list-of-sentinels-in-stdlib_>`__.
|
||||
This showed that the need for sentinels is fairly common, that there are
|
||||
various implementation methods used even within the stdlib, and that many of
|
||||
these suffer from at least one of the aforementioned drawbacks.
|
||||
sentinel values in the Python standard library [2]_. This showed that the
|
||||
need for sentinels is fairly common, that there are various implementation
|
||||
methods used even within the stdlib, and that many of these suffer from at
|
||||
least one of the three aforementioned drawbacks.
|
||||
|
||||
The discussion did not lead to any clear consensus on whether a standard
|
||||
implementation method is needed or desirable, whether the drawbacks mentioned
|
||||
are significant, nor which kind of implementation would be good.
|
||||
are significant, nor which kind of implementation would be good. The author
|
||||
of this PEP created an issue on bugs.python.org [3]_ suggesting options for
|
||||
improvement, but that focused on only a single problematic aspect of a few
|
||||
cases, and failed to gather any support.
|
||||
|
||||
A `poll was created on discuss.python.org <poll_>`__ to get a clearer sense of
|
||||
A poll [4]_ was created on discuss.python.org to get a clearer sense of
|
||||
the community's opinions. The poll's results were not conclusive, with 40%
|
||||
voting for "The status-quo is fine / there’s no need for consistency in
|
||||
this", but most voters voting for one or more standardized solutions.
|
||||
|
@ -80,6 +99,11 @@ stdlib".
|
|||
With such mixed opinions, this PEP was created to facilitate making a decision
|
||||
on the subject.
|
||||
|
||||
While working on this PEP, iterating on various options and implementations
|
||||
and continuing discussions, the author has come to the opinion that a simple,
|
||||
good implementation available in the standard library would be worth having,
|
||||
both for use in the standard library itself and elsewhere.
|
||||
|
||||
|
||||
Rationale
|
||||
=========
|
||||
|
@ -87,17 +111,26 @@ Rationale
|
|||
The criteria guiding the chosen implementation were:
|
||||
|
||||
1. The sentinel objects should behave as expected by a sentinel object: When
|
||||
compared using the ``is`` operator, it should always be considered identical
|
||||
to itself but never to any other object.
|
||||
2. It should be simple to define as many distinct sentinel values as needed.
|
||||
3. The sentinel objects should have a clear and short repr.
|
||||
4. The sentinel objects should each have a *distinct* type, usable in type
|
||||
annotations to define *strict* type signatures.
|
||||
5. The sentinel objects should behave correctly after copying and/or
|
||||
compared using the ``is`` operator, it should always be considered
|
||||
identical to itself but never to any other object.
|
||||
2. Creating a sentinel object should be a simple, straightforward one-liner.
|
||||
3. It should be simple to define as many distinct sentinel values as needed.
|
||||
4. The sentinel objects should have a clear and short repr.
|
||||
5. It should be possible to use clear type signatures for sentinels.
|
||||
6. The sentinel objects should behave correctly after copying and/or
|
||||
unpickling.
|
||||
6. Creating a sentinel object should be a simple, straightforward one-liner.
|
||||
7. Works using CPython and PyPy3. Will hopefully also work with other
|
||||
implementations.
|
||||
7. Such sentinels should work when using CPython 3.x and PyPy3, and ideally
|
||||
also with other implementations of Python.
|
||||
8. As simple and straightforward as possible, in implementation and especially
|
||||
in use. Avoid this becoming one more special thing to learn when learning
|
||||
Python. It should be easy to find and use when needed, and obvious enough
|
||||
when reading code that one would normally not feel a need to look up its
|
||||
documentation.
|
||||
|
||||
With so many uses in the Python standard library [2]_, it would be useful to
|
||||
have an implementation in the standard library, since the stdlib cannot use
|
||||
implementations of sentinel objects available elsewhere (such as the
|
||||
``sentinels`` [5]_ or ``sentinel`` [6]_ PyPI packages).
|
||||
|
||||
After researching existing idioms and implementations, and going through many
|
||||
different possible implementations, an implementation was written which meets
|
||||
|
@ -107,79 +140,99 @@ all of these criteria (see `Reference Implementation`_).
|
|||
Specification
|
||||
=============
|
||||
|
||||
A new ``sentinel`` function will be added to a new ``sentinels`` module.
|
||||
It will accept a single required argument, the name of the sentinel object,
|
||||
and a single optional argument, the repr of the object.
|
||||
A new ``Sentinel`` class will be added to a new ``sentinels`` module.
|
||||
Its initializer will accept a single required argument, the name of the
|
||||
sentinel object, and two optional arguments: the repr of the object, and the
|
||||
name of its module::
|
||||
|
||||
::
|
||||
|
||||
>>> NotGiven = sentinel('NotGiven')
|
||||
>>> from sentinel import Sentinel
|
||||
>>> NotGiven = Sentinel('NotGiven')
|
||||
>>> NotGiven
|
||||
<NotGiven>
|
||||
>>> MISSING = sentinel('MISSING', repr='mymodule.MISSING')
|
||||
>>> MISSING = Sentinel('MISSING', repr='mymodule.MISSING')
|
||||
>>> MISSING
|
||||
mymodule.MISSING
|
||||
>>> MEGA = Sentinel('MEGA', repr='<MEGA>', module_name='mymodule')
|
||||
<MEGA>
|
||||
|
||||
Checking if a value is such a sentinel *should* be done using the ``is``
|
||||
operator, as is recommended for ``None``. Equality checks using ``==`` will
|
||||
also work as expected, returning ``True`` only when the object is compared
|
||||
with itself.
|
||||
with itself. Identity checks such as ``if value is MISSING:`` should usually
|
||||
be used rather than boolean checks such as ``if value:`` or ``if not value:``.
|
||||
Sentinel instances are truthy by default.
|
||||
|
||||
The name should be set to the name of the variable used to reference the
|
||||
object, as in the examples above. Otherwise, the sentinel object won't be
|
||||
able to survive copying or pickling+unpickling while retaining the above
|
||||
described behavior. Note, that when defined in a class scope, the name must
|
||||
be the fully-qualified name of the variable in the module, for example::
|
||||
The names of sentinels are unique within each module. When calling
|
||||
``Sentinel()`` in a module where a sentinel with that name was already
|
||||
defined, the existing sentinel with that name will be returned. Sentinels
|
||||
with the same name in different modules will be distinct from each other.
|
||||
|
||||
class MyClass:
|
||||
NotGiven = sentinel('MyClass.NotGiven')
|
||||
Creating a copy of a sentinel object, such as by using ``copy.copy()`` or by
|
||||
pickling and unpickling, will return the same object.
|
||||
|
||||
Type annotations for sentinel values will use `typing.Literal`_.
|
||||
For example::
|
||||
Type annotations for sentinel values should use ``Sentinel``. For example::
|
||||
|
||||
def foo(value: int | Literal[NotGiven]) -> None:
|
||||
def foo(value: int | Sentinel = MISSING) -> int:
|
||||
...
|
||||
|
||||
.. _typing.Literal: https://docs.python.org/3/library/typing.html#typing.Literal
|
||||
The ``module_name`` optional argument should normally not need to be supplied,
|
||||
as ``Sentinel()`` will usually be able to recognize the module in which it was
|
||||
called. ``module_name`` should be supplied only in unusual cases when this
|
||||
automatic recognition does not work as intended, such as perhaps when using
|
||||
Jython or IronPython. This parallels the designs of ``Enum`` and
|
||||
``namedtuple``. For more details, see :pep:`435`.
|
||||
|
||||
The ``Sentinel`` class may be sub-classed. Instances of each sub-class will
|
||||
be unique, even if using the same name and module. This allows for
|
||||
customizing the behavior of sentinels, such as controlling their truthiness.
|
||||
|
||||
|
||||
Reference Implementation
|
||||
========================
|
||||
|
||||
The reference implementation is found in a `dedicated GitHub repo
|
||||
<reference-github-repo_>`__. A simplified version follows::
|
||||
The reference implementation is found in a dedicated GitHub repo [7]_. A
|
||||
simplified version follows::
|
||||
|
||||
def sentinel(name, repr=None):
|
||||
"""Create a unique sentinel object."""
|
||||
repr = repr or f'<{name}>'
|
||||
_registry = {}
|
||||
|
||||
module = _get_parent_frame().f_globals.get('__name__', '__main__')
|
||||
class_name = _get_class_name(name, module)
|
||||
class_namespace = {
|
||||
'__repr__': lambda self: repr,
|
||||
}
|
||||
cls = type(class_name, (), class_namespace)
|
||||
cls.__module__ = module
|
||||
_get_parent_frame().f_globals[class_name] = cls
|
||||
class Sentinel:
|
||||
"""Unique sentinel values."""
|
||||
|
||||
sentinel = cls()
|
||||
cls.__new__ = lambda cls_: sentinel
|
||||
def __new__(cls, name, repr=None, module_name=None):
|
||||
name = str(name)
|
||||
repr = str(repr) if repr else f'<{name.split(".")[-1]}>'
|
||||
if module_name is None:
|
||||
try:
|
||||
module_name = \
|
||||
sys._getframe(1).f_globals.get('__name__', '__main__')
|
||||
except (AttributeError, ValueError):
|
||||
module_name = __name__
|
||||
|
||||
registry_key = f'{module_name}-{name}'
|
||||
|
||||
sentinel = _registry.get(registry_key, None)
|
||||
if sentinel is not None:
|
||||
return sentinel
|
||||
|
||||
def _get_class_name(sentinel_qualname, module_name):
|
||||
return '__'.join(['_sentinel_type',
|
||||
module_name.replace('.', '_'),
|
||||
sentinel_qualname.replace('.', '_')])
|
||||
sentinel = super().__new__(cls)
|
||||
sentinel._name = name
|
||||
sentinel._repr = repr
|
||||
sentinel._module_name = module_name
|
||||
|
||||
return _registry.setdefault(registry_key, sentinel)
|
||||
|
||||
Note that a dedicated class is created automatically for each sentinel object.
|
||||
This class is assigned to the namespace of the module from which the
|
||||
``sentinel()`` call was made, or to that of the ``sentinels`` module itself as
|
||||
a fallback. These classes have long names comprised of several parts to
|
||||
ensure their uniqueness. However, these names usually wouldn't be used, since
|
||||
type annotations should use ``Literal[]`` as described above, and identity
|
||||
checks should be preferred over type checks.
|
||||
def __repr__(self):
|
||||
return self._repr
|
||||
|
||||
def __reduce__(self):
|
||||
return (
|
||||
self.__class__,
|
||||
(
|
||||
self._name,
|
||||
self._repr,
|
||||
self._module_name,
|
||||
),
|
||||
)
|
||||
|
||||
|
||||
Rejected Ideas
|
||||
|
@ -192,8 +245,8 @@ Use ``NotGiven = object()``
|
|||
This suffers from all of the drawbacks mentioned in the `Rationale`_ section.
|
||||
|
||||
|
||||
Add a single new sentinel value, e.g. ``MISSING`` or ``Sentinel``
|
||||
-----------------------------------------------------------------
|
||||
Add a single new sentinel value, such as ``MISSING`` or ``Sentinel``
|
||||
--------------------------------------------------------------------
|
||||
|
||||
Since such a value could be used for various things in various places, one
|
||||
could not always be confident that it would never be a valid value in some use
|
||||
|
@ -203,7 +256,7 @@ with confidence without needing to consider potential edge-cases.
|
|||
Additionally, it is useful to be able to provide a meaningful name and repr
|
||||
for a sentinel value, specific to the context where it is used.
|
||||
|
||||
Finally, this was a very unpopular option in the `poll <poll_>`__, with only 12%
|
||||
Finally, this was a very unpopular option in the poll [4]_, with only 12%
|
||||
of the votes voting for it.
|
||||
|
||||
|
||||
|
@ -221,9 +274,7 @@ as confidently used in all cases, unlike a dedicated, distinct value.
|
|||
Use a single-valued enum
|
||||
------------------------
|
||||
|
||||
The suggested idiom is:
|
||||
|
||||
::
|
||||
The suggested idiom is::
|
||||
|
||||
class NotGivenType(Enum):
|
||||
NotGiven = 'NotGiven'
|
||||
|
@ -233,23 +284,21 @@ Besides the excessive repetition, the repr is overly long:
|
|||
``<NotGivenType.NotGiven: 'NotGiven'>``. A shorter repr can be defined, at
|
||||
the expense of a bit more code and yet more repetition.
|
||||
|
||||
Finally, this option was the least popular among the nine options in the `poll
|
||||
<poll_>`__, being the only option to receive no votes.
|
||||
Finally, this option was the least popular among the nine options in the
|
||||
poll [4]_, being the only option to receive no votes.
|
||||
|
||||
|
||||
A sentinel class decorator
|
||||
--------------------------
|
||||
|
||||
The suggested interface:
|
||||
|
||||
::
|
||||
The suggested idiom is::
|
||||
|
||||
@sentinel(repr='<NotGiven>')
|
||||
class NotGivenType: pass
|
||||
NotGiven = NotGivenType()
|
||||
|
||||
While this allowed for a very simple and clear implementation, the interface
|
||||
is too verbose, repetitive, and difficult to remember.
|
||||
While this allows for a very simple and clear implementation of the decorator,
|
||||
the idiom is too verbose, repetitive, and difficult to remember.
|
||||
|
||||
|
||||
Using class objects
|
||||
|
@ -258,33 +307,23 @@ Using class objects
|
|||
Since classes are inherently singletons, using a class as a sentinel value
|
||||
makes sense and allows for a simple implementation.
|
||||
|
||||
The simplest version of this idiom is:
|
||||
|
||||
::
|
||||
The simplest version of this is::
|
||||
|
||||
class NotGiven: pass
|
||||
|
||||
To have a clear repr, one could define ``__repr__``:
|
||||
|
||||
::
|
||||
|
||||
class NotGiven:
|
||||
def __repr__(self):
|
||||
return '<NotGiven>'
|
||||
|
||||
... or use a meta-class:
|
||||
|
||||
::
|
||||
To have a clear repr, one would need to use a meta-class::
|
||||
|
||||
class NotGiven(metaclass=SentinelMeta): pass
|
||||
|
||||
However, all such implementations don't have a dedicated type for the
|
||||
sentinel, which is considered desirable for strict typing. A dedicated type
|
||||
could be created by a meta-class or class decorator, but at that point the
|
||||
implementation would become much more complex and loses its advantages over
|
||||
the chosen implementation.
|
||||
... or a class decorator::
|
||||
|
||||
Additionally, using classes this way is unusual and could be confusing.
|
||||
@Sentinel
|
||||
class NotGiven: pass
|
||||
|
||||
Using classes this way is unusual and could be confusing. The intention of
|
||||
code would be hard to understand without comments. It would also cause
|
||||
such sentinels to have some unexpected and undesirable behavior, such as
|
||||
being callable.
|
||||
|
||||
|
||||
Define a recommended "standard" idiom, without supplying an implementation
|
||||
|
@ -293,38 +332,65 @@ Define a recommended "standard" idiom, without supplying an implementation
|
|||
Most common exiting idioms have significant drawbacks. So far, no idiom
|
||||
has been found that is clear and concise while avoiding these drawbacks.
|
||||
|
||||
Also, in the `poll on this subject <poll_>`__, the options for recommending an
|
||||
Also, in the poll [4]_ on this subject, the options for recommending an
|
||||
idiom were unpopular, with the highest-voted option being voted for by only
|
||||
25% of the voters.
|
||||
|
||||
|
||||
Specific type signatures for each sentinel value
|
||||
------------------------------------------------
|
||||
|
||||
For a long time, the author of this PEP strove to have type signatures for
|
||||
such sentinels that were specific to each value. A leading proposal
|
||||
(supported by Guido and others) was to expand the use of ``Literal``, e.g.
|
||||
``Literal[MISSING]``. After much thought and discussion, especially on the
|
||||
typing-sig mailing list [8]_, it seems that all such solutions would require
|
||||
special-casing and/or added complexity in the implementations of static type
|
||||
checkers, while also constraining the implementation of sentinels.
|
||||
|
||||
Therefore, this PEP no longer proposes such signatures. Instead, this PEP
|
||||
suggests using ``Sentinel`` as the type signature for sentinel values.
|
||||
|
||||
It is somewhat unfortunate that static type checkers will sometimes not be
|
||||
able to deduce more specific types due to this, such as inside a conditional
|
||||
block like ``if value is not MISSING: ...``. However, this is a minor issue
|
||||
in practice, as type checkers can be easily made to understand these cases
|
||||
using ``typing.cast()``.
|
||||
|
||||
|
||||
Additional Notes
|
||||
================
|
||||
|
||||
* This PEP and the initial implementation are drafted in a `dedicated GitHub
|
||||
repo <reference-github-repo_>`__.
|
||||
* This PEP and the initial implementation are drafted in a dedicated GitHub
|
||||
repo [7]_.
|
||||
|
||||
* The support for copying/unpickling works when defined in a module's scope or
|
||||
a (possibly nested) class's scope. Note that in the latter case, the name
|
||||
provided as the first parameter must be the fully-qualified name of the
|
||||
variable in the module::
|
||||
* For sentinels defined in a class scope, to avoid potential name clashes,
|
||||
one should use the fully-qualified name of the variable in the module. Only
|
||||
the part of the name after the last period will be used for the default
|
||||
repr. For example::
|
||||
|
||||
class MyClass:
|
||||
NotGiven = sentinel('MyClass.NotGiven', repr='<NotGiven>')
|
||||
>>> class MyClass:
|
||||
... NotGiven = sentinel('MyClass.NotGiven')
|
||||
>>> MyClass.NotGiven
|
||||
<NotGiven>
|
||||
|
||||
* One should be careful when creating sentinels in a function or method, since
|
||||
sentinels with the same name created by code in the same module will be
|
||||
identical. If distinct sentinel objects are needed, make sure to use
|
||||
distinct names.
|
||||
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
.. _python-dev-thread: https://mail.python.org/archives/list/python-dev@python.org/thread/ZLVPD2OISI7M4POMTR2FCQTE6TPMPTO3/
|
||||
.. _list-of-sentinels-in-stdlib: https://mail.python.org/archives/list/python-dev@python.org/message/JBYXQH3NV3YBF7P2HLHB5CD6V3GVTY55/
|
||||
.. _poll: https://discuss.python.org/t/sentinel-values-in-the-stdlib/8810/
|
||||
.. _reference-github-repo: https://github.com/taleinat/python-stdlib-sentinels
|
||||
|
||||
* `bpo-44123: Make function parameter sentinel values true singletons <https://bugs.python.org/issue44123>`_
|
||||
* `The "sentinels" package on PyPI <https://pypi.org/project/sentinels/>`_
|
||||
* `The "sentinel" package on PyPI <https://pypi.org/project/sentinel/>`_
|
||||
* `Discussion thread about type signatures for these sentinels on the typing-sig mailing list <https://mail.python.org/archives/list/typing-sig@python.org/thread/NDEJ7UCDPINP634GXWDARVMTGDVSNBKV/#LVCPTY26JQJW7NKGKGAZXHQKWVW7GOGL>`_
|
||||
.. [1] Python-Dev mailing list: `The repr of a sentinel <https://mail.python.org/archives/list/python-dev@python.org/thread/ZLVPD2OISI7M4POMTR2FCQTE6TPMPTO3/>`_
|
||||
.. [2] Python-Dev mailing list: `"The stdlib contains tons of sentinels" <https://mail.python.org/archives/list/python-dev@python.org/message/JBYXQH3NV3YBF7P2HLHB5CD6V3GVTY55/>`_
|
||||
.. [3] `bpo-44123: Make function parameter sentinel values true singletons <https://bugs.python.org/issue44123>`_
|
||||
.. [4] discuss.python.org Poll: `Sentinel Values in the Stdlib <https://discuss.python.org/t/sentinel-values-in-the-stdlib/8810/>`_
|
||||
.. [5] `The "sentinels" package on PyPI <https://pypi.org/project/sentinels/>`_
|
||||
.. [6] `The "sentinel" package on PyPI <https://pypi.org/project/sentinel/>`_
|
||||
.. [7] `Reference implementation at the taleinat/python-stdlib-sentinels GitHub repo <https://github.com/taleinat/python-stdlib-sentinels>`_
|
||||
.. [8] `Discussion thread about type signatures for these sentinels on the typing-sig mailing list <https://mail.python.org/archives/list/typing-sig@python.org/thread/NDEJ7UCDPINP634GXWDARVMTGDVSNBKV/#LVCPTY26JQJW7NKGKGAZXHQKWVW7GOGL>`_
|
||||
|
||||
|
||||
Copyright
|
||||
|
|
Loading…
Reference in New Issue