diff --git a/pep-0661.rst b/pep-0661.rst index 3cc1d2476..5f428f43c 100644 --- a/pep-0661.rst +++ b/pep-0661.rst @@ -15,27 +15,43 @@ TL;DR: See the `Specification`_ and `Reference Implementation`_. Abstract ======== -Unique placeholder values, commonly known as "sentinel values", are useful in -Python programs for several things, such as default values for function -arguments where ``None`` is a valid input value. These cases are common -enough for several idioms for implementing such "sentinels" to have arisen -over the years, but uncommon enough that there hasn't been a clear need for -standardization. However, the common implementations, including some in the -stdlib, suffer from several significant drawbacks. +Unique placeholder values, commonly known as "sentinel values", are common in +programming. They have many uses, such as for: -This PEP suggests adding a utility for defining sentinel values, to be used +* Default values for function arguments, for when a value was not given:: + + def foo(value=None): + ... + +* Return values from functions when something is not found or unavailable:: + + >>> "abc".find("d") + -1 + +* Missing data, such as NULL in relational databases or "N/A" ("not + available") in spreadsheets + +Python has the special value ``None``, which is intended to be used as such +a sentinel value in most cases. However, sometimes an alternative sentinel +value is needed, usually when it needs to be distinct from ``None``. These +cases are common enough that several idioms for implementing such sentinels +have arisen over the years, but uncommon enough that there hasn't been a +clear need for standardization. However, the common implementations, +including some in the stdlib, suffer from several significant drawbacks. + +This PEP proposes adding a utility for defining sentinel values, to be used in the stdlib and made publicly available as part of the stdlib. Note: Changing all existing sentinels in the stdlib to be implemented this way is not deemed necessary, and whether to do so is left to the discretion -of each maintainer. +of the maintainers. Motivation ========== -In May 2021, a question was brought up on the `python-dev mailing list -`__ about how to better implement a sentinel value for +In May 2021, a question was brought up on the python-dev mailing list +[1]_ about how to better implement a sentinel value for ``traceback.print_exception``. The existing implementation used the following common idiom:: @@ -54,22 +70,25 @@ function's signature to be overly long and hard to read:: Additionally, two other drawbacks of many existing sentinels were brought up in the discussion: -1. Not having a distinct type, hence it being impossible to define strict - type signatures functions with sentinels as default values +1. Not having a distinct type, hence it being impossible to define clear + type signatures for functions with sentinels as default values 2. Incorrect behavior after being copied or unpickled, due to a separate instance being created and thus comparisons using ``is`` failing In the ensuing discussion, Victor Stinner supplied a list of currently used -`sentinel values in the Python standard library `__. -This showed that the need for sentinels is fairly common, that there are -various implementation methods used even within the stdlib, and that many of -these suffer from at least one of the aforementioned drawbacks. +sentinel values in the Python standard library [2]_. This showed that the +need for sentinels is fairly common, that there are various implementation +methods used even within the stdlib, and that many of these suffer from at +least one of the three aforementioned drawbacks. The discussion did not lead to any clear consensus on whether a standard implementation method is needed or desirable, whether the drawbacks mentioned -are significant, nor which kind of implementation would be good. +are significant, nor which kind of implementation would be good. The author +of this PEP created an issue on bugs.python.org [3]_ suggesting options for +improvement, but that focused on only a single problematic aspect of a few +cases, and failed to gather any support. -A `poll was created on discuss.python.org `__ to get a clearer sense of +A poll [4]_ was created on discuss.python.org to get a clearer sense of the community's opinions. The poll's results were not conclusive, with 40% voting for "The status-quo is fine / there’s no need for consistency in this", but most voters voting for one or more standardized solutions. @@ -80,6 +99,11 @@ stdlib". With such mixed opinions, this PEP was created to facilitate making a decision on the subject. +While working on this PEP, iterating on various options and implementations +and continuing discussions, the author has come to the opinion that a simple, +good implementation available in the standard library would be worth having, +both for use in the standard library itself and elsewhere. + Rationale ========= @@ -87,17 +111,26 @@ Rationale The criteria guiding the chosen implementation were: 1. The sentinel objects should behave as expected by a sentinel object: When - compared using the ``is`` operator, it should always be considered identical - to itself but never to any other object. -2. It should be simple to define as many distinct sentinel values as needed. -3. The sentinel objects should have a clear and short repr. -4. The sentinel objects should each have a *distinct* type, usable in type - annotations to define *strict* type signatures. -5. The sentinel objects should behave correctly after copying and/or + compared using the ``is`` operator, it should always be considered + identical to itself but never to any other object. +2. Creating a sentinel object should be a simple, straightforward one-liner. +3. It should be simple to define as many distinct sentinel values as needed. +4. The sentinel objects should have a clear and short repr. +5. It should be possible to use clear type signatures for sentinels. +6. The sentinel objects should behave correctly after copying and/or unpickling. -6. Creating a sentinel object should be a simple, straightforward one-liner. -7. Works using CPython and PyPy3. Will hopefully also work with other - implementations. +7. Such sentinels should work when using CPython 3.x and PyPy3, and ideally + also with other implementations of Python. +8. As simple and straightforward as possible, in implementation and especially + in use. Avoid this becoming one more special thing to learn when learning + Python. It should be easy to find and use when needed, and obvious enough + when reading code that one would normally not feel a need to look up its + documentation. + +With so many uses in the Python standard library [2]_, it would be useful to +have an implementation in the standard library, since the stdlib cannot use +implementations of sentinel objects available elsewhere (such as the +``sentinels`` [5]_ or ``sentinel`` [6]_ PyPI packages). After researching existing idioms and implementations, and going through many different possible implementations, an implementation was written which meets @@ -107,79 +140,99 @@ all of these criteria (see `Reference Implementation`_). Specification ============= -A new ``sentinel`` function will be added to a new ``sentinels`` module. -It will accept a single required argument, the name of the sentinel object, -and a single optional argument, the repr of the object. +A new ``Sentinel`` class will be added to a new ``sentinels`` module. +Its initializer will accept a single required argument, the name of the +sentinel object, and two optional arguments: the repr of the object, and the +name of its module:: -:: - - >>> NotGiven = sentinel('NotGiven') + >>> from sentinel import Sentinel + >>> NotGiven = Sentinel('NotGiven') >>> NotGiven - >>> MISSING = sentinel('MISSING', repr='mymodule.MISSING') + >>> MISSING = Sentinel('MISSING', repr='mymodule.MISSING') >>> MISSING mymodule.MISSING + >>> MEGA = Sentinel('MEGA', repr='', module_name='mymodule') + Checking if a value is such a sentinel *should* be done using the ``is`` operator, as is recommended for ``None``. Equality checks using ``==`` will also work as expected, returning ``True`` only when the object is compared -with itself. +with itself. Identity checks such as ``if value is MISSING:`` should usually +be used rather than boolean checks such as ``if value:`` or ``if not value:``. +Sentinel instances are truthy by default. -The name should be set to the name of the variable used to reference the -object, as in the examples above. Otherwise, the sentinel object won't be -able to survive copying or pickling+unpickling while retaining the above -described behavior. Note, that when defined in a class scope, the name must -be the fully-qualified name of the variable in the module, for example:: +The names of sentinels are unique within each module. When calling +``Sentinel()`` in a module where a sentinel with that name was already +defined, the existing sentinel with that name will be returned. Sentinels +with the same name in different modules will be distinct from each other. - class MyClass: - NotGiven = sentinel('MyClass.NotGiven') +Creating a copy of a sentinel object, such as by using ``copy.copy()`` or by +pickling and unpickling, will return the same object. -Type annotations for sentinel values will use `typing.Literal`_. -For example:: +Type annotations for sentinel values should use ``Sentinel``. For example:: - def foo(value: int | Literal[NotGiven]) -> None: + def foo(value: int | Sentinel = MISSING) -> int: ... -.. _typing.Literal: https://docs.python.org/3/library/typing.html#typing.Literal +The ``module_name`` optional argument should normally not need to be supplied, +as ``Sentinel()`` will usually be able to recognize the module in which it was +called. ``module_name`` should be supplied only in unusual cases when this +automatic recognition does not work as intended, such as perhaps when using +Jython or IronPython. This parallels the designs of ``Enum`` and +``namedtuple``. For more details, see :pep:`435`. + +The ``Sentinel`` class may be sub-classed. Instances of each sub-class will +be unique, even if using the same name and module. This allows for +customizing the behavior of sentinels, such as controlling their truthiness. Reference Implementation ======================== -The reference implementation is found in a `dedicated GitHub repo -`__. A simplified version follows:: +The reference implementation is found in a dedicated GitHub repo [7]_. A +simplified version follows:: - def sentinel(name, repr=None): - """Create a unique sentinel object.""" - repr = repr or f'<{name}>' + _registry = {} - module = _get_parent_frame().f_globals.get('__name__', '__main__') - class_name = _get_class_name(name, module) - class_namespace = { - '__repr__': lambda self: repr, - } - cls = type(class_name, (), class_namespace) - cls.__module__ = module - _get_parent_frame().f_globals[class_name] = cls + class Sentinel: + """Unique sentinel values.""" - sentinel = cls() - cls.__new__ = lambda cls_: sentinel + def __new__(cls, name, repr=None, module_name=None): + name = str(name) + repr = str(repr) if repr else f'<{name.split(".")[-1]}>' + if module_name is None: + try: + module_name = \ + sys._getframe(1).f_globals.get('__name__', '__main__') + except (AttributeError, ValueError): + module_name = __name__ - return sentinel + registry_key = f'{module_name}-{name}' - def _get_class_name(sentinel_qualname, module_name): - return '__'.join(['_sentinel_type', - module_name.replace('.', '_'), - sentinel_qualname.replace('.', '_')]) + sentinel = _registry.get(registry_key, None) + if sentinel is not None: + return sentinel + sentinel = super().__new__(cls) + sentinel._name = name + sentinel._repr = repr + sentinel._module_name = module_name -Note that a dedicated class is created automatically for each sentinel object. -This class is assigned to the namespace of the module from which the -``sentinel()`` call was made, or to that of the ``sentinels`` module itself as -a fallback. These classes have long names comprised of several parts to -ensure their uniqueness. However, these names usually wouldn't be used, since -type annotations should use ``Literal[]`` as described above, and identity -checks should be preferred over type checks. + return _registry.setdefault(registry_key, sentinel) + + def __repr__(self): + return self._repr + + def __reduce__(self): + return ( + self.__class__, + ( + self._name, + self._repr, + self._module_name, + ), + ) Rejected Ideas @@ -192,8 +245,8 @@ Use ``NotGiven = object()`` This suffers from all of the drawbacks mentioned in the `Rationale`_ section. -Add a single new sentinel value, e.g. ``MISSING`` or ``Sentinel`` ------------------------------------------------------------------ +Add a single new sentinel value, such as ``MISSING`` or ``Sentinel`` +-------------------------------------------------------------------- Since such a value could be used for various things in various places, one could not always be confident that it would never be a valid value in some use @@ -203,7 +256,7 @@ with confidence without needing to consider potential edge-cases. Additionally, it is useful to be able to provide a meaningful name and repr for a sentinel value, specific to the context where it is used. -Finally, this was a very unpopular option in the `poll `__, with only 12% +Finally, this was a very unpopular option in the poll [4]_, with only 12% of the votes voting for it. @@ -221,9 +274,7 @@ as confidently used in all cases, unlike a dedicated, distinct value. Use a single-valued enum ------------------------ -The suggested idiom is: - -:: +The suggested idiom is:: class NotGivenType(Enum): NotGiven = 'NotGiven' @@ -233,23 +284,21 @@ Besides the excessive repetition, the repr is overly long: ````. A shorter repr can be defined, at the expense of a bit more code and yet more repetition. -Finally, this option was the least popular among the nine options in the `poll -`__, being the only option to receive no votes. +Finally, this option was the least popular among the nine options in the +poll [4]_, being the only option to receive no votes. A sentinel class decorator -------------------------- -The suggested interface: - -:: +The suggested idiom is:: @sentinel(repr='') class NotGivenType: pass NotGiven = NotGivenType() -While this allowed for a very simple and clear implementation, the interface -is too verbose, repetitive, and difficult to remember. +While this allows for a very simple and clear implementation of the decorator, +the idiom is too verbose, repetitive, and difficult to remember. Using class objects @@ -258,33 +307,23 @@ Using class objects Since classes are inherently singletons, using a class as a sentinel value makes sense and allows for a simple implementation. -The simplest version of this idiom is: - -:: +The simplest version of this is:: class NotGiven: pass -To have a clear repr, one could define ``__repr__``: - -:: - - class NotGiven: - def __repr__(self): - return '' - -... or use a meta-class: - -:: +To have a clear repr, one would need to use a meta-class:: class NotGiven(metaclass=SentinelMeta): pass -However, all such implementations don't have a dedicated type for the -sentinel, which is considered desirable for strict typing. A dedicated type -could be created by a meta-class or class decorator, but at that point the -implementation would become much more complex and loses its advantages over -the chosen implementation. +... or a class decorator:: -Additionally, using classes this way is unusual and could be confusing. + @Sentinel + class NotGiven: pass + +Using classes this way is unusual and could be confusing. The intention of +code would be hard to understand without comments. It would also cause +such sentinels to have some unexpected and undesirable behavior, such as +being callable. Define a recommended "standard" idiom, without supplying an implementation @@ -293,38 +332,65 @@ Define a recommended "standard" idiom, without supplying an implementation Most common exiting idioms have significant drawbacks. So far, no idiom has been found that is clear and concise while avoiding these drawbacks. -Also, in the `poll on this subject `__, the options for recommending an +Also, in the poll [4]_ on this subject, the options for recommending an idiom were unpopular, with the highest-voted option being voted for by only 25% of the voters. +Specific type signatures for each sentinel value +------------------------------------------------ + +For a long time, the author of this PEP strove to have type signatures for +such sentinels that were specific to each value. A leading proposal +(supported by Guido and others) was to expand the use of ``Literal``, e.g. +``Literal[MISSING]``. After much thought and discussion, especially on the +typing-sig mailing list [8]_, it seems that all such solutions would require +special-casing and/or added complexity in the implementations of static type +checkers, while also constraining the implementation of sentinels. + +Therefore, this PEP no longer proposes such signatures. Instead, this PEP +suggests using ``Sentinel`` as the type signature for sentinel values. + +It is somewhat unfortunate that static type checkers will sometimes not be +able to deduce more specific types due to this, such as inside a conditional +block like ``if value is not MISSING: ...``. However, this is a minor issue +in practice, as type checkers can be easily made to understand these cases +using ``typing.cast()``. + + Additional Notes ================ -* This PEP and the initial implementation are drafted in a `dedicated GitHub - repo `__. +* This PEP and the initial implementation are drafted in a dedicated GitHub + repo [7]_. -* The support for copying/unpickling works when defined in a module's scope or - a (possibly nested) class's scope. Note that in the latter case, the name - provided as the first parameter must be the fully-qualified name of the - variable in the module:: +* For sentinels defined in a class scope, to avoid potential name clashes, + one should use the fully-qualified name of the variable in the module. Only + the part of the name after the last period will be used for the default + repr. For example:: - class MyClass: - NotGiven = sentinel('MyClass.NotGiven', repr='') + >>> class MyClass: + ... NotGiven = sentinel('MyClass.NotGiven') + >>> MyClass.NotGiven + + +* One should be careful when creating sentinels in a function or method, since + sentinels with the same name created by code in the same module will be + identical. If distinct sentinel objects are needed, make sure to use + distinct names. References ========== -.. _python-dev-thread: https://mail.python.org/archives/list/python-dev@python.org/thread/ZLVPD2OISI7M4POMTR2FCQTE6TPMPTO3/ -.. _list-of-sentinels-in-stdlib: https://mail.python.org/archives/list/python-dev@python.org/message/JBYXQH3NV3YBF7P2HLHB5CD6V3GVTY55/ -.. _poll: https://discuss.python.org/t/sentinel-values-in-the-stdlib/8810/ -.. _reference-github-repo: https://github.com/taleinat/python-stdlib-sentinels - -* `bpo-44123: Make function parameter sentinel values true singletons `_ -* `The "sentinels" package on PyPI `_ -* `The "sentinel" package on PyPI `_ -* `Discussion thread about type signatures for these sentinels on the typing-sig mailing list `_ +.. [1] Python-Dev mailing list: `The repr of a sentinel `_ +.. [2] Python-Dev mailing list: `"The stdlib contains tons of sentinels" `_ +.. [3] `bpo-44123: Make function parameter sentinel values true singletons `_ +.. [4] discuss.python.org Poll: `Sentinel Values in the Stdlib `_ +.. [5] `The "sentinels" package on PyPI `_ +.. [6] `The "sentinel" package on PyPI `_ +.. [7] `Reference implementation at the taleinat/python-stdlib-sentinels GitHub repo `_ +.. [8] `Discussion thread about type signatures for these sentinels on the typing-sig mailing list `_ Copyright