611 lines
22 KiB
ReStructuredText
611 lines
22 KiB
ReStructuredText
PEP: 505
|
||
Title: None-aware operators
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Mark E. Haase <mehaase@gmail.com>, Steve Dower <steve.dower@python.org>
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 18-Sep-2015
|
||
Python-Version: 3.8
|
||
|
||
Abstract
|
||
========
|
||
|
||
Several modern programming languages have so-called "``null``-coalescing" or
|
||
"``null``- aware" operators, including C# [1]_, Dart [2]_, Perl, Swift, and PHP
|
||
(starting in version 7). These operators provide syntactic sugar for common
|
||
patterns involving null references.
|
||
|
||
* The "``null``-coalescing" operator is a binary operator that returns its left
|
||
operand if it is not ``null``. Otherwise it returns its right operand.
|
||
* The "``null``-aware member access" operator accesses an instance member only
|
||
if that instance is non-``null``. Otherwise it returns ``null``. (This is also
|
||
called a "safe navigation" operator.)
|
||
* The "``null``-aware index access" operator accesses an element of a collection
|
||
only if that collection is non-``null``. Otherwise it returns ``null``. (This
|
||
is another type of "safe navigation" operator.)
|
||
|
||
This PEP proposes three ``None``-aware operators for Python, based on the
|
||
definitions and other language's implementations of those above. Specifically:
|
||
|
||
* The "``None`` coalescing`` binary operator ``??`` returns the left hand side
|
||
if it evaluates to a value that is not ``None``, or else it evaluates and
|
||
returns the right hand side.
|
||
* The "``None``-aware attribute access" operator ``?.`` evaluates the complete
|
||
expression if the left hand side evaluates to a value that is not ``None``
|
||
* The "``None``-aware indexing" operator ``?[]`` evaluates the complete
|
||
expression if the left hand site evaluates to a value that is not ``None``
|
||
|
||
Syntax and Semantics
|
||
====================
|
||
|
||
Specialness of ``None``
|
||
-----------------------
|
||
|
||
The ``None`` object denotes the lack of a value. For the purposes of these
|
||
operators, the lack of a value indicates that the remainder of the expression
|
||
also lacks a value and should not be evaluated.
|
||
|
||
A rejected proposal was to treat any value that evaluates to false in a
|
||
Boolean context as not having a value. However, the purpose of these operators
|
||
is to propagate the "lack of value" state, rather that the "false" state.
|
||
|
||
Some argue that this makes ``None`` special. We contend that ``None`` is
|
||
already special, and that using it as both the test and the result of these
|
||
operators does not change the existing semantics in any way.
|
||
|
||
Grammar changes
|
||
---------------
|
||
|
||
The following rules of the Python grammar are updated to read::
|
||
|
||
augassign: ('+=' | '-=' | '*=' | '@=' | '/=' | '%=' | '&=' | '|=' | '^=' |
|
||
'<<=' | '>>=' | '**=' | '//=' | '??=')
|
||
term: coalesce (('*'|'@'|'/'|'%'|'//') coalesce)*
|
||
coalesce: factor ('??' factor)*
|
||
|
||
# factor eventually resolved to atom_expr
|
||
atom_expr: ['await'] atom trailer*
|
||
trailer: ('(' [arglist] ')' |
|
||
'[' subscriptlist ']' |
|
||
'?[' subscriptlist ']' |
|
||
'.' NAME |
|
||
'?.' NAME)
|
||
|
||
Inserting the ``coalesce`` rule in this location ensures that expressions
|
||
resulting in ``None`` are natuarlly coalesced before they are used in
|
||
operations that would typically raise ``TypeError``. Like ``and`` and ``or``
|
||
the right-hand expression is not evaluated until the left-hand side is
|
||
determined to be ``None``. For example::
|
||
|
||
a, b = None, None
|
||
def c(): return None
|
||
def ex(): raise Exception()
|
||
|
||
(a ?? 2 ** b ?? 3) == a ?? (2 ** b) ?? 3
|
||
(a * b ?? c // d) == (a * b) ?? (c // d)
|
||
(a ?? True and b ?? False) == (a ?? True) and (b ?? False)
|
||
(c() ?? c() ?? True) == True
|
||
(True ?? ex()) == True
|
||
(c ?? ex)() == c()
|
||
|
||
Augmented coalescing assignment only rebinds the name if its current value is
|
||
``None``. If the target name already has a value, the right-hand side is not
|
||
evaluated. For example::
|
||
|
||
a = None
|
||
b = ''
|
||
|
||
a ??= 'value'
|
||
b ??= undefined_name
|
||
|
||
assert a == 'value'
|
||
assert b == ''
|
||
|
||
Adding new trailers for the other ``None``-aware operators ensures that they
|
||
may be used in all valid locations for the existing equivalent operators,
|
||
including as part of an assignment target (more details below). As the existing
|
||
evaluation rules are not directly embedded in the grammar, we specify the
|
||
required changes here.
|
||
|
||
Assume that the ``atom`` is always successfully evaluated. Each ``trailer`` is
|
||
then evaluated from left to right, applying its own parameter (either its
|
||
arguments, subscripts or attribute name) to produce the value for the next
|
||
``trailer``. Finally, if present, ``await`` is applied.
|
||
|
||
For example, ``await a.b(c).d[e]`` is currently parsed as
|
||
``['await', 'a', '.b', '(c)', '.d', '[e]']`` and evaluated::
|
||
|
||
_v = a
|
||
_v = _v.b
|
||
_v = _v(c)
|
||
_v = _v.d
|
||
_v = _v[e]
|
||
await _v
|
||
|
||
When a ``None``-aware operator is present, the left-to-right evaluation may be
|
||
short-circuited. For example, ``await a?.b(c).d?[e]`` is evaluated::
|
||
|
||
_v = a
|
||
if _v is not None:
|
||
_v = _v.b
|
||
_v = _v(c)
|
||
_v = _v.d
|
||
if _v is not None:
|
||
_v = _v[e]
|
||
await _v
|
||
|
||
.. note::
|
||
``await`` will almost certainly fail in this context, as it would in
|
||
the case where code attempts ``await None``. We are not proposing to add a
|
||
``None``-aware ``await`` keyword here, and merely include it in this
|
||
example for completeness of the specification, since the ``atom_expr``
|
||
grammar rule includes the keyword. If it were in its own rule, we would have
|
||
never mentioned it.
|
||
|
||
Parenthesised expressions are handled by the ``atom`` rule (not shown above),
|
||
which will implicitly terminate the short-circuiting behaviour of the above
|
||
transformation. For example, ``(a?.b ?? c).d?.e`` is evaluated as::
|
||
|
||
# a?.b
|
||
_v = a
|
||
if _v is not None:
|
||
_v = _v.b
|
||
|
||
# ... ?? c
|
||
if _v is None:
|
||
_v = c
|
||
|
||
# (...).d?.e
|
||
_v = _v.d
|
||
if _v is not None:
|
||
_v = _v.e
|
||
|
||
When used as an assignment target, the ``None``-aware operations may only be
|
||
used in a "load" context. That is, ``a?.b = 1`` and ``a?[b] = 1`` will raise
|
||
``SyntaxError``. Use earlier in the expression (``a?.b.c = 1``) is permitted,
|
||
though unlikely to be useful unless combined with a coalescing operation::
|
||
|
||
(a?.b ?? d).c = 1
|
||
|
||
|
||
Examples
|
||
========
|
||
|
||
This section presents some examples of common ``None`` patterns and explains
|
||
the drawbacks.
|
||
|
||
jsonify
|
||
-------
|
||
|
||
This first example is from a Python web crawler that uses the popular Flask
|
||
framework as a front-end. This function retrieves information about a web site
|
||
from a SQL database and formats it as JSON to send to an HTTP client::
|
||
|
||
class SiteView(FlaskView):
|
||
@route('/site/<id_>', methods=['GET'])
|
||
def get_site(self, id_):
|
||
site = db.query('site_table').find(id_)
|
||
|
||
return jsonify(
|
||
first_seen=site.first_seen.isoformat() if site.first_seen is not None else None,
|
||
id=site.id,
|
||
is_active=site.is_active,
|
||
last_seen=site.last_seen.isoformat() if site.last_seen is not None else None,
|
||
url=site.url.rstrip('/')
|
||
)
|
||
|
||
Both ``first_seen`` and ``last_seen`` are allowed to be ``null`` in the
|
||
database, and they are also allowed to be ``null`` in the JSON response. JSON
|
||
does not have a native way to represent a ``datetime``, so the server's contract
|
||
states that any non-``null`` date is represented as an ISO-8601 string.
|
||
|
||
However, without knowing the exact semantics of the ``first_seen`` and
|
||
``last_seen`` attributes, it is impossible to know whether the attribute can
|
||
be safely or performantly accessed multiple times.
|
||
|
||
One way to fix this code is to replace each ternary with explicit value
|
||
assignment and a full ``if/else`` block::
|
||
|
||
class SiteView(FlaskView):
|
||
@route('/site/<id_>', methods=['GET'])
|
||
def get_site(self, id_):
|
||
site = db.query('site_table').find(id_)
|
||
|
||
first_seen_dt = site.first_seen
|
||
if first_seen_dt is None:
|
||
first_seen = None
|
||
else:
|
||
first_seen = first_seen_dt.isoformat()
|
||
|
||
last_seen_dt = site.last_seen
|
||
if last_seen_dt is None:
|
||
last_seen = None
|
||
else:
|
||
last_seen = last_seen_dt.isoformat()
|
||
|
||
return jsonify(
|
||
first_seen=first_seen,
|
||
id=site.id,
|
||
is_active=site.is_active,
|
||
last_seen=last_seen,
|
||
url=site.url.rstrip('/')
|
||
)
|
||
|
||
This adds ten lines of code and four new code paths to the function,
|
||
dramatically increasing the apparently complexity. Rewriting using
|
||
``None``-aware attribute operator on the other hand, results in shorter
|
||
code::
|
||
|
||
class SiteView(FlaskView):
|
||
@route('/site/<id_>', methods=['GET'])
|
||
def get_site(self, id_):
|
||
site = db.query('site_table').find(id_)
|
||
|
||
return jsonify(
|
||
first_seen=site.first_seen?.isoformat(),
|
||
id=site.id,
|
||
is_active=site.is_active,
|
||
last_seen=site.last_seen?.isoformat(),
|
||
url=site.url.rstrip('/')
|
||
)
|
||
|
||
Grab
|
||
----
|
||
|
||
The next example is from a Python scraping library called `Grab
|
||
<https://github.com/lorien/grab/blob/4c95b18dcb0fa88eeca81f5643c0ebfb114bf728/gr
|
||
ab/upload.py>`_::
|
||
|
||
class BaseUploadObject(object):
|
||
def find_content_type(self, filename):
|
||
ctype, encoding = mimetypes.guess_type(filename)
|
||
if ctype is None:
|
||
return 'application/octet-stream'
|
||
else:
|
||
return ctype
|
||
|
||
class UploadContent(BaseUploadObject):
|
||
def __init__(self, content, filename=None, content_type=None):
|
||
self.content = content
|
||
if filename is None:
|
||
self.filename = self.get_random_filename()
|
||
else:
|
||
self.filename = filename
|
||
if content_type is None:
|
||
self.content_type = self.find_content_type(self.filename)
|
||
else:
|
||
self.content_type = content_type
|
||
|
||
class UploadFile(BaseUploadObject):
|
||
def __init__(self, path, filename=None, content_type=None):
|
||
self.path = path
|
||
if filename is None:
|
||
self.filename = os.path.split(path)[1]
|
||
else:
|
||
self.filename = filename
|
||
if content_type is None:
|
||
self.content_type = self.find_content_type(self.filename)
|
||
else:
|
||
self.content_type = content_type
|
||
|
||
This example contains several good examples of needing to provide default
|
||
values. Rewriting to use a conditional expression reduces the overall lines of
|
||
code, but does not necessarily improve readability::
|
||
|
||
class BaseUploadObject(object):
|
||
def find_content_type(self, filename):
|
||
ctype, encoding = mimetypes.guess_type(filename)
|
||
return 'application/octet-stream' if ctype is None else ctype
|
||
|
||
class UploadContent(BaseUploadObject):
|
||
def __init__(self, content, filename=None, content_type=None):
|
||
self.content = content
|
||
self.filename = (self.get_random_filename() if filename
|
||
is None else filename)
|
||
self.content_type = (self.find_content_type(self.filename)
|
||
if content_type is None else content_type)
|
||
|
||
class UploadFile(BaseUploadObject):
|
||
def __init__(self, path, filename=None, content_type=None):
|
||
self.path = path
|
||
self.filename = (os.path.split(path)[1] if filename is
|
||
None else filename)
|
||
self.content_type = (self.find_content_type(self.filename)
|
||
if content_type is None else content_type)
|
||
|
||
The first ternary expression is tidy, but it reverses the intuitive order of
|
||
the operands: it should return ``ctype`` if it has a value and use the string
|
||
literal as fallback. The other ternary expressions are unintuitive and so
|
||
long that they must be wrapped. The overall readability is worsened, not
|
||
improved.
|
||
|
||
Rewriting using the ``None`` coalescing operator::
|
||
|
||
class BaseUploadObject(object):
|
||
def find_content_type(self, filename):
|
||
ctype, encoding = mimetypes.guess_type(filename)
|
||
return ctype ?? 'application/octet-stream'
|
||
|
||
class UploadContent(BaseUploadObject):
|
||
def __init__(self, content, filename=None, content_type=None):
|
||
self.content = content
|
||
self.filename = filename ?? self.get_random_filename()
|
||
self.content_type = content_type ?? self.find_content_type(self.filename)
|
||
|
||
class UploadFile(BaseUploadObject):
|
||
def __init__(self, path, filename=None, content_type=None):
|
||
self.path = path
|
||
self.filename = filename ?? os.path.split(path)[1]
|
||
self.content_type = content_type ?? self.find_content_type(self.filename)
|
||
|
||
This syntax has an intuitive ordering of the operands. In
|
||
``find_content_type``, for example, the preferred value ``ctype`` appears before
|
||
the fallback value. The terseness of the syntax also makes for fewer lines of
|
||
code and less code to visually parse.
|
||
|
||
Alternatives
|
||
============
|
||
|
||
Python does not have any existing ``None``-aware operators, but it does have
|
||
operators that can be used for a similar purpose. This section describes why
|
||
these alternatives are undesirable for some common ``None`` patterns.
|
||
|
||
``or`` Operator
|
||
---------------
|
||
|
||
Similar behavior can be achieved with the ``or`` operator, but ``or`` checks
|
||
whether its left operand is false-y, not specifically ``None``. This can lead
|
||
to unexpected behavior when the value is zero, an empty string, or an empty
|
||
collection::
|
||
|
||
>>> def f(s=None):
|
||
... s = s or []
|
||
... s.append(123)
|
||
...
|
||
>>> my_list = []
|
||
>>> f(my_list)
|
||
>>> my_list
|
||
[]
|
||
# Expected: [123]
|
||
|
||
Rewritten using the ``None`` coalescing operator, the function could read::
|
||
|
||
def f(s=None):
|
||
s = s ?? []
|
||
s.append(123)
|
||
|
||
Or using the ``None``-aware attribute operator::
|
||
|
||
def f(s=None):
|
||
s?.append(123)
|
||
|
||
(Rewriting using a conditional expression is covered in a later section.)
|
||
|
||
``getattr`` Builtin
|
||
-------------------
|
||
|
||
Using the ``getattr`` builtin with a default value is often a suitable approach
|
||
for getting an attribute from a target that may be ``None``::
|
||
|
||
client = maybe_get_client()
|
||
name = getattr(client, 'name', None)
|
||
|
||
However, the semantics of this call are different from the proposed operators.
|
||
Using ``getattr`` will suppress ``AttributeError`` for misspelled or missing
|
||
attributes, even when ``client`` has a value.
|
||
|
||
Spelled correctly for these semantics, this example would read::
|
||
|
||
client = maybe_get_client()
|
||
if client is not None:
|
||
name = client.name
|
||
else:
|
||
name = None
|
||
|
||
Written using the ``None``-aware attribute operator::
|
||
|
||
client = maybe_get_client()
|
||
name = client?.name
|
||
|
||
Ternary Operator
|
||
----------------
|
||
|
||
Another common way to initialize default values is to use the ternary operator.
|
||
Here is an excerpt from the popular `Requests package
|
||
<https://github.com/kennethreitz/requests/blob/14a555ac716866678bf17e43e23230d81
|
||
a8149f5/requests/models.py#L212>`_::
|
||
|
||
data = [] if data is None else data
|
||
files = [] if files is None else files
|
||
headers = {} if headers is None else headers
|
||
params = {} if params is None else params
|
||
hooks = {} if hooks is None else hooks
|
||
|
||
This particular formulation has the undesirable effect of putting the operands
|
||
in an unintuitive order: the brain thinks, "use ``data`` if possible and use
|
||
``[]`` as a fallback," but the code puts the fallback *before* the preferred
|
||
value.
|
||
|
||
The author of this package could have written it like this instead::
|
||
|
||
data = data if data is not None else []
|
||
files = files if files is not None else []
|
||
headers = headers if headers is not None else {}
|
||
params = params if params is not None else {}
|
||
hooks = hooks if hooks is not None else {}
|
||
|
||
This ordering of the operands is more intuitive, but it requires 4 extra
|
||
characters (for "not "). It also highlights the repetition of identifiers:
|
||
``data if data``, ``files if files``, etc.
|
||
|
||
When written using the ``None`` coalescing operator, the sample reads::
|
||
|
||
data = data ?? []
|
||
files = files ?? []
|
||
headers = headers ?? {}
|
||
params = params ?? {}
|
||
hooks = hooks ?? {}
|
||
|
||
Rejected Ideas
|
||
==============
|
||
|
||
``None``-aware Function Call
|
||
----------------------------
|
||
|
||
The ``None``-aware syntax applies to attribute and index access, so it seems
|
||
natural to ask if it should also apply to function invocation syntax. It might
|
||
be written as ``foo?()``, where ``foo`` is only called if it is not None.
|
||
|
||
This has been rejected on the basis of the proposed operators being intended
|
||
to aid traversal of partially populated hierarchical data structures, *not*
|
||
for traversal of arbitrary class hierarchies. This is reflected in the fact
|
||
that none of the other mainstream languages that already offer this syntax
|
||
have found it worthwhile to support a similar syntax for optional function
|
||
invocations.
|
||
|
||
A workaround similar to that used by C# would be to write
|
||
``maybe_none?.__call__(arguments)``. If the callable is ``None``, the
|
||
expression will not be evaluated. (The C# equivalent uses ``?.Invoke()`` on its
|
||
callable type.)
|
||
|
||
``?`` Unary Postfix Operator
|
||
----------------------------
|
||
|
||
To generalize the ``None``-aware behavior and limit the number of new operators
|
||
introduced, a unary, postfix operator spelled ``?`` was suggested. The idea is
|
||
that ``?`` might return a special object that could would override dunder
|
||
methods that return ``self``. For example, ``foo?`` would evaluate to ``foo`` if
|
||
it is not ``None``, otherwise it would evaluate to an instance of
|
||
``NoneQuestion``::
|
||
|
||
class NoneQuestion():
|
||
def __call__(self, *args, **kwargs):
|
||
return self
|
||
|
||
def __getattr__(self, name):
|
||
return self
|
||
|
||
def __getitem__(self, key):
|
||
return self
|
||
|
||
|
||
With this new operator and new type, an expression like ``foo?.bar[baz]``
|
||
evaluates to ``NoneQuestion`` if ``foo`` is None. This is a nifty
|
||
generalization, but it's difficult to use in practice since most existing code
|
||
won't know what ``NoneQuestion`` is.
|
||
|
||
Going back to one of the motivating examples above, consider the following::
|
||
|
||
>>> import json
|
||
>>> created = None
|
||
>>> json.dumps({'created': created?.isoformat()})``
|
||
|
||
The JSON serializer does not know how to serialize ``NoneQuestion``, nor will
|
||
any other API. This proposal actually requires *lots of specialized logic*
|
||
throughout the standard library and any third party library.
|
||
|
||
At the same time, the ``?`` operator may also be **too general**, in the sense
|
||
that it can be combined with any other operator. What should the following
|
||
expressions mean?::
|
||
|
||
>>> x? + 1
|
||
>>> x? -= 1
|
||
>>> x? == 1
|
||
>>> ~x?
|
||
|
||
This degree of generalization is not useful. The operators actually proposed
|
||
herein are intentionally limited to a few operators that are expected to make it
|
||
easier to write common code patterns.
|
||
|
||
Haskell-style ``Maybe``
|
||
-----------------------
|
||
|
||
Haskell has a concept called `Maybe <https://wiki.haskell.org/Maybe>`_ that
|
||
encapsulates the idea of an optional value without relying on any special
|
||
keyword (e.g. ``null``) or any special instance (e.g. ``None``). In Haskell, the
|
||
purpose of ``Maybe`` is to avoid separate handling of "something" and nothing".
|
||
|
||
A Python package called `pymaybe <https://pypi.org/p/pymaybe/>`_ provides a
|
||
rough approximation. The documentation shows the following example::
|
||
|
||
>>> maybe('VALUE').lower()
|
||
'value'
|
||
|
||
>>> maybe(None).invalid().method().or_else('unknown')
|
||
'unknown'
|
||
|
||
The function ``maybe()`` returns either a ``Something`` instance or a
|
||
``Nothing`` instance. Similar to the unary postfix operator described in the
|
||
previous section, ``Nothing`` overrides dunder methods in order to allow
|
||
chaining on a missing value.
|
||
|
||
Note that ``or_else()`` is eventually required to retrieve the underlying value
|
||
from ``pymaybe``'s wrappers. Furthermore, ``pymaybe`` does not short circuit any
|
||
evaluation. Although ``pymaybe`` has some strengths and may be useful in its own
|
||
right, it also demonstrates why a pure Python implementation of coalescing is
|
||
not nearly as powerful as support built into the language.
|
||
|
||
The idea of adding a builtin ``maybe`` type to enable this scenario is rejected.
|
||
|
||
No-Value Protocol
|
||
-----------------
|
||
|
||
These operators could be generalised to user-defined types by using a protocol
|
||
to indicate when a value represents no value. Such a protocol may be a dunder
|
||
method ``__has_value__(self)` that returns ``True`` if the value should be
|
||
treated as having a value, and ``False`` if the ``None``-aware operators should
|
||
|
||
With this generalization, ``object`` would implement a dunder method equivalent
|
||
to this::
|
||
|
||
def __has_value__(self):
|
||
return True
|
||
|
||
``NoneType`` would implement a dunder method equivalent to this::
|
||
|
||
def __has_value__(self):
|
||
return False
|
||
|
||
In the specification section, all uses of ``x is None`` would be replaces with
|
||
``not x.__has_value__()``.
|
||
|
||
This generalization would allow for domain-specific ``null`` objects to be
|
||
coalesced just like ``None``. For example the ``pyasn1`` package has a type
|
||
called ``Null`` that represents an ASN.1 ``null``::
|
||
|
||
>>> from pyasn1.type import univ
|
||
>>> univ.Null() ?? univ.Integer(123)
|
||
Integer(123)
|
||
|
||
However, as ``None`` is already defined as being the value that represents
|
||
"no value", and the current specification would not preclude switching to a
|
||
protocol in the future, this idea is rejected for now.
|
||
|
||
References
|
||
==========
|
||
|
||
.. [1] C# Reference: Operators
|
||
(https://msdn.microsoft.com/en-us/library/6a71f45d.aspx)
|
||
|
||
.. [2] A Tour of the Dart Language: Operators
|
||
(https://www.dartlang.org/docs/dart-up-and-running/ch02.html#operators)
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|