python-peps/pep-0505.rst

611 lines
22 KiB
ReStructuredText
Raw Normal View History

PEP: 505
Title: None-aware operators
Version: $Revision$
Last-Modified: $Date$
Author: Mark E. Haase <mehaase@gmail.com>, Steve Dower <steve.dower@python.org>
2018-07-07 14:30:13 -04:00
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 18-Sep-2015
Python-Version: 3.8
Abstract
========
Several modern programming languages have so-called "``null``-coalescing" or
"``null``- aware" operators, including C# [1]_, Dart [2]_, Perl, Swift, and PHP
(starting in version 7). These operators provide syntactic sugar for common
patterns involving null references.
* The "``null``-coalescing" operator is a binary operator that returns its left
operand if it is not ``null``. Otherwise it returns its right operand.
* The "``null``-aware member access" operator accesses an instance member only
if that instance is non-``null``. Otherwise it returns ``null``. (This is also
called a "safe navigation" operator.)
2015-10-20 21:44:00 -04:00
* The "``null``-aware index access" operator accesses an element of a collection
only if that collection is non-``null``. Otherwise it returns ``null``. (This
is another type of "safe navigation" operator.)
This PEP proposes three ``None``-aware operators for Python, based on the
definitions and other language's implementations of those above. Specifically:
* The "``None`` coalescing`` binary operator ``??`` returns the left hand side
if it evaluates to a value that is not ``None``, or else it evaluates and
returns the right hand side.
* The "``None``-aware attribute access" operator ``?.`` evaluates the complete
expression if the left hand side evaluates to a value that is not ``None``
* The "``None``-aware indexing" operator ``?[]`` evaluates the complete
expression if the left hand site evaluates to a value that is not ``None``
Syntax and Semantics
====================
Specialness of ``None``
-----------------------
The ``None`` object denotes the lack of a value. For the purposes of these
operators, the lack of a value indicates that the remainder of the expression
also lacks a value and should not be evaluated.
A rejected proposal was to treat any value that evaluates to false in a
Boolean context as not having a value. However, the purpose of these operators
is to propagate the "lack of value" state, rather that the "false" state.
Some argue that this makes ``None`` special. We contend that ``None`` is
already special, and that using it as both the test and the result of these
operators does not change the existing semantics in any way.
Grammar changes
---------------
The following rules of the Python grammar are updated to read::
2018-07-07 14:30:13 -04:00
augassign: ('+=' | '-=' | '*=' | '@=' | '/=' | '%=' | '&=' | '|=' | '^=' |
'<<=' | '>>=' | '**=' | '//=' | '??=')
term: coalesce (('*'|'@'|'/'|'%'|'//') coalesce)*
coalesce: factor ('??' factor)*
# factor eventually resolved to atom_expr
atom_expr: ['await'] atom trailer*
2018-07-07 14:30:13 -04:00
trailer: ('(' [arglist] ')' |
'[' subscriptlist ']' |
'?[' subscriptlist ']' |
'.' NAME |
'?.' NAME)
Inserting the ``coalesce`` rule in this location ensures that expressions
resulting in ``None`` are natuarlly coalesced before they are used in
operations that would typically raise ``TypeError``. Like ``and`` and ``or``
the right-hand expression is not evaluated until the left-hand side is
determined to be ``None``. For example::
a, b = None, None
def c(): return None
def ex(): raise Exception()
2018-07-07 14:30:13 -04:00
(a ?? 2 ** b ?? 3) == a ?? (2 ** b) ?? 3
(a * b ?? c // d) == (a * b) ?? (c // d)
(a ?? True and b ?? False) == (a ?? True) and (b ?? False)
(c() ?? c() ?? True) == True
(True ?? ex()) == True
(c ?? ex)() == c()
2018-07-07 14:30:13 -04:00
Augmented coalescing assignment only rebinds the name if its current value is
``None``. If the target name already has a value, the right-hand side is not
evaluated. For example::
a = None
b = ''
a ??= 'value'
b ??= undefined_name
assert a == 'value'
assert b == ''
Adding new trailers for the other ``None``-aware operators ensures that they
2018-07-07 14:30:13 -04:00
may be used in all valid locations for the existing equivalent operators,
including as part of an assignment target (more details below). As the existing
evaluation rules are not directly embedded in the grammar, we specify the
required changes here.
Assume that the ``atom`` is always successfully evaluated. Each ``trailer`` is
then evaluated from left to right, applying its own parameter (either its
arguments, subscripts or attribute name) to produce the value for the next
``trailer``. Finally, if present, ``await`` is applied.
For example, ``await a.b(c).d[e]`` is currently parsed as
``['await', 'a', '.b', '(c)', '.d', '[e]']`` and evaluated::
_v = a
_v = _v.b
_v = _v(c)
_v = _v.d
_v = _v[e]
await _v
When a ``None``-aware operator is present, the left-to-right evaluation may be
short-circuited. For example, ``await a?.b(c).d?[e]`` is evaluated::
_v = a
if _v is not None:
_v = _v.b
_v = _v(c)
_v = _v.d
if _v is not None:
_v = _v[e]
await _v
.. note::
``await`` will almost certainly fail in this context, as it would in
the case where code attempts ``await None``. We are not proposing to add a
``None``-aware ``await`` keyword here, and merely include it in this
example for completeness of the specification, since the ``atom_expr``
grammar rule includes the keyword. If it were in its own rule, we would have
never mentioned it.
Parenthesised expressions are handled by the ``atom`` rule (not shown above),
which will implicitly terminate the short-circuiting behaviour of the above
transformation. For example, ``(a?.b ?? c).d?.e`` is evaluated as::
# a?.b
_v = a
if _v is not None:
_v = _v.b
# ... ?? c
if _v is None:
_v = c
# (...).d?.e
_v = _v.d
if _v is not None:
_v = _v.e
2018-07-07 14:30:13 -04:00
When used as an assignment target, the ``None``-aware operations may only be
used in a "load" context. That is, ``a?.b = 1`` and ``a?[b] = 1`` will raise
``SyntaxError``. Use earlier in the expression (``a?.b.c = 1``) is permitted,
though unlikely to be useful unless combined with a coalescing operation::
(a?.b ?? d).c = 1
Examples
========
This section presents some examples of common ``None`` patterns and explains
the drawbacks.
jsonify
-------
This first example is from a Python web crawler that uses the popular Flask
framework as a front-end. This function retrieves information about a web site
from a SQL database and formats it as JSON to send to an HTTP client::
class SiteView(FlaskView):
@route('/site/<id_>', methods=['GET'])
def get_site(self, id_):
site = db.query('site_table').find(id_)
return jsonify(
first_seen=site.first_seen.isoformat() if site.first_seen is not None else None,
id=site.id,
is_active=site.is_active,
last_seen=site.last_seen.isoformat() if site.last_seen is not None else None,
url=site.url.rstrip('/')
)
Both ``first_seen`` and ``last_seen`` are allowed to be ``null`` in the
database, and they are also allowed to be ``null`` in the JSON response. JSON
does not have a native way to represent a ``datetime``, so the server's contract
states that any non-``null`` date is represented as an ISO-8601 string.
However, without knowing the exact semantics of the ``first_seen`` and
``last_seen`` attributes, it is impossible to know whether the attribute can
be safely or performantly accessed multiple times.
One way to fix this code is to replace each ternary with explicit value
assignment and a full ``if/else`` block::
class SiteView(FlaskView):
@route('/site/<id_>', methods=['GET'])
def get_site(self, id_):
site = db.query('site_table').find(id_)
first_seen_dt = site.first_seen
if first_seen_dt is None:
first_seen = None
else:
first_seen = first_seen_dt.isoformat()
last_seen_dt = site.last_seen
if last_seen_dt is None:
last_seen = None
else:
last_seen = last_seen_dt.isoformat()
return jsonify(
first_seen=first_seen,
id=site.id,
is_active=site.is_active,
last_seen=last_seen,
url=site.url.rstrip('/')
)
This adds ten lines of code and four new code paths to the function,
dramatically increasing the apparently complexity. Rewriting using
``None``-aware attribute operator on the other hand, results in shorter
code::
class SiteView(FlaskView):
@route('/site/<id_>', methods=['GET'])
def get_site(self, id_):
site = db.query('site_table').find(id_)
return jsonify(
first_seen=site.first_seen?.isoformat(),
id=site.id,
is_active=site.is_active,
last_seen=site.last_seen?.isoformat(),
url=site.url.rstrip('/')
)
Grab
----
The next example is from a Python scraping library called `Grab
<https://github.com/lorien/grab/blob/4c95b18dcb0fa88eeca81f5643c0ebfb114bf728/gr
ab/upload.py>`_::
class BaseUploadObject(object):
def find_content_type(self, filename):
ctype, encoding = mimetypes.guess_type(filename)
if ctype is None:
return 'application/octet-stream'
else:
return ctype
class UploadContent(BaseUploadObject):
def __init__(self, content, filename=None, content_type=None):
self.content = content
if filename is None:
self.filename = self.get_random_filename()
else:
self.filename = filename
if content_type is None:
self.content_type = self.find_content_type(self.filename)
else:
self.content_type = content_type
class UploadFile(BaseUploadObject):
def __init__(self, path, filename=None, content_type=None):
self.path = path
if filename is None:
self.filename = os.path.split(path)[1]
else:
self.filename = filename
if content_type is None:
self.content_type = self.find_content_type(self.filename)
else:
self.content_type = content_type
This example contains several good examples of needing to provide default
values. Rewriting to use a conditional expression reduces the overall lines of
code, but does not necessarily improve readability::
class BaseUploadObject(object):
def find_content_type(self, filename):
ctype, encoding = mimetypes.guess_type(filename)
return 'application/octet-stream' if ctype is None else ctype
class UploadContent(BaseUploadObject):
def __init__(self, content, filename=None, content_type=None):
self.content = content
self.filename = (self.get_random_filename() if filename
is None else filename)
self.content_type = (self.find_content_type(self.filename)
if content_type is None else content_type)
class UploadFile(BaseUploadObject):
def __init__(self, path, filename=None, content_type=None):
self.path = path
self.filename = (os.path.split(path)[1] if filename is
None else filename)
self.content_type = (self.find_content_type(self.filename)
if content_type is None else content_type)
The first ternary expression is tidy, but it reverses the intuitive order of
the operands: it should return ``ctype`` if it has a value and use the string
literal as fallback. The other ternary expressions are unintuitive and so
long that they must be wrapped. The overall readability is worsened, not
improved.
Rewriting using the ``None`` coalescing operator::
class BaseUploadObject(object):
def find_content_type(self, filename):
ctype, encoding = mimetypes.guess_type(filename)
return ctype ?? 'application/octet-stream'
class UploadContent(BaseUploadObject):
def __init__(self, content, filename=None, content_type=None):
self.content = content
self.filename = filename ?? self.get_random_filename()
self.content_type = content_type ?? self.find_content_type(self.filename)
class UploadFile(BaseUploadObject):
def __init__(self, path, filename=None, content_type=None):
self.path = path
self.filename = filename ?? os.path.split(path)[1]
self.content_type = content_type ?? self.find_content_type(self.filename)
This syntax has an intuitive ordering of the operands. In
``find_content_type``, for example, the preferred value ``ctype`` appears before
the fallback value. The terseness of the syntax also makes for fewer lines of
code and less code to visually parse.
Alternatives
============
Python does not have any existing ``None``-aware operators, but it does have
operators that can be used for a similar purpose. This section describes why
these alternatives are undesirable for some common ``None`` patterns.
``or`` Operator
---------------
Similar behavior can be achieved with the ``or`` operator, but ``or`` checks
whether its left operand is false-y, not specifically ``None``. This can lead
to unexpected behavior when the value is zero, an empty string, or an empty
collection::
>>> def f(s=None):
... s = s or []
... s.append(123)
...
>>> my_list = []
>>> f(my_list)
>>> my_list
[]
# Expected: [123]
Rewritten using the ``None`` coalescing operator, the function could read::
def f(s=None):
s = s ?? []
s.append(123)
Or using the ``None``-aware attribute operator::
def f(s=None):
s?.append(123)
(Rewriting using a conditional expression is covered in a later section.)
``getattr`` Builtin
-------------------
Using the ``getattr`` builtin with a default value is often a suitable approach
for getting an attribute from a target that may be ``None``::
client = maybe_get_client()
name = getattr(client, 'name', None)
However, the semantics of this call are different from the proposed operators.
Using ``getattr`` will suppress ``AttributeError`` for misspelled or missing
attributes, even when ``client`` has a value.
Spelled correctly for these semantics, this example would read::
client = maybe_get_client()
if client is not None:
name = client.name
else:
name = None
Written using the ``None``-aware attribute operator::
client = maybe_get_client()
name = client?.name
Ternary Operator
----------------
Another common way to initialize default values is to use the ternary operator.
Here is an excerpt from the popular `Requests package
<https://github.com/kennethreitz/requests/blob/14a555ac716866678bf17e43e23230d81
a8149f5/requests/models.py#L212>`_::
data = [] if data is None else data
files = [] if files is None else files
headers = {} if headers is None else headers
params = {} if params is None else params
hooks = {} if hooks is None else hooks
This particular formulation has the undesirable effect of putting the operands
in an unintuitive order: the brain thinks, "use ``data`` if possible and use
``[]`` as a fallback," but the code puts the fallback *before* the preferred
value.
The author of this package could have written it like this instead::
data = data if data is not None else []
files = files if files is not None else []
headers = headers if headers is not None else {}
params = params if params is not None else {}
hooks = hooks if hooks is not None else {}
This ordering of the operands is more intuitive, but it requires 4 extra
characters (for "not "). It also highlights the repetition of identifiers:
``data if data``, ``files if files``, etc.
When written using the ``None`` coalescing operator, the sample reads::
data = data ?? []
files = files ?? []
headers = headers ?? {}
params = params ?? {}
hooks = hooks ?? {}
Rejected Ideas
==============
``None``-aware Function Call
----------------------------
The ``None``-aware syntax applies to attribute and index access, so it seems
2015-10-20 21:44:00 -04:00
natural to ask if it should also apply to function invocation syntax. It might
be written as ``foo?()``, where ``foo`` is only called if it is not None.
This has been rejected on the basis of the proposed operators being intended
to aid traversal of partially populated hierarchical data structures, *not*
for traversal of arbitrary class hierarchies. This is reflected in the fact
that none of the other mainstream languages that already offer this syntax
have found it worthwhile to support a similar syntax for optional function
invocations.
A workaround similar to that used by C# would be to write
``maybe_none?.__call__(arguments)``. If the callable is ``None``, the
expression will not be evaluated. (The C# equivalent uses ``?.Invoke()`` on its
callable type.)
``?`` Unary Postfix Operator
----------------------------
To generalize the ``None``-aware behavior and limit the number of new operators
introduced, a unary, postfix operator spelled ``?`` was suggested. The idea is
2015-10-20 21:44:00 -04:00
that ``?`` might return a special object that could would override dunder
methods that return ``self``. For example, ``foo?`` would evaluate to ``foo`` if
it is not ``None``, otherwise it would evaluate to an instance of
``NoneQuestion``::
class NoneQuestion():
def __call__(self, *args, **kwargs):
return self
def __getattr__(self, name):
return self
def __getitem__(self, key):
2015-10-20 21:44:00 -04:00
return self
2015-10-20 21:44:00 -04:00
With this new operator and new type, an expression like ``foo?.bar[baz]``
evaluates to ``NoneQuestion`` if ``foo`` is None. This is a nifty
generalization, but it's difficult to use in practice since most existing code
won't know what ``NoneQuestion`` is.
Going back to one of the motivating examples above, consider the following::
>>> import json
>>> created = None
>>> json.dumps({'created': created?.isoformat()})``
The JSON serializer does not know how to serialize ``NoneQuestion``, nor will
any other API. This proposal actually requires *lots of specialized logic*
throughout the standard library and any third party library.
2015-10-20 21:44:00 -04:00
At the same time, the ``?`` operator may also be **too general**, in the sense
that it can be combined with any other operator. What should the following
expressions mean?::
>>> x? + 1
>>> x? -= 1
>>> x? == 1
>>> ~x?
This degree of generalization is not useful. The operators actually proposed
herein are intentionally limited to a few operators that are expected to make it
easier to write common code patterns.
Haskell-style ``Maybe``
-----------------------
Haskell has a concept called `Maybe <https://wiki.haskell.org/Maybe>`_ that
encapsulates the idea of an optional value without relying on any special
keyword (e.g. ``null``) or any special instance (e.g. ``None``). In Haskell, the
purpose of ``Maybe`` is to avoid separate handling of "something" and nothing".
A Python package called `pymaybe <https://pypi.org/p/pymaybe/>`_ provides a
rough approximation. The documentation shows the following example::
>>> maybe('VALUE').lower()
'value'
>>> maybe(None).invalid().method().or_else('unknown')
'unknown'
The function ``maybe()`` returns either a ``Something`` instance or a
``Nothing`` instance. Similar to the unary postfix operator described in the
previous section, ``Nothing`` overrides dunder methods in order to allow
chaining on a missing value.
Note that ``or_else()`` is eventually required to retrieve the underlying value
from ``pymaybe``'s wrappers. Furthermore, ``pymaybe`` does not short circuit any
evaluation. Although ``pymaybe`` has some strengths and may be useful in its own
right, it also demonstrates why a pure Python implementation of coalescing is
not nearly as powerful as support built into the language.
The idea of adding a builtin ``maybe`` type to enable this scenario is rejected.
No-Value Protocol
-----------------
These operators could be generalised to user-defined types by using a protocol
to indicate when a value represents no value. Such a protocol may be a dunder
method ``__has_value__(self)` that returns ``True`` if the value should be
treated as having a value, and ``False`` if the ``None``-aware operators should
With this generalization, ``object`` would implement a dunder method equivalent
to this::
def __has_value__(self):
return True
``NoneType`` would implement a dunder method equivalent to this::
2015-10-20 21:44:00 -04:00
def __has_value__(self):
return False
2015-10-20 21:44:00 -04:00
In the specification section, all uses of ``x is None`` would be replaces with
``not x.__has_value__()``.
2015-10-20 21:44:00 -04:00
This generalization would allow for domain-specific ``null`` objects to be
coalesced just like ``None``. For example the ``pyasn1`` package has a type
called ``Null`` that represents an ASN.1 ``null``::
2015-10-20 21:44:00 -04:00
>>> from pyasn1.type import univ
>>> univ.Null() ?? univ.Integer(123)
2015-10-20 21:44:00 -04:00
Integer(123)
However, as ``None`` is already defined as being the value that represents
"no value", and the current specification would not preclude switching to a
protocol in the future, this idea is rejected for now.
References
==========
.. [1] C# Reference: Operators
(https://msdn.microsoft.com/en-us/library/6a71f45d.aspx)
.. [2] A Tour of the Dart Language: Operators
(https://www.dartlang.org/docs/dart-up-and-running/ch02.html#operators)
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: