python-peps/pep-0505.txt

1107 lines
42 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

PEP: 505
Title: None-aware operators
Version: $Revision$
Last-Modified: $Date$
Author: Mark E. Haase <mehaase@gmail.com>
Status: Deferred
Type: Standards Track
Content-Type: text/x-rst
Created: 18-Sep-2015
Python-Version: 3.8
PEP Deferral
============
Further consideration of this PEP has been deferred until Python 3.8 at the
earliest.
Abstract
========
Several modern programming languages have so-called "``null``-coalescing" or
"``null``- aware" operators, including C# [1]_, Dart [2]_, Perl, Swift, and PHP
(starting in version 7). These operators provide syntactic sugar for common
patterns involving null references.
* The "``null``-coalescing" operator is a binary operator that returns its left
operand if it is not ``null``. Otherwise it returns its right operand.
* The "``null``-aware member access" operator accesses an instance member only
if that instance is non-``null``. Otherwise it returns ``null``. (This is also
called a "safe navigation" operator.)
* The "``null``-aware index access" operator accesses an element of a collection
only if that collection is non-``null``. Otherwise it returns ``null``. (This
is another type of "safe navigation" operator.)
The purpose of this PEP is to explore the possibility of implementing similar
operators in Python. It provides some background material and then offers
several competing alternatives for implementation.
The initial reaction to this idea is majority negative. Even if ultimately
rejected, this PEP still serves a purpose: to fully document the reasons why
Python should not add this behavior, so that it can be pointed to in the future
when the question inevitably arises again. (This is the null alternative, so to
speak!)
This proposal advances multiple alternatives, and it should be considered
severable. It may be accepted in whole or in part. For example, the safe
navigation operators might be rejected even if the ``null``-coalescing operator
is approved, or vice-versa.
Of course, Python does not have ``null``; it has ``None``, which is conceptually
distinct. Although this PEP is inspired by "``null``-aware" operators in other
languages, it uses the term "``None``-aware" operators to describe some
hypothetical Python implementations.
Background
==========
Specialness of ``None``
-----------------------
The Python language does not currently define any special behavior for ``None``.
This PEP suggests making ``None`` a special case. This loss of generality is a
noticeable drawback of the proposal. A generalization of ``None``-aware
operators is set forth later in this document in order to avoid this
specialization.
Utility of ``None``
-------------------
One common criticism of adding special syntax for ``None`` is that ``None``
shouldn't be used in the first place: it's a code smell. A related criticism is
that ``None``-aware operators are used to silence errors (such as the novice
misunderstanding of an implicit ``return None``) akin to `PHP's @ operator
<http://php.net/manual/en/language.operators.errorcontrol.php>`_. Therefore,
the utility of ``None`` must be debated before discussing whether to add new
behavior around it.
Python does not have any concept of ``null``. Every Python identifier must
refer to an instance, so there cannot be any ``null`` references. Python does
have a special instance called ``None`` that can be used to represent missing
values, but ``None`` is conceptually distinct from ``null``.
The most frequent use of ``None`` in Python is to provide a default value for
optional arguments when some other default object is unwieldy. For example:
``def get(url, proxy=None):``. In this case, ``proxy`` is an optional
argument. If ``proxy`` is ``None``, then the request should be sent directly to
the server; otherwise, the request should be routed through the specified proxy
server. This use of ``None`` is preferred here to some other sentinel value or
the Null Object Pattern. [3]_
Examples of this form abound. Consider ``types.py`` in the standard library::
def prepare_class(name, bases=(), kwds=None):
if kwds is None:
kwds = {}
else:
kwds = dict(kwds)
...
Another frequent use of ``None`` is interfacing with external systems. Many of
those other systems have a concept of ``null``. Therefore, Python code must have
a way of representing ``null``, and typically it is represented by ``None``. For
example, databases can have ``null`` values, and most Python database drivers
will convert ``null`` to ``None`` when retrieving data from a database, and will
convert from ``None`` back to ``null`` when sending data to a database.
This convention of interchanging ``null`` and ``None`` is widespread in Python.
It is canonized in the Python DBAPI (PEP-249). [4]_ The ``json`` module in the
standard library and the third party PyYAML package both use ``None`` to
represent their respective languages' ``null``.
The C language ``null`` often bleeds into Python, too, particularly for thin
wrappers around C libraries. For example, in ``pyopenssl``, the ``X509`` class
has a ``get_notBefore()`` `method <https://github.com/pyca/pyopenssl/blob/325787
7f8846e4357b495fa6c9344d01b11cf16d/OpenSSL/crypto.py#L1219>`_ that returns
either a timestamp or ``None``. This function is a thin wrapper around an
OpenSSL function with the return type ``ASN1_TIME *``. Because this C pointer
may be ``null``, the Python wrapper must be able to represent ``null``, and
``None`` is the chosen representation.
The representation of ``null`` is particularly noticeable when Python code is
marshalling data between two systems. For example, consider a Python server that
fetches data from a database and converts it to JSON for consumption by another
process. In this case, it's often desirable that ``null`` in the database can be
easily translated to ``null`` in JSON. If ``None`` is not used for this purpose,
then each package will have to define its own representation of ``null``, and
converting between these representations adds unnecessary complexity to the
Python glue code.
Therefore, the preference for avoiding ``None`` is nothing more than a
preference. ``None`` has legitimate uses, particularly in specific types of
software. Any hypothetical ``None``-aware operators should be construed as
syntactic sugar for simplifying common patterns involving ``None``, and *should
not be construed* as error handling behavior.
Behavior In Other Languages
---------------------------
Given that ``null``-aware operators exist in other modern languages, it may be
helpful to quickly understand how they work in those languages::
/* Null-coalescing. */
String s1 = null;
String s2 = "hello";
String s3 = s1 ?? s2;
Console.WriteLine("s3 is: " + s3);
// s3 is: hello
/* Null-aware member access, a.k.a. safe navigation. */
Console.WriteLine("s1.Length is: " + s1?.Length);
Console.WriteLine("s2.Length is: " + s2?.Length);
// s1.Length is:
// s2.Length is: 5
/* Null-aware index access, a.k.a. safe navigation. */
Dictionary<string,string> d1 = null;
Dictionary<string,string> d2 = new Dictionary<string, string>
{
{ "foo", "bar" },
{ "baz", "bat" }
};
Console.WriteLine("d1[\"foo\"] is: " + d1?["foo"]);
Console.WriteLine("d2[\"foo\"] is: " + d2?["foo"]);
// d1["foo"] is:
// d2["foo"] is: bar
/* Short Circuiting */
Console.WriteLine("s1 trim/upper is: " + s1?.Trim().Length);
Console.WriteLine("s2 trim/upper is: " + s2?.Trim().Length);
// s1 trimmed length is:
// s2 trimmed length is: 5
String s4 = s1 ?? s2 ?? DoError();
Console.WriteLine("s4 is: " + s4)
// s4 is: hello
A `working example <https://dotnetfiddle.net/SxQNG8>`_ can be viewed online.
Of utmost importance, notice the short circuiting behavior. The short circuiting
of ``??`` is similar to short circuiting of other boolean operators such as
``||`` or ``&&`` and should not be surprising. Helpfully, `?.` is *also* short
circuiting: ``s1?.Trim()`` evaluates to null, but ``s1?.Trim().Length`` does not
attempt to dereference the ``null`` pointer.
Rationale
=========
Existing Alternatives
---------------------
Python does not have any specific ``None``-aware operators, but it does have
operators that can be used for a similar purpose. This section describes why
these alternatives may be undesirable for some common ``None`` patterns.
``or`` Operator
~~~~~~~~~~~~~~~
Similar behavior can be achieved with the ``or`` operator, but ``or`` checks
whether its left operand is false-y, not specifically ``None``. This can lead
to surprising behavior. Consider the scenario of computing the price of some
products a customer has in his/her shopping cart::
>>> price = 100
>>> default_quantity = 1
# If user didn't specify a quantity, then assume the default.
>>> requested_quantity = None
>>> (requested_quantity or default_quantity) * price
100
# The user added 5 items to the cart.
>>> requested_quantity = 5
>>> (requested_quantity or default_quantity) * price
500
# User removed 5 items from cart.
>>> requested_quantity = 0
>>> (requested_quantity or default_quantity) * price # oops!
100
An experienced Python developer should know how ``or`` works and be capable of
avoiding bugs like this. However, getting in the habit of using ``or`` for this
purpose still might cause an experienced developer to occasionally make this
mistake, especially when refactoring existing code and not carefully paying
attention to the possible values of the left-hand operand.
For inexperienced developers, the problem is worse. The top Google hit for
"python null coalesce" is a `StackOverflow page
<http://stackoverflow.com/questions/4978738/is-there-a-python-equivalent-of-
the-c-sharp-null-coalescing-operator>`_, and the top answer says to use ``or``.
The top answer goes on to explain the caveats of using ``or`` like this, but how
many beginning developers go on to read all those caveats? The accepted answer
on `a more recent question <http://stackoverflow.com/questions/13710631/is-
there-shorthand-for-returning-a -default-value-if-none-in-python>`_ says to use
``or`` without any caveats at all. These two questions have a combined 26,000
views!
The common usage of ``or`` for the purpose of providing default values is
undeniable, and yet it is also booby-trapped for unsuspecting newcomers. This
suggests that a safe operator for providing default values would have positive
utility. While some critics claim that ``None``-aware operators will be abused
for error handling, they are no more prone to abuse than ``or`` is.
Ternary Operator
~~~~~~~~~~~~~~~~
Another common way to initialize default values is to use the ternary operator.
Here is an excerpt from the popular `Requests package
<https://github.com/kennethreitz/requests/blob/14a555ac716866678bf17e43e23230d81
a8149f5/requests/models.py#L212>`_::
data = [] if data is None else data
files = [] if files is None else files
headers = {} if headers is None else headers
params = {} if params is None else params
hooks = {} if hooks is None else hooks
This particular formulation has the undesirable effect of putting the operands
in an unintuitive order: the brain thinks, "use ``data`` if possible and use
``[]`` as a fallback," but the code puts the fallback *before* the preferred
value.
The author of this package could have written it like this instead::
data = data if data is not None else []
files = files if files is not None else []
headers = headers if headers is not None else {}
params = params if params is not None else {}
hooks = hooks if hooks is not None else {}
This ordering of the operands is more intuitive, but it requires 4 extra
characters (for "not "). It also highlights the repetition of identifiers:
``data if data``, ``files if files``, etc. This example benefits from short
identifiers, but what if the tested expression is longer and/or has side
effects? This is addressed in the next section.
Motivating Examples
-------------------
The purpose of this PEP is to simplify some common patterns involving ``None``.
This section presents some examples of common ``None`` patterns and explains
the drawbacks.
This first example is from a Python web crawler that uses the popular Flask
framework as a front-end. This function retrieves information about a web site
from a SQL database and formats it as JSON to send to an HTTP client::
class SiteView(FlaskView):
@route('/site/<id_>', methods=['GET'])
def get_site(self, id_):
site = db.query('site_table').find(id_)
return jsonify(
first_seen=site.first_seen.isoformat() if site.first_seen is not None else None,
id=site.id,
is_active=site.is_active,
last_seen=site.last_seen.isoformat() if site.last_seen is not None else None,
url=site.url.rstrip('/')
)
Both ``first_seen`` and ``last_seen`` are allowed to be ``null`` in the
database, and they are also allowed to be ``null`` in the JSON response. JSON
does not have a native way to represent a ``datetime``, so the server's contract
states that any non-``null`` date is represented as an ISO-8601 string.
Note that this code is invalid by PEP-8 standards: several lines are over the
line length limit. In fact, *including it in this document* violates the PEP
formatting standard! But it's not unreasonably indented, nor are any of the
identifiers excessively long. The excessive line length is due to the
repetition of identifiers on both sides of the ternary ``if`` and the verbosity
of the ternary itself (10 characters out of a 78 character line length).
One way to fix this code is to replace each ternary with a full ``if/else``
block::
class SiteView(FlaskView):
@route('/site/<id_>', methods=['GET'])
def get_site(self, id_):
site = db.query('site_table').find(id_)
if site.first_seen is None:
first_seen = None
else:
first_seen = site.first_seen.isoformat()
if site.last_seen is None:
last_seen = None
else:
last_seen = site.last_seen.isoformat()
return jsonify(
first_seen=first_seen,
id=site.id,
is_active=site.is_active,
last_seen=last_seen,
url=site.url.rstrip('/')
)
This version definitely isn't *bad*. It is easy to read and understand. On the
other hand, adding 8 lines of code to express this common behavior feels a bit
heavy, especially for a deliberately simplified example. If a larger, more
complicated data model was being used, then it would get tedious to continually
write in this long form. The readability would start to suffer as the number of
lines in the function grows, and a refactoring would be forced.
Another alternative is to rename some of the identifiers::
class SiteView(FlaskView):
@route('/site/<id_>', methods=['GET'])
def get_site(self, id_):
site = db.query('site_table').find(id_)
fs = site.first_seen
ls = site.last_seen
return jsonify(
first_seen=fs.isodate() if fs is not None else None,
id=site.id,
is_active=site.is_active,
last_seen=ls.isodate() if ls is not None else None,,
url=site.url.rstrip('/')
)
This adds fewer lines of code than the previous example, but it comes at the
expense of introducing extraneous identifiers that amount to nothing more than
aliases. These new identifiers are short enough to fit a ternary expression onto
one line, but the identifiers are also less intuitive, e.g. ``fs`` versus
``first_seen``.
As a quick preview, consider an alternative rewrite using a new operator::
class SiteView(FlaskView):
@route('/site/<id_>', methods=['GET'])
def get_site(self, id_):
site = db.query('site_table').find(id_)
return jsonify(
first_seen=site.first_seen?.isoformat(),
id=site.id,
is_active=site.is_active,
last_seen=site.last_seen?.isoformat(),
url=site.url.rstrip('/')
)
The ``?.`` operator behaves as a "safe navigation" operator, allowing a more
concise syntax where the expression ``site.first_seen`` is not duplicated.
The next example is from a trending project on GitHub called `Grab
<https://github.com/lorien/grab/blob/4c95b18dcb0fa88eeca81f5643c0ebfb114bf728/gr
ab/upload.py>`_, which is a Python scraping library::
class BaseUploadObject(object):
def find_content_type(self, filename):
ctype, encoding = mimetypes.guess_type(filename)
if ctype is None:
return 'application/octet-stream'
else:
return ctype
class UploadContent(BaseUploadObject):
def __init__(self, content, filename=None, content_type=None):
self.content = content
if filename is None:
self.filename = self.get_random_filename()
else:
self.filename = filename
if content_type is None:
self.content_type = self.find_content_type(self.filename)
else:
self.content_type = content_type
class UploadFile(BaseUploadObject):
def __init__(self, path, filename=None, content_type=None):
self.path = path
if filename is None:
self.filename = os.path.split(path)[1]
else:
self.filename = filename
if content_type is None:
self.content_type = self.find_content_type(self.filename)
else:
self.content_type = content_type
.. note::
I don't know the author of the Grab project. I used it as an example
because it is a trending repo on GitHub and it has good examples of common
``None`` patterns.
This example contains several good examples of needing to provide default
values. It is a bit verbose as it is, and it is certainly not improved by the
ternary operator::
class BaseUploadObject(object):
def find_content_type(self, filename):
ctype, encoding = mimetypes.guess_type(filename)
return 'application/octet-stream' if ctype is None else ctype
class UploadContent(BaseUploadObject):
def __init__(self, content, filename=None, content_type=None):
self.content = content
self.filename = self.get_random_filename() if filename \
is None else filename
self.content_type = self.find_content_type(self.filename) \
if content_type is None else content_type
class UploadFile(BaseUploadObject):
def __init__(self, path, filename=None, content_type=None):
self.path = path
self.filename = os.path.split(path)[1] if filename is \
None else filename
self.content_type = self.find_content_type(self.filename) \
if content_type is None else content_type
The first ternary expression is tidy, but it reverses the intuitive order of
the operands: it should return ``ctype`` if it has a value and use the string
literal as fallback. The other ternary expressions are unintuitive and so
long that they must be wrapped. The overall readability is worsened, not
improved.
This code *might* be improved, though, if there was a syntactic shortcut for
this common need to supply a default value::
class BaseUploadObject(object):
def find_ctype(self, filename):
ctype, encoding = mimetypes.guess_type(filename)
return ctype ?? 'application/octet-stream'
class UploadContent(BaseUploadObject):
def __init__(self, content, filename=None, content_type=None):
self.content = content
self.filename = filename ?? self.get_random_filename()
self.content_type = content_type ?? self.find_ctype(self.filename)
class UploadFile(BaseUploadObject):
def __init__(self, path, filename=None, content_type=None):
self.path = path
self.filename = filename ?? os.path.split(path)[1]
self.content_type = content_type ?? self.find_ctype(self.filename)
This syntax has an intuitive ordering of the operands, e.g. ``ctype`` -- the
preferred value -- comes before the fallback value. The terseness of the syntax
also makes for fewer lines of code and less code to visually parse.
.. note::
I cheated on the last example: I renamed ``find_content_type`` to
``find_ctype`` in order to fit two of the lines under 80 characters. If you
find this underhanded, you can go back and apply the same renaming to the
previous 2 examples. You'll find that it doesn't change the
conclusions.
Usage Of ``None`` In The Standard Library
-----------------------------------------
The previous sections show some code patterns that are claimed to be "common",
but how common are they? The attached script `find-pep505.py
<https://github.com/python/peps/blob/master/pep-0505/find-pep505.py>`_ is meant
to answer this question. It uses the ``ast`` module to search for variations of
the following patterns in any ``*.py`` file::
>>> # None-coalescing if block
...
>>> if a is None:
... a = b
>>> # [Possible] None-coalescing "or" operator
...
>>> a or 'foo'
>>> a or []
>>> a or {}
>>> # None-coalescing ternary
...
>>> a if a is not None else b
>>> b if a is None else a
>>> # Safe navigation "and" operator
...
>>> a and a.foo
>>> a and a['foo']
>>> a and a.foo()
>>> # Safe navigation if block
...
>>> if a is not None:
... a.foo()
>>> # Safe navigation ternary
...
>>> a.foo if a is not None else b
>>> b if a is None else a.foo
This script takes one or more names of Python source files to analyze::
$ python3 find-pep505.py test.py
$ find /usr/lib/python3.4 -name '*.py' | xargs python3 find-pep505.py
The script prints out any matches it finds. Sample::
None-coalescing if block: /usr/lib/python3.4/inspect.py:594
if _filename is None:
_filename = getsourcefile(object) or getfile(object)
[Possible] None-coalescing `or`: /usr/lib/python3.4/lib2to3/refactor.py:191
self.explicit = explicit or []
None-coalescing ternary: /usr/lib/python3.4/decimal.py:3909
self.clamp = clamp if clamp is not None else dc.clamp
Safe navigation `and`: /usr/lib/python3.4/weakref.py:512
obj = info and info.weakref()
Safe navigation `if` block: /usr/lib/python3.4/http/cookiejar.py:1895
if k is not None:
lc = k.lower()
else:
lc = None
Safe navigation ternary: /usr/lib/python3.4/sre_parse.py:856
literals = [None if s is None else s.encode('latin-1') for s in literals]
.. note::
Coalescing with ``or`` is marked as a "possible" match, because it's not
trivial to infer whether ``or`` is meant to coalesce False-y values
(correct) or if it meant to coalesce ``None`` (incorrect). On the other
hand, we assume that ``and`` is always incorrect for safe navigation.
The script has been tested against `test.py
<https://github.com/python/peps/blob/master/pep-0505/test.py>`_ and the Python
3.4 standard library, but it should work on any arbitrary Python 3 source code.
The complete output from running it against the standard library is attached to
this proposal as `find-pep505.out
<https://github.com/python/peps/blob/master/pep-0505/find-pep505.out>`_.
The script counts how many matches it finds and prints the totals at the
end::
Total None-coalescing `if` blocks: 426
Total [possible] None-coalescing `or`: 119
Total None-coalescing ternaries: 21
Total Safe navigation `and`: 9
Total Safe navigation `if` blocks: 55
Total Safe navigation ternaries: 7
This is a total of 637 possible matches for these common code patterns in the
standard library. Allowing for some false positives and false negatives, it is
fair to say that these code patterns are definitely common in the standard
library.
Rejected Ideas
--------------
Several related ideas were discussed on python-ideas, and some of these were
roundly rejected by BDFL, the community, or both. For posterity's sake, some of
those ideas are recorded here.
``None``-aware Function Call
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The ``None``-aware syntax applies to attribute and index access, so it seems
natural to ask if it should also apply to function invocation syntax. It might
be written as ``foo?()``, where ``foo`` is only called if it is not None. This
idea was quickly rejected, for several reasons.
First, no other mainstream language has such syntax. Second, Python evaluates
arguments to a function before it looks up the function itself, so
``foo?(bar())`` would still call ``bar()`` even if ``foo`` is ``None``. This
behaviour is unexpected for a so-called "short-circuiting" operator.
``?`` Unary Postfix Operator
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To generalize the ``None``-aware behavior and limit the number of new operators
introduced, a unary, postfix operator spelled ``?`` was suggested. The idea is
that ``?`` might return a special object that could would override dunder
methods that return ``self``. For example, ``foo?`` would evaluate to ``foo`` if
it is not ``None``, otherwise it would evaluate to an instance of
``NoneQuestion``::
class NoneQuestion():
def __call__(self, *args, **kwargs):
return self
def __getattr__(self, name):
return self
def __getitem__(self, key):
return self
With this new operator and new type, an expression like ``foo?.bar[baz]``
evaluates to ``NoneQuestion`` if ``foo`` is None. This is a nifty
generalization, but it's difficult to use in practice since most existing code
won't know what ``NoneQuestion`` is.
Going back to one of the motivating examples above, consider the following::
>>> import json
>>> created = None
>>> json.dumps({'created': created?.isoformat()})``
The JSON serializer does not know how to serialize ``NoneQuestion``, nor will
any other API. This proposal actually requires *lots of specialized logic*
throughout the standard library and any third party library.
At the same time, the ``?`` operator may also be **too general**, in the sense
that it can be combined with any other operator. What should the following
expressions mean?::
>>> x? + 1
>>> x? -= 1
>>> x? == 1
>>> ~x?
This degree of generalization is not useful. The operators actually proposed
herein are intentionally limited to a few operators that are expected to make it
easier to write common code patterns.
Haskell-style ``Maybe``
~~~~~~~~~~~~~~~~~~~~~~~
Haskell has a concept called `Maybe <https://wiki.haskell.org/Maybe>`_ that
encapsulates the idea of an optional value without relying on any special
keyword (e.g. ``null``) or any special instance (e.g. ``None``). In Haskell, the
purpose of ``Maybe`` is to avoid separate handling of "something" and nothing".
The concept is so heavily intertwined with Haskell's lazy evaluation that it
doesn't translate cleanly into Python.
There is a Python package called `pymaybe
<https://pypi.python.org/pypi/pymaybe/0.1.1>`_ that provides a rough
approximation. The documentation shows the following example that appears
relevant to the discussion at hand::
>>> maybe('VALUE').lower()
'value'
>>> maybe(None).invalid().method().or_else('unknown')
'unknown'
The function ``maybe()`` returns either a ``Something`` instance or a
``Nothing`` instance. Similar to the unary postfix operator described in the
previous section, ``Nothing`` overrides dunder methods in order to allow
chaining on a missing value.
Note that ``or_else()`` is eventually required to retrieve the underlying value
from ``pymaybe``'s wrappers. Furthermore, ``pymaybe`` does not short circuit any
evaluation. Although ``pymaybe`` has some strengths and may be useful in its own
right, it also demonstrates why a pure Python implementation of coalescing is
not nearly as powerful as support built into the language.
Specification
=============
This PEP suggests 3 new operators be added to Python:
1. ``None``-coalescing operator
2. ``None``-aware attribute access
3. ``None``-aware index access/slicing
We will continue to assume the same spellings as in
the previous sections in order to focus on behavior before diving into the much
more contentious issue of how to spell these operators.
A generalization of these operators is also proposed below under the heading
"Generalized Coalescing".
Operator Spelling
-----------------
Despite significant support for the proposed operators, the majority of
discussion on python-ideas fixated on the spelling. Many alternative spellings
were proposed, both punctuation and keywords, but each alternative drew some
criticism. Spelling the operator as a keyword is problematic, because adding new
keywords to the language is not backwards compatible.
It is not impossible to add a new keyword, however, and we can look at several
other PEPs for inspiration. For example, `PEP-492
<https://www.python.org/dev/peps/pep-0492/>`_ introduced the new keywords
``async`` and ``await`` into Python 3.5. These new keywords are fully backwards
compatible, because that PEP also introduces a new lexical context such that
``async`` and ``await`` are only treated as keywords when used inside of an
``async def`` function. In other locations, ``async`` and ``await`` may be used
as identifiers.
It is also possible to craft a new operator out of existing keywords, as was
the case with `PEP-308 <https://www.python.org/dev/peps/pep-0308/>`_, which
created a ternary operator by cobbling together the `if` and `else` keywords
into a new operator.
In addition to the lexical acrobatics required to create a new keyword, keyword
operators are also undesirable for creating an assignment shortcut syntax. In
Dart, for example, ``x ??= y`` is an assignment shortcut that approximately
means ``x = x ?? y`` except that ``x`` is only evaluated once. If Python's
coalesce operator is a keyword, e.g. ``foo``, then the assignment shortcut would
be very ugly: ``x foo= y``.
Spelling new logical operators with punctuation is unlikely, for several
reasons. First, Python eschews punctuation for logical operators. For example,
it uses ``not`` instead of ``!``, ``or`` instead of ``||``, and ``… if … else …``
instead of ``… ? … : …``.
Second, nearly every single punctuation character on a standard keyboard already
has special meaning in Python. The only exceptions are ``$``, ``!``, ``?``, and
backtick (as of Python 3). This leaves few options for a new, single-character
operator.
Third, other projects in the Python universe assign special meaning to
punctuation. For example, `IPython
<https://ipython.org/ipython-doc/2/interactive/reference.html>`_ assigns
special meaning to ``%``, ``%%``, ``?``, ``??``, ``$``, and ``$$``, among
others. Out of deference to those projects and the large communities using them,
introducing conflicting syntax into Python is undesirable.
The spellings ``??`` and ``?.`` will be familiar to programmers who have seen
them in other popular programming languages. Any alternative punctuation will be
just as ugly but without the benefit of familiarity from other languages.
Therefore, this proposal spells the new operators using the same punctuation
that already exists in other languages.
``None``-Coalescing Operator
----------------------------
The ``None``-coalescing operator is a short-circuiting, binary operator that
behaves in the following way.
1. Evaluate the left operand first.
2. If the left operand is not ``None``, then return it immediately.
3. Else, evaluate the right operand and return the result.
Consider the following examples. We will continue to use the spelling ``??``
here, but keep in mind that alternative spellings will be discussed below::
>>> 1 ?? 2
1
>>> None ?? 2
2
Importantly, note that the right operand is not evaluated unless the left
operand is None::
>>> def err(): raise Exception('foo')
>>> 1 ?? err()
1
>>> None ?? err()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in err
Exception: foo
The operator is left associative. Combined with its short circuiting behavior,
this makes the operator easy to chain::
>>> timeout = None
>>> local_timeout = 60
>>> global_timeout = 300
>>> timeout ?? local_timeout ?? global_timeout
60
>>> local_timeout = None
>>> timeout ?? local_timeout ?? global_timeout
300
The operator has higher precedence than the comparison operators ``==``, ``>``,
``is``, etc., but lower precedence than any bitwise or arithmetic operators.
This precedence is chosen for making "default value" expressions intuitive to
read and write::
>>> not None ?? True
>>> not (None ?? True) # Same precedence
>>> 1 == None ?? 1
>>> 1 == (None ?? 1) # Same precedence
>>> 'foo' in None ?? ['foo', 'bar']
>>> 'foo' in (None ?? ['foo', 'bar']) # Same precedence
>>> 1 + None ?? 2
>>> 1 + (None ?? 2) # Same precedence
Recall the example above of calculating the cost of items in a shopping cart,
and the easy-to-miss bug. This type of bug is not possible with the ``None``-
coalescing operator, because there is no implicit type coersion to ``bool``::
>>> requested_quantity = 0
>>> default_quantity = 1
>>> price = 100
>>> requested_quantity ?? default_quantity * price
0
The ``None``-coalescing operator also has a corresponding assignment shortcut.
The following assignments are semantically similar, except that ``foo`` is only
looked up once when using the assignment shortcut::
>>> foo ??= []
>>> foo = foo ?? []
The ``None`` coalescing operator improves readability, especially when handling
default function arguments. Consider again the example from the Requests
library, rewritten to use ``None``-coalescing::
def __init__(self, data=None, files=None, headers=None, params=None, hooks=None):
self.data = data ?? []
self.files = files ?? []
self.headers = headers ?? {}
self.params = params ?? {}
self.hooks = hooks ?? {}
The operator makes the intent easier to follow (by putting operands in an
intuitive order) and is more concise than the ternary operator, while still
preserving the short circuit semantics of the code that it replaces.
``None``-Aware Attribute Access Operator
----------------------------------------
The ``None``-aware attribute access operator (also called "safe navigation")
checks its left operand. If the left operand is ``None``, then the operator
evaluates to ``None``. If the the left operand is not ``None``, then the
operator accesses the attribute named by the right operand::
>>> from datetime import date
>>> d = date.today()
>>> d.year
2015
>>> d = None
>>> d.year
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'year'
>>> d?.year
None
The operator has the same precedence and associativity as the plain attribute
access operator ``.``, but this operator is also short-circuiting in a unique
way: if the left operand is ``None``, then any series of attribute access, index
access, slicing, or function call operators immediately to the right of it *are
not evaluated*::
>>> name = ' The Black Knight '
>>> name.strip()[4:].upper()
'BLACK KNIGHT'
>>> name = None
>>> name?.strip()[4:].upper()
None
If this operator did not short circuit in this way, then the second example
would partially evaluate ``name?.strip()`` to ``None()`` and then fail with
``TypeError: 'NoneType' object is not callable``.
To put it another way, the following expressions are semantically similar,
except that ``name`` is only looked up once on the first line::
>>> name?.strip()[4:].upper()
>>> name.strip()[4:].upper() if name is not None else None
.. note::
C# implements its safe navigation operators with the same short-circuiting
semantics, but Dart does not. In Dart, the second example (suitably
translated) would fail. The C# semantics are obviously superior, given the
original goal of writing common cases more concisely. The Dart semantics are
nearly useless.
This operator short circuits one or more attribute access, index access,
slicing, or function call operators that are adjacent to its right, but it
does not short circuit any other operators (logical, bitwise, arithmetic, etc.),
nor does it escape parentheses::
>>> d = date.today()
>>> d?.year.numerator + 1
2016
>>> d = None
>>> d?.year.numerator + 1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'
>>> (d?.year).numerator + 1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'numerator'
Note that the error in the second example is not on the attribute access
``numerator``. In fact, that attribute access is never performed. The error
occurs when adding ``None + 1``, because the ``None``-aware attribute access
does not short circuit ``+``.
The third example fails because the operator does not escape parentheses. In
that example, the attribute access ``numerator`` is evaluated and fails because
``None`` does not have that attribute.
Finally, observe that short circuiting adjacent operators is not at all the same
thing as propagating ``None`` throughout an expression::
>>> user?.first_name.upper()
If ``user`` is not ``None``, then ``user.first_name`` is evaluated. If
``user.first_name`` evaluates to ``None``, then ``user.first_name.upper()`` is
an error! In English, this expression says, "``user`` is optional but if it has
a value, then it must have a ``first_name``, too."
If ``first_name`` is supposed to be optional attribute, then the expression must
make that explicit::
>>> user?.first_name?.upper()
The operator is not intended as an error silencing mechanism, and it would be
undesirable if its presence infected nearby operators.
``None``-Aware Index Access/Slicing Operator
--------------------------------------------
The ``None``-aware index access/slicing operator (also called "safe navigation")
is nearly identical to the ``None``-aware attribute access operator. It combines
the familiar square bracket syntax ``[]`` with new punctuation or a new keyword,
the spelling of which is discussed later::
>>> person = {'name': 'Mark', 'age': 32}
>>> person['name']
'Mark'
>>> person = None
>>> person['name']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'NoneType' object is not subscriptable
>>> person?.['name']
None
The ``None``-aware slicing operator behaves similarly::
>>> name = 'The Black Knight'
>>> name[4:]
'Black Knight'
>>> name = None
>>> name[4:]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'NoneType' object is not subscriptable
>>> name?.[4:]
None
These operators have the same precedence as the plain index access and slicing
operators. They also have the same short-circuiting behavior as the
``None``-aware attribute access.
Generalized Coalescing
----------------------
Making ``None`` a special case is too specialized and magical. The behavior can
be generalized by making the ``None``-aware operators invoke a dunder method,
e.g. ``__coalesce__(self)`` that returns ``True`` if an object should be
coalesced and ``False`` otherwise.
With this generalization, ``object`` would implement a dunder method equivalent
to this::
def __coalesce__(self):
return False
``NoneType`` would implement a dunder method equivalent to this::
def __coalesce__(self):
return True
If this generalization is accepted, then the operators will need to be renamed
such that the term ``None`` is not used, e.g. "Coalescing Operator", "Coalesced
Member Access Operator", etc.
The coalesce operator would invoke this dunder method. The following two
expressions are semantically similar, except `foo` is only looked up once when
using the coalesce operator::
>>> foo ?? bar
>>> bar if foo.__coalesce__() else foo
The coalesced attribute and index access operators would invoke the same dunder
method::
>>> user?.first_name.upper()
>>> None if user.__coalesce__() else user.first_name.upper()
This generalization allows for domain-specific ``null`` objects to be coalesced
just like ``None``. For example the ``pyasn1`` package has a type called
``Null`` that represents an ASN.1 ``null``::
>>> from pyasn1.type import univ
>>> univ.Null() ?? univ.Integer(123)
Integer(123)
In addition to making the proposed operators less specialized, this
generalization also makes it easier to work with the Null Object Pattern, [3]_
for those developers who prefer to avoid using ``None``.
Implementation
--------------
The author of this PEP is not competent with grammars or lexers, and given the
contentiousness of this proposal, the implementation details for CPython will be
deferred until we have a clearer idea that one or more of the proposed
enhancements will be approved.
...TBD...
References
==========
.. [1] C# Reference: Operators
(https://msdn.microsoft.com/en-us/library/6a71f45d.aspx)
.. [2] A Tour of the Dart Language: Operators
(https://www.dartlang.org/docs/dart-up-and-running/ch02.html#operators)
.. [3] Wikipedia: Null Object Pattern
(https://en.wikipedia.org/wiki/Null_Object_pattern)
.. [4] PEP-249:
(https://www.python.org/dev/peps/pep-0249/)
.. [5] PEP-308
(https://www.python.org/dev/peps/pep-0308/)
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: