1100 lines
42 KiB
Plaintext
1100 lines
42 KiB
Plaintext
PEP: 505
|
||
Title: None-aware operators
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Mark E. Haase <mehaase@gmail.com>
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 18-Sep-2015
|
||
Python-Version: 3.6
|
||
|
||
Abstract
|
||
========
|
||
|
||
Several modern programming languages have so-called "``null``-coalescing" or
|
||
"``null``- aware" operators, including C# [1]_, Dart [2]_, Perl, Swift, and PHP
|
||
(starting in version 7). These operators provide syntactic sugar for common
|
||
patterns involving null references.
|
||
|
||
* The "``null``-coalescing" operator is a binary operator that returns its left
|
||
operand if it is not ``null``. Otherwise it returns its right operand.
|
||
* The "``null``-aware member access" operator accesses an instance member only
|
||
if that instance is non-``null``. Otherwise it returns ``null``. (This is also
|
||
called a "safe navigation" operator.)
|
||
* The "``null``-aware index access" operator accesses an element of a collection
|
||
only if that collection is non-``null``. Otherwise it returns ``null``. (This
|
||
is another type of "safe navigation" operator.)
|
||
|
||
The purpose of this PEP is to explore the possibility of implementing similar
|
||
operators in Python. It provides some background material and then offers
|
||
several competing alternatives for implementation.
|
||
|
||
The initial reaction to this idea is majority negative. Even if ultimately
|
||
rejected, this PEP still serves a purpose: to fully document the reasons why
|
||
Python should not add this behavior, so that it can be pointed to in the future
|
||
when the question inevitably arises again. (This is the null alternative, so to
|
||
speak!)
|
||
|
||
This proposal advances multiple alternatives, and it should be considered
|
||
severable. It may be accepted in whole or in part. For example, the safe
|
||
navigation operators might be rejected even if the ``null``-coalescing operator
|
||
is approved, or vice-versa.
|
||
|
||
Of course, Python does not have ``null``; it has ``None``, which is conceptually
|
||
distinct. Although this PEP is inspired by "``null``-aware" operators in other
|
||
languages, it uses the term "``None``-aware" operators to describe some
|
||
hypothetical Python implementations.
|
||
|
||
|
||
Background
|
||
==========
|
||
|
||
Specialness of ``None``
|
||
-----------------------
|
||
|
||
The Python language does not currently define any special behavior for ``None``.
|
||
This PEP suggests making ``None`` a special case. This loss of generality is a
|
||
noticeable drawback of the proposal. A generalization of ``None``-aware
|
||
operators is set forth later in this document in order to avoid this
|
||
specialization.
|
||
|
||
|
||
Utility of ``None``
|
||
-------------------
|
||
|
||
One common criticism of adding special syntax for ``None`` is that ``None``
|
||
shouldn't be used in the first place: it's a code smell. A related criticism is
|
||
that ``None``-aware operators are used to silence errors (such as the novice
|
||
misunderstanding of an implicit ``return None``) akin to `PHP's @ operator
|
||
<http://php.net/manual/en/language.operators.errorcontrol.php>`_. Therefore,
|
||
the utility of ``None`` must be debated before discussing whether to add new
|
||
behavior around it.
|
||
|
||
Python does not have any concept of ``null``. Every Python identifier must
|
||
refer to an instance, so there cannot be any ``null`` references. Python does
|
||
have a special instance called ``None`` that can be used to represent missing
|
||
values, but ``None`` is conceptually distinct from ``null``.
|
||
|
||
The most frequent use of ``None`` in Python is to provide a default value for
|
||
optional arguments when some other default object is unwieldy. For example:
|
||
``def get(url, proxy=None):``. In this case, ``proxy`` is an optional
|
||
argument. If ``proxy`` is ``None``, then the request should be sent directly to
|
||
the server; otherwise, the request should be routed through the specified proxy
|
||
server. This use of ``None`` is preferred here to some other sentinel value or
|
||
the Null Object Pattern. [3]_
|
||
|
||
Examples of this form abound. Consider ``types.py`` in the standard library::
|
||
|
||
def prepare_class(name, bases=(), kwds=None):
|
||
if kwds is None:
|
||
kwds = {}
|
||
else:
|
||
kwds = dict(kwds)
|
||
...
|
||
|
||
Another frequent use of ``None`` is interfacing with external systems. Many of
|
||
those other systems have a concept of ``null``. Therefore, Python code must have
|
||
a way of representing ``null``, and typically it is represented by ``None``. For
|
||
example, databases can have ``null`` values, and most Python database drivers
|
||
will convert ``null`` to ``None`` when retrieving data from a database, and will
|
||
convert from ``None`` back to ``null`` when sending data to a database.
|
||
|
||
This convention of interchanging ``null`` and ``None`` is widespread in Python.
|
||
It is canonized in the Python DBAPI (PEP-249). [4]_ The ``json`` module in the
|
||
standard library and the third party PyYAML package both use ``None`` to
|
||
represent their respective languages' ``null``.
|
||
|
||
The C language ``null`` often bleeds into Python, too, particularly for thin
|
||
wrappers around C libraries. For example, in ``pyopenssl``, the ``X509`` class
|
||
has a ``get_notBefore()`` `method <https://github.com/pyca/pyopenssl/blob/325787
|
||
7f8846e4357b495fa6c9344d01b11cf16d/OpenSSL/crypto.py#L1219>`_ that returns
|
||
either a timestamp or ``None``. This function is a thin wrapper around an
|
||
OpenSSL function with the return type ``ASN1_TIME *``. Because this C pointer
|
||
may be ``null``, the Python wrapper must be able to represent ``null``, and
|
||
``None`` is the chosen representation.
|
||
|
||
The representation of ``null`` is particularly noticeable when Python code is
|
||
marshalling data between two systems. For example, consider a Python server that
|
||
fetches data from a database and converts it to JSON for consumption by another
|
||
process. In this case, it's often desirable that ``null`` in the database can be
|
||
easily translated to ``null`` in JSON. If ``None`` is not used for this purpose,
|
||
then each package will have to define its own representation of ``null``, and
|
||
converting between these representations adds unnecessary complexity to the
|
||
Python glue code.
|
||
|
||
Therefore, the preference for avoiding ``None`` is nothing more than a
|
||
preference. ``None`` has legitimate uses, particularly in specific types of
|
||
software. Any hypothetical ``None``-aware operators should be construed as
|
||
syntactic sugar for simplifying common patterns involving ``None``, and *should
|
||
not be construed* as error handling behavior.
|
||
|
||
|
||
Behavior In Other Languages
|
||
---------------------------
|
||
|
||
Given that ``null``-aware operators exist in other modern languages, it may be
|
||
helpful to quickly understand how they work in those languages::
|
||
|
||
/* Null-coalescing. */
|
||
|
||
String s1 = null;
|
||
String s2 = "hello";
|
||
String s3 = s1 ?? s2;
|
||
Console.WriteLine("s3 is: " + s3);
|
||
// s3 is: hello
|
||
|
||
/* Null-aware member access, a.k.a. safe navigation. */
|
||
|
||
Console.WriteLine("s1.Length is: " + s1?.Length);
|
||
Console.WriteLine("s2.Length is: " + s2?.Length);
|
||
// s1.Length is:
|
||
// s2.Length is: 5
|
||
|
||
/* Null-aware index access, a.k.a. safe navigation. */
|
||
|
||
Dictionary<string,string> d1 = null;
|
||
Dictionary<string,string> d2 = new Dictionary<string, string>
|
||
{
|
||
{ "foo", "bar" },
|
||
{ "baz", "bat" }
|
||
};
|
||
|
||
Console.WriteLine("d1[\"foo\"] is: " + d1?["foo"]);
|
||
Console.WriteLine("d2[\"foo\"] is: " + d2?["foo"]);
|
||
// d1["foo"] is:
|
||
// d2["foo"] is: bar
|
||
|
||
/* Short Circuiting */
|
||
|
||
Console.WriteLine("s1 trim/upper is: " + s1?.Trim().Length);
|
||
Console.WriteLine("s2 trim/upper is: " + s2?.Trim().Length);
|
||
// s1 trimmed length is:
|
||
// s2 trimmed length is: 5
|
||
|
||
String s4 = s1 ?? s2 ?? DoError();
|
||
Console.WriteLine("s4 is: " + s4)
|
||
// s4 is: hello
|
||
|
||
A `working example <https://dotnetfiddle.net/SxQNG8>`_ can be viewed online.
|
||
|
||
Of utmost importance, notice the short circuiting behavior. The short circuiting
|
||
of ``??`` is similar to short circuiting of other boolean operators such as
|
||
``||`` or ``&&`` and should not be surprising. Helpfully, `?.` is *also* short
|
||
circuiting: ``s1?.Trim()`` evaluates to null, but ``s1?.Trim().Length`` does not
|
||
attempt to dereference the ``null`` pointer.
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
Existing Alternatives
|
||
---------------------
|
||
|
||
Python does not have any specific ``None``-aware operators, but it does have
|
||
operators that can be used for a similar purpose. This section describes why
|
||
these alternatives may be undesirable for some common ``None`` patterns.
|
||
|
||
|
||
``or`` Operator
|
||
~~~~~~~~~~~~~~~
|
||
|
||
Similar behavior can be achieved with the ``or`` operator, but ``or`` checks
|
||
whether its left operand is false-y, not specifically ``None``. This can lead
|
||
to surprising behavior. Consider the scenario of computing the price of some
|
||
products a customer has in his/her shopping cart::
|
||
|
||
>>> price = 100
|
||
>>> default_quantity = 1
|
||
# If user didn't specify a quantity, then assume the default.
|
||
>>> requested_quantity = None
|
||
>>> (requested_quantity or default_quantity) * price
|
||
100
|
||
# The user added 5 items to the cart.
|
||
>>> requested_quantity = 5
|
||
>>> (requested_quantity or default_quantity) * price
|
||
500
|
||
# User removed 5 items from cart.
|
||
>>> requested_quantity = 0
|
||
>>> (requested_quantity or default_quantity) * price # oops!
|
||
100
|
||
|
||
An experienced Python developer should know how ``or`` works and be capable of
|
||
avoiding bugs like this. However, getting in the habit of using ``or`` for this
|
||
purpose still might cause an experienced developer to occasionally make this
|
||
mistake, especially when refactoring existing code and not carefully paying
|
||
attention to the possible values of the left-hand operand.
|
||
|
||
For inexperienced developers, the problem is worse. The top Google hit for
|
||
"python null coalesce" is a `StackOverflow page
|
||
<http://stackoverflow.com/questions/4978738/is-there-a-python-equivalent-of-
|
||
the-c-sharp-null-coalescing-operator>`_, and the top answer says to use ``or``.
|
||
The top answer goes on to explain the caveats of using ``or`` like this, but how
|
||
many beginning developers go on to read all those caveats? The accepted answer
|
||
on `a more recent question <http://stackoverflow.com/questions/13710631/is-
|
||
there-shorthand-for-returning-a -default-value-if-none-in-python>`_ says to use
|
||
``or`` without any caveats at all. These two questions have a combined 26,000
|
||
views!
|
||
|
||
The common usage of ``or`` for the purpose of providing default values is
|
||
undeniable, and yet it is also booby-trapped for unsuspecting newcomers. This
|
||
suggests that a safe operator for providing default values would have positive
|
||
utility. While some critics claim that ``None``-aware operators will be abused
|
||
for error handling, they are no more prone to abuse than ``or`` is.
|
||
|
||
|
||
Ternary Operator
|
||
~~~~~~~~~~~~~~~~
|
||
|
||
Another common way to initialize default values is to use the ternary operator.
|
||
Here is an excerpt from the popular `Requests package
|
||
<https://github.com/kennethreitz/requests/blob/14a555ac716866678bf17e43e23230d81
|
||
a8149f5/requests/models.py#L212>`_::
|
||
|
||
data = [] if data is None else data
|
||
files = [] if files is None else files
|
||
headers = {} if headers is None else headers
|
||
params = {} if params is None else params
|
||
hooks = {} if hooks is None else hooks
|
||
|
||
This particular formulation has the undesirable effect of putting the operands
|
||
in an unintuitive order: the brain thinks, "use ``data`` if possible and use
|
||
``[]`` as a fallback," but the code puts the fallback *before* the preferred
|
||
value.
|
||
|
||
The author of this package could have written it like this instead::
|
||
|
||
data = data if data is not None else []
|
||
files = files if files is not None else []
|
||
headers = headers if headers is not None else {}
|
||
params = params if params is not None else {}
|
||
hooks = hooks if hooks is not None else {}
|
||
|
||
This ordering of the operands is more intuitive, but it requires 4 extra
|
||
characters (for "not "). It also highlights the repetition of identifiers:
|
||
``data if data``, ``files if files``, etc. This example benefits from short
|
||
identifiers, but what if the tested expression is longer and/or has side
|
||
effects? This is addressed in the next section.
|
||
|
||
|
||
Motivating Examples
|
||
-------------------
|
||
|
||
The purpose of this PEP is to simplify some common patterns involving ``None``.
|
||
This section presents some examples of common ``None`` patterns and explains
|
||
the drawbacks.
|
||
|
||
This first example is from a Python web crawler that uses the popular Flask
|
||
framework as a front-end. This function retrieves information about a web site
|
||
from a SQL database and formats it as JSON to send to an HTTP client::
|
||
|
||
class SiteView(FlaskView):
|
||
@route('/site/<id_>', methods=['GET'])
|
||
def get_site(self, id_):
|
||
site = db.query('site_table').find(id_)
|
||
|
||
return jsonify(
|
||
first_seen=site.first_seen.isoformat() if site.first_seen is not None else None,
|
||
id=site.id,
|
||
is_active=site.is_active,
|
||
last_seen=site.last_seen.isoformat() if site.last_seen is not None else None,
|
||
url=site.url.rstrip('/')
|
||
)
|
||
|
||
Both ``first_seen`` and ``last_seen`` are allowed to be ``null`` in the
|
||
database, and they are also allowed to be ``null`` in the JSON response. JSON
|
||
does not have a native way to represent a ``datetime``, so the server's contract
|
||
states that any non-``null`` date is represented as an ISO-8601 string.
|
||
|
||
Note that this code is invalid by PEP-8 standards: several lines are over the
|
||
line length limit. In fact, *including it in this document* violates the PEP
|
||
formatting standard! But it's not unreasonably indented, nor are any of the
|
||
identifiers excessively long. The excessive line length is due to the
|
||
repetition of identifiers on both sides of the ternary ``if`` and the verbosity
|
||
of the ternary itself (10 characters out of a 78 character line length).
|
||
|
||
One way to fix this code is to replace each ternary with a full ``if/else``
|
||
block::
|
||
|
||
class SiteView(FlaskView):
|
||
@route('/site/<id_>', methods=['GET'])
|
||
def get_site(self, id_):
|
||
site = db.query('site_table').find(id_)
|
||
|
||
if site.first_seen is None:
|
||
first_seen = None
|
||
else:
|
||
first_seen = site.first_seen.isoformat()
|
||
|
||
if site.last_seen is None:
|
||
last_seen = None
|
||
else:
|
||
last_seen = site.last_seen.isoformat()
|
||
|
||
return jsonify(
|
||
first_seen=first_seen,
|
||
id=site.id,
|
||
is_active=site.is_active,
|
||
last_seen=last_seen,
|
||
url=site.url.rstrip('/')
|
||
)
|
||
|
||
This version definitely isn't *bad*. It is easy to read and understand. On the
|
||
other hand, adding 8 lines of code to express this common behavior feels a bit
|
||
heavy, especially for a deliberately simplified example. If a larger, more
|
||
complicated data model was being used, then it would get tedious to continually
|
||
write in this long form. The readability would start to suffer as the number of
|
||
lines in the function grows, and a refactoring would be forced.
|
||
|
||
Another alternative is to rename some of the identifiers::
|
||
|
||
class SiteView(FlaskView):
|
||
@route('/site/<id_>', methods=['GET'])
|
||
def get_site(self, id_):
|
||
site = db.query('site_table').find(id_)
|
||
|
||
fs = site.first_seen
|
||
ls = site.last_seen
|
||
|
||
return jsonify(
|
||
first_seen=fs.isodate() if fs is not None else None,
|
||
id=site.id,
|
||
is_active=site.is_active,
|
||
last_seen=ls.isodate() if ls is not None else None,,
|
||
url=site.url.rstrip('/')
|
||
)
|
||
|
||
This adds fewer lines of code than the previous example, but it comes at the
|
||
expense of introducing extraneous identifiers that amount to nothing more than
|
||
aliases. These new identifiers are short enough to fit a ternary expression onto
|
||
one line, but the identifiers are also less intuitive, e.g. ``fs`` versus
|
||
``first_seen``.
|
||
|
||
As a quick preview, consider an alternative rewrite using a new operator::
|
||
|
||
class SiteView(FlaskView):
|
||
@route('/site/<id_>', methods=['GET'])
|
||
def get_site(self, id_):
|
||
site = db.query('site_table').find(id_)
|
||
|
||
return jsonify(
|
||
first_seen=site.first_seen?.isoformat(),
|
||
id=site.id,
|
||
is_active=site.is_active,
|
||
last_seen=site.last_seen?.isoformat(),
|
||
url=site.url.rstrip('/')
|
||
)
|
||
|
||
The ``?.`` operator behaves as a "safe navigation" operator, allowing a more
|
||
concise syntax where the expression ``site.first_seen`` is not duplicated.
|
||
|
||
The next example is from a trending project on GitHub called `Grab
|
||
<https://github.com/lorien/grab/blob/4c95b18dcb0fa88eeca81f5643c0ebfb114bf728/gr
|
||
ab/upload.py>`_, which is a Python scraping library::
|
||
|
||
class BaseUploadObject(object):
|
||
def find_content_type(self, filename):
|
||
ctype, encoding = mimetypes.guess_type(filename)
|
||
if ctype is None:
|
||
return 'application/octet-stream'
|
||
else:
|
||
return ctype
|
||
|
||
class UploadContent(BaseUploadObject):
|
||
def __init__(self, content, filename=None, content_type=None):
|
||
self.content = content
|
||
if filename is None:
|
||
self.filename = self.get_random_filename()
|
||
else:
|
||
self.filename = filename
|
||
if content_type is None:
|
||
self.content_type = self.find_content_type(self.filename)
|
||
else:
|
||
self.content_type = content_type
|
||
|
||
class UploadFile(BaseUploadObject):
|
||
def __init__(self, path, filename=None, content_type=None):
|
||
self.path = path
|
||
if filename is None:
|
||
self.filename = os.path.split(path)[1]
|
||
else:
|
||
self.filename = filename
|
||
if content_type is None:
|
||
self.content_type = self.find_content_type(self.filename)
|
||
else:
|
||
self.content_type = content_type
|
||
|
||
.. note::
|
||
|
||
I don't know the author of the Grab project. I used it as an example
|
||
because it is a trending repo on GitHub and it has good examples of common
|
||
``None`` patterns.
|
||
|
||
This example contains several good examples of needing to provide default
|
||
values. It is a bit verbose as it is, and it is certainly not improved by the
|
||
ternary operator::
|
||
|
||
class BaseUploadObject(object):
|
||
def find_content_type(self, filename):
|
||
ctype, encoding = mimetypes.guess_type(filename)
|
||
return 'application/octet-stream' if ctype is None else ctype
|
||
|
||
class UploadContent(BaseUploadObject):
|
||
def __init__(self, content, filename=None, content_type=None):
|
||
self.content = content
|
||
self.filename = self.get_random_filename() if filename \
|
||
is None else filename
|
||
self.content_type = self.find_content_type(self.filename) \
|
||
if content_type is None else content_type
|
||
|
||
class UploadFile(BaseUploadObject):
|
||
def __init__(self, path, filename=None, content_type=None):
|
||
self.path = path
|
||
self.filename = os.path.split(path)[1] if filename is \
|
||
None else filename
|
||
self.content_type = self.find_content_type(self.filename) \
|
||
if content_type is None else content_type
|
||
|
||
The first ternary expression is tidy, but it reverses the intuitive order of
|
||
the operands: it should return ``ctype`` if it has a value and use the string
|
||
literal as fallback. The other ternary expressions are unintuitive and so
|
||
long that they must be wrapped. The overall readability is worsened, not
|
||
improved.
|
||
|
||
This code *might* be improved, though, if there was a syntactic shortcut for
|
||
this common need to supply a default value::
|
||
|
||
class BaseUploadObject(object):
|
||
def find_ctype(self, filename):
|
||
ctype, encoding = mimetypes.guess_type(filename)
|
||
return ctype ?? 'application/octet-stream'
|
||
|
||
class UploadContent(BaseUploadObject):
|
||
def __init__(self, content, filename=None, content_type=None):
|
||
self.content = content
|
||
self.filename = filename ?? self.get_random_filename()
|
||
self.content_type = content_type ?? self.find_ctype(self.filename)
|
||
|
||
class UploadFile(BaseUploadObject):
|
||
def __init__(self, path, filename=None, content_type=None):
|
||
self.path = path
|
||
self.filename = filename ?? os.path.split(path)[1]
|
||
self.content_type = content_type ?? self.find_ctype(self.filename)
|
||
|
||
This syntax has an intuitive ordering of the operands, e.g. ``ctype`` -- the
|
||
preferred value -- comes before the fallback value. The terseness of the syntax
|
||
also makes for fewer lines of code and less code to visually parse.
|
||
|
||
.. note::
|
||
|
||
I cheated on the last example: I renamed ``find_content_type`` to
|
||
``find_ctype`` in order to fit two of the lines under 80 characters. If you
|
||
find this underhanded, you can go back and apply the same renaming to the
|
||
previous 2 examples. You'll find that it doesn't change the
|
||
conclusions.
|
||
|
||
|
||
Usage Of ``None`` In The Standard Library
|
||
-----------------------------------------
|
||
|
||
The previous sections show some code patterns that are claimed to be "common",
|
||
but how common are they? The attached script `find-pep505.py
|
||
<https://github.com/python/peps/blob/master/pep-0505/find-pep505.py>`_ is meant
|
||
to answer this question. It uses the ``ast`` module to search for variations of
|
||
the following patterns in any ``*.py`` file::
|
||
|
||
>>> # None-coalescing if block
|
||
...
|
||
>>> if a is None:
|
||
... a = b
|
||
|
||
>>> # [Possible] None-coalescing "or" operator
|
||
...
|
||
>>> a or 'foo'
|
||
>>> a or []
|
||
>>> a or {}
|
||
|
||
>>> # None-coalescing ternary
|
||
...
|
||
>>> a if a is not None else b
|
||
>>> b if a is None else a
|
||
|
||
>>> # Safe navigation "and" operator
|
||
...
|
||
>>> a and a.foo
|
||
>>> a and a['foo']
|
||
>>> a and a.foo()
|
||
|
||
>>> # Safe navigation if block
|
||
...
|
||
>>> if a is not None:
|
||
... a.foo()
|
||
|
||
>>> # Safe navigation ternary
|
||
...
|
||
>>> a.foo if a is not None else b
|
||
>>> b if a is None else a.foo
|
||
|
||
This script takes one or more names of Python source files to analyze::
|
||
|
||
$ python3 find-pep505.py test.py
|
||
$ find /usr/lib/python3.4 -name '*.py' | xargs python3 find-pep505.py
|
||
|
||
The script prints out any matches it finds. Sample::
|
||
|
||
None-coalescing if block: /usr/lib/python3.4/inspect.py:594
|
||
if _filename is None:
|
||
_filename = getsourcefile(object) or getfile(object)
|
||
|
||
[Possible] None-coalescing `or`: /usr/lib/python3.4/lib2to3/refactor.py:191
|
||
self.explicit = explicit or []
|
||
|
||
None-coalescing ternary: /usr/lib/python3.4/decimal.py:3909
|
||
self.clamp = clamp if clamp is not None else dc.clamp
|
||
|
||
Safe navigation `and`: /usr/lib/python3.4/weakref.py:512
|
||
obj = info and info.weakref()
|
||
|
||
Safe navigation `if` block: /usr/lib/python3.4/http/cookiejar.py:1895
|
||
if k is not None:
|
||
lc = k.lower()
|
||
else:
|
||
lc = None
|
||
|
||
Safe navigation ternary: /usr/lib/python3.4/sre_parse.py:856
|
||
literals = [None if s is None else s.encode('latin-1') for s in literals]
|
||
|
||
.. note::
|
||
|
||
Coalescing with ``or`` is marked as a "possible" match, because it's not
|
||
trivial to infer whether ``or`` is meant to coalesce False-y values
|
||
(correct) or if it meant to coalesce ``None`` (incorrect). On the other
|
||
hand, we assume that ``and`` is always incorrect for safe navigation.
|
||
|
||
The script has been tested against `test.py
|
||
<https://github.com/python/peps/blob/master/pep-0505/test.py>`_ and the Python
|
||
3.4 standard library, but it should work on any arbitrary Python 3 source code.
|
||
The complete output from running it against the standard library is attached to
|
||
this proposal as `find-pep505.out
|
||
<https://github.com/python/peps/blob/master/pep-0505/find-pep505.out>`_.
|
||
|
||
The script counts how many matches it finds and prints the totals at the
|
||
end::
|
||
|
||
Total None-coalescing `if` blocks: 426
|
||
Total [possible] None-coalescing `or`: 119
|
||
Total None-coalescing ternaries: 21
|
||
Total Safe navigation `and`: 9
|
||
Total Safe navigation `if` blocks: 55
|
||
Total Safe navigation ternaries: 7
|
||
|
||
This is a total of 637 possible matches for these common code patterns in the
|
||
standard library. Allowing for some false positives and false negatives, it is
|
||
fair to say that these code patterns are definitely common in the standard
|
||
library.
|
||
|
||
|
||
Rejected Ideas
|
||
--------------
|
||
|
||
Several related ideas were discussed on python-ideas, and some of these were
|
||
roundly rejected by BDFL, the community, or both. For posterity's sake, some of
|
||
those ideas are recorded here.
|
||
|
||
``None``-aware Function Call
|
||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||
|
||
The ``None``-aware syntax applies to attribute and index access, so it seems
|
||
natural to ask if it should also apply to function invocation syntax. It might
|
||
be written as ``foo?()``, where ``foo`` is only called if it is not None. This
|
||
idea was quickly rejected, for several reasons.
|
||
|
||
First, no other mainstream language has such syntax. Second, Python evaluates
|
||
arguments to a function before it looks up the function itself, so
|
||
``foo?(bar())`` would still call ``bar()`` even if ``foo`` is ``None``. This
|
||
behaviour is unexpected for a so-called "short-circuiting" operator.
|
||
|
||
``?`` Unary Postfix Operator
|
||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||
|
||
To generalize the ``None``-aware behavior and limit the number of new operators
|
||
introduced, a unary, postfix operator spelled ``?`` was suggested. The idea is
|
||
that ``?`` might return a special object that could would override dunder
|
||
methods that return ``self``. For example, ``foo?`` would evaluate to ``foo`` if
|
||
it is not ``None``, otherwise it would evaluate to an instance of
|
||
``NoneQuestion``::
|
||
|
||
class NoneQuestion():
|
||
def __call__(self, *args, **kwargs):
|
||
return self
|
||
|
||
def __getattr__(self, name):
|
||
return self
|
||
|
||
def __getitem__(self, key):
|
||
return self
|
||
|
||
|
||
With this new operator and new type, an expression like ``foo?.bar[baz]``
|
||
evaluates to ``NoneQuestion`` if ``foo`` is None. This is a nifty
|
||
generalization, but it's difficult to use in practice since most existing code
|
||
won't know what ``NoneQuestion`` is.
|
||
|
||
Going back to one of the motivating examples above, consider the following::
|
||
|
||
>>> import json
|
||
>>> created = None
|
||
>>> json.dumps({'created': created?.isoformat()})``
|
||
|
||
The JSON serializer does not know how to serialize ``NoneQuestion``, nor will
|
||
any other API. This proposal actually requires *lots of specialized logic*
|
||
throughout the standard library and any third party library.
|
||
|
||
At the same time, the ``?`` operator may also be **too general**, in the sense
|
||
that it can be combined with any other operator. What should the following
|
||
expressions mean?::
|
||
|
||
>>> x? + 1
|
||
>>> x? -= 1
|
||
>>> x? == 1
|
||
>>> ~x?
|
||
|
||
This degree of generalization is not useful. The operators actually proposed
|
||
herein are intentionally limited to a few operators that are expected to make it
|
||
easier to write common code patterns.
|
||
|
||
Haskell-style ``Maybe``
|
||
~~~~~~~~~~~~~~~~~~~~~~~
|
||
|
||
Haskell has a concept called `Maybe <https://wiki.haskell.org/Maybe>`_ that
|
||
encapsulates the idea of an optional value without relying on any special
|
||
keyword (e.g. ``null``) or any special instance (e.g. ``None``). In Haskell, the
|
||
purpose of ``Maybe`` is to avoid separate handling of "something" and nothing".
|
||
The concept is so heavily intertwined with Haskell's lazy evaluation that it
|
||
doesn't translate cleanly into Python.
|
||
|
||
There is a Python package called `pymaybe
|
||
<https://pypi.python.org/pypi/pymaybe/0.1.1>`_ that provides a rough
|
||
approximation. The documentation shows the following example that appears
|
||
relevant to the discussion at hand::
|
||
|
||
>>> maybe('VALUE').lower()
|
||
'value'
|
||
|
||
>>> maybe(None).invalid().method().or_else('unknown')
|
||
'unknown'
|
||
|
||
The function ``maybe()`` returns either a ``Something`` instance or a
|
||
``Nothing`` instance. Similar to the unary postfix operator described in the
|
||
previous section, ``Nothing`` overrides dunder methods in order to allow
|
||
chaining on a missing value.
|
||
|
||
Note that ``or_else()`` is eventually required to retrieve the underlying value
|
||
from ``pymaybe``'s wrappers. Furthermore, ``pymaybe`` does not short circuit any
|
||
evaluation. Although ``pymaybe`` has some strengths and may be useful in its own
|
||
right, it also demonstrates why a pure Python implementation of coalescing is
|
||
not nearly as powerful as support built into the language.
|
||
|
||
|
||
Specification
|
||
=============
|
||
|
||
This PEP suggests 3 new operators be added to Python:
|
||
|
||
1. ``None``-coalescing operator
|
||
2. ``None``-aware attribute access
|
||
3. ``None``-aware index access/slicing
|
||
|
||
We will continue to assume the same spellings as in
|
||
the previous sections in order to focus on behavior before diving into the much
|
||
more contentious issue of how to spell these operators.
|
||
|
||
A generalization of these operators is also proposed below under the heading
|
||
"Generalized Coalescing".
|
||
|
||
|
||
Operator Spelling
|
||
-----------------
|
||
|
||
Despite significant support for the proposed operators, the majority of
|
||
discussion on python-ideas fixated on the spelling. Many alternative spellings
|
||
were proposed, both punctuation and keywords, but each alternative drew some
|
||
criticism. Spelling the operator as a keyword is problematic, because adding new
|
||
keywords to the language is not backwards compatible.
|
||
|
||
It is not impossible to add a new keyword, however, and we can look at several
|
||
other PEPs for inspiration. For example, `PEP-492
|
||
<https://www.python.org/dev/peps/pep-0492/>`_ introduced the new keywords
|
||
``async`` and ``await`` into Python 3.5. These new keywords are fully backwards
|
||
compatible, because that PEP also introduces a new lexical context such that
|
||
``async`` and ``await`` are only treated as keywords when used inside of an
|
||
``async def`` function. In other locations, ``async`` and ``await`` may be used
|
||
as identifiers.
|
||
|
||
It is also possible to craft a new operator out of existing keywords, as was
|
||
the case with `PEP-308 <https://www.python.org/dev/peps/pep-0308/>`_, which
|
||
created a ternary operator by cobbling together the `if` and `else` keywords
|
||
into a new operator.
|
||
|
||
In addition to the lexical acrobatics required to create a new keyword, keyword
|
||
operators are also undesirable for creating an assignment shortcut syntax. In
|
||
Dart, for example, ``x ??= y`` is an assignment shortcut that approximately
|
||
means ``x = x ?? y`` except that ``x`` is only evaluated once. If Python's
|
||
coalesce operator is a keyword, e.g. ``foo``, then the assignment shortcut would
|
||
be very ugly: ``x foo= y``.
|
||
|
||
Spelling new logical operators with punctuation is unlikely, for several
|
||
reasons. First, Python eschews punctuation for logical operators. For example,
|
||
it uses ``not`` instead of ``!``, ``or`` instead of ``||``, and ``… if … else …``
|
||
instead of ``… ? … : …``.
|
||
|
||
Second, nearly every single punctuation character on a standard keyboard already
|
||
has special meaning in Python. The only exceptions are ``$``, ``!``, ``?``, and
|
||
backtick (as of Python 3). This leaves few options for a new, single-character
|
||
operator.
|
||
|
||
Third, other projects in the Python universe assign special meaning to
|
||
punctuation. For example, `IPython
|
||
<https://ipython.org/ipython-doc/2/interactive/reference.html>`_ assigns
|
||
special meaning to ``%``, ``%%``, ``?``, ``??``, ``$``, and ``$$``, among
|
||
others. Out of deference to those projects and the large communities using them,
|
||
introducing conflicting syntax into Python is undesirable.
|
||
|
||
The spellings ``??`` and ``?.`` will be familiar to programmers who have seen
|
||
them in other popular programming languages. Any alternative punctuation will be
|
||
just as ugly but without the benefit of familiarity from other languages.
|
||
Therefore, this proposal spells the new operators using the same punctuation
|
||
that already exists in other languages.
|
||
|
||
|
||
``None``-Coalescing Operator
|
||
----------------------------
|
||
|
||
The ``None``-coalescing operator is a short-circuiting, binary operator that
|
||
behaves in the following way.
|
||
|
||
1. Evaluate the left operand first.
|
||
2. If the left operand is not ``None``, then return it immediately.
|
||
3. Else, evaluate the right operand and return the result.
|
||
|
||
Consider the following examples. We will continue to use the spelling ``??``
|
||
here, but keep in mind that alternative spellings will be discussed below::
|
||
|
||
>>> 1 ?? 2
|
||
1
|
||
>>> None ?? 2
|
||
2
|
||
|
||
Importantly, note that the right operand is not evaluated unless the left
|
||
operand is None::
|
||
|
||
>>> def err(): raise Exception('foo')
|
||
>>> 1 ?? err()
|
||
1
|
||
>>> None ?? err()
|
||
Traceback (most recent call last):
|
||
File "<stdin>", line 1, in <module>
|
||
File "<stdin>", line 1, in err
|
||
Exception: foo
|
||
|
||
The operator is left associative. Combined with its short circuiting behavior,
|
||
this makes the operator easy to chain::
|
||
|
||
>>> timeout = None
|
||
>>> local_timeout = 60
|
||
>>> global_timeout = 300
|
||
>>> timeout ?? local_timeout ?? global_timeout
|
||
60
|
||
|
||
>>> local_timeout = None
|
||
>>> timeout ?? local_timeout ?? global_timeout
|
||
300
|
||
|
||
The operator has higher precedence than the comparison operators ``==``, ``>``,
|
||
``is``, etc., but lower precedence than any bitwise or arithmetic operators.
|
||
This precedence is chosen for making "default value" expressions intuitive to
|
||
read and write::
|
||
|
||
>>> not None ?? True
|
||
>>> not (None ?? True) # Same precedence
|
||
|
||
>>> 1 == None ?? 1
|
||
>>> 1 == (None ?? 1) # Same precedence
|
||
|
||
>>> 'foo' in None ?? ['foo', 'bar']
|
||
>>> 'foo' in (None ?? ['foo', 'bar']) # Same precedence
|
||
|
||
>>> 1 + None ?? 2
|
||
>>> 1 + (None ?? 2) # Same precedence
|
||
|
||
Recall the example above of calculating the cost of items in a shopping cart,
|
||
and the easy-to-miss bug. This type of bug is not possible with the ``None``-
|
||
coalescing operator, because there is no implicit type coersion to ``bool``::
|
||
|
||
>>> requested_quantity = 0
|
||
>>> default_quantity = 1
|
||
>>> price = 100
|
||
>>> requested_quantity ?? default_quantity * price
|
||
0
|
||
|
||
The ``None``-coalescing operator also has a corresponding assignment shortcut.
|
||
The following assignments are semantically similar, except that ``foo`` is only
|
||
looked up once when using the assignment shortcut::
|
||
|
||
>>> foo ??= []
|
||
>>> foo = foo ?? []
|
||
|
||
The ``None`` coalescing operator improves readability, especially when handling
|
||
default function arguments. Consider again the example from the Requests
|
||
library, rewritten to use ``None``-coalescing::
|
||
|
||
def __init__(self, data=None, files=None, headers=None, params=None, hooks=None):
|
||
self.data = data ?? []
|
||
self.files = files ?? []
|
||
self.headers = headers ?? {}
|
||
self.params = params ?? {}
|
||
self.hooks = hooks ?? {}
|
||
|
||
The operator makes the intent easier to follow (by putting operands in an
|
||
intuitive order) and is more concise than the ternary operator, while still
|
||
preserving the short circuit semantics of the code that it replaces.
|
||
|
||
|
||
``None``-Aware Attribute Access Operator
|
||
----------------------------------------
|
||
|
||
The ``None``-aware attribute access operator (also called "safe navigation")
|
||
checks its left operand. If the left operand is ``None``, then the operator
|
||
evaluates to ``None``. If the the left operand is not ``None``, then the
|
||
operator accesses the attribute named by the right operand::
|
||
|
||
>>> from datetime import date
|
||
>>> d = date.today()
|
||
>>> d.year
|
||
2015
|
||
|
||
>>> d = None
|
||
>>> d.year
|
||
Traceback (most recent call last):
|
||
File "<stdin>", line 1, in <module>
|
||
AttributeError: 'NoneType' object has no attribute 'year'
|
||
|
||
>>> d?.year
|
||
None
|
||
|
||
The operator has the same precedence and associativity as the plain attribute
|
||
access operator ``.``, but this operator is also short-circuiting in a unique
|
||
way: if the left operand is ``None``, then any series of attribute access, index
|
||
access, slicing, or function call operators immediately to the right of it *are
|
||
not evaluated*::
|
||
|
||
>>> name = ' The Black Knight '
|
||
>>> name.strip()[4:].upper()
|
||
'BLACK KNIGHT'
|
||
|
||
>>> name = None
|
||
>>> name?.strip()[4:].upper()
|
||
None
|
||
|
||
If this operator did not short circuit in this way, then the second example
|
||
would partially evaluate ``name?.strip()`` to ``None()`` and then fail with
|
||
``TypeError: 'NoneType' object is not callable``.
|
||
|
||
To put it another way, the following expressions are semantically similar,
|
||
except that ``name`` is only looked up once on the first line::
|
||
|
||
>>> name?.strip()[4:].upper()
|
||
>>> name.strip()[4:].upper() if name is not None else None
|
||
|
||
.. note::
|
||
|
||
C# implements its safe navigation operators with the same short-circuiting
|
||
semantics, but Dart does not. In Dart, the second example (suitably
|
||
translated) would fail. The C# semantics are obviously superior, given the
|
||
original goal of writing common cases more concisely. The Dart semantics are
|
||
nearly useless.
|
||
|
||
This operator short circuits one or more attribute access, index access,
|
||
slicing, or function call operators that are adjacent to its right, but it
|
||
does not short circuit any other operators (logical, bitwise, arithmetic, etc.),
|
||
nor does it escape parentheses::
|
||
|
||
>>> d = date.today()
|
||
>>> d?.year.numerator + 1
|
||
2016
|
||
|
||
>>> d = None
|
||
>>> d?.year.numerator + 1
|
||
Traceback (most recent call last):
|
||
File "<stdin>", line 1, in <module>
|
||
TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'
|
||
|
||
>>> (d?.year).numerator + 1
|
||
Traceback (most recent call last):
|
||
File "<stdin>", line 1, in <module>
|
||
AttributeError: 'NoneType' object has no attribute 'numerator'
|
||
|
||
Note that the error in the second example is not on the attribute access
|
||
``numerator``. In fact, that attribute access is never performed. The error
|
||
occurs when adding ``None + 1``, because the ``None``-aware attribute access
|
||
does not short circuit ``+``.
|
||
|
||
The third example fails because the operator does not escape parentheses. In
|
||
that example, the attribute access ``numerator`` is evaluated and fails because
|
||
``None`` does not have that attribute.
|
||
|
||
Finally, observe that short circuiting adjacent operators is not at all the same
|
||
thing as propagating ``None`` throughout an expression::
|
||
|
||
>>> user?.first_name.upper()
|
||
|
||
If ``user`` is not ``None``, then ``user.first_name`` is evaluated. If
|
||
``user.first_name`` evaluates to ``None``, then ``user.first_name.upper()`` is
|
||
an error! In English, this expression says, "``user`` is optional but if it has
|
||
a value, then it must have a ``first_name``, too."
|
||
|
||
If ``first_name`` is supposed to be optional attribute, then the expression must
|
||
make that explicit::
|
||
|
||
>>> user?.first_name?.upper()
|
||
|
||
The operator is not intended as an error silencing mechanism, and it would be
|
||
undesirable if its presence infected nearby operators.
|
||
|
||
|
||
``None``-Aware Index Access/Slicing Operator
|
||
--------------------------------------------
|
||
|
||
The ``None``-aware index access/slicing operator (also called "safe navigation")
|
||
is nearly identical to the ``None``-aware attribute access operator. It combines
|
||
the familiar square bracket syntax ``[]`` with new punctuation or a new keyword,
|
||
the spelling of which is discussed later::
|
||
|
||
>>> person = {'name': 'Mark', 'age': 32}
|
||
>>> person['name']
|
||
'Mark'
|
||
|
||
>>> person = None
|
||
>>> person['name']
|
||
Traceback (most recent call last):
|
||
File "<stdin>", line 1, in <module>
|
||
TypeError: 'NoneType' object is not subscriptable
|
||
|
||
>>> person?.['name']
|
||
None
|
||
|
||
The ``None``-aware slicing operator behaves similarly::
|
||
|
||
>>> name = 'The Black Knight'
|
||
>>> name[4:]
|
||
'Black Knight'
|
||
|
||
>>> name = None
|
||
>>> name[4:]
|
||
Traceback (most recent call last):
|
||
File "<stdin>", line 1, in <module>
|
||
TypeError: 'NoneType' object is not subscriptable
|
||
|
||
>>> name?.[4:]
|
||
None
|
||
|
||
These operators have the same precedence as the plain index access and slicing
|
||
operators. They also have the same short-circuiting behavior as the
|
||
``None``-aware attribute access.
|
||
|
||
|
||
Generalized Coalescing
|
||
----------------------
|
||
|
||
Making ``None`` a special case is too specialized and magical. The behavior can
|
||
be generalized by making the ``None``-aware operators invoke a dunder method,
|
||
e.g. ``__coalesce__(self)`` that returns ``True`` if an object should be
|
||
coalesced and ``False`` otherwise.
|
||
|
||
With this generalization, ``object`` would implement a dunder method equivalent
|
||
to this::
|
||
|
||
def __coalesce__(self):
|
||
return False
|
||
|
||
``NoneType`` would implement a dunder method equivalent to this::
|
||
|
||
def __coalesce__(self):
|
||
return True
|
||
|
||
If this generalization is accepted, then the operators will need to be renamed
|
||
such that the term ``None`` is not used, e.g. "Coalescing Operator", "Coalesced
|
||
Member Access Operator", etc.
|
||
|
||
The coalesce operator would invoke this dunder method. The following two
|
||
expressions are semantically similar, except `foo` is only looked up once when
|
||
using the coalesce operator::
|
||
|
||
>>> foo ?? bar
|
||
>>> bar if foo.__coalesce__() else foo
|
||
|
||
The coalesced attribute and index access operators would invoke the same dunder
|
||
method::
|
||
|
||
>>> user?.first_name.upper()
|
||
>>> None if user.__coalesce__() else user.first_name.upper()
|
||
|
||
This generalization allows for domain-specific ``null`` objects to be coalesced
|
||
just like ``None``. For example the ``pyasn1`` package has a type called
|
||
``Null`` that represents an ASN.1 ``null``::
|
||
|
||
>>> from pyasn1.type import univ
|
||
>>> univ.Null() ?? univ.Integer(123)
|
||
Integer(123)
|
||
|
||
In addition to making the proposed operators less specialized, this
|
||
generalization also makes it easier to work with the Null Object Pattern, [3]_
|
||
for those developers who prefer to avoid using ``None``.
|
||
|
||
|
||
Implementation
|
||
--------------
|
||
|
||
The author of this PEP is not competent with grammars or lexers, and given the
|
||
contentiousness of this proposal, the implementation details for CPython will be
|
||
deferred until we have a clearer idea that one or more of the proposed
|
||
enhancements will be approved.
|
||
|
||
...TBD...
|
||
|
||
|
||
References
|
||
==========
|
||
|
||
.. [1] C# Reference: Operators
|
||
(https://msdn.microsoft.com/en-us/library/6a71f45d.aspx)
|
||
|
||
.. [2] A Tour of the Dart Language: Operators
|
||
(https://www.dartlang.org/docs/dart-up-and-running/ch02.html#operators)
|
||
|
||
.. [3] Wikipedia: Null Object Pattern
|
||
(https://en.wikipedia.org/wiki/Null_Object_pattern)
|
||
|
||
.. [4] PEP-249:
|
||
(https://www.python.org/dev/peps/pep-0249/)
|
||
|
||
.. [5] PEP-308
|
||
(https://www.python.org/dev/peps/pep-0308/)
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|