Pep 572 update (#654)

- Remove changes to comprehension scope
- Make := in comprehensions assign to containing non-comprehension scope
- Clarify binding precedence (tighter than comma, not at same level as =)
- Remove mention of more complex targets in the future
- Explicitly disallow toplevel :=
- Rewrite section on differences with =, enumerating all of them
- Remove "here's how this could be written without :=" from examples

- Tweak first paragraph of "Syntax and semantics"
- Add "Exception cases" (various explicit prohibitions)
- Clarify that lambda is a containing scope
- Clarify that := and = just don't mix
- Added "Open questions" section
- Added two new rejected alternatives: "Allowing commas to the right"
  and "Always requiring parentheses"
- Minor edits

* Start a section on real code

* Correct/clarify "commas to the right" section

* Add Tim and Guido as authors

* Update abstract to mention :=

* Rule out targets conflicting with comprehension loop control

* Add timcode.txt as Appendix A

* Add os.fork() example

* Add TODOs about evaluation order
This commit is contained in:
Guido van Rossum 2018-05-22 12:49:00 -07:00 committed by GitHub
parent 0e2f1f04f7
commit 6d81538ecb
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 548 additions and 252 deletions

View File

@ -1,6 +1,7 @@
PEP: 572
Title: Assignment Expressions
Author: Chris Angelico <rosuav@gmail.com>
Author: Chris Angelico <rosuav@gmail.com>, Tim Peters <tim.peters@gmail.com>,
Guido van Rossum <guido@python.org>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
@ -14,8 +15,7 @@ Abstract
========
This is a proposal for creating a way to assign to variables within an
expression. Additionally, the precise scope of comprehensions is adjusted, to
maintain consistency and follow expectations.
expression using the notation ``NAME := expr``.
Rationale
@ -25,11 +25,7 @@ Naming the result of an expression is an important part of programming,
allowing a descriptive name to be used in place of a longer expression,
and permitting reuse. Currently, this feature is available only in
statement form, making it unavailable in list comprehensions and other
expression contexts. Merely introducing a way to assign as an expression
would create bizarre edge cases around comprehensions, though, and to avoid
the worst of the confusions, we change the definition of comprehensions,
causing some edge cases to be interpreted differently, but maintaining the
existing behaviour in the majority of situations.
expression contexts.
Additionally, naming sub-parts of a large expression can assist an interactive
debugger, providing useful display hooks and partial results. Without a way to
@ -39,13 +35,83 @@ code; with assignment expressions, this merely requires the insertion of a few
the code be inadvertently changed as part of debugging (a common cause of
Heisenbugs), and is easier to dictate to another programmer.
The importance of real code
---------------------------
During the development of this PEP many people (supporters and critics
both) have had a tendency to focus on toy examples on the one hand,
and on overly complex examples on the other.
The danger of toy examples is twofold: they are often too abstract to
make anyone go "ooh, that's compelling", and they are easily refuted
with "I would never write it that way anyway".
The danger of overly complex examples is that they provide a
convenient strawman for critics of the proposal to shoot down ("that's
obfuscated").
Yet there is some use for both extremely simple and extremely complex
examples: they are helpful to clarify the intended semantics.
Therefore there will be some of each below.
However, in order to be *compelling*, examples should be rooted in
real code, i.e. code that was written without any thought of this PEP,
as part of a useful application, however large or small. Tim Peters
has been extremely helpful by going over his own personal code
repository and picking examples of code he had written that (in his
view) would have been *clearer* if rewritten with (sparing) use of
assignment expressions. His conclusion: the current proposal would
have allowed a modest but clear improvement in quite a few bits of
code.
Another use of real code is to observe indirectly how much value
programmers place on compactness. Guido van Rossum searched through a
Dropbox code base and discovered some evidence that programmers value
writing fewer lines over shorter lines.
Case in point: Guido found several examples where a programmer
repeated a subexpression, slowing down the program, in order to save
one line of code, e.g. instead of writing::
match = re.match(data)
group = match.group(1) if match else None
they would write::
group = re.match(data).group(1) if re.match(data) else None
Another example illustrates that programmers sometimes do more work to
save an extra level of indentation::
match1 = pattern1.match(data)
match2 = pattern2.match(data)
if match1:
return match1.group(1)
elif match2:
return match2.group(2)
This code tries to match ``pattern2`` even if ``pattern1`` has a match
(in which case the match on ``pattern2`` is never used). The more
efficient rewrite would have been::
match1 = pattern1.match(data)
if match1:
return match1.group(1)
else:
match2 = pattern2.match(data)
if match2:
return match2.group(2)
(TODO: Include Guido's evidence, and do a more systematic search.)
Syntax and semantics
====================
In any context where arbitrary Python expressions can be used, a **named
expression** can appear. This is of the form ``name := expr`` where
``expr`` is any valid Python expression, and ``name`` is an identifier.
In most contexts where arbitrary Python expressions can be used, a
**named expression** can appear. This is of the form ``NAME := expr``
where ``expr`` is any valid Python expression other than an
unparenthesized tuple, and ``NAME`` is an identifier.
The value of such a named expression is the same as the incorporated
expression, with the additional side-effect that the target is assigned
@ -62,140 +128,221 @@ that value::
# Share a subexpression between a comprehension filter clause and its output
filtered_data = [y for x in data if (y := f(x)) is not None]
Exceptional cases
-----------------
Differences from regular assignment statements
----------------------------------------------
There are a few places where assignment expressions are not allowed,
in order to avoid ambiguities or user confusion:
- Unparenthesized assignment expressions are prohibited at the top
level of an expression statement; for example, this is not allowed::
y := f(x) # INVALID
This rule is included to simplify the choice for the user between an
assignment statements and an assignment expression -- there is no
syntactic position where both are valid.
- Unparenthesized assignment expressions are prohibited at the top
level in the right hand side of an assignment statement; for
example, the following is not allowed::
y0 = y1 := f(x) # INVALID
Again, this rule is included to avoid two visually similar ways of
saying the same thing.
- Unparenthesized assignment expressions are prohibited for the value
of a keyword argument in a call; for example, this is disallowed::
foo(x = y := f(x)) # INVALID
This rule is included to disallow excessively confusing code.
- TODO: Should we disallow using keyword arguments and top level
assignment expressions in the same call? E.g.::
# Should these be invalid?
foo(x=0, y := f(0))
bar(x := 0, y = f(x))
Regardless, ``foo(x := 0)`` should probably be valid (see below).
- Assignment expressions (even parenthesized or occurring inside other
constructs) are prohibited in function default values. For example,
the following examples are all invalid, even though the expressions
for the default values are valid in other contexts::
def foo(answer = p := 42): # INVALID
...
def bar(answer = (p := 42)): # INVALID
...
def baz(callback = (lambda arg: p := arg)): # INVALID
...
This rule is included to avoid side effects in a position whose
exact semantics are already confusing to many users (cf. the common
style recommendation against mutable default values). (TODO: Maybe
this should just be a style recommendation except for the
prohibition at the top level?)
Scope of the target
-------------------
An assignment expression does not introduce a new scope. In most
cases the scope in which the target will be bound is self-explanatory:
it is the current scope. If this scope contains a ``nonlocal`` or
``global`` declaration for the target, the assignment expression
honors that.
There is one special case: an assignment expression occurring in a
list, set or dict comprehension or in a generator expression (below
collectively referred to as "comprehensions") binds the target in the
containing scope, honoring a ``nonlocal`` or ``global`` declaration
for the target in that scope, if one exists. For the purpose of this
rule the containing scope of a nested comprehension is the scope that
contains the outermost comprehension. A lambda counts as a containing
scope.
The motivation for this special case is twofold. First, it allows us
to conveniently capture a "witness" for an ``any()`` expression, or a
counterexample for ``all()``, for example::
if any((comment := line).startswith('#') for line in lines):
print("First comment:", comment)
else:
print("There are no comments")
if all((nonblank := line).strip() == '' for line in lines):
print("All lines are blank")
else:
print("First non-blank line:", nonblank)
Second, it allows a compact way of updating mutable state from a
comprehension, for example::
# Compute partial sums in a list comprehension
total = 0
partial_sums = [total := total + v for v in values]
print("Total:", total)
An exception to this special case applies when the target name is the
same as a loop control variable for a comprehension containing it.
This is invalid. (This exception exists to rule out edge cases of the
above scope rules as illustrated by ``[i := i+1 for i in range(5)]``
or ``[[(j := j) for i in range(5)] for j in range(5)]``. Note that
this exception also applies to ``[i := 0 for i, j in stuff]``.)
A further exception applies when an assignment expression occurrs in a
comprehension whose containing scope is a class scope. If the rules
above were to result in the target being assigned in that class's
scope, the assignment expression is expressly invalid.
(The reason for the latter exception is the implicit function created
for comprehensions -- there is currently no runtime mechanism for a
function to refer to a variable in the containing class scope, and we
do not want to add such a mechanism. If this issue ever gets resolved
this special case may be removed from the specification of assignment
expressions. Note that the problem already exists for *using* a
variable defined in the class scope from a comprehension.)
Relative precedence of ``:=``
-----------------------------
The ``:=`` operator groups more tightly than a comma in all syntactic
positions where it is legal, but less tightly than all operators,
including ``or``, ``and`` and ``not``. As follows from section
"Exceptional cases" above, it is never allowed at the same level as
``=``. In case a different grouping is desired, parentheses should be
used.
The ``:=`` operator may be used directly in a positional function call
argument; however it is invalid directly in a keyword argument.
Some examples to clarify what's technically valid or invalid::
# INVALID
x := 0
# Valid alternative
(x := 0)
# INVALID
x = y := 0
# Valid alternative
x = (y := 0)
# Valid
len(lines := f.readlines())
# Valid (TODO: Should this be disallowed?)
foo(x := 3, cat='vector')
# INVALID
foo(cat=category := 'vector')
# Valid alternative
foo(cat=(category := 'vector'))
Most of the "valid" examples above are not recommended, since human
readers of Python source code who are quickly glancing at some code
may miss the distinction. But simple cases are not objectionable::
# Valid
if any(len(longline := line) >= 100 for line in lines):
print("Extremely long line:", longline)
This PEP recommends always putting spaces around ``:=``, similar to
PEP 8's recommendation for ``=`` when used for assignment, whereas the
latter disallows spaces around ``=`` used for keyword arguments.)
Differences between assignment expressions and assignment statements
---------------------------------------------------------------------
Most importantly, since ``:=`` is an expression, it can be used in contexts
where statements are illegal, including lambda functions and comprehensions.
An assignment statement can assign to multiple targets, left-to-right::
Conversely, assignment expressions don't support the advanced features
found in assignment statements:
x = y = z = 0
- Multiple targets are not directly supported::
The equivalent assignment expression is parsed as separate binary operators,
and is therefore processed right-to-left, as if it were spelled thus::
x = y = z = 0 # Equivalent: (x := (y := (z := 0)))
assert 0 == (x := (y := (z := 0)))
- Single assignment targets more complex than a single ``NAME`` are
not supported::
Statement assignment can include annotations. This would be syntactically
noisy in expressions, and is of minor importance. An annotation can be
given separately from the assignment if needed::
# No equivalent
a[i] = x
self.rest = []
x:str = "" # works
(x:str := "") # SyntaxError
x:str # possibly before a loop
(x := "") # fine
- Iterable packing and unpacking (both regular or extended forms) are
not supported::
Augmented assignment is not supported in expression form::
# Equivalent needs extra parentheses
loc = x, y # Use (loc := (x, y))
info = name, phone, *rest # Use (info := (name, phone, *rest))
>>> x +:= 1
File "<stdin>", line 1
x +:= 1
^
SyntaxError: invalid syntax
# No equivalent
px, py, pz = position
name, phone, email, *other_info = contact
Statement assignment is able to set attributes and subscripts, but
expression assignment is restricted to names. (This restriction may be
relaxed in a future version of Python.)
- Type annotations are not supported::
Otherwise, the semantics of assignment are identical in statement and
expression forms.
# No equivalent
p: Optional[int] = None
- Augmented assignment is not supported::
total += tax # Equivalent: (total := total + tax)
Alterations to comprehensions
-----------------------------
The current behaviour of list/set/dict comprehensions and generator
expressions has some edge cases that would behave strangely if an assignment
expression were to be used. Therefore the proposed semantics are changed,
removing the current edge cases, and instead altering their behaviour *only*
in a class scope.
As of Python 3.7, the outermost iterable of any comprehension is evaluated
in the surrounding context, and then passed as an argument to the implicit
function that evaluates the comprehension.
Under this proposal, the entire body of the comprehension is evaluated in
its implicit function. Names not assigned to within the comprehension are
located in the surrounding scopes, as with normal lookups. As one special
case, a comprehension at class scope will **eagerly bind** any name which
is already defined in the class scope.
A list comprehension can be unrolled into an equivalent function. With
Python 3.7 semantics::
numbers = [x + y for x in range(3) for y in range(4)]
# Is approximately equivalent to
def <listcomp>(iterator):
result = []
for x in iterator:
for y in range(4):
result.append(x + y)
return result
numbers = <listcomp>(iter(range(3)))
Under the new semantics, this would instead be equivalent to::
def <listcomp>():
result = []
for x in range(3):
for y in range(4):
result.append(x + y)
return result
numbers = <listcomp>()
When a class scope is involved, a naive transformation into a function would
prevent name lookups (as the function would behave like a method)::
class X:
names = ["Fred", "Barney", "Joe"]
prefix = "> "
prefixed_names = [prefix + name for name in names]
With Python 3.7 semantics, this will evaluate the outermost iterable at class
scope, which will succeed; but it will evaluate everything else in a function::
class X:
names = ["Fred", "Barney", "Joe"]
prefix = "> "
def <listcomp>(iterator):
result = []
for name in iterator:
result.append(prefix + name)
return result
prefixed_names = <listcomp>(iter(names))
The name ``prefix`` is thus searched for at global scope, ignoring the class
name. Under the proposed semantics, this name will be eagerly bound; and the
same early binding then handles the outermost iterable as well. The list
comprehension is thus approximately equivalent to::
class X:
names = ["Fred", "Barney", "Joe"]
prefix = "> "
def <listcomp>(names=names, prefix=prefix):
result = []
for name in names:
result.append(prefix + name)
return result
prefixed_names = <listcomp>()
With list comprehensions, this is unlikely to cause any confusion. With
generator expressions, this has the potential to affect behaviour, as the
eager binding means that the name could be rebound between the creation of
the genexp and the first call to ``next()``. It is, however, more closely
aligned to normal expectations. The effect is ONLY seen with names that
are looked up from class scope; global names (eg ``range()``) will still
be late-bound as usual.
One consequence of this change is that certain bugs in genexps will not
be detected until the first call to ``next()``, where today they would be
caught upon creation of the generator.
Recommended use-cases
=====================
Examples
========
Simplifying list comprehensions
-------------------------------
@ -210,21 +357,8 @@ giving it a name on first use::
stuff = [[y := f(x), x/y] for x in range(5)]
# There are a number of less obvious ways to spell this in current
# versions of Python, such as:
# Inline helper function
stuff = [(lambda y: [y,x/y])(f(x)) for x in range(5)]
# Extra 'for' loop - potentially could be optimized internally
stuff = [[y, x/y] for x in range(5) for y in [f(x)]]
# Using a mutable cache object (various forms possible)
c = {}
stuff = [[c.update(y=f(x)) or c['y'], x/c['y']] for x in range(5)]
In all cases, the name is local to the comprehension; like iteration variables,
it cannot leak out into the surrounding context.
Note that in both cases the variable ``y`` is bound in the containing
scope (i.e. at the same level as ``results`` or ``stuff``).
Capturing condition values
@ -233,7 +367,7 @@ Capturing condition values
Assignment expressions can be used to good effect in the header of
an ``if`` or ``while`` statement::
# Proposed syntax
# Loop-and-a-half
while (command := input("> ")) != "quit":
print("You entered:", command)
@ -250,26 +384,64 @@ an ``if`` or ``while`` statement::
print("Fallback found:", match.group(0))
# Reading socket data until an empty string is returned
while data := sock.read():
while data := sock.recv():
print("Received data:", data)
# Equivalent in current Python, not caring about function return value
while input("> ") != "quit":
print("You entered a command.")
# To capture the return value in current Python demands a four-line
# loop header.
while True:
command = input("> ");
if command == "quit":
break
print("You entered:", command)
Particularly with the ``while`` loop, this can remove the need to have an
infinite loop, an assignment, and a condition. It also creates a smooth
parallel between a loop which simply uses a function call as its condition,
and one which uses that as its condition but also uses the actual value.
Fork
----
An example from the low-level UNIX world::
if pid := os.fork():
# Parent code
else:
# Child code
Open questions and TODOs
========================
- For precise semantics, the proposal requires evaluation order to be
well-defined. We're mostly good due to the rule that things
generally are evaluated from left to right, but there are some
corner cases:
1. In a dict comprehension ``{X: Y for ...}``, ``Y`` is evaluated
before ``X``. This is confusing and should be swapped. (In a
dict display ``{X: Y}}`` the order is already ``X`` before
``Y``.)
2. It would be good to confirm definitively that in an assignment
statement, any subexpressions on the left hand side are
evaluated after the right hand side (e.g. ``a[X] = Y`` evaluates
``X`` after ``Y``). (This already seems to be the case.)
3. Also in multiple assignment statements (e.g. ``a[X] = a[Y] = Z``)
it would be good to confirm that ``a[X]`` is evaluated before
``a[Y]``. (This already seems to be the case.)
- Should we adopt Tim Peters's proposal to make the target scope be the
containing scope? It's cute, and has some useful applications, but
it requires a carefully formulated mouthful. (Current answer: yes.)
- Should we disallow combining keyword arguments and unparenthesized
assignment expressions in the same call? (Current answer: no.)
- Should we disallow ``(x := 0, y := 0)`` and ``foo(x := 0, y := 0)``,
requiring the fully parenthesized forms ``((x := 0), (y := 0))`` and
``foo((x := 0), (y := 0))`` instead? (Current answer: no.)
- If we were to change the previous answer to yes, should we still
allow ``len(lines := f.readlines())``? (I'd say yes.)
- Should we disallow assignment expressions anywhere in function
defaults? (Current answer: yes.)
Rejected alternative proposals
==============================
@ -279,6 +451,17 @@ Below are a number of alternative syntaxes, some of them specific to
comprehensions, which have been rejected in favour of the one given above.
Changing the scope rules for comprehensions
-------------------------------------------
A previous version of this PEP proposed subtle changes to the scope
rules for comprehensions, to make them more usable in class scope and
to unify the scope of the "outermost iterable" and the rest of the
comprehension. However, this part of the proposal would have caused
backwards incompatibilities, and has been withdrawn so the PEP can
focus on assignment expressions.
Alternative spellings
---------------------
@ -426,102 +609,46 @@ Once find() returns -1, the loop terminates. If ``:=`` binds as loosely as
While this behaviour would be convenient in many situations, it is also harder
to explain than "the := operator behaves just like the assignment statement",
and as such, the precedence for ``:=`` has been made as close as possible to
that of ``=``.
that of ``=`` (with the exception that it binds tighter than comma).
Migration path
==============
Allowing commas to the right
----------------------------
The semantic changes to list/set/dict comprehensions, and more so to generator
expressions, may potentially require migration of code. In many cases, the
changes simply make legal what used to raise an exception, but there are some
edge cases that were previously legal and now are not, and a few corner cases
with altered semantics.
Some critics have claimed that the assignment expressions should allow
unparenthesized tuples on the right, so that these two would be equivalent::
(point := (x, y))
(point := x, y)
(With the current version of the proposal, the latter would be
equivalent to ``((point := x), y)``.)
However, adopting this stance would logically lead to the conclusion
that when used in a function call, assignment expressions also bind
less tight than comma, so we'd have the following confusing equivalence::
foo(x := 1, y)
foo(x := (1, y))
The less confusing option is to make ``:=`` bind more tightly than comma.
The Outermost Iterable
----------------------
Always requiring parentheses
----------------------------
As of Python 3.7, the outermost iterable in a comprehension is special: it is
evaluated in the surrounding context, instead of inside the comprehension.
Thus it is permitted to contain a ``yield`` expression, to use a name also
used elsewhere, and to reference names from class scope. Also, in a genexp,
the outermost iterable is pre-evaluated, but the rest of the code is not
touched until the genexp is first iterated over. Class scope is now handled
more generally (see above), but if other changes require the old behaviour,
the iterable must be explicitly elevated from the comprehension::
It's been proposed to just always require parenthesize around an
assignment expression. This would resolve many ambiguities, and
indeed parentheses will frequently be needed to extract the desired
subexpression. But in the following cases the extra parentheses feel
redundant::
# Python 3.7
def f(x):
return [x for x in x if x]
def g():
return [x for x in [(yield 1)]]
# With PEP 572
def f(x):
return [y for y in x if y]
def g():
sent_item = (yield 1)
return [x for x in [sent_item]]
# Top level in if
if match := pattern.match(line):
return match.group(1)
This more clearly shows that it is g(), not the comprehension, which is able
to yield values (and is thus a generator function). The entire comprehension
is consistently in a single scope.
The following expressions would, in Python 3.7, raise exceptions immediately.
With the removal of the outermost iterable's special casing, they are now
equivalent to the most obvious longhand form::
gen = (x for x in rage(10)) # NameError
gen = (x for x in 10) # TypeError (not iterable)
gen = (x for x in range(1/0)) # ZeroDivisionError
def <genexp>():
for x in rage(10):
yield x
gen = <genexp>() # No exception yet
tng = next(gen) # NameError
Open questions
==============
Importing names into comprehensions
-----------------------------------
A list comprehension can use and update local names, and they will retain
their values from one iteration to another. It would be convenient to use
this feature to create rolling or self-effecting data streams::
progressive_sums = [total := total + value for value in data]
This will fail with UnboundLocalError due to ``total`` not being initalized.
Simply initializing it outside of the comprehension is insufficient - unless
the comprehension is in class scope::
class X:
total = 0
progressive_sums = [total := total + value for value in data]
At other scopes, it may be beneficial to have a way to fetch a value from the
surrounding scope. Should this be automatic? Should it be controlled with a
keyword? Hypothetically (and using no new keywords), this could be written::
total = 0
progressive_sums = [total := total + value
import nonlocal total
for value in data]
Translated into longhand, this would become::
total = 0
def <listcomp>(total=total):
result = []
for value in data:
result.append(total := total + value)
return result
progressive_sums = <listcomp>()
ie utilizing the same early-binding technique that is used at class scope.
# Short call
len(lines := f.readlines())
Frequently Raised Objections
@ -565,10 +692,6 @@ create externally-visible names. This is no different from ``for`` loops or
other constructs, and can be solved the same way: ``del`` the name once it is
no longer needed, or prefix it with an underscore.
Names bound within a comprehension are local to that comprehension, even in
the outermost iterable, and can thus be used freely without polluting the
surrounding namespace.
(The author wishes to thank Guido van Rossum and Christoph Groth for their
suggestions to move the proposal in this direction. [2]_)
@ -590,11 +713,184 @@ benefit of style guides such as PEP 8, two recommendations are suggested.
Acknowledgements
================
The author wishes to thank Guido van Rossum and Nick Coghlan for their
considerable contributions to this proposal, and to members of the
The authors wish to thank Nick Coghlan and Steven D'Aprano for their
considerable contributions to this proposal, and members of the
core-mentorship mailing list for assistance with implementation.
Appendix A: Tim Peters's findings
=================================
Here's a brief essay Tim Peters wrote on the topic.
I dislike "busy" lines of code, and also dislike putting conceptually
unrelated logic on a single line. So, for example, instead of::
i = j = count = nerrors = 0
I prefer::
i = j = 0
count = 0
nerrors = 0
instead. So I suspected I'd find few places I'd want to use
assignment expressions. I didn't even consider them for lines already
stretching halfway across the screen. In other cases, "unrelated"
ruled::
mylast = mylast[1]
yield mylast[0]
is a vast improvment over the briefer::
yield (mylast := mylast[1])[0]
The original two statements are doing entirely different conceptual
things, and slamming them together is conceptually insane.
In other cases, combining related logic made it harder to understand,
such as rewriting::
while True:
old = total
total += term
if old == total:
return total
term *= mx2 / (i*(i+1))
i += 2
as the briefer::
while total != (total := total + term):
term *= mx2 / (i*(i+1))
i += 2
return total
The ``while`` test there is too subtle, crucially relying on strict
left-to-right evaluation in a non-short-circuiting or method-chaining
context. My brain isn't wired that way.
But cases like that were rare. Name binding is very frequent, and
"sparse is better than dense" does not mean "almost empty is better
than sparse". For example, I have many functions that return ``None``
or ``0`` to communicate "I have nothing useful to return in this case,
but since that's expected often I'm not going to annoy you with an
exception". This is essentially the same as regular expression search
functions returning ``None`` when there is no match. So there was lots
of code of the form::
result = solution(xs, n)
if result:
# use result
I find that clearer, and certainly a bit less typing and
pattern-matching reading, as::
if result := solution(xs, n):
# use result
It's also nice to trade away a small amount of horizontal whitespace
to get another _line_ of surrounding code on screen. I didn't give
much weight to this at first, but it was so very frequent it added up,
and I soon enough became annoyed that I couldn't actually run the
briefer code. That surprised me!
There are other cases where assignment expressions really shine.
Rather than pick another from my code, Kirill Balunov gave a lovely
example from the standard library's ``copy()`` function in ``copy.py``::
reductor = dispatch_table.get(cls)
if reductor:
rv = reductor(x)
else:
reductor = getattr(x, "__reduce_ex__", None)
if reductor:
rv = reductor(4)
else:
reductor = getattr(x, "__reduce__", None)
if reductor:
rv = reductor()
else:
raise Error("un(shallow)copyable object of type %s" % cls)
The ever-increasing indentation is semantically misleading: the logic
is conceptually flat, "the first test that succeeds wins"::
if reductor := dispatch_table.get(cls):
rv = reductor(x)
elif reductor := getattr(x, "__reduce_ex__", None):
rv = reductor(4)
elif reductor := getattr(x, "__reduce__", None):
rv = reductor()
else:
raise Error("un(shallow)copyable object of type %s" % cls)
Using easy assignment expressions allows the visual structure of the
code to emphasize the conceptual flatness of the logic;
ever-increasing indentation obscured it.
A smaller example from my code delighted me, both allowing to put
inherently related logic in a single line, and allowing to remove an
annoying "artificial" indentation level::
diff = x - x_base
if diff:
g = gcd(diff, n)
if g > 1:
return g
became::
if (diff := x - x_base) and (g := gcd(diff, n)) > 1:
return g
That ``if`` is about as long as I want my lines to get, bur remains easy
to follow.
So, in all, in most lines binding a name, I wouldn't use assignment
expressions, but because that construct is so very frequent, that
leaves many places I would. In most of the latter, I found a small
win that adds up due to how often it occurs, and in the rest I found a
moderate to major win. I'd certainly use it more often than ternary
``if``, but significantly less often than augmented assignment.
A numeric example
-----------------
I have another example that quite impressed me at the time.
Where all variables are positive integers, and a is at least as large
as the n'th root of x, this algorithm returns the floor of the n'th
root of x (and roughly doubling the number of accurate bits per
iteration)::
while a > (d := x // a**(n-1)):
a = ((n-1)*a + d) // n
return a
It's not obvious why that works, but is no more obvious in the "loop
and a half" form. It's hard to prove correctness without building on
the right insight (the "arithmetic mean - geometric mean inequality"),
and knowing some non-trivial things about how nested floor functions
behave. That is, the challenges are in the math, not really in the
coding.
If you do know all that, then the assignment-expression form is easily
read as "while the current guess is too large, get a smaller guess",
where the "too large?" test and the new guess share an expensive
sub-expression.
To my eyes, the original form is harder to understand::
while True:
d = x // a**(n-1)
if a <= d:
break
a = ((n-1)*a + d) // n
return a
References
==========