1192 lines
40 KiB
ReStructuredText
1192 lines
40 KiB
ReStructuredText
PEP: 572
|
||
Title: Assignment Expressions
|
||
Author: Chris Angelico <rosuav@gmail.com>, Tim Peters <tim.peters@gmail.com>,
|
||
Guido van Rossum <guido@python.org>
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 28-Feb-2018
|
||
Python-Version: 3.8
|
||
Post-History: 28-Feb-2018, 02-Mar-2018, 23-Mar-2018, 04-Apr-2018, 17-Apr-2018,
|
||
25-Apr-2018
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
This is a proposal for creating a way to assign to variables within an
|
||
expression using the notation ``NAME := expr``.
|
||
|
||
Pending Acceptance
|
||
------------------
|
||
|
||
This PEP will be accepted, however it needs some editing for clarity
|
||
and exact specification. A final draft will be posted to python-dev.
|
||
Note that alternate syntax proposals ("EXPR as NAME", "EXPR given
|
||
...") are no longer under consideration. Hopefully the final draft
|
||
will be posted well before the end of July 2018.
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
Naming the result of an expression is an important part of programming,
|
||
allowing a descriptive name to be used in place of a longer expression,
|
||
and permitting reuse. Currently, this feature is available only in
|
||
statement form, making it unavailable in list comprehensions and other
|
||
expression contexts.
|
||
|
||
Additionally, naming sub-parts of a large expression can assist an interactive
|
||
debugger, providing useful display hooks and partial results. Without a way to
|
||
capture sub-expressions inline, this would require refactoring of the original
|
||
code; with assignment expressions, this merely requires the insertion of a few
|
||
``name :=`` markers. Removing the need to refactor reduces the likelihood that
|
||
the code be inadvertently changed as part of debugging (a common cause of
|
||
Heisenbugs), and is easier to dictate to another programmer.
|
||
|
||
The importance of real code
|
||
---------------------------
|
||
|
||
During the development of this PEP many people (supporters and critics
|
||
both) have had a tendency to focus on toy examples on the one hand,
|
||
and on overly complex examples on the other.
|
||
|
||
The danger of toy examples is twofold: they are often too abstract to
|
||
make anyone go "ooh, that's compelling", and they are easily refuted
|
||
with "I would never write it that way anyway".
|
||
|
||
The danger of overly complex examples is that they provide a
|
||
convenient strawman for critics of the proposal to shoot down ("that's
|
||
obfuscated").
|
||
|
||
Yet there is some use for both extremely simple and extremely complex
|
||
examples: they are helpful to clarify the intended semantics.
|
||
Therefore there will be some of each below.
|
||
|
||
However, in order to be *compelling*, examples should be rooted in
|
||
real code, i.e. code that was written without any thought of this PEP,
|
||
as part of a useful application, however large or small. Tim Peters
|
||
has been extremely helpful by going over his own personal code
|
||
repository and picking examples of code he had written that (in his
|
||
view) would have been *clearer* if rewritten with (sparing) use of
|
||
assignment expressions. His conclusion: the current proposal would
|
||
have allowed a modest but clear improvement in quite a few bits of
|
||
code.
|
||
|
||
Another use of real code is to observe indirectly how much value
|
||
programmers place on compactness. Guido van Rossum searched through a
|
||
Dropbox code base and discovered some evidence that programmers value
|
||
writing fewer lines over shorter lines.
|
||
|
||
Case in point: Guido found several examples where a programmer
|
||
repeated a subexpression, slowing down the program, in order to save
|
||
one line of code, e.g. instead of writing::
|
||
|
||
match = re.match(data)
|
||
group = match.group(1) if match else None
|
||
|
||
they would write::
|
||
|
||
group = re.match(data).group(1) if re.match(data) else None
|
||
|
||
Another example illustrates that programmers sometimes do more work to
|
||
save an extra level of indentation::
|
||
|
||
match1 = pattern1.match(data)
|
||
match2 = pattern2.match(data)
|
||
if match1:
|
||
return match1.group(1)
|
||
elif match2:
|
||
return match2.group(2)
|
||
|
||
This code tries to match ``pattern2`` even if ``pattern1`` has a match
|
||
(in which case the match on ``pattern2`` is never used). The more
|
||
efficient rewrite would have been::
|
||
|
||
match1 = pattern1.match(data)
|
||
if match1:
|
||
return match1.group(1)
|
||
else:
|
||
match2 = pattern2.match(data)
|
||
if match2:
|
||
return match2.group(2)
|
||
|
||
(TODO: Include Guido's evidence, and do a more systematic search.)
|
||
|
||
|
||
Syntax and semantics
|
||
====================
|
||
|
||
In most contexts where arbitrary Python expressions can be used, a
|
||
**named expression** can appear. This is of the form ``NAME := expr``
|
||
where ``expr`` is any valid Python expression other than an
|
||
unparenthesized tuple, and ``NAME`` is an identifier.
|
||
|
||
The value of such a named expression is the same as the incorporated
|
||
expression, with the additional side-effect that the target is assigned
|
||
that value::
|
||
|
||
# Handle a matched regex
|
||
if (match := pattern.search(data)) is not None:
|
||
...
|
||
|
||
# A more explicit alternative to the 2-arg form of iter() invocation
|
||
while (value := read_next_item()) is not None:
|
||
...
|
||
|
||
# Share a subexpression between a comprehension filter clause and its output
|
||
filtered_data = [y for x in data if (y := f(x)) is not None]
|
||
|
||
Exceptional cases
|
||
-----------------
|
||
|
||
There are a few places where assignment expressions are not allowed,
|
||
in order to avoid ambiguities or user confusion:
|
||
|
||
- Unparenthesized assignment expressions are prohibited at the top
|
||
level of an expression statement; for example, this is not allowed::
|
||
|
||
y := f(x) # INVALID
|
||
|
||
This rule is included to simplify the choice for the user between an
|
||
assignment statements and an assignment expression -- there is no
|
||
syntactic position where both are valid.
|
||
|
||
- Unparenthesized assignment expressions are prohibited at the top
|
||
level in the right hand side of an assignment statement; for
|
||
example, the following is not allowed::
|
||
|
||
y0 = y1 := f(x) # INVALID
|
||
|
||
Again, this rule is included to avoid two visually similar ways of
|
||
saying the same thing.
|
||
|
||
- Unparenthesized assignment expressions are prohibited for the value
|
||
of a keyword argument in a call; for example, this is disallowed::
|
||
|
||
foo(x = y := f(x)) # INVALID
|
||
|
||
This rule is included to disallow excessively confusing code, and
|
||
because parsing keyword arguments is complex enough already.
|
||
|
||
- Assignment expressions (even parenthesized or occurring inside other
|
||
constructs) are prohibited in function default values. For example,
|
||
the following examples are all invalid, even though the expressions
|
||
for the default values are valid in other contexts::
|
||
|
||
def foo(answer = p := 42): # INVALID
|
||
...
|
||
|
||
def bar(answer = (p := 42)): # INVALID
|
||
...
|
||
|
||
def baz(callback = (lambda arg: p := arg)): # INVALID
|
||
...
|
||
|
||
This rule is included to avoid side effects in a position whose
|
||
exact semantics are already confusing to many users (cf. the common
|
||
style recommendation against mutable default values). (TODO: Maybe
|
||
this should just be a style recommendation except for the
|
||
prohibition at the top level?)
|
||
|
||
Scope of the target
|
||
-------------------
|
||
|
||
An assignment expression does not introduce a new scope. In most
|
||
cases the scope in which the target will be bound is self-explanatory:
|
||
it is the current scope. If this scope contains a ``nonlocal`` or
|
||
``global`` declaration for the target, the assignment expression
|
||
honors that.
|
||
|
||
There is one special case: an assignment expression occurring in a
|
||
list, set or dict comprehension or in a generator expression (below
|
||
collectively referred to as "comprehensions") binds the target in the
|
||
containing scope, honoring a ``nonlocal`` or ``global`` declaration
|
||
for the target in that scope, if one exists. For the purpose of this
|
||
rule the containing scope of a nested comprehension is the scope that
|
||
contains the outermost comprehension. A lambda counts as a containing
|
||
scope.
|
||
|
||
The motivation for this special case is twofold. First, it allows us
|
||
to conveniently capture a "witness" for an ``any()`` expression, or a
|
||
counterexample for ``all()``, for example::
|
||
|
||
if any((comment := line).startswith('#') for line in lines):
|
||
print("First comment:", comment)
|
||
else:
|
||
print("There are no comments")
|
||
|
||
if all((nonblank := line).strip() == '' for line in lines):
|
||
print("All lines are blank")
|
||
else:
|
||
print("First non-blank line:", nonblank)
|
||
|
||
Second, it allows a compact way of updating mutable state from a
|
||
comprehension, for example::
|
||
|
||
# Compute partial sums in a list comprehension
|
||
total = 0
|
||
partial_sums = [total := total + v for v in values]
|
||
print("Total:", total)
|
||
|
||
An exception to this special case applies when the target name is the
|
||
same as a loop control variable for a comprehension containing it.
|
||
This is invalid. (This exception exists to rule out edge cases of the
|
||
above scope rules as illustrated by ``[i := i+1 for i in range(5)]``
|
||
or ``[[(j := j) for i in range(5)] for j in range(5)]``. Note that
|
||
this exception also applies to ``[i := 0 for i, j in stuff]``.)
|
||
|
||
TODO: prohibit [... for i in i := ...]
|
||
|
||
A further exception applies when an assignment expression occurrs in a
|
||
comprehension whose containing scope is a class scope. If the rules
|
||
above were to result in the target being assigned in that class's
|
||
scope, the assignment expression is expressly invalid.
|
||
|
||
(The reason for the latter exception is the implicit function created
|
||
for comprehensions -- there is currently no runtime mechanism for a
|
||
function to refer to a variable in the containing class scope, and we
|
||
do not want to add such a mechanism. If this issue ever gets resolved
|
||
this special case may be removed from the specification of assignment
|
||
expressions. Note that the problem already exists for *using* a
|
||
variable defined in the class scope from a comprehension.)
|
||
|
||
See Appendix B for some examples of how the rules for targets in
|
||
comprehensions translate to equivalent code.
|
||
|
||
TODO: use a a subclass of SyntaxError for various prohibitions?
|
||
|
||
Relative precedence of ``:=``
|
||
-----------------------------
|
||
|
||
The ``:=`` operator groups more tightly than a comma in all syntactic
|
||
positions where it is legal, but less tightly than all operators,
|
||
including ``or``, ``and`` and ``not``. As follows from section
|
||
"Exceptional cases" above, it is never allowed at the same level as
|
||
``=``. In case a different grouping is desired, parentheses should be
|
||
used.
|
||
|
||
The ``:=`` operator may be used directly in a positional function call
|
||
argument; however it is invalid directly in a keyword argument.
|
||
|
||
Some examples to clarify what's technically valid or invalid::
|
||
|
||
# INVALID
|
||
x := 0
|
||
|
||
# Valid alternative
|
||
(x := 0)
|
||
|
||
# INVALID
|
||
x = y := 0
|
||
|
||
# Valid alternative
|
||
x = (y := 0)
|
||
|
||
# Valid
|
||
len(lines := f.readlines())
|
||
|
||
# Valid
|
||
foo(x := 3, cat='vector')
|
||
|
||
# INVALID
|
||
foo(cat=category := 'vector')
|
||
|
||
# Valid alternative
|
||
foo(cat=(category := 'vector'))
|
||
|
||
Most of the "valid" examples above are not recommended, since human
|
||
readers of Python source code who are quickly glancing at some code
|
||
may miss the distinction. But simple cases are not objectionable::
|
||
|
||
# Valid
|
||
if any(len(longline := line) >= 100 for line in lines):
|
||
print("Extremely long line:", longline)
|
||
|
||
This PEP recommends always putting spaces around ``:=``, similar to
|
||
PEP 8's recommendation for ``=`` when used for assignment, whereas the
|
||
latter disallows spaces around ``=`` used for keyword arguments.)
|
||
|
||
|
||
Differences between assignment expressions and assignment statements
|
||
---------------------------------------------------------------------
|
||
|
||
Most importantly, since ``:=`` is an expression, it can be used in contexts
|
||
where statements are illegal, including lambda functions and comprehensions.
|
||
|
||
Conversely, assignment expressions don't support the advanced features
|
||
found in assignment statements:
|
||
|
||
- Multiple targets are not directly supported::
|
||
|
||
x = y = z = 0 # Equivalent: (x := (y := (z := 0)))
|
||
|
||
- Single assignment targets more complex than a single ``NAME`` are
|
||
not supported::
|
||
|
||
# No equivalent
|
||
a[i] = x
|
||
self.rest = []
|
||
|
||
- Iterable packing and unpacking (both regular or extended forms) are
|
||
not supported::
|
||
|
||
# Equivalent needs extra parentheses
|
||
loc = x, y # Use (loc := (x, y))
|
||
info = name, phone, *rest # Use (info := (name, phone, *rest))
|
||
|
||
# No equivalent
|
||
px, py, pz = position
|
||
name, phone, email, *other_info = contact
|
||
|
||
- Type annotations are not supported::
|
||
|
||
# No equivalent
|
||
p: Optional[int] = None
|
||
|
||
- Augmented assignment is not supported::
|
||
|
||
total += tax # Equivalent: (total := total + tax)
|
||
|
||
|
||
Examples
|
||
========
|
||
|
||
Examples from the Python standard library
|
||
-----------------------------------------
|
||
|
||
site.py
|
||
^^^^^^^
|
||
|
||
*env_base* is only used on these lines, putting its assignment on the if
|
||
moves it as the "header" of the block.
|
||
|
||
- Current::
|
||
|
||
env_base = os.environ.get("PYTHONUSERBASE", None)
|
||
if env_base:
|
||
return env_base
|
||
|
||
- Improved::
|
||
|
||
if env_base := os.environ.get("PYTHONUSERBASE", None):
|
||
return env_base
|
||
|
||
_pydecimal.py
|
||
^^^^^^^^^^^^^
|
||
|
||
Avoid nested if and remove one indentation level.
|
||
|
||
- Current::
|
||
|
||
if self._is_special:
|
||
ans = self._check_nans(context=context)
|
||
if ans:
|
||
return ans
|
||
|
||
- Improved::
|
||
|
||
if self._is_special and (ans := self._check_nans(context=context)):
|
||
return ans
|
||
|
||
copy.py
|
||
^^^^^^^
|
||
|
||
Code looks more regular and avoid multiple nested if.
|
||
(See Appendix A for the origin of this example.)
|
||
|
||
- Current::
|
||
|
||
reductor = dispatch_table.get(cls)
|
||
if reductor:
|
||
rv = reductor(x)
|
||
else:
|
||
reductor = getattr(x, "__reduce_ex__", None)
|
||
if reductor:
|
||
rv = reductor(4)
|
||
else:
|
||
reductor = getattr(x, "__reduce__", None)
|
||
if reductor:
|
||
rv = reductor()
|
||
else:
|
||
raise Error(
|
||
"un(deep)copyable object of type %s" % cls)
|
||
|
||
- Improved::
|
||
|
||
if reductor := dispatch_table.get(cls):
|
||
rv = reductor(x)
|
||
elif reductor := getattr(x, "__reduce_ex__", None):
|
||
rv = reductor(4)
|
||
elif reductor := getattr(x, "__reduce__", None):
|
||
rv = reductor()
|
||
else:
|
||
raise Error("un(deep)copyable object of type %s" % cls)
|
||
|
||
datetime.py
|
||
^^^^^^^^^^^
|
||
|
||
*tz* is only used for ``s += tz``, moving its assignment inside the if
|
||
helps to show its scope.
|
||
|
||
- Current::
|
||
|
||
s = _format_time(self._hour, self._minute,
|
||
self._second, self._microsecond,
|
||
timespec)
|
||
tz = self._tzstr()
|
||
if tz:
|
||
s += tz
|
||
return s
|
||
|
||
- Improved::
|
||
|
||
s = _format_time(self._hour, self._minute,
|
||
self._second, self._microsecond,
|
||
timespec)
|
||
if tz := self._tzstr():
|
||
s += tz
|
||
return s
|
||
|
||
sysconfig.py
|
||
^^^^^^^^^^^^
|
||
|
||
Calling ``fp.readline()`` in the ``while`` condition and calling
|
||
``.match()`` on the if lines make the code more compact without making
|
||
it harder to understand.
|
||
|
||
- Current::
|
||
|
||
while True:
|
||
line = fp.readline()
|
||
if not line:
|
||
break
|
||
m = define_rx.match(line)
|
||
if m:
|
||
n, v = m.group(1, 2)
|
||
try:
|
||
v = int(v)
|
||
except ValueError:
|
||
pass
|
||
vars[n] = v
|
||
else:
|
||
m = undef_rx.match(line)
|
||
if m:
|
||
vars[m.group(1)] = 0
|
||
|
||
- Improved::
|
||
|
||
while line := fp.readline():
|
||
if m := define_rx.match(line):
|
||
n, v = m.group(1, 2)
|
||
try:
|
||
v = int(v)
|
||
except ValueError:
|
||
pass
|
||
vars[n] = v
|
||
elif m := undef_rx.match(line):
|
||
vars[m.group(1)] = 0
|
||
|
||
|
||
Simplifying list comprehensions
|
||
-------------------------------
|
||
|
||
A list comprehension can map and filter efficiently by capturing
|
||
the condition::
|
||
|
||
results = [(x, y, x/y) for x in input_data if (y := f(x)) > 0]
|
||
|
||
Similarly, a subexpression can be reused within the main expression, by
|
||
giving it a name on first use::
|
||
|
||
stuff = [[y := f(x), x/y] for x in range(5)]
|
||
|
||
Note that in both cases the variable ``y`` is bound in the containing
|
||
scope (i.e. at the same level as ``results`` or ``stuff``).
|
||
|
||
|
||
Capturing condition values
|
||
--------------------------
|
||
|
||
Assignment expressions can be used to good effect in the header of
|
||
an ``if`` or ``while`` statement::
|
||
|
||
# Loop-and-a-half
|
||
while (command := input("> ")) != "quit":
|
||
print("You entered:", command)
|
||
|
||
# Capturing regular expression match objects
|
||
# See, for instance, Lib/pydoc.py, which uses a multiline spelling
|
||
# of this effect
|
||
if match := re.search(pat, text):
|
||
print("Found:", match.group(0))
|
||
# The same syntax chains nicely into 'elif' statements, unlike the
|
||
# equivalent using assignment statements.
|
||
elif match := re.search(otherpat, text):
|
||
print("Alternate found:", match.group(0))
|
||
elif match := re.search(third, text):
|
||
print("Fallback found:", match.group(0))
|
||
|
||
# Reading socket data until an empty string is returned
|
||
while data := sock.recv():
|
||
print("Received data:", data)
|
||
|
||
Particularly with the ``while`` loop, this can remove the need to have an
|
||
infinite loop, an assignment, and a condition. It also creates a smooth
|
||
parallel between a loop which simply uses a function call as its condition,
|
||
and one which uses that as its condition but also uses the actual value.
|
||
|
||
Fork
|
||
----
|
||
|
||
An example from the low-level UNIX world::
|
||
|
||
if pid := os.fork():
|
||
# Parent code
|
||
else:
|
||
# Child code
|
||
|
||
|
||
Open questions and TODOs
|
||
========================
|
||
|
||
- For precise semantics, the proposal requires evaluation order to be
|
||
well-defined. We're mostly good due to the rule that things
|
||
generally are evaluated from left to right, but there are some
|
||
corner cases:
|
||
|
||
1. In a dict comprehension ``{X: Y for ...}``, ``Y`` is evaluated
|
||
before ``X``. This is confusing and should be swapped. (In a
|
||
dict display ``{X: Y}}`` the order is already ``X`` before
|
||
``Y``.)
|
||
|
||
2. It would be good to confirm definitively that in an assignment
|
||
statement, any subexpressions on the left hand side are
|
||
evaluated after the right hand side (e.g. ``a[X] = Y`` evaluates
|
||
``X`` after ``Y``). (This already seems to be the case.)
|
||
|
||
3. Also in multiple assignment statements (e.g. ``a[X] = a[Y] = Z``)
|
||
it would be good to confirm that ``a[X]`` is evaluated before
|
||
``a[Y]``. (This already seems to be the case.)
|
||
|
||
- Should we adopt Tim Peters's proposal to make the target scope be the
|
||
containing scope? It's cute, and has some useful applications, but
|
||
it requires a carefully formulated mouthful. (Current answer: yes.)
|
||
|
||
- Should we disallow combining keyword arguments and unparenthesized
|
||
assignment expressions in the same call? (Current answer: no.)
|
||
|
||
- Should we disallow ``(x := 0, y := 0)`` and ``foo(x := 0, y := 0)``,
|
||
requiring the fully parenthesized forms ``((x := 0), (y := 0))`` and
|
||
``foo((x := 0), (y := 0))`` instead? (Current answer: no.)
|
||
|
||
- If we were to change the previous answer to yes, should we still
|
||
allow ``len(lines := f.readlines())``? (I'd say yes.)
|
||
|
||
- Should we disallow assignment expressions anywhere in function
|
||
defaults? (Current answer: yes.)
|
||
|
||
|
||
Rejected alternative proposals
|
||
==============================
|
||
|
||
Proposals broadly similar to this one have come up frequently on python-ideas.
|
||
Below are a number of alternative syntaxes, some of them specific to
|
||
comprehensions, which have been rejected in favour of the one given above.
|
||
|
||
|
||
Changing the scope rules for comprehensions
|
||
-------------------------------------------
|
||
|
||
A previous version of this PEP proposed subtle changes to the scope
|
||
rules for comprehensions, to make them more usable in class scope and
|
||
to unify the scope of the "outermost iterable" and the rest of the
|
||
comprehension. However, this part of the proposal would have caused
|
||
backwards incompatibilities, and has been withdrawn so the PEP can
|
||
focus on assignment expressions.
|
||
|
||
|
||
Alternative spellings
|
||
---------------------
|
||
|
||
Broadly the same semantics as the current proposal, but spelled differently.
|
||
|
||
1. ``EXPR as NAME``::
|
||
|
||
stuff = [[f(x) as y, x/y] for x in range(5)]
|
||
|
||
Since ``EXPR as NAME`` already has meaning in ``except`` and ``with``
|
||
statements (with different semantics), this would create unnecessary
|
||
confusion or require special-casing (eg to forbid assignment within the
|
||
headers of these statements).
|
||
|
||
TODO: Explain more why this is bad, since this keeps coming up (it's also readability)
|
||
|
||
2. ``EXPR -> NAME``::
|
||
|
||
stuff = [[f(x) -> y, x/y] for x in range(5)]
|
||
|
||
This syntax is inspired by languages such as R and Haskell, and some
|
||
programmable calculators. (Note that a left-facing arrow ``y <- f(x)`` is
|
||
not possible in Python, as it would be interpreted as less-than and unary
|
||
minus.) This syntax has a slight advantage over 'as' in that it does not
|
||
conflict with ``with`` and ``except`` statements, but otherwise is
|
||
equivalent.
|
||
|
||
3. Adorning statement-local names with a leading dot::
|
||
|
||
stuff = [[(f(x) as .y), x/.y] for x in range(5)] # with "as"
|
||
stuff = [[(.y := f(x)), x/.y] for x in range(5)] # with ":="
|
||
|
||
This has the advantage that leaked usage can be readily detected, removing
|
||
some forms of syntactic ambiguity. However, this would be the only place
|
||
in Python where a variable's scope is encoded into its name, making
|
||
refactoring harder.
|
||
|
||
4. Adding a ``where:`` to any statement to create local name bindings::
|
||
|
||
value = x**2 + 2*x where:
|
||
x = spam(1, 4, 7, q)
|
||
|
||
Execution order is inverted (the indented body is performed first, followed
|
||
by the "header"). This requires a new keyword, unless an existing keyword
|
||
is repurposed (most likely ``with:``). See PEP 3150 for prior discussion
|
||
on this subject (with the proposed keyword being ``given:``).
|
||
|
||
5. ``TARGET from EXPR``::
|
||
|
||
stuff = [[y from f(x), x/y] for x in range(5)]
|
||
|
||
This syntax has fewer conflicts than ``as`` does (conflicting only with the
|
||
``raise Exc from Exc`` notation), but is otherwise comparable to it. Instead
|
||
of paralleling ``with expr as target:`` (which can be useful but can also be
|
||
confusing), this has no parallels, but is evocative.
|
||
|
||
|
||
Special-casing conditional statements
|
||
-------------------------------------
|
||
|
||
One of the most popular use-cases is ``if`` and ``while`` statements. Instead
|
||
of a more general solution, this proposal enhances the syntax of these two
|
||
statements to add a means of capturing the compared value::
|
||
|
||
if re.search(pat, text) as match:
|
||
print("Found:", match.group(0))
|
||
|
||
This works beautifully if and ONLY if the desired condition is based on the
|
||
truthiness of the captured value. It is thus effective for specific
|
||
use-cases (regex matches, socket reads that return `''` when done), and
|
||
completely useless in more complicated cases (eg where the condition is
|
||
``f(x) < 0`` and you want to capture the value of ``f(x)``). It also has
|
||
no benefit to list comprehensions.
|
||
|
||
Advantages: No syntactic ambiguities. Disadvantages: Answers only a fraction
|
||
of possible use-cases, even in ``if``/``while`` statements.
|
||
|
||
|
||
Special-casing comprehensions
|
||
-----------------------------
|
||
|
||
Another common use-case is comprehensions (list/set/dict, and genexps). As
|
||
above, proposals have been made for comprehension-specific solutions.
|
||
|
||
1. ``where``, ``let``, or ``given``::
|
||
|
||
stuff = [(y, x/y) where y = f(x) for x in range(5)]
|
||
stuff = [(y, x/y) let y = f(x) for x in range(5)]
|
||
stuff = [(y, x/y) given y = f(x) for x in range(5)]
|
||
|
||
This brings the subexpression to a location in between the 'for' loop and
|
||
the expression. It introduces an additional language keyword, which creates
|
||
conflicts. Of the three, ``where`` reads the most cleanly, but also has the
|
||
greatest potential for conflict (eg SQLAlchemy and numpy have ``where``
|
||
methods, as does ``tkinter.dnd.Icon`` in the standard library).
|
||
|
||
2. ``with NAME = EXPR``::
|
||
|
||
stuff = [(y, x/y) with y = f(x) for x in range(5)]
|
||
|
||
As above, but reusing the `with` keyword. Doesn't read too badly, and needs
|
||
no additional language keyword. Is restricted to comprehensions, though,
|
||
and cannot as easily be transformed into "longhand" for-loop syntax. Has
|
||
the C problem that an equals sign in an expression can now create a name
|
||
binding, rather than performing a comparison. Would raise the question of
|
||
why "with NAME = EXPR:" cannot be used as a statement on its own.
|
||
|
||
3. ``with EXPR as NAME``::
|
||
|
||
stuff = [(y, x/y) with f(x) as y for x in range(5)]
|
||
|
||
As per option 2, but using ``as`` rather than an equals sign. Aligns
|
||
syntactically with other uses of ``as`` for name binding, but a simple
|
||
transformation to for-loop longhand would create drastically different
|
||
semantics; the meaning of ``with`` inside a comprehension would be
|
||
completely different from the meaning as a stand-alone statement, while
|
||
retaining identical syntax.
|
||
|
||
Regardless of the spelling chosen, this introduces a stark difference between
|
||
comprehensions and the equivalent unrolled long-hand form of the loop. It is
|
||
no longer possible to unwrap the loop into statement form without reworking
|
||
any name bindings. The only keyword that can be repurposed to this task is
|
||
``with``, thus giving it sneakily different semantics in a comprehension than
|
||
in a statement; alternatively, a new keyword is needed, with all the costs
|
||
therein.
|
||
|
||
|
||
Lowering operator precedence
|
||
----------------------------
|
||
|
||
There are two logical precedences for the ``:=`` operator. Either it should
|
||
bind as loosely as possible, as does statement-assignment; or it should bind
|
||
more tightly than comparison operators. Placing its precedence between the
|
||
comparison and arithmetic operators (to be precise: just lower than bitwise
|
||
OR) allows most uses inside ``while`` and ``if`` conditions to be spelled
|
||
without parentheses, as it is most likely that you wish to capture the value
|
||
of something, then perform a comparison on it::
|
||
|
||
pos = -1
|
||
while pos := buffer.find(search_term, pos + 1) >= 0:
|
||
...
|
||
|
||
Once find() returns -1, the loop terminates. If ``:=`` binds as loosely as
|
||
``=`` does, this would capture the result of the comparison (generally either
|
||
``True`` or ``False``), which is less useful.
|
||
|
||
While this behaviour would be convenient in many situations, it is also harder
|
||
to explain than "the := operator behaves just like the assignment statement",
|
||
and as such, the precedence for ``:=`` has been made as close as possible to
|
||
that of ``=`` (with the exception that it binds tighter than comma).
|
||
|
||
|
||
Allowing commas to the right
|
||
----------------------------
|
||
|
||
Some critics have claimed that the assignment expressions should allow
|
||
unparenthesized tuples on the right, so that these two would be equivalent::
|
||
|
||
(point := (x, y))
|
||
(point := x, y)
|
||
|
||
(With the current version of the proposal, the latter would be
|
||
equivalent to ``((point := x), y)``.)
|
||
|
||
However, adopting this stance would logically lead to the conclusion
|
||
that when used in a function call, assignment expressions also bind
|
||
less tight than comma, so we'd have the following confusing equivalence::
|
||
|
||
foo(x := 1, y)
|
||
foo(x := (1, y))
|
||
|
||
The less confusing option is to make ``:=`` bind more tightly than comma.
|
||
|
||
|
||
Always requiring parentheses
|
||
----------------------------
|
||
|
||
It's been proposed to just always require parenthesize around an
|
||
assignment expression. This would resolve many ambiguities, and
|
||
indeed parentheses will frequently be needed to extract the desired
|
||
subexpression. But in the following cases the extra parentheses feel
|
||
redundant::
|
||
|
||
# Top level in if
|
||
if match := pattern.match(line):
|
||
return match.group(1)
|
||
|
||
# Short call
|
||
len(lines := f.readlines())
|
||
|
||
|
||
Frequently Raised Objections
|
||
============================
|
||
|
||
Why not just turn existing assignment into an expression?
|
||
---------------------------------------------------------
|
||
|
||
C and its derivatives define the ``=`` operator as an expression, rather than
|
||
a statement as is Python's way. This allows assignments in more contexts,
|
||
including contexts where comparisons are more common. The syntactic similarity
|
||
between ``if (x == y)`` and ``if (x = y)`` belies their drastically different
|
||
semantics. Thus this proposal uses ``:=`` to clarify the distinction.
|
||
|
||
|
||
This could be used to create ugly code!
|
||
---------------------------------------
|
||
|
||
So can anything else. This is a tool, and it is up to the programmer to use it
|
||
where it makes sense, and not use it where superior constructs can be used.
|
||
|
||
|
||
With assignment expressions, why bother with assignment statements?
|
||
-------------------------------------------------------------------
|
||
|
||
The two forms have different flexibilities. The ``:=`` operator can be used
|
||
inside a larger expression; the ``=`` statement can be augmented to ``+=`` and
|
||
its friends, can be chained, and can assign to attributes and subscripts.
|
||
|
||
|
||
Why not use a sublocal scope and prevent namespace pollution?
|
||
-------------------------------------------------------------
|
||
|
||
Previous revisions of this proposal involved sublocal scope (restricted to a
|
||
single statement), preventing name leakage and namespace pollution. While a
|
||
definite advantage in a number of situations, this increases complexity in
|
||
many others, and the costs are not justified by the benefits. In the interests
|
||
of language simplicity, the name bindings created here are exactly equivalent
|
||
to any other name bindings, including that usage at class or module scope will
|
||
create externally-visible names. This is no different from ``for`` loops or
|
||
other constructs, and can be solved the same way: ``del`` the name once it is
|
||
no longer needed, or prefix it with an underscore.
|
||
|
||
(The author wishes to thank Guido van Rossum and Christoph Groth for their
|
||
suggestions to move the proposal in this direction. [2]_)
|
||
|
||
|
||
Style guide recommendations
|
||
===========================
|
||
|
||
As expression assignments can sometimes be used equivalently to statement
|
||
assignments, the question of which should be preferred will arise. For the
|
||
benefit of style guides such as PEP 8, two recommendations are suggested.
|
||
|
||
1. If either assignment statements or assignment expressions can be
|
||
used, prefer statements; they are a clear declaration of intent.
|
||
|
||
2. If using assignment expressions would lead to ambiguity about
|
||
execution order, restructure it to use statements instead.
|
||
|
||
|
||
Acknowledgements
|
||
================
|
||
|
||
The authors wish to thank Nick Coghlan and Steven D'Aprano for their
|
||
considerable contributions to this proposal, and members of the
|
||
core-mentorship mailing list for assistance with implementation.
|
||
|
||
|
||
Appendix A: Tim Peters's findings
|
||
=================================
|
||
|
||
Here's a brief essay Tim Peters wrote on the topic.
|
||
|
||
I dislike "busy" lines of code, and also dislike putting conceptually
|
||
unrelated logic on a single line. So, for example, instead of::
|
||
|
||
i = j = count = nerrors = 0
|
||
|
||
I prefer::
|
||
|
||
i = j = 0
|
||
count = 0
|
||
nerrors = 0
|
||
|
||
instead. So I suspected I'd find few places I'd want to use
|
||
assignment expressions. I didn't even consider them for lines already
|
||
stretching halfway across the screen. In other cases, "unrelated"
|
||
ruled::
|
||
|
||
mylast = mylast[1]
|
||
yield mylast[0]
|
||
|
||
is a vast improvment over the briefer::
|
||
|
||
yield (mylast := mylast[1])[0]
|
||
|
||
The original two statements are doing entirely different conceptual
|
||
things, and slamming them together is conceptually insane.
|
||
|
||
In other cases, combining related logic made it harder to understand,
|
||
such as rewriting::
|
||
|
||
while True:
|
||
old = total
|
||
total += term
|
||
if old == total:
|
||
return total
|
||
term *= mx2 / (i*(i+1))
|
||
i += 2
|
||
|
||
as the briefer::
|
||
|
||
while total != (total := total + term):
|
||
term *= mx2 / (i*(i+1))
|
||
i += 2
|
||
return total
|
||
|
||
The ``while`` test there is too subtle, crucially relying on strict
|
||
left-to-right evaluation in a non-short-circuiting or method-chaining
|
||
context. My brain isn't wired that way.
|
||
|
||
But cases like that were rare. Name binding is very frequent, and
|
||
"sparse is better than dense" does not mean "almost empty is better
|
||
than sparse". For example, I have many functions that return ``None``
|
||
or ``0`` to communicate "I have nothing useful to return in this case,
|
||
but since that's expected often I'm not going to annoy you with an
|
||
exception". This is essentially the same as regular expression search
|
||
functions returning ``None`` when there is no match. So there was lots
|
||
of code of the form::
|
||
|
||
result = solution(xs, n)
|
||
if result:
|
||
# use result
|
||
|
||
I find that clearer, and certainly a bit less typing and
|
||
pattern-matching reading, as::
|
||
|
||
if result := solution(xs, n):
|
||
# use result
|
||
|
||
It's also nice to trade away a small amount of horizontal whitespace
|
||
to get another _line_ of surrounding code on screen. I didn't give
|
||
much weight to this at first, but it was so very frequent it added up,
|
||
and I soon enough became annoyed that I couldn't actually run the
|
||
briefer code. That surprised me!
|
||
|
||
There are other cases where assignment expressions really shine.
|
||
Rather than pick another from my code, Kirill Balunov gave a lovely
|
||
example from the standard library's ``copy()`` function in ``copy.py``::
|
||
|
||
reductor = dispatch_table.get(cls)
|
||
if reductor:
|
||
rv = reductor(x)
|
||
else:
|
||
reductor = getattr(x, "__reduce_ex__", None)
|
||
if reductor:
|
||
rv = reductor(4)
|
||
else:
|
||
reductor = getattr(x, "__reduce__", None)
|
||
if reductor:
|
||
rv = reductor()
|
||
else:
|
||
raise Error("un(shallow)copyable object of type %s" % cls)
|
||
|
||
The ever-increasing indentation is semantically misleading: the logic
|
||
is conceptually flat, "the first test that succeeds wins"::
|
||
|
||
if reductor := dispatch_table.get(cls):
|
||
rv = reductor(x)
|
||
elif reductor := getattr(x, "__reduce_ex__", None):
|
||
rv = reductor(4)
|
||
elif reductor := getattr(x, "__reduce__", None):
|
||
rv = reductor()
|
||
else:
|
||
raise Error("un(shallow)copyable object of type %s" % cls)
|
||
|
||
Using easy assignment expressions allows the visual structure of the
|
||
code to emphasize the conceptual flatness of the logic;
|
||
ever-increasing indentation obscured it.
|
||
|
||
A smaller example from my code delighted me, both allowing to put
|
||
inherently related logic in a single line, and allowing to remove an
|
||
annoying "artificial" indentation level::
|
||
|
||
diff = x - x_base
|
||
if diff:
|
||
g = gcd(diff, n)
|
||
if g > 1:
|
||
return g
|
||
|
||
became::
|
||
|
||
if (diff := x - x_base) and (g := gcd(diff, n)) > 1:
|
||
return g
|
||
|
||
That ``if`` is about as long as I want my lines to get, but remains easy
|
||
to follow.
|
||
|
||
So, in all, in most lines binding a name, I wouldn't use assignment
|
||
expressions, but because that construct is so very frequent, that
|
||
leaves many places I would. In most of the latter, I found a small
|
||
win that adds up due to how often it occurs, and in the rest I found a
|
||
moderate to major win. I'd certainly use it more often than ternary
|
||
``if``, but significantly less often than augmented assignment.
|
||
|
||
A numeric example
|
||
-----------------
|
||
|
||
I have another example that quite impressed me at the time.
|
||
|
||
Where all variables are positive integers, and a is at least as large
|
||
as the n'th root of x, this algorithm returns the floor of the n'th
|
||
root of x (and roughly doubling the number of accurate bits per
|
||
iteration)::
|
||
|
||
while a > (d := x // a**(n-1)):
|
||
a = ((n-1)*a + d) // n
|
||
return a
|
||
|
||
It's not obvious why that works, but is no more obvious in the "loop
|
||
and a half" form. It's hard to prove correctness without building on
|
||
the right insight (the "arithmetic mean - geometric mean inequality"),
|
||
and knowing some non-trivial things about how nested floor functions
|
||
behave. That is, the challenges are in the math, not really in the
|
||
coding.
|
||
|
||
If you do know all that, then the assignment-expression form is easily
|
||
read as "while the current guess is too large, get a smaller guess",
|
||
where the "too large?" test and the new guess share an expensive
|
||
sub-expression.
|
||
|
||
To my eyes, the original form is harder to understand::
|
||
|
||
while True:
|
||
d = x // a**(n-1)
|
||
if a <= d:
|
||
break
|
||
a = ((n-1)*a + d) // n
|
||
return a
|
||
|
||
|
||
Appendix B: Rough code translations for comprehensions
|
||
======================================================
|
||
|
||
This appendix attempts to clarify (though not specify) the rules when
|
||
a target occurs in a comprehension or in a generator expression.
|
||
For a number of illustrative examples we show the original code,
|
||
containing a comprehension, and the translation, where the
|
||
comprehension has been replaced by an equivalent generator function
|
||
plus some scaffolding.
|
||
|
||
Since ``[x for ...]`` is equivalent to ``list(x for ...)`` these
|
||
examples all use list comprehensions without loss of generality.
|
||
And since these examples are meant to clarify edge cases of the rules,
|
||
they aren't trying to look like real code.
|
||
|
||
Note: comprehensions are already implemented via synthesizing nested
|
||
generator functions like those in this appendix. The new part is
|
||
adding appropriate declarations to establish the intended scope of
|
||
assignment expression targets (the same scope they resolve to as if
|
||
the assignment were performed in the block containing the outermost
|
||
comprehension). For type inference purposes, these illustrative
|
||
expansions do not imply that assignment expression targets are always
|
||
Optional (but they do indicate the target binding scope).
|
||
|
||
Let's start with a reminder of what code is generated for a generator
|
||
expression without assignment expression.
|
||
|
||
- Original code (EXPR usually references VAR)::
|
||
|
||
def f():
|
||
a = [EXPR for VAR in ITERABLE]
|
||
|
||
- Translation (let's not worry about name conflicts)::
|
||
|
||
def f():
|
||
def genexpr(iterator):
|
||
for VAR in iterator:
|
||
yield EXPR
|
||
a = list(genexpr(iter(ITERABLE)))
|
||
|
||
Let's add a simple assignment expression.
|
||
|
||
- Original code::
|
||
|
||
def f():
|
||
a = [TARGET := EXPR for VAR in ITERABLE]
|
||
|
||
- Translation::
|
||
|
||
def f():
|
||
if False:
|
||
TARGET = None # Dead code to ensure TARGET is a local variable
|
||
def genexpr(iterator):
|
||
nonlocal TARGET
|
||
for VAR in iterator:
|
||
TARGET = EXPR
|
||
yield TARGET
|
||
a = list(genexpr(iter(ITERABLE)))
|
||
|
||
Let's add a ``global TARGET`` declaration in ``f()``.
|
||
|
||
- Original code::
|
||
|
||
def f():
|
||
global TARGET
|
||
a = [TARGET := EXPR for VAR in ITERABLE]
|
||
|
||
- Translation::
|
||
|
||
def f():
|
||
global TARGET
|
||
def genexpr(iterator):
|
||
global TARGET
|
||
for VAR in iterator:
|
||
TARGET = EXPR
|
||
yield TARGET
|
||
a = list(genexpr(iter(ITERABLE)))
|
||
|
||
Or instead let's add a ``nonlocal TARGET`` declaration in ``f()``.
|
||
|
||
- Original code::
|
||
|
||
def g():
|
||
TARGET = ...
|
||
def f():
|
||
nonlocal TARGET
|
||
a = [TARGET := EXPR for VAR in ITERABLE]
|
||
|
||
- Translation::
|
||
|
||
def g():
|
||
TARGET = ...
|
||
def f():
|
||
nonlocal TARGET
|
||
def genexpr(iterator):
|
||
nonlocal TARGET
|
||
for VAR in iterator:
|
||
TARGET = EXPR
|
||
yield TARGET
|
||
a = list(genexpr(iter(ITERABLE)))
|
||
|
||
Finally, let's nest two comprehensions.
|
||
|
||
- Original code::
|
||
|
||
def f():
|
||
a = [[TARGET := i for i in range(3)] for j in range(2)]
|
||
# I.e., a = [[0, 1, 2], [0, 1, 2]]
|
||
print(TARGET) # prints 2
|
||
|
||
- Translation::
|
||
|
||
def f():
|
||
if False:
|
||
TARGET = None
|
||
def outer_genexpr(outer_iterator):
|
||
nonlocal TARGET
|
||
def inner_generator(inner_iterator):
|
||
nonlocal TARGET
|
||
for i in inner_iterator:
|
||
TARGET = i
|
||
yield i
|
||
for j in outer_iterator:
|
||
yield list(inner_generator(range(3)))
|
||
a = list(outer_genexpr(range(2)))
|
||
print(TARGET)
|
||
|
||
|
||
References
|
||
==========
|
||
|
||
.. [1] Proof of concept / reference implementation
|
||
(https://github.com/Rosuav/cpython/tree/assignment-expressions)
|
||
.. [2] Pivotal post regarding inline assignment semantics
|
||
(https://mail.python.org/pipermail/python-ideas/2018-March/049409.html)
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|