Pep 572 update (#654)
- Remove changes to comprehension scope - Make := in comprehensions assign to containing non-comprehension scope - Clarify binding precedence (tighter than comma, not at same level as =) - Remove mention of more complex targets in the future - Explicitly disallow toplevel := - Rewrite section on differences with =, enumerating all of them - Remove "here's how this could be written without :=" from examples - Tweak first paragraph of "Syntax and semantics" - Add "Exception cases" (various explicit prohibitions) - Clarify that lambda is a containing scope - Clarify that := and = just don't mix - Added "Open questions" section - Added two new rejected alternatives: "Allowing commas to the right" and "Always requiring parentheses" - Minor edits * Start a section on real code * Correct/clarify "commas to the right" section * Add Tim and Guido as authors * Update abstract to mention := * Rule out targets conflicting with comprehension loop control * Add timcode.txt as Appendix A * Add os.fork() example * Add TODOs about evaluation order
This commit is contained in:
parent
0e2f1f04f7
commit
6d81538ecb
800
pep-0572.rst
800
pep-0572.rst
|
@ -1,6 +1,7 @@
|
||||||
PEP: 572
|
PEP: 572
|
||||||
Title: Assignment Expressions
|
Title: Assignment Expressions
|
||||||
Author: Chris Angelico <rosuav@gmail.com>
|
Author: Chris Angelico <rosuav@gmail.com>, Tim Peters <tim.peters@gmail.com>,
|
||||||
|
Guido van Rossum <guido@python.org>
|
||||||
Status: Draft
|
Status: Draft
|
||||||
Type: Standards Track
|
Type: Standards Track
|
||||||
Content-Type: text/x-rst
|
Content-Type: text/x-rst
|
||||||
|
@ -14,8 +15,7 @@ Abstract
|
||||||
========
|
========
|
||||||
|
|
||||||
This is a proposal for creating a way to assign to variables within an
|
This is a proposal for creating a way to assign to variables within an
|
||||||
expression. Additionally, the precise scope of comprehensions is adjusted, to
|
expression using the notation ``NAME := expr``.
|
||||||
maintain consistency and follow expectations.
|
|
||||||
|
|
||||||
|
|
||||||
Rationale
|
Rationale
|
||||||
|
@ -25,11 +25,7 @@ Naming the result of an expression is an important part of programming,
|
||||||
allowing a descriptive name to be used in place of a longer expression,
|
allowing a descriptive name to be used in place of a longer expression,
|
||||||
and permitting reuse. Currently, this feature is available only in
|
and permitting reuse. Currently, this feature is available only in
|
||||||
statement form, making it unavailable in list comprehensions and other
|
statement form, making it unavailable in list comprehensions and other
|
||||||
expression contexts. Merely introducing a way to assign as an expression
|
expression contexts.
|
||||||
would create bizarre edge cases around comprehensions, though, and to avoid
|
|
||||||
the worst of the confusions, we change the definition of comprehensions,
|
|
||||||
causing some edge cases to be interpreted differently, but maintaining the
|
|
||||||
existing behaviour in the majority of situations.
|
|
||||||
|
|
||||||
Additionally, naming sub-parts of a large expression can assist an interactive
|
Additionally, naming sub-parts of a large expression can assist an interactive
|
||||||
debugger, providing useful display hooks and partial results. Without a way to
|
debugger, providing useful display hooks and partial results. Without a way to
|
||||||
|
@ -39,13 +35,83 @@ code; with assignment expressions, this merely requires the insertion of a few
|
||||||
the code be inadvertently changed as part of debugging (a common cause of
|
the code be inadvertently changed as part of debugging (a common cause of
|
||||||
Heisenbugs), and is easier to dictate to another programmer.
|
Heisenbugs), and is easier to dictate to another programmer.
|
||||||
|
|
||||||
|
The importance of real code
|
||||||
|
---------------------------
|
||||||
|
|
||||||
|
During the development of this PEP many people (supporters and critics
|
||||||
|
both) have had a tendency to focus on toy examples on the one hand,
|
||||||
|
and on overly complex examples on the other.
|
||||||
|
|
||||||
|
The danger of toy examples is twofold: they are often too abstract to
|
||||||
|
make anyone go "ooh, that's compelling", and they are easily refuted
|
||||||
|
with "I would never write it that way anyway".
|
||||||
|
|
||||||
|
The danger of overly complex examples is that they provide a
|
||||||
|
convenient strawman for critics of the proposal to shoot down ("that's
|
||||||
|
obfuscated").
|
||||||
|
|
||||||
|
Yet there is some use for both extremely simple and extremely complex
|
||||||
|
examples: they are helpful to clarify the intended semantics.
|
||||||
|
Therefore there will be some of each below.
|
||||||
|
|
||||||
|
However, in order to be *compelling*, examples should be rooted in
|
||||||
|
real code, i.e. code that was written without any thought of this PEP,
|
||||||
|
as part of a useful application, however large or small. Tim Peters
|
||||||
|
has been extremely helpful by going over his own personal code
|
||||||
|
repository and picking examples of code he had written that (in his
|
||||||
|
view) would have been *clearer* if rewritten with (sparing) use of
|
||||||
|
assignment expressions. His conclusion: the current proposal would
|
||||||
|
have allowed a modest but clear improvement in quite a few bits of
|
||||||
|
code.
|
||||||
|
|
||||||
|
Another use of real code is to observe indirectly how much value
|
||||||
|
programmers place on compactness. Guido van Rossum searched through a
|
||||||
|
Dropbox code base and discovered some evidence that programmers value
|
||||||
|
writing fewer lines over shorter lines.
|
||||||
|
|
||||||
|
Case in point: Guido found several examples where a programmer
|
||||||
|
repeated a subexpression, slowing down the program, in order to save
|
||||||
|
one line of code, e.g. instead of writing::
|
||||||
|
|
||||||
|
match = re.match(data)
|
||||||
|
group = match.group(1) if match else None
|
||||||
|
|
||||||
|
they would write::
|
||||||
|
|
||||||
|
group = re.match(data).group(1) if re.match(data) else None
|
||||||
|
|
||||||
|
Another example illustrates that programmers sometimes do more work to
|
||||||
|
save an extra level of indentation::
|
||||||
|
|
||||||
|
match1 = pattern1.match(data)
|
||||||
|
match2 = pattern2.match(data)
|
||||||
|
if match1:
|
||||||
|
return match1.group(1)
|
||||||
|
elif match2:
|
||||||
|
return match2.group(2)
|
||||||
|
|
||||||
|
This code tries to match ``pattern2`` even if ``pattern1`` has a match
|
||||||
|
(in which case the match on ``pattern2`` is never used). The more
|
||||||
|
efficient rewrite would have been::
|
||||||
|
|
||||||
|
match1 = pattern1.match(data)
|
||||||
|
if match1:
|
||||||
|
return match1.group(1)
|
||||||
|
else:
|
||||||
|
match2 = pattern2.match(data)
|
||||||
|
if match2:
|
||||||
|
return match2.group(2)
|
||||||
|
|
||||||
|
(TODO: Include Guido's evidence, and do a more systematic search.)
|
||||||
|
|
||||||
|
|
||||||
Syntax and semantics
|
Syntax and semantics
|
||||||
====================
|
====================
|
||||||
|
|
||||||
In any context where arbitrary Python expressions can be used, a **named
|
In most contexts where arbitrary Python expressions can be used, a
|
||||||
expression** can appear. This is of the form ``name := expr`` where
|
**named expression** can appear. This is of the form ``NAME := expr``
|
||||||
``expr`` is any valid Python expression, and ``name`` is an identifier.
|
where ``expr`` is any valid Python expression other than an
|
||||||
|
unparenthesized tuple, and ``NAME`` is an identifier.
|
||||||
|
|
||||||
The value of such a named expression is the same as the incorporated
|
The value of such a named expression is the same as the incorporated
|
||||||
expression, with the additional side-effect that the target is assigned
|
expression, with the additional side-effect that the target is assigned
|
||||||
|
@ -62,140 +128,221 @@ that value::
|
||||||
# Share a subexpression between a comprehension filter clause and its output
|
# Share a subexpression between a comprehension filter clause and its output
|
||||||
filtered_data = [y for x in data if (y := f(x)) is not None]
|
filtered_data = [y for x in data if (y := f(x)) is not None]
|
||||||
|
|
||||||
|
Exceptional cases
|
||||||
|
-----------------
|
||||||
|
|
||||||
Differences from regular assignment statements
|
There are a few places where assignment expressions are not allowed,
|
||||||
----------------------------------------------
|
in order to avoid ambiguities or user confusion:
|
||||||
|
|
||||||
|
- Unparenthesized assignment expressions are prohibited at the top
|
||||||
|
level of an expression statement; for example, this is not allowed::
|
||||||
|
|
||||||
|
y := f(x) # INVALID
|
||||||
|
|
||||||
|
This rule is included to simplify the choice for the user between an
|
||||||
|
assignment statements and an assignment expression -- there is no
|
||||||
|
syntactic position where both are valid.
|
||||||
|
|
||||||
|
- Unparenthesized assignment expressions are prohibited at the top
|
||||||
|
level in the right hand side of an assignment statement; for
|
||||||
|
example, the following is not allowed::
|
||||||
|
|
||||||
|
y0 = y1 := f(x) # INVALID
|
||||||
|
|
||||||
|
Again, this rule is included to avoid two visually similar ways of
|
||||||
|
saying the same thing.
|
||||||
|
|
||||||
|
- Unparenthesized assignment expressions are prohibited for the value
|
||||||
|
of a keyword argument in a call; for example, this is disallowed::
|
||||||
|
|
||||||
|
foo(x = y := f(x)) # INVALID
|
||||||
|
|
||||||
|
This rule is included to disallow excessively confusing code.
|
||||||
|
|
||||||
|
- TODO: Should we disallow using keyword arguments and top level
|
||||||
|
assignment expressions in the same call? E.g.::
|
||||||
|
|
||||||
|
# Should these be invalid?
|
||||||
|
foo(x=0, y := f(0))
|
||||||
|
bar(x := 0, y = f(x))
|
||||||
|
|
||||||
|
Regardless, ``foo(x := 0)`` should probably be valid (see below).
|
||||||
|
|
||||||
|
- Assignment expressions (even parenthesized or occurring inside other
|
||||||
|
constructs) are prohibited in function default values. For example,
|
||||||
|
the following examples are all invalid, even though the expressions
|
||||||
|
for the default values are valid in other contexts::
|
||||||
|
|
||||||
|
def foo(answer = p := 42): # INVALID
|
||||||
|
...
|
||||||
|
|
||||||
|
def bar(answer = (p := 42)): # INVALID
|
||||||
|
...
|
||||||
|
|
||||||
|
def baz(callback = (lambda arg: p := arg)): # INVALID
|
||||||
|
...
|
||||||
|
|
||||||
|
This rule is included to avoid side effects in a position whose
|
||||||
|
exact semantics are already confusing to many users (cf. the common
|
||||||
|
style recommendation against mutable default values). (TODO: Maybe
|
||||||
|
this should just be a style recommendation except for the
|
||||||
|
prohibition at the top level?)
|
||||||
|
|
||||||
|
Scope of the target
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
An assignment expression does not introduce a new scope. In most
|
||||||
|
cases the scope in which the target will be bound is self-explanatory:
|
||||||
|
it is the current scope. If this scope contains a ``nonlocal`` or
|
||||||
|
``global`` declaration for the target, the assignment expression
|
||||||
|
honors that.
|
||||||
|
|
||||||
|
There is one special case: an assignment expression occurring in a
|
||||||
|
list, set or dict comprehension or in a generator expression (below
|
||||||
|
collectively referred to as "comprehensions") binds the target in the
|
||||||
|
containing scope, honoring a ``nonlocal`` or ``global`` declaration
|
||||||
|
for the target in that scope, if one exists. For the purpose of this
|
||||||
|
rule the containing scope of a nested comprehension is the scope that
|
||||||
|
contains the outermost comprehension. A lambda counts as a containing
|
||||||
|
scope.
|
||||||
|
|
||||||
|
The motivation for this special case is twofold. First, it allows us
|
||||||
|
to conveniently capture a "witness" for an ``any()`` expression, or a
|
||||||
|
counterexample for ``all()``, for example::
|
||||||
|
|
||||||
|
if any((comment := line).startswith('#') for line in lines):
|
||||||
|
print("First comment:", comment)
|
||||||
|
else:
|
||||||
|
print("There are no comments")
|
||||||
|
|
||||||
|
if all((nonblank := line).strip() == '' for line in lines):
|
||||||
|
print("All lines are blank")
|
||||||
|
else:
|
||||||
|
print("First non-blank line:", nonblank)
|
||||||
|
|
||||||
|
Second, it allows a compact way of updating mutable state from a
|
||||||
|
comprehension, for example::
|
||||||
|
|
||||||
|
# Compute partial sums in a list comprehension
|
||||||
|
total = 0
|
||||||
|
partial_sums = [total := total + v for v in values]
|
||||||
|
print("Total:", total)
|
||||||
|
|
||||||
|
An exception to this special case applies when the target name is the
|
||||||
|
same as a loop control variable for a comprehension containing it.
|
||||||
|
This is invalid. (This exception exists to rule out edge cases of the
|
||||||
|
above scope rules as illustrated by ``[i := i+1 for i in range(5)]``
|
||||||
|
or ``[[(j := j) for i in range(5)] for j in range(5)]``. Note that
|
||||||
|
this exception also applies to ``[i := 0 for i, j in stuff]``.)
|
||||||
|
|
||||||
|
A further exception applies when an assignment expression occurrs in a
|
||||||
|
comprehension whose containing scope is a class scope. If the rules
|
||||||
|
above were to result in the target being assigned in that class's
|
||||||
|
scope, the assignment expression is expressly invalid.
|
||||||
|
|
||||||
|
(The reason for the latter exception is the implicit function created
|
||||||
|
for comprehensions -- there is currently no runtime mechanism for a
|
||||||
|
function to refer to a variable in the containing class scope, and we
|
||||||
|
do not want to add such a mechanism. If this issue ever gets resolved
|
||||||
|
this special case may be removed from the specification of assignment
|
||||||
|
expressions. Note that the problem already exists for *using* a
|
||||||
|
variable defined in the class scope from a comprehension.)
|
||||||
|
|
||||||
|
Relative precedence of ``:=``
|
||||||
|
-----------------------------
|
||||||
|
|
||||||
|
The ``:=`` operator groups more tightly than a comma in all syntactic
|
||||||
|
positions where it is legal, but less tightly than all operators,
|
||||||
|
including ``or``, ``and`` and ``not``. As follows from section
|
||||||
|
"Exceptional cases" above, it is never allowed at the same level as
|
||||||
|
``=``. In case a different grouping is desired, parentheses should be
|
||||||
|
used.
|
||||||
|
|
||||||
|
The ``:=`` operator may be used directly in a positional function call
|
||||||
|
argument; however it is invalid directly in a keyword argument.
|
||||||
|
|
||||||
|
Some examples to clarify what's technically valid or invalid::
|
||||||
|
|
||||||
|
# INVALID
|
||||||
|
x := 0
|
||||||
|
|
||||||
|
# Valid alternative
|
||||||
|
(x := 0)
|
||||||
|
|
||||||
|
# INVALID
|
||||||
|
x = y := 0
|
||||||
|
|
||||||
|
# Valid alternative
|
||||||
|
x = (y := 0)
|
||||||
|
|
||||||
|
# Valid
|
||||||
|
len(lines := f.readlines())
|
||||||
|
|
||||||
|
# Valid (TODO: Should this be disallowed?)
|
||||||
|
foo(x := 3, cat='vector')
|
||||||
|
|
||||||
|
# INVALID
|
||||||
|
foo(cat=category := 'vector')
|
||||||
|
|
||||||
|
# Valid alternative
|
||||||
|
foo(cat=(category := 'vector'))
|
||||||
|
|
||||||
|
Most of the "valid" examples above are not recommended, since human
|
||||||
|
readers of Python source code who are quickly glancing at some code
|
||||||
|
may miss the distinction. But simple cases are not objectionable::
|
||||||
|
|
||||||
|
# Valid
|
||||||
|
if any(len(longline := line) >= 100 for line in lines):
|
||||||
|
print("Extremely long line:", longline)
|
||||||
|
|
||||||
|
This PEP recommends always putting spaces around ``:=``, similar to
|
||||||
|
PEP 8's recommendation for ``=`` when used for assignment, whereas the
|
||||||
|
latter disallows spaces around ``=`` used for keyword arguments.)
|
||||||
|
|
||||||
|
|
||||||
|
Differences between assignment expressions and assignment statements
|
||||||
|
---------------------------------------------------------------------
|
||||||
|
|
||||||
Most importantly, since ``:=`` is an expression, it can be used in contexts
|
Most importantly, since ``:=`` is an expression, it can be used in contexts
|
||||||
where statements are illegal, including lambda functions and comprehensions.
|
where statements are illegal, including lambda functions and comprehensions.
|
||||||
|
|
||||||
An assignment statement can assign to multiple targets, left-to-right::
|
Conversely, assignment expressions don't support the advanced features
|
||||||
|
found in assignment statements:
|
||||||
|
|
||||||
x = y = z = 0
|
- Multiple targets are not directly supported::
|
||||||
|
|
||||||
The equivalent assignment expression is parsed as separate binary operators,
|
x = y = z = 0 # Equivalent: (x := (y := (z := 0)))
|
||||||
and is therefore processed right-to-left, as if it were spelled thus::
|
|
||||||
|
|
||||||
assert 0 == (x := (y := (z := 0)))
|
- Single assignment targets more complex than a single ``NAME`` are
|
||||||
|
not supported::
|
||||||
|
|
||||||
Statement assignment can include annotations. This would be syntactically
|
# No equivalent
|
||||||
noisy in expressions, and is of minor importance. An annotation can be
|
a[i] = x
|
||||||
given separately from the assignment if needed::
|
self.rest = []
|
||||||
|
|
||||||
x:str = "" # works
|
- Iterable packing and unpacking (both regular or extended forms) are
|
||||||
(x:str := "") # SyntaxError
|
not supported::
|
||||||
x:str # possibly before a loop
|
|
||||||
(x := "") # fine
|
|
||||||
|
|
||||||
Augmented assignment is not supported in expression form::
|
# Equivalent needs extra parentheses
|
||||||
|
loc = x, y # Use (loc := (x, y))
|
||||||
|
info = name, phone, *rest # Use (info := (name, phone, *rest))
|
||||||
|
|
||||||
>>> x +:= 1
|
# No equivalent
|
||||||
File "<stdin>", line 1
|
px, py, pz = position
|
||||||
x +:= 1
|
name, phone, email, *other_info = contact
|
||||||
^
|
|
||||||
SyntaxError: invalid syntax
|
|
||||||
|
|
||||||
Statement assignment is able to set attributes and subscripts, but
|
- Type annotations are not supported::
|
||||||
expression assignment is restricted to names. (This restriction may be
|
|
||||||
relaxed in a future version of Python.)
|
|
||||||
|
|
||||||
Otherwise, the semantics of assignment are identical in statement and
|
# No equivalent
|
||||||
expression forms.
|
p: Optional[int] = None
|
||||||
|
|
||||||
|
- Augmented assignment is not supported::
|
||||||
|
|
||||||
|
total += tax # Equivalent: (total := total + tax)
|
||||||
|
|
||||||
|
|
||||||
Alterations to comprehensions
|
Examples
|
||||||
-----------------------------
|
========
|
||||||
|
|
||||||
The current behaviour of list/set/dict comprehensions and generator
|
|
||||||
expressions has some edge cases that would behave strangely if an assignment
|
|
||||||
expression were to be used. Therefore the proposed semantics are changed,
|
|
||||||
removing the current edge cases, and instead altering their behaviour *only*
|
|
||||||
in a class scope.
|
|
||||||
|
|
||||||
As of Python 3.7, the outermost iterable of any comprehension is evaluated
|
|
||||||
in the surrounding context, and then passed as an argument to the implicit
|
|
||||||
function that evaluates the comprehension.
|
|
||||||
|
|
||||||
Under this proposal, the entire body of the comprehension is evaluated in
|
|
||||||
its implicit function. Names not assigned to within the comprehension are
|
|
||||||
located in the surrounding scopes, as with normal lookups. As one special
|
|
||||||
case, a comprehension at class scope will **eagerly bind** any name which
|
|
||||||
is already defined in the class scope.
|
|
||||||
|
|
||||||
A list comprehension can be unrolled into an equivalent function. With
|
|
||||||
Python 3.7 semantics::
|
|
||||||
|
|
||||||
numbers = [x + y for x in range(3) for y in range(4)]
|
|
||||||
# Is approximately equivalent to
|
|
||||||
def <listcomp>(iterator):
|
|
||||||
result = []
|
|
||||||
for x in iterator:
|
|
||||||
for y in range(4):
|
|
||||||
result.append(x + y)
|
|
||||||
return result
|
|
||||||
numbers = <listcomp>(iter(range(3)))
|
|
||||||
|
|
||||||
Under the new semantics, this would instead be equivalent to::
|
|
||||||
|
|
||||||
def <listcomp>():
|
|
||||||
result = []
|
|
||||||
for x in range(3):
|
|
||||||
for y in range(4):
|
|
||||||
result.append(x + y)
|
|
||||||
return result
|
|
||||||
numbers = <listcomp>()
|
|
||||||
|
|
||||||
When a class scope is involved, a naive transformation into a function would
|
|
||||||
prevent name lookups (as the function would behave like a method)::
|
|
||||||
|
|
||||||
class X:
|
|
||||||
names = ["Fred", "Barney", "Joe"]
|
|
||||||
prefix = "> "
|
|
||||||
prefixed_names = [prefix + name for name in names]
|
|
||||||
|
|
||||||
With Python 3.7 semantics, this will evaluate the outermost iterable at class
|
|
||||||
scope, which will succeed; but it will evaluate everything else in a function::
|
|
||||||
|
|
||||||
class X:
|
|
||||||
names = ["Fred", "Barney", "Joe"]
|
|
||||||
prefix = "> "
|
|
||||||
def <listcomp>(iterator):
|
|
||||||
result = []
|
|
||||||
for name in iterator:
|
|
||||||
result.append(prefix + name)
|
|
||||||
return result
|
|
||||||
prefixed_names = <listcomp>(iter(names))
|
|
||||||
|
|
||||||
The name ``prefix`` is thus searched for at global scope, ignoring the class
|
|
||||||
name. Under the proposed semantics, this name will be eagerly bound; and the
|
|
||||||
same early binding then handles the outermost iterable as well. The list
|
|
||||||
comprehension is thus approximately equivalent to::
|
|
||||||
|
|
||||||
class X:
|
|
||||||
names = ["Fred", "Barney", "Joe"]
|
|
||||||
prefix = "> "
|
|
||||||
def <listcomp>(names=names, prefix=prefix):
|
|
||||||
result = []
|
|
||||||
for name in names:
|
|
||||||
result.append(prefix + name)
|
|
||||||
return result
|
|
||||||
prefixed_names = <listcomp>()
|
|
||||||
|
|
||||||
With list comprehensions, this is unlikely to cause any confusion. With
|
|
||||||
generator expressions, this has the potential to affect behaviour, as the
|
|
||||||
eager binding means that the name could be rebound between the creation of
|
|
||||||
the genexp and the first call to ``next()``. It is, however, more closely
|
|
||||||
aligned to normal expectations. The effect is ONLY seen with names that
|
|
||||||
are looked up from class scope; global names (eg ``range()``) will still
|
|
||||||
be late-bound as usual.
|
|
||||||
|
|
||||||
One consequence of this change is that certain bugs in genexps will not
|
|
||||||
be detected until the first call to ``next()``, where today they would be
|
|
||||||
caught upon creation of the generator.
|
|
||||||
|
|
||||||
|
|
||||||
Recommended use-cases
|
|
||||||
=====================
|
|
||||||
|
|
||||||
Simplifying list comprehensions
|
Simplifying list comprehensions
|
||||||
-------------------------------
|
-------------------------------
|
||||||
|
@ -210,21 +357,8 @@ giving it a name on first use::
|
||||||
|
|
||||||
stuff = [[y := f(x), x/y] for x in range(5)]
|
stuff = [[y := f(x), x/y] for x in range(5)]
|
||||||
|
|
||||||
# There are a number of less obvious ways to spell this in current
|
Note that in both cases the variable ``y`` is bound in the containing
|
||||||
# versions of Python, such as:
|
scope (i.e. at the same level as ``results`` or ``stuff``).
|
||||||
|
|
||||||
# Inline helper function
|
|
||||||
stuff = [(lambda y: [y,x/y])(f(x)) for x in range(5)]
|
|
||||||
|
|
||||||
# Extra 'for' loop - potentially could be optimized internally
|
|
||||||
stuff = [[y, x/y] for x in range(5) for y in [f(x)]]
|
|
||||||
|
|
||||||
# Using a mutable cache object (various forms possible)
|
|
||||||
c = {}
|
|
||||||
stuff = [[c.update(y=f(x)) or c['y'], x/c['y']] for x in range(5)]
|
|
||||||
|
|
||||||
In all cases, the name is local to the comprehension; like iteration variables,
|
|
||||||
it cannot leak out into the surrounding context.
|
|
||||||
|
|
||||||
|
|
||||||
Capturing condition values
|
Capturing condition values
|
||||||
|
@ -233,7 +367,7 @@ Capturing condition values
|
||||||
Assignment expressions can be used to good effect in the header of
|
Assignment expressions can be used to good effect in the header of
|
||||||
an ``if`` or ``while`` statement::
|
an ``if`` or ``while`` statement::
|
||||||
|
|
||||||
# Proposed syntax
|
# Loop-and-a-half
|
||||||
while (command := input("> ")) != "quit":
|
while (command := input("> ")) != "quit":
|
||||||
print("You entered:", command)
|
print("You entered:", command)
|
||||||
|
|
||||||
|
@ -250,26 +384,64 @@ an ``if`` or ``while`` statement::
|
||||||
print("Fallback found:", match.group(0))
|
print("Fallback found:", match.group(0))
|
||||||
|
|
||||||
# Reading socket data until an empty string is returned
|
# Reading socket data until an empty string is returned
|
||||||
while data := sock.read():
|
while data := sock.recv():
|
||||||
print("Received data:", data)
|
print("Received data:", data)
|
||||||
|
|
||||||
# Equivalent in current Python, not caring about function return value
|
|
||||||
while input("> ") != "quit":
|
|
||||||
print("You entered a command.")
|
|
||||||
|
|
||||||
# To capture the return value in current Python demands a four-line
|
|
||||||
# loop header.
|
|
||||||
while True:
|
|
||||||
command = input("> ");
|
|
||||||
if command == "quit":
|
|
||||||
break
|
|
||||||
print("You entered:", command)
|
|
||||||
|
|
||||||
Particularly with the ``while`` loop, this can remove the need to have an
|
Particularly with the ``while`` loop, this can remove the need to have an
|
||||||
infinite loop, an assignment, and a condition. It also creates a smooth
|
infinite loop, an assignment, and a condition. It also creates a smooth
|
||||||
parallel between a loop which simply uses a function call as its condition,
|
parallel between a loop which simply uses a function call as its condition,
|
||||||
and one which uses that as its condition but also uses the actual value.
|
and one which uses that as its condition but also uses the actual value.
|
||||||
|
|
||||||
|
Fork
|
||||||
|
----
|
||||||
|
|
||||||
|
An example from the low-level UNIX world::
|
||||||
|
|
||||||
|
if pid := os.fork():
|
||||||
|
# Parent code
|
||||||
|
else:
|
||||||
|
# Child code
|
||||||
|
|
||||||
|
|
||||||
|
Open questions and TODOs
|
||||||
|
========================
|
||||||
|
|
||||||
|
- For precise semantics, the proposal requires evaluation order to be
|
||||||
|
well-defined. We're mostly good due to the rule that things
|
||||||
|
generally are evaluated from left to right, but there are some
|
||||||
|
corner cases:
|
||||||
|
|
||||||
|
1. In a dict comprehension ``{X: Y for ...}``, ``Y`` is evaluated
|
||||||
|
before ``X``. This is confusing and should be swapped. (In a
|
||||||
|
dict display ``{X: Y}}`` the order is already ``X`` before
|
||||||
|
``Y``.)
|
||||||
|
|
||||||
|
2. It would be good to confirm definitively that in an assignment
|
||||||
|
statement, any subexpressions on the left hand side are
|
||||||
|
evaluated after the right hand side (e.g. ``a[X] = Y`` evaluates
|
||||||
|
``X`` after ``Y``). (This already seems to be the case.)
|
||||||
|
|
||||||
|
3. Also in multiple assignment statements (e.g. ``a[X] = a[Y] = Z``)
|
||||||
|
it would be good to confirm that ``a[X]`` is evaluated before
|
||||||
|
``a[Y]``. (This already seems to be the case.)
|
||||||
|
|
||||||
|
- Should we adopt Tim Peters's proposal to make the target scope be the
|
||||||
|
containing scope? It's cute, and has some useful applications, but
|
||||||
|
it requires a carefully formulated mouthful. (Current answer: yes.)
|
||||||
|
|
||||||
|
- Should we disallow combining keyword arguments and unparenthesized
|
||||||
|
assignment expressions in the same call? (Current answer: no.)
|
||||||
|
|
||||||
|
- Should we disallow ``(x := 0, y := 0)`` and ``foo(x := 0, y := 0)``,
|
||||||
|
requiring the fully parenthesized forms ``((x := 0), (y := 0))`` and
|
||||||
|
``foo((x := 0), (y := 0))`` instead? (Current answer: no.)
|
||||||
|
|
||||||
|
- If we were to change the previous answer to yes, should we still
|
||||||
|
allow ``len(lines := f.readlines())``? (I'd say yes.)
|
||||||
|
|
||||||
|
- Should we disallow assignment expressions anywhere in function
|
||||||
|
defaults? (Current answer: yes.)
|
||||||
|
|
||||||
|
|
||||||
Rejected alternative proposals
|
Rejected alternative proposals
|
||||||
==============================
|
==============================
|
||||||
|
@ -279,6 +451,17 @@ Below are a number of alternative syntaxes, some of them specific to
|
||||||
comprehensions, which have been rejected in favour of the one given above.
|
comprehensions, which have been rejected in favour of the one given above.
|
||||||
|
|
||||||
|
|
||||||
|
Changing the scope rules for comprehensions
|
||||||
|
-------------------------------------------
|
||||||
|
|
||||||
|
A previous version of this PEP proposed subtle changes to the scope
|
||||||
|
rules for comprehensions, to make them more usable in class scope and
|
||||||
|
to unify the scope of the "outermost iterable" and the rest of the
|
||||||
|
comprehension. However, this part of the proposal would have caused
|
||||||
|
backwards incompatibilities, and has been withdrawn so the PEP can
|
||||||
|
focus on assignment expressions.
|
||||||
|
|
||||||
|
|
||||||
Alternative spellings
|
Alternative spellings
|
||||||
---------------------
|
---------------------
|
||||||
|
|
||||||
|
@ -426,102 +609,46 @@ Once find() returns -1, the loop terminates. If ``:=`` binds as loosely as
|
||||||
While this behaviour would be convenient in many situations, it is also harder
|
While this behaviour would be convenient in many situations, it is also harder
|
||||||
to explain than "the := operator behaves just like the assignment statement",
|
to explain than "the := operator behaves just like the assignment statement",
|
||||||
and as such, the precedence for ``:=`` has been made as close as possible to
|
and as such, the precedence for ``:=`` has been made as close as possible to
|
||||||
that of ``=``.
|
that of ``=`` (with the exception that it binds tighter than comma).
|
||||||
|
|
||||||
|
|
||||||
Migration path
|
Allowing commas to the right
|
||||||
==============
|
----------------------------
|
||||||
|
|
||||||
The semantic changes to list/set/dict comprehensions, and more so to generator
|
Some critics have claimed that the assignment expressions should allow
|
||||||
expressions, may potentially require migration of code. In many cases, the
|
unparenthesized tuples on the right, so that these two would be equivalent::
|
||||||
changes simply make legal what used to raise an exception, but there are some
|
|
||||||
edge cases that were previously legal and now are not, and a few corner cases
|
(point := (x, y))
|
||||||
with altered semantics.
|
(point := x, y)
|
||||||
|
|
||||||
|
(With the current version of the proposal, the latter would be
|
||||||
|
equivalent to ``((point := x), y)``.)
|
||||||
|
|
||||||
|
However, adopting this stance would logically lead to the conclusion
|
||||||
|
that when used in a function call, assignment expressions also bind
|
||||||
|
less tight than comma, so we'd have the following confusing equivalence::
|
||||||
|
|
||||||
|
foo(x := 1, y)
|
||||||
|
foo(x := (1, y))
|
||||||
|
|
||||||
|
The less confusing option is to make ``:=`` bind more tightly than comma.
|
||||||
|
|
||||||
|
|
||||||
The Outermost Iterable
|
Always requiring parentheses
|
||||||
----------------------
|
----------------------------
|
||||||
|
|
||||||
As of Python 3.7, the outermost iterable in a comprehension is special: it is
|
It's been proposed to just always require parenthesize around an
|
||||||
evaluated in the surrounding context, instead of inside the comprehension.
|
assignment expression. This would resolve many ambiguities, and
|
||||||
Thus it is permitted to contain a ``yield`` expression, to use a name also
|
indeed parentheses will frequently be needed to extract the desired
|
||||||
used elsewhere, and to reference names from class scope. Also, in a genexp,
|
subexpression. But in the following cases the extra parentheses feel
|
||||||
the outermost iterable is pre-evaluated, but the rest of the code is not
|
redundant::
|
||||||
touched until the genexp is first iterated over. Class scope is now handled
|
|
||||||
more generally (see above), but if other changes require the old behaviour,
|
|
||||||
the iterable must be explicitly elevated from the comprehension::
|
|
||||||
|
|
||||||
# Python 3.7
|
# Top level in if
|
||||||
def f(x):
|
if match := pattern.match(line):
|
||||||
return [x for x in x if x]
|
return match.group(1)
|
||||||
def g():
|
|
||||||
return [x for x in [(yield 1)]]
|
|
||||||
# With PEP 572
|
|
||||||
def f(x):
|
|
||||||
return [y for y in x if y]
|
|
||||||
def g():
|
|
||||||
sent_item = (yield 1)
|
|
||||||
return [x for x in [sent_item]]
|
|
||||||
|
|
||||||
This more clearly shows that it is g(), not the comprehension, which is able
|
# Short call
|
||||||
to yield values (and is thus a generator function). The entire comprehension
|
len(lines := f.readlines())
|
||||||
is consistently in a single scope.
|
|
||||||
|
|
||||||
The following expressions would, in Python 3.7, raise exceptions immediately.
|
|
||||||
With the removal of the outermost iterable's special casing, they are now
|
|
||||||
equivalent to the most obvious longhand form::
|
|
||||||
|
|
||||||
gen = (x for x in rage(10)) # NameError
|
|
||||||
gen = (x for x in 10) # TypeError (not iterable)
|
|
||||||
gen = (x for x in range(1/0)) # ZeroDivisionError
|
|
||||||
|
|
||||||
def <genexp>():
|
|
||||||
for x in rage(10):
|
|
||||||
yield x
|
|
||||||
gen = <genexp>() # No exception yet
|
|
||||||
tng = next(gen) # NameError
|
|
||||||
|
|
||||||
|
|
||||||
Open questions
|
|
||||||
==============
|
|
||||||
|
|
||||||
Importing names into comprehensions
|
|
||||||
-----------------------------------
|
|
||||||
|
|
||||||
A list comprehension can use and update local names, and they will retain
|
|
||||||
their values from one iteration to another. It would be convenient to use
|
|
||||||
this feature to create rolling or self-effecting data streams::
|
|
||||||
|
|
||||||
progressive_sums = [total := total + value for value in data]
|
|
||||||
|
|
||||||
This will fail with UnboundLocalError due to ``total`` not being initalized.
|
|
||||||
Simply initializing it outside of the comprehension is insufficient - unless
|
|
||||||
the comprehension is in class scope::
|
|
||||||
|
|
||||||
class X:
|
|
||||||
total = 0
|
|
||||||
progressive_sums = [total := total + value for value in data]
|
|
||||||
|
|
||||||
At other scopes, it may be beneficial to have a way to fetch a value from the
|
|
||||||
surrounding scope. Should this be automatic? Should it be controlled with a
|
|
||||||
keyword? Hypothetically (and using no new keywords), this could be written::
|
|
||||||
|
|
||||||
total = 0
|
|
||||||
progressive_sums = [total := total + value
|
|
||||||
import nonlocal total
|
|
||||||
for value in data]
|
|
||||||
|
|
||||||
Translated into longhand, this would become::
|
|
||||||
|
|
||||||
total = 0
|
|
||||||
def <listcomp>(total=total):
|
|
||||||
result = []
|
|
||||||
for value in data:
|
|
||||||
result.append(total := total + value)
|
|
||||||
return result
|
|
||||||
progressive_sums = <listcomp>()
|
|
||||||
|
|
||||||
ie utilizing the same early-binding technique that is used at class scope.
|
|
||||||
|
|
||||||
|
|
||||||
Frequently Raised Objections
|
Frequently Raised Objections
|
||||||
|
@ -565,10 +692,6 @@ create externally-visible names. This is no different from ``for`` loops or
|
||||||
other constructs, and can be solved the same way: ``del`` the name once it is
|
other constructs, and can be solved the same way: ``del`` the name once it is
|
||||||
no longer needed, or prefix it with an underscore.
|
no longer needed, or prefix it with an underscore.
|
||||||
|
|
||||||
Names bound within a comprehension are local to that comprehension, even in
|
|
||||||
the outermost iterable, and can thus be used freely without polluting the
|
|
||||||
surrounding namespace.
|
|
||||||
|
|
||||||
(The author wishes to thank Guido van Rossum and Christoph Groth for their
|
(The author wishes to thank Guido van Rossum and Christoph Groth for their
|
||||||
suggestions to move the proposal in this direction. [2]_)
|
suggestions to move the proposal in this direction. [2]_)
|
||||||
|
|
||||||
|
@ -590,11 +713,184 @@ benefit of style guides such as PEP 8, two recommendations are suggested.
|
||||||
Acknowledgements
|
Acknowledgements
|
||||||
================
|
================
|
||||||
|
|
||||||
The author wishes to thank Guido van Rossum and Nick Coghlan for their
|
The authors wish to thank Nick Coghlan and Steven D'Aprano for their
|
||||||
considerable contributions to this proposal, and to members of the
|
considerable contributions to this proposal, and members of the
|
||||||
core-mentorship mailing list for assistance with implementation.
|
core-mentorship mailing list for assistance with implementation.
|
||||||
|
|
||||||
|
|
||||||
|
Appendix A: Tim Peters's findings
|
||||||
|
=================================
|
||||||
|
|
||||||
|
Here's a brief essay Tim Peters wrote on the topic.
|
||||||
|
|
||||||
|
I dislike "busy" lines of code, and also dislike putting conceptually
|
||||||
|
unrelated logic on a single line. So, for example, instead of::
|
||||||
|
|
||||||
|
i = j = count = nerrors = 0
|
||||||
|
|
||||||
|
I prefer::
|
||||||
|
|
||||||
|
i = j = 0
|
||||||
|
count = 0
|
||||||
|
nerrors = 0
|
||||||
|
|
||||||
|
instead. So I suspected I'd find few places I'd want to use
|
||||||
|
assignment expressions. I didn't even consider them for lines already
|
||||||
|
stretching halfway across the screen. In other cases, "unrelated"
|
||||||
|
ruled::
|
||||||
|
|
||||||
|
mylast = mylast[1]
|
||||||
|
yield mylast[0]
|
||||||
|
|
||||||
|
is a vast improvment over the briefer::
|
||||||
|
|
||||||
|
yield (mylast := mylast[1])[0]
|
||||||
|
|
||||||
|
The original two statements are doing entirely different conceptual
|
||||||
|
things, and slamming them together is conceptually insane.
|
||||||
|
|
||||||
|
In other cases, combining related logic made it harder to understand,
|
||||||
|
such as rewriting::
|
||||||
|
|
||||||
|
while True:
|
||||||
|
old = total
|
||||||
|
total += term
|
||||||
|
if old == total:
|
||||||
|
return total
|
||||||
|
term *= mx2 / (i*(i+1))
|
||||||
|
i += 2
|
||||||
|
|
||||||
|
as the briefer::
|
||||||
|
|
||||||
|
while total != (total := total + term):
|
||||||
|
term *= mx2 / (i*(i+1))
|
||||||
|
i += 2
|
||||||
|
return total
|
||||||
|
|
||||||
|
The ``while`` test there is too subtle, crucially relying on strict
|
||||||
|
left-to-right evaluation in a non-short-circuiting or method-chaining
|
||||||
|
context. My brain isn't wired that way.
|
||||||
|
|
||||||
|
But cases like that were rare. Name binding is very frequent, and
|
||||||
|
"sparse is better than dense" does not mean "almost empty is better
|
||||||
|
than sparse". For example, I have many functions that return ``None``
|
||||||
|
or ``0`` to communicate "I have nothing useful to return in this case,
|
||||||
|
but since that's expected often I'm not going to annoy you with an
|
||||||
|
exception". This is essentially the same as regular expression search
|
||||||
|
functions returning ``None`` when there is no match. So there was lots
|
||||||
|
of code of the form::
|
||||||
|
|
||||||
|
result = solution(xs, n)
|
||||||
|
if result:
|
||||||
|
# use result
|
||||||
|
|
||||||
|
I find that clearer, and certainly a bit less typing and
|
||||||
|
pattern-matching reading, as::
|
||||||
|
|
||||||
|
if result := solution(xs, n):
|
||||||
|
# use result
|
||||||
|
|
||||||
|
It's also nice to trade away a small amount of horizontal whitespace
|
||||||
|
to get another _line_ of surrounding code on screen. I didn't give
|
||||||
|
much weight to this at first, but it was so very frequent it added up,
|
||||||
|
and I soon enough became annoyed that I couldn't actually run the
|
||||||
|
briefer code. That surprised me!
|
||||||
|
|
||||||
|
There are other cases where assignment expressions really shine.
|
||||||
|
Rather than pick another from my code, Kirill Balunov gave a lovely
|
||||||
|
example from the standard library's ``copy()`` function in ``copy.py``::
|
||||||
|
|
||||||
|
reductor = dispatch_table.get(cls)
|
||||||
|
if reductor:
|
||||||
|
rv = reductor(x)
|
||||||
|
else:
|
||||||
|
reductor = getattr(x, "__reduce_ex__", None)
|
||||||
|
if reductor:
|
||||||
|
rv = reductor(4)
|
||||||
|
else:
|
||||||
|
reductor = getattr(x, "__reduce__", None)
|
||||||
|
if reductor:
|
||||||
|
rv = reductor()
|
||||||
|
else:
|
||||||
|
raise Error("un(shallow)copyable object of type %s" % cls)
|
||||||
|
|
||||||
|
The ever-increasing indentation is semantically misleading: the logic
|
||||||
|
is conceptually flat, "the first test that succeeds wins"::
|
||||||
|
|
||||||
|
if reductor := dispatch_table.get(cls):
|
||||||
|
rv = reductor(x)
|
||||||
|
elif reductor := getattr(x, "__reduce_ex__", None):
|
||||||
|
rv = reductor(4)
|
||||||
|
elif reductor := getattr(x, "__reduce__", None):
|
||||||
|
rv = reductor()
|
||||||
|
else:
|
||||||
|
raise Error("un(shallow)copyable object of type %s" % cls)
|
||||||
|
|
||||||
|
Using easy assignment expressions allows the visual structure of the
|
||||||
|
code to emphasize the conceptual flatness of the logic;
|
||||||
|
ever-increasing indentation obscured it.
|
||||||
|
|
||||||
|
A smaller example from my code delighted me, both allowing to put
|
||||||
|
inherently related logic in a single line, and allowing to remove an
|
||||||
|
annoying "artificial" indentation level::
|
||||||
|
|
||||||
|
diff = x - x_base
|
||||||
|
if diff:
|
||||||
|
g = gcd(diff, n)
|
||||||
|
if g > 1:
|
||||||
|
return g
|
||||||
|
|
||||||
|
became::
|
||||||
|
|
||||||
|
if (diff := x - x_base) and (g := gcd(diff, n)) > 1:
|
||||||
|
return g
|
||||||
|
|
||||||
|
That ``if`` is about as long as I want my lines to get, bur remains easy
|
||||||
|
to follow.
|
||||||
|
|
||||||
|
So, in all, in most lines binding a name, I wouldn't use assignment
|
||||||
|
expressions, but because that construct is so very frequent, that
|
||||||
|
leaves many places I would. In most of the latter, I found a small
|
||||||
|
win that adds up due to how often it occurs, and in the rest I found a
|
||||||
|
moderate to major win. I'd certainly use it more often than ternary
|
||||||
|
``if``, but significantly less often than augmented assignment.
|
||||||
|
|
||||||
|
A numeric example
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
I have another example that quite impressed me at the time.
|
||||||
|
|
||||||
|
Where all variables are positive integers, and a is at least as large
|
||||||
|
as the n'th root of x, this algorithm returns the floor of the n'th
|
||||||
|
root of x (and roughly doubling the number of accurate bits per
|
||||||
|
iteration)::
|
||||||
|
|
||||||
|
while a > (d := x // a**(n-1)):
|
||||||
|
a = ((n-1)*a + d) // n
|
||||||
|
return a
|
||||||
|
|
||||||
|
It's not obvious why that works, but is no more obvious in the "loop
|
||||||
|
and a half" form. It's hard to prove correctness without building on
|
||||||
|
the right insight (the "arithmetic mean - geometric mean inequality"),
|
||||||
|
and knowing some non-trivial things about how nested floor functions
|
||||||
|
behave. That is, the challenges are in the math, not really in the
|
||||||
|
coding.
|
||||||
|
|
||||||
|
If you do know all that, then the assignment-expression form is easily
|
||||||
|
read as "while the current guess is too large, get a smaller guess",
|
||||||
|
where the "too large?" test and the new guess share an expensive
|
||||||
|
sub-expression.
|
||||||
|
|
||||||
|
To my eyes, the original form is harder to understand::
|
||||||
|
|
||||||
|
while True:
|
||||||
|
d = x // a**(n-1)
|
||||||
|
if a <= d:
|
||||||
|
break
|
||||||
|
a = ((n-1)*a + d) // n
|
||||||
|
return a
|
||||||
|
|
||||||
|
|
||||||
References
|
References
|
||||||
==========
|
==========
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue