PEP 572: Wholesale rewrites to bring the document in line with reality

This commit is contained in:
Chris Angelico 2018-04-09 11:40:59 +10:00
parent f58b37ce8e
commit 731071e222
1 changed files with 73 additions and 227 deletions

View File

@ -1,5 +1,5 @@
PEP: 572
Title: Sublocal-Scoped Assignment Expressions
Title: Assignment Expressions
Author: Chris Angelico <rosuav@gmail.com>
Status: Draft
Type: Standards Track
@ -12,120 +12,64 @@ Post-History: 28-Feb-2018, 02-Mar-2018, 23-Mar-2018
Abstract
========
This is a proposal for permitting temporary name bindings
which are limited to a single statement.
This is a proposal for creating a way to assign to names within an expression.
Additionally, the precise scope of comprehensions is adjusted, to maintain
consistency and follow expectations.
Rationale
=========
Programmers generally prefer reusing code rather than duplicating it. When
an expression needs to be used twice in quick succession but never again,
it is convenient to assign it to a temporary name with sublocal scope.
By permitting name bindings to exist within a single statement only, we
make this both convenient and safe against name collisions.
This is particularly notable in list/dict/set comprehensions and generator
expressions, where refactoring a subexpression into an assignment statement
is not possible. There are currently several ways to create a temporary name
binding inside a list comprehension, none of which is universally
accepted as ideal. A statement-local name allows any subexpression to be
temporarily captured and then used multiple times.
Additionally, this syntax can in places be used to remove the need to write an
infinite loop with a ``break`` in it. Capturing part of a ``while`` loop's
condition can improve the clarity of the loop header while still making the
actual value available within the loop body.
Naming the result of an expression is an important part of programming,
allowing a descriptive name to be used in place of a longer expression,
and permitting reuse. Currently, this feature is available only in
statement form, making it unavailable in list comprehensions and other
expression contexts. Merely introducing a way to assign as an expression
would create bizarre edge cases around comprehensions, though, and to avoid
the worst of the confusions, we change the definition of comprehensions,
causing some edge cases to be interpreted differently, but maintaining the
existing behaviour in the majority of situations.
Syntax and semantics
====================
In any context where arbitrary Python expressions can be used, a **named
expression** can appear. This must be parenthesized for clarity, and is of
the form ``(expr as NAME)`` where ``expr`` is any valid Python expression,
and ``NAME`` is a simple name.
expression** can appear. This can be parenthesized for clarity, and is of
the form ``(target := expr)`` where ``expr`` is any valid Python expression,
and ``target`` is any valid assignment target.
The value of such a named expression is the same as the incorporated
expression, with the additional side-effect that NAME is bound to that
value for the remainder of the current statement. For example::
expression, with the additional side-effect that the target is assigned
that value.
# Similar to the boolean 'or' but checking for None specifically
x = "default" if (spam().ham as eggs) is None else eggs
x = "default" if (eggs := spam().ham) is None else eggs
# Even complex expressions can be built up piece by piece
y = ((spam() as eggs), (eggs.method() as cheese), cheese[eggs])
Just as function-local names shadow global names for the scope of the
function, sublocal names shadow other names for that statement. (This
includes other sublocal names.)
Assignment to sublocal names is ONLY through this syntax. Regular
assignment to the same name will remove the sublocal name and
affect the name in the surrounding scope (function, class, or module).
Sublocal names never appear in locals() or globals(), and cannot be
closed over by nested functions.
Execution order and its consequences
------------------------------------
Since the sublocal name binding lasts from its point of execution
to the end of the current statement, this can potentially cause confusion
when the actual order of execution does not match the programmer's
expectations. Some examples::
# A simple statement ends at the newline or semicolon.
a = (1 as y)
print(y) # NameError
# The assignment ignores the SLNB - this adds one to 'a'
a = (a + 1 as a)
# Compound statements usually enclose everything...
if (re.match(...) as m):
print(m.groups(0))
print(m) # NameError
# ... except when function bodies are involved...
if (input("> ") as cmd):
def run_cmd():
print("Running command", cmd) # NameError
# ... but function *headers* are executed immediately
if (input("> ") as cmd):
def run_cmd(cmd=cmd): # Capture the value in the default arg
print("Running command", cmd) # Works
Function bodies, in this respect, behave the same way they do in class scope;
assigned names are not closed over by method definitions. Defining a function
inside a loop already has potentially-confusing consequences, and sublocals
do not change the existing situation.
y = ((eggs := spam()), (cheese := eggs.method()), cheese[eggs])
Differences from regular assignment statements
----------------------------------------------
Using ``(EXPR as NAME)`` is similar to ``NAME = EXPR``, but has a number of
important distinctions.
An assignment statement can assign to multiple targets::
* Assignment is a statement; sublocal creaation is an expression whose value
is the same as the object bound to the new name.
* Sublocals disappear at the end of their enclosing statement, at which point
the name again refers to whatever it previously would have. Sublocals can
thus shadow other names without conflict.
* Sublocals cannot be closed over by nested functions, and are completely
ignored for this purpose.
* Sublocals do not appear in ``locals()`` or ``globals()``.
* A sublocal cannot be the target of any form of assignment, including
augmented. Attempting to do so will remove the sublocal and assign to the
fully-scoped name.
x = y = z = 0
In many respects, a sublocal variable is akin to a local variable in an
imaginary nested function, except that the overhead of creating and calling
a function is bypassed. As with names bound by ``for`` loops inside list
comprehensions, sublocal names cannot "leak" into their surrounding scope.
To do the same with assignment expressions, they must be parenthesized::
assert 0 == (x := (y := (z := 0)))
Augmented assignment is not supported in expression form::
>>> x +:= 1
File "<stdin>", line 1
x +:= 1
^
SyntaxError: invalid syntax
Otherwise, the semantics of assignment are unchanged by this proposal.
Recommended use-cases
@ -169,8 +113,8 @@ These list comprehensions are all approximately equivalent::
c = {}
stuff = [[c.update(y=f(x)) or c['y'], x/c['y']] for x in range(5)]
# Using a sublocal name
stuff = [[(f(x) as y), x/y] for x in range(5)]
# Using a temporary name
stuff = [[y := f(x), x/y] for x in range(5)]
If calling ``f(x)`` is expensive or has side effects, the clean operation of
the list comprehension gets muddled. Using a short-duration name binding
@ -182,9 +126,8 @@ part at the end of the comprehension instead of the beginning.
Capturing condition values
--------------------------
Since a sublocal created by an assignment expression extends to the full
current statement, even a block statement, this can be used to good effect
in the header of an ``if`` or ``while`` statement::
Assignment expressions can be used to good effect in the header of
an ``if`` or ``while`` statement::
# Current Python, not caring about function return value
while input("> ") != "quit":
@ -198,17 +141,17 @@ in the header of an ``if`` or ``while`` statement::
print("You entered:", command)
# Proposed alternative to the above
while (input("> ") as command) != "quit":
while (command := input("> ")) != "quit":
print("You entered:", command)
# Capturing regular expression match objects
# See, for instance, Lib/pydoc.py, which uses a multiline spelling
# of this effect
if (re.search(pat, text) as match):
if match := re.search(pat, text):
print("Found:", match.group(0))
# Reading socket data until an empty string is returned
while (sock.read() as data):
while data := sock.read():
print("Received data:", data)
Particularly with the ``while`` loop, this can remove the need to have an
@ -217,71 +160,6 @@ parallel between a loop which simply uses a function call as its condition,
and one which uses that as its condition but also uses the actual value.
Preventing temporaries from leaking
-----------------------------------
Inside a class definition, any name assigned to will become a class attribute.
Use of a sublocal name binding will prevent temporary variables from becoming
public attributes of the class.
(TODO: Get example)
Performance costs
=================
The cost of sublocals must be kept to a minimum, particularly when they are not
used; normal assignment should not be measurably penalized. The reference
implementation uses a linked list of sublocal cells, with the absence of such
a list being the normal case. This is used for code compilation only; once a
function's bytecode has been baked in, execution of that bytecode has no
performance cost compared to regular assignment.
Other Python implementations may choose to do things differently, but a zero
run-time cost is strongly recommended, as is a minimal compile-time cost in
the case where no sublocal names are used.
Forbidden special cases
=======================
In two situations, the use of SLNBs makes no sense, and could be confusing due
to the ``as`` keyword already having a different meaning in the same context.
1. Exception catching::
try:
...
except (Exception as e1) as e2:
...
The expression ``(Exception as e1)`` has the value ``Exception``, and
creates an SLNB ``e1 = Exception``. This is generally useless, and creates
the potential confusion in that these two statements do quite different
things:
except (Exception as e1):
except Exception as e2:
The latter captures the exception **instance**, while the former captures
the ``Exception`` **type** (not the type of the raised exception).
2. Context managers::
lock = threading.Lock()
with (lock as l) as m:
...
This captures the original Lock object as ``l``, and the result of calling
its ``__enter__`` method as ``m``. As with ``except`` statements, this
creates a situation in which parenthesizing an expression subtly changes
its semantics, with the additional pitfall that this will frequently work
(when ``x.__enter__()`` returns x, eg with file objects).
Both of these are forbidden; creating SLNBs in the headers of these statements
will result in a SyntaxError.
Rejected alternative proposals
==============================
@ -295,19 +173,25 @@ Alternative spellings
Broadly the same semantics as the current proposal, but spelled differently.
1. ``EXPR as NAME`` without parentheses::
1. ``EXPR as NAME``, with or without parentheses::
stuff = [[f(x) as y, x/y] for x in range(5)]
Omitting the parentheses from this PEP's proposed syntax introduces many
Omitting the parentheses in this form of the proposal introduces many
syntactic ambiguities. Requiring them in all contexts leaves open the
option to make them optional in specific situations where the syntax is
unambiguous (cf generator expressions as sole parameters in function
calls), but there is no plausible way to make them optional everywhere.
With the parentheses, this becomes a viable option, with its own tradeoffs
in syntactic ambiguity. Since ``EXPR as NAME`` already has meaning in
``except`` and ``with`` statements (with different semantics), this would
create unnecessary confusion or require special-casing.
2. Adorning statement-local names with a leading dot::
stuff = [[(f(x) as .y), x/.y] for x in range(5)]
stuff = [[(f(x) as .y), x/.y] for x in range(5)] # with "as"
stuff = [[(.y := f(x)), x/.y] for x in range(5)] # with ":="
This has the advantage that leaked usage can be readily detected, removing
some forms of syntactic ambiguity. However, this would be the only place
@ -323,7 +207,8 @@ Broadly the same semantics as the current proposal, but spelled differently.
Execution order is inverted (the indented body is performed first, followed
by the "header"). This requires a new keyword, unless an existing keyword
is repurposed (most likely ``with:``).
is repurposed (most likely ``with:``). See PEP 3150 for prior discussion
on this subject (with the proposed keyword being ``given:``).
Special-casing conditional statements
@ -395,81 +280,42 @@ any name bindings. The only keyword that can be repurposed to this task is
in a statement; alternatively, a new keyword is needed, with all the costs
therein.
Assignment expressions
======================
Rather than creating a statement-local name, these forms of name binding have
the exact same semantics as regular assignment: bind to a local name unless
there's a ``global`` or ``nonlocal`` declaration.
Frequently Raised Objections
============================
Syntax options:
Why not just turn existing assignment into an expression?
---------------------------------------------------------
1. ``(EXPR as NAME)`` as per the promoted proposal
2. C-style ``NAME = EXPR`` in any context
3. A new and dedicated operator with C-like semantics ``NAME := EXPR``
The C syntax has been long known to be a bug magnet. The syntactic similarity
C and its derivatives define the ``=`` operator as an expression, rather than
a statement as is Python's way. This allows assignments in more contexts,
including contexts where comparisons are more common. The syntactic similarity
between ``if (x == y)`` and ``if (x = y)`` belies their drastically different
semantics. While this can be mitigated with good tools, such tools would need
to be deployed for Python, and even with perfect tooling, one-character bugs
will still happen. Creating a new operator mitigates this, but creates a
disconnect between regular assignment statements and these new assignment
expressions, or would result in the old syntax being a short-hand usable in
certain situations only.
Regardless of the syntax, all of these have the problem that wide-scope names
can be assigned to from an expression. This creates strange edge cases and
unexpected behaviour, such as::
# Name bindings inside list comprehensions usually won't leak
x = [(y as local) for z in iter]
# But occasionally they will!
x = [y for z in (iter as leaky)]
# Function default arguments are evaluated in the surrounding scope,
# not the enclosing scope
def x(y = (1 as z)):
# z here is closing over the outer variable
# z is a regular variable here
# Assignment targets are evaluated after the values to be assigned
x[y] = f((1 as y))
The same peculiarities can be seen with function calls and global/nonlocal
declarations, but will become considerably more likely to occur.
semantics. Thus this proposal uses ``:=`` to clarify the distinction.
Other uses of sublocals
=======================
This could be used to create ugly code!
---------------------------------------
Once sublocal name bindings exist as a concept, they could potentially be
used in additional ways.
So can anything else. This is a tool, and it is up to the programmer to use it
where it makes sense, and not use it where superior constructs can be used.
Exception catching
------------------
With assignment expressions, why bother with assignment statements?
-------------------------------------------------------------------
Currently, ``except Exception as e:`` binds to a regular (usually local) name,
and then unbinds this name. This could be changed to bind to a sublocal name
whose scope ends at the end of the except block.
List/set/dict comprehensions
----------------------------
Rather than create an entire function scope, a comprehension could create
subscopes for the names it binds to. They would thus be protected against
name leakage just as they are today, but without the edge cases around
class scope and name references.
The two forms have different flexibilities. The ``:=`` operator can be used
inside a larger expression; the ``=`` operator can be chained more
conveniently, and closely parallels the inline operations ``+=`` and friends.
The assignment statement is a clear declaration of intent: this value is to
be assigned to this target, and that's it.
References
==========
.. [1] Proof of concept / reference implementation
(https://github.com/Rosuav/cpython/tree/statement-local-variables)
(https://github.com/Rosuav/cpython/tree/assignment-expressions)
Copyright