PEP 572: Wholesale rewrites to bring the document in line with reality
This commit is contained in:
parent
f58b37ce8e
commit
731071e222
300
pep-0572.rst
300
pep-0572.rst
|
@ -1,5 +1,5 @@
|
|||
PEP: 572
|
||||
Title: Sublocal-Scoped Assignment Expressions
|
||||
Title: Assignment Expressions
|
||||
Author: Chris Angelico <rosuav@gmail.com>
|
||||
Status: Draft
|
||||
Type: Standards Track
|
||||
|
@ -12,120 +12,64 @@ Post-History: 28-Feb-2018, 02-Mar-2018, 23-Mar-2018
|
|||
Abstract
|
||||
========
|
||||
|
||||
This is a proposal for permitting temporary name bindings
|
||||
which are limited to a single statement.
|
||||
This is a proposal for creating a way to assign to names within an expression.
|
||||
Additionally, the precise scope of comprehensions is adjusted, to maintain
|
||||
consistency and follow expectations.
|
||||
|
||||
|
||||
Rationale
|
||||
=========
|
||||
|
||||
Programmers generally prefer reusing code rather than duplicating it. When
|
||||
an expression needs to be used twice in quick succession but never again,
|
||||
it is convenient to assign it to a temporary name with sublocal scope.
|
||||
By permitting name bindings to exist within a single statement only, we
|
||||
make this both convenient and safe against name collisions.
|
||||
|
||||
This is particularly notable in list/dict/set comprehensions and generator
|
||||
expressions, where refactoring a subexpression into an assignment statement
|
||||
is not possible. There are currently several ways to create a temporary name
|
||||
binding inside a list comprehension, none of which is universally
|
||||
accepted as ideal. A statement-local name allows any subexpression to be
|
||||
temporarily captured and then used multiple times.
|
||||
|
||||
Additionally, this syntax can in places be used to remove the need to write an
|
||||
infinite loop with a ``break`` in it. Capturing part of a ``while`` loop's
|
||||
condition can improve the clarity of the loop header while still making the
|
||||
actual value available within the loop body.
|
||||
Naming the result of an expression is an important part of programming,
|
||||
allowing a descriptive name to be used in place of a longer expression,
|
||||
and permitting reuse. Currently, this feature is available only in
|
||||
statement form, making it unavailable in list comprehensions and other
|
||||
expression contexts. Merely introducing a way to assign as an expression
|
||||
would create bizarre edge cases around comprehensions, though, and to avoid
|
||||
the worst of the confusions, we change the definition of comprehensions,
|
||||
causing some edge cases to be interpreted differently, but maintaining the
|
||||
existing behaviour in the majority of situations.
|
||||
|
||||
|
||||
Syntax and semantics
|
||||
====================
|
||||
|
||||
In any context where arbitrary Python expressions can be used, a **named
|
||||
expression** can appear. This must be parenthesized for clarity, and is of
|
||||
the form ``(expr as NAME)`` where ``expr`` is any valid Python expression,
|
||||
and ``NAME`` is a simple name.
|
||||
expression** can appear. This can be parenthesized for clarity, and is of
|
||||
the form ``(target := expr)`` where ``expr`` is any valid Python expression,
|
||||
and ``target`` is any valid assignment target.
|
||||
|
||||
The value of such a named expression is the same as the incorporated
|
||||
expression, with the additional side-effect that NAME is bound to that
|
||||
value for the remainder of the current statement. For example::
|
||||
expression, with the additional side-effect that the target is assigned
|
||||
that value.
|
||||
|
||||
# Similar to the boolean 'or' but checking for None specifically
|
||||
x = "default" if (spam().ham as eggs) is None else eggs
|
||||
x = "default" if (eggs := spam().ham) is None else eggs
|
||||
|
||||
# Even complex expressions can be built up piece by piece
|
||||
y = ((spam() as eggs), (eggs.method() as cheese), cheese[eggs])
|
||||
|
||||
Just as function-local names shadow global names for the scope of the
|
||||
function, sublocal names shadow other names for that statement. (This
|
||||
includes other sublocal names.)
|
||||
|
||||
Assignment to sublocal names is ONLY through this syntax. Regular
|
||||
assignment to the same name will remove the sublocal name and
|
||||
affect the name in the surrounding scope (function, class, or module).
|
||||
|
||||
Sublocal names never appear in locals() or globals(), and cannot be
|
||||
closed over by nested functions.
|
||||
|
||||
|
||||
Execution order and its consequences
|
||||
------------------------------------
|
||||
|
||||
Since the sublocal name binding lasts from its point of execution
|
||||
to the end of the current statement, this can potentially cause confusion
|
||||
when the actual order of execution does not match the programmer's
|
||||
expectations. Some examples::
|
||||
|
||||
# A simple statement ends at the newline or semicolon.
|
||||
a = (1 as y)
|
||||
print(y) # NameError
|
||||
|
||||
# The assignment ignores the SLNB - this adds one to 'a'
|
||||
a = (a + 1 as a)
|
||||
|
||||
# Compound statements usually enclose everything...
|
||||
if (re.match(...) as m):
|
||||
print(m.groups(0))
|
||||
print(m) # NameError
|
||||
|
||||
# ... except when function bodies are involved...
|
||||
if (input("> ") as cmd):
|
||||
def run_cmd():
|
||||
print("Running command", cmd) # NameError
|
||||
|
||||
# ... but function *headers* are executed immediately
|
||||
if (input("> ") as cmd):
|
||||
def run_cmd(cmd=cmd): # Capture the value in the default arg
|
||||
print("Running command", cmd) # Works
|
||||
|
||||
Function bodies, in this respect, behave the same way they do in class scope;
|
||||
assigned names are not closed over by method definitions. Defining a function
|
||||
inside a loop already has potentially-confusing consequences, and sublocals
|
||||
do not change the existing situation.
|
||||
y = ((eggs := spam()), (cheese := eggs.method()), cheese[eggs])
|
||||
|
||||
|
||||
Differences from regular assignment statements
|
||||
----------------------------------------------
|
||||
|
||||
Using ``(EXPR as NAME)`` is similar to ``NAME = EXPR``, but has a number of
|
||||
important distinctions.
|
||||
An assignment statement can assign to multiple targets::
|
||||
|
||||
* Assignment is a statement; sublocal creaation is an expression whose value
|
||||
is the same as the object bound to the new name.
|
||||
* Sublocals disappear at the end of their enclosing statement, at which point
|
||||
the name again refers to whatever it previously would have. Sublocals can
|
||||
thus shadow other names without conflict.
|
||||
* Sublocals cannot be closed over by nested functions, and are completely
|
||||
ignored for this purpose.
|
||||
* Sublocals do not appear in ``locals()`` or ``globals()``.
|
||||
* A sublocal cannot be the target of any form of assignment, including
|
||||
augmented. Attempting to do so will remove the sublocal and assign to the
|
||||
fully-scoped name.
|
||||
x = y = z = 0
|
||||
|
||||
In many respects, a sublocal variable is akin to a local variable in an
|
||||
imaginary nested function, except that the overhead of creating and calling
|
||||
a function is bypassed. As with names bound by ``for`` loops inside list
|
||||
comprehensions, sublocal names cannot "leak" into their surrounding scope.
|
||||
To do the same with assignment expressions, they must be parenthesized::
|
||||
|
||||
assert 0 == (x := (y := (z := 0)))
|
||||
|
||||
Augmented assignment is not supported in expression form::
|
||||
|
||||
>>> x +:= 1
|
||||
File "<stdin>", line 1
|
||||
x +:= 1
|
||||
^
|
||||
SyntaxError: invalid syntax
|
||||
|
||||
Otherwise, the semantics of assignment are unchanged by this proposal.
|
||||
|
||||
|
||||
Recommended use-cases
|
||||
|
@ -169,8 +113,8 @@ These list comprehensions are all approximately equivalent::
|
|||
c = {}
|
||||
stuff = [[c.update(y=f(x)) or c['y'], x/c['y']] for x in range(5)]
|
||||
|
||||
# Using a sublocal name
|
||||
stuff = [[(f(x) as y), x/y] for x in range(5)]
|
||||
# Using a temporary name
|
||||
stuff = [[y := f(x), x/y] for x in range(5)]
|
||||
|
||||
If calling ``f(x)`` is expensive or has side effects, the clean operation of
|
||||
the list comprehension gets muddled. Using a short-duration name binding
|
||||
|
@ -182,9 +126,8 @@ part at the end of the comprehension instead of the beginning.
|
|||
Capturing condition values
|
||||
--------------------------
|
||||
|
||||
Since a sublocal created by an assignment expression extends to the full
|
||||
current statement, even a block statement, this can be used to good effect
|
||||
in the header of an ``if`` or ``while`` statement::
|
||||
Assignment expressions can be used to good effect in the header of
|
||||
an ``if`` or ``while`` statement::
|
||||
|
||||
# Current Python, not caring about function return value
|
||||
while input("> ") != "quit":
|
||||
|
@ -198,17 +141,17 @@ in the header of an ``if`` or ``while`` statement::
|
|||
print("You entered:", command)
|
||||
|
||||
# Proposed alternative to the above
|
||||
while (input("> ") as command) != "quit":
|
||||
while (command := input("> ")) != "quit":
|
||||
print("You entered:", command)
|
||||
|
||||
# Capturing regular expression match objects
|
||||
# See, for instance, Lib/pydoc.py, which uses a multiline spelling
|
||||
# of this effect
|
||||
if (re.search(pat, text) as match):
|
||||
if match := re.search(pat, text):
|
||||
print("Found:", match.group(0))
|
||||
|
||||
# Reading socket data until an empty string is returned
|
||||
while (sock.read() as data):
|
||||
while data := sock.read():
|
||||
print("Received data:", data)
|
||||
|
||||
Particularly with the ``while`` loop, this can remove the need to have an
|
||||
|
@ -217,71 +160,6 @@ parallel between a loop which simply uses a function call as its condition,
|
|||
and one which uses that as its condition but also uses the actual value.
|
||||
|
||||
|
||||
Preventing temporaries from leaking
|
||||
-----------------------------------
|
||||
|
||||
Inside a class definition, any name assigned to will become a class attribute.
|
||||
Use of a sublocal name binding will prevent temporary variables from becoming
|
||||
public attributes of the class.
|
||||
|
||||
(TODO: Get example)
|
||||
|
||||
|
||||
Performance costs
|
||||
=================
|
||||
|
||||
The cost of sublocals must be kept to a minimum, particularly when they are not
|
||||
used; normal assignment should not be measurably penalized. The reference
|
||||
implementation uses a linked list of sublocal cells, with the absence of such
|
||||
a list being the normal case. This is used for code compilation only; once a
|
||||
function's bytecode has been baked in, execution of that bytecode has no
|
||||
performance cost compared to regular assignment.
|
||||
|
||||
Other Python implementations may choose to do things differently, but a zero
|
||||
run-time cost is strongly recommended, as is a minimal compile-time cost in
|
||||
the case where no sublocal names are used.
|
||||
|
||||
|
||||
Forbidden special cases
|
||||
=======================
|
||||
|
||||
In two situations, the use of SLNBs makes no sense, and could be confusing due
|
||||
to the ``as`` keyword already having a different meaning in the same context.
|
||||
|
||||
1. Exception catching::
|
||||
|
||||
try:
|
||||
...
|
||||
except (Exception as e1) as e2:
|
||||
...
|
||||
|
||||
The expression ``(Exception as e1)`` has the value ``Exception``, and
|
||||
creates an SLNB ``e1 = Exception``. This is generally useless, and creates
|
||||
the potential confusion in that these two statements do quite different
|
||||
things:
|
||||
|
||||
except (Exception as e1):
|
||||
except Exception as e2:
|
||||
|
||||
The latter captures the exception **instance**, while the former captures
|
||||
the ``Exception`` **type** (not the type of the raised exception).
|
||||
|
||||
2. Context managers::
|
||||
|
||||
lock = threading.Lock()
|
||||
with (lock as l) as m:
|
||||
...
|
||||
|
||||
This captures the original Lock object as ``l``, and the result of calling
|
||||
its ``__enter__`` method as ``m``. As with ``except`` statements, this
|
||||
creates a situation in which parenthesizing an expression subtly changes
|
||||
its semantics, with the additional pitfall that this will frequently work
|
||||
(when ``x.__enter__()`` returns x, eg with file objects).
|
||||
|
||||
Both of these are forbidden; creating SLNBs in the headers of these statements
|
||||
will result in a SyntaxError.
|
||||
|
||||
|
||||
Rejected alternative proposals
|
||||
==============================
|
||||
|
||||
|
@ -295,19 +173,25 @@ Alternative spellings
|
|||
|
||||
Broadly the same semantics as the current proposal, but spelled differently.
|
||||
|
||||
1. ``EXPR as NAME`` without parentheses::
|
||||
1. ``EXPR as NAME``, with or without parentheses::
|
||||
|
||||
stuff = [[f(x) as y, x/y] for x in range(5)]
|
||||
|
||||
Omitting the parentheses from this PEP's proposed syntax introduces many
|
||||
Omitting the parentheses in this form of the proposal introduces many
|
||||
syntactic ambiguities. Requiring them in all contexts leaves open the
|
||||
option to make them optional in specific situations where the syntax is
|
||||
unambiguous (cf generator expressions as sole parameters in function
|
||||
calls), but there is no plausible way to make them optional everywhere.
|
||||
|
||||
With the parentheses, this becomes a viable option, with its own tradeoffs
|
||||
in syntactic ambiguity. Since ``EXPR as NAME`` already has meaning in
|
||||
``except`` and ``with`` statements (with different semantics), this would
|
||||
create unnecessary confusion or require special-casing.
|
||||
|
||||
2. Adorning statement-local names with a leading dot::
|
||||
|
||||
stuff = [[(f(x) as .y), x/.y] for x in range(5)]
|
||||
stuff = [[(f(x) as .y), x/.y] for x in range(5)] # with "as"
|
||||
stuff = [[(.y := f(x)), x/.y] for x in range(5)] # with ":="
|
||||
|
||||
This has the advantage that leaked usage can be readily detected, removing
|
||||
some forms of syntactic ambiguity. However, this would be the only place
|
||||
|
@ -323,7 +207,8 @@ Broadly the same semantics as the current proposal, but spelled differently.
|
|||
|
||||
Execution order is inverted (the indented body is performed first, followed
|
||||
by the "header"). This requires a new keyword, unless an existing keyword
|
||||
is repurposed (most likely ``with:``).
|
||||
is repurposed (most likely ``with:``). See PEP 3150 for prior discussion
|
||||
on this subject (with the proposed keyword being ``given:``).
|
||||
|
||||
|
||||
Special-casing conditional statements
|
||||
|
@ -395,81 +280,42 @@ any name bindings. The only keyword that can be repurposed to this task is
|
|||
in a statement; alternatively, a new keyword is needed, with all the costs
|
||||
therein.
|
||||
|
||||
Assignment expressions
|
||||
======================
|
||||
|
||||
Rather than creating a statement-local name, these forms of name binding have
|
||||
the exact same semantics as regular assignment: bind to a local name unless
|
||||
there's a ``global`` or ``nonlocal`` declaration.
|
||||
Frequently Raised Objections
|
||||
============================
|
||||
|
||||
Syntax options:
|
||||
Why not just turn existing assignment into an expression?
|
||||
---------------------------------------------------------
|
||||
|
||||
1. ``(EXPR as NAME)`` as per the promoted proposal
|
||||
|
||||
2. C-style ``NAME = EXPR`` in any context
|
||||
|
||||
3. A new and dedicated operator with C-like semantics ``NAME := EXPR``
|
||||
|
||||
The C syntax has been long known to be a bug magnet. The syntactic similarity
|
||||
C and its derivatives define the ``=`` operator as an expression, rather than
|
||||
a statement as is Python's way. This allows assignments in more contexts,
|
||||
including contexts where comparisons are more common. The syntactic similarity
|
||||
between ``if (x == y)`` and ``if (x = y)`` belies their drastically different
|
||||
semantics. While this can be mitigated with good tools, such tools would need
|
||||
to be deployed for Python, and even with perfect tooling, one-character bugs
|
||||
will still happen. Creating a new operator mitigates this, but creates a
|
||||
disconnect between regular assignment statements and these new assignment
|
||||
expressions, or would result in the old syntax being a short-hand usable in
|
||||
certain situations only.
|
||||
|
||||
Regardless of the syntax, all of these have the problem that wide-scope names
|
||||
can be assigned to from an expression. This creates strange edge cases and
|
||||
unexpected behaviour, such as::
|
||||
|
||||
# Name bindings inside list comprehensions usually won't leak
|
||||
x = [(y as local) for z in iter]
|
||||
# But occasionally they will!
|
||||
x = [y for z in (iter as leaky)]
|
||||
|
||||
# Function default arguments are evaluated in the surrounding scope,
|
||||
# not the enclosing scope
|
||||
def x(y = (1 as z)):
|
||||
# z here is closing over the outer variable
|
||||
# z is a regular variable here
|
||||
|
||||
# Assignment targets are evaluated after the values to be assigned
|
||||
x[y] = f((1 as y))
|
||||
|
||||
The same peculiarities can be seen with function calls and global/nonlocal
|
||||
declarations, but will become considerably more likely to occur.
|
||||
semantics. Thus this proposal uses ``:=`` to clarify the distinction.
|
||||
|
||||
|
||||
Other uses of sublocals
|
||||
=======================
|
||||
This could be used to create ugly code!
|
||||
---------------------------------------
|
||||
|
||||
Once sublocal name bindings exist as a concept, they could potentially be
|
||||
used in additional ways.
|
||||
So can anything else. This is a tool, and it is up to the programmer to use it
|
||||
where it makes sense, and not use it where superior constructs can be used.
|
||||
|
||||
|
||||
Exception catching
|
||||
------------------
|
||||
With assignment expressions, why bother with assignment statements?
|
||||
-------------------------------------------------------------------
|
||||
|
||||
Currently, ``except Exception as e:`` binds to a regular (usually local) name,
|
||||
and then unbinds this name. This could be changed to bind to a sublocal name
|
||||
whose scope ends at the end of the except block.
|
||||
|
||||
|
||||
List/set/dict comprehensions
|
||||
----------------------------
|
||||
|
||||
Rather than create an entire function scope, a comprehension could create
|
||||
subscopes for the names it binds to. They would thus be protected against
|
||||
name leakage just as they are today, but without the edge cases around
|
||||
class scope and name references.
|
||||
The two forms have different flexibilities. The ``:=`` operator can be used
|
||||
inside a larger expression; the ``=`` operator can be chained more
|
||||
conveniently, and closely parallels the inline operations ``+=`` and friends.
|
||||
The assignment statement is a clear declaration of intent: this value is to
|
||||
be assigned to this target, and that's it.
|
||||
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
.. [1] Proof of concept / reference implementation
|
||||
(https://github.com/Rosuav/cpython/tree/statement-local-variables)
|
||||
(https://github.com/Rosuav/cpython/tree/assignment-expressions)
|
||||
|
||||
|
||||
Copyright
|
||||
|
|
Loading…
Reference in New Issue