394 lines
15 KiB
ReStructuredText
394 lines
15 KiB
ReStructuredText
PEP: 572
|
||
Title: Syntax for Statement-Local Name Bindings
|
||
Author: Chris Angelico <rosuav@gmail.com>
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 28-Feb-2018
|
||
Python-Version: 3.8
|
||
Post-History: 28-Feb-2018, 02-Mar-2018, 23-Mar-2018
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
This is a proposal for permitting temporary name bindings
|
||
which are limited to a single statement.
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
Programmers generally prefer reusing code rather than duplicating it. When
|
||
an expression needs to be used twice in quick succession but never again,
|
||
it is convenient to assign it to a temporary name with small scope.
|
||
By permitting name bindings to exist within a single statement only, we
|
||
make this both convenient and safe against name collisions.
|
||
|
||
This is particularly notable in list/dict/set comprehensions and generator
|
||
expressions, where refactoring a subexpression into an assignment statement
|
||
is not possible. There are currently several ways to create a temporary name
|
||
binding inside a list comprehension, none of which is universally
|
||
accepted as ideal. A statement-local name allows any subexpression to be
|
||
temporarily captured and then used multiple times.
|
||
|
||
Additionally, this syntax can in places be used to remove the need to write an
|
||
infinite loop with a ``break`` in it. Capturing part of a ``while`` loop's
|
||
condition can improve the clarity of the loop header while still making the
|
||
actual value available within the loop body.
|
||
|
||
|
||
Syntax and semantics
|
||
====================
|
||
|
||
In any context where arbitrary Python expressions can be used, a **named
|
||
expression** can appear. This must be parenthesized for clarity, and is of
|
||
the form ``(expr as NAME)`` where ``expr`` is any valid Python expression,
|
||
and ``NAME`` is a simple name.
|
||
|
||
The value of such a named expression is the same as the incorporated
|
||
expression, with the additional side-effect that NAME is bound to that
|
||
value for the remainder of the current statement. For example::
|
||
|
||
# Similar to the boolean 'or' but checking for None specifically
|
||
x = "default" if (spam().ham as eggs) is None else eggs
|
||
|
||
# Even complex expressions can be built up piece by piece
|
||
y = ((spam() as eggs), (eggs.method() as cheese), cheese[eggs])
|
||
|
||
Just as function-local names shadow global names for the scope of the
|
||
function, statement-local names shadow other names for that statement.
|
||
(They can technically also shadow each other, though actually doing this
|
||
should not be encouraged.)
|
||
|
||
Assignment to statement-local names is ONLY through this syntax. Regular
|
||
assignment to the same name will remove the statement-local name and
|
||
affect the name in the surrounding scope (function, class, or module).
|
||
|
||
Statement-local names never appear in locals() or globals(), and cannot be
|
||
closed over by nested functions.
|
||
|
||
|
||
Execution order and its consequences
|
||
------------------------------------
|
||
|
||
Since the statement-local name binding lasts from its point of execution
|
||
to the end of the current statement, this can potentially cause confusion
|
||
when the actual order of execution does not match the programmer's
|
||
expectations. Some examples::
|
||
|
||
# A simple statement ends at the newline or semicolon.
|
||
a = (1 as y)
|
||
print(y) # NameError
|
||
|
||
# The assignment ignores the SLNB - this adds one to 'a'
|
||
a = (a + 1 as a)
|
||
|
||
# Compound statements usually enclose everything...
|
||
if (re.match(...) as m):
|
||
print(m.groups(0))
|
||
print(m) # NameError
|
||
|
||
# ... except when function bodies are involved...
|
||
if (input("> ") as cmd):
|
||
def run_cmd():
|
||
print("Running command", cmd) # NameError
|
||
|
||
# ... but function *headers* are executed immediately
|
||
if (input("> ") as cmd):
|
||
def run_cmd(cmd=cmd): # Capture the value in the default arg
|
||
print("Running command", cmd) # Works
|
||
|
||
Function bodies, in this respect, behave the same way they do in class scope;
|
||
assigned names are not closed over by method definitions. Defining a function
|
||
inside a loop already has potentially-confusing consequences, and SLNBs do not
|
||
materially worsen the existing situation.
|
||
|
||
|
||
Differences from regular assignment statements
|
||
----------------------------------------------
|
||
|
||
Using ``(EXPR as NAME)`` is similar to ``NAME = EXPR``, but has a number of
|
||
important distinctions.
|
||
|
||
* Assignment is a statement; an SLNB is an expression whose value is the same
|
||
as the object bound to the new name.
|
||
* SLNBs disappear at the end of their enclosing statement, at which point the
|
||
name again refers to whatever it previously would have. SLNBs can thus
|
||
shadow other names without conflict (although deliberately doing so will
|
||
often be a sign of bad code).
|
||
* SLNBs cannot be closed over by nested functions, and are completely ignored
|
||
for this purpose.
|
||
* SLNBs do not appear in ``locals()`` or ``globals()``.
|
||
* An SLNB cannot be the target of any form of assignment, including augmented.
|
||
Attempting to do so will remove the SLNB and assign to the fully-scoped name.
|
||
|
||
In many respects, an SLNB is akin to a local variable in an imaginary nested
|
||
function, except that the overhead of creating and calling a function is
|
||
bypassed. As with names bound by ``for`` loops inside list comprehensions,
|
||
SLNBs cannot "leak" into their surrounding scope.
|
||
|
||
|
||
Example usage
|
||
=============
|
||
|
||
These list comprehensions are all approximately equivalent::
|
||
|
||
# Calling the function twice
|
||
stuff = [[f(x), x/f(x)] for x in range(5)]
|
||
|
||
# External helper function
|
||
def pair(x, value): return [value, x/value]
|
||
stuff = [pair(x, f(x)) for x in range(5)]
|
||
|
||
# Inline helper function
|
||
stuff = [(lambda y: [y,x/y])(f(x)) for x in range(5)]
|
||
|
||
# Extra 'for' loop - potentially could be optimized internally
|
||
stuff = [[y, x/y] for x in range(5) for y in [f(x)]]
|
||
|
||
# Iterating over a genexp
|
||
stuff = [[y, x/y] for x, y in ((x, f(x)) for x in range(5))]
|
||
|
||
# Expanding the comprehension into a loop
|
||
stuff = []
|
||
for x in range(5):
|
||
y = f(x)
|
||
stuff.append([y, x/y])
|
||
|
||
# Wrapping the loop in a generator function
|
||
def g():
|
||
for x in range(5):
|
||
y = f(x)
|
||
yield [y, x/y]
|
||
stuff = list(g())
|
||
|
||
# Using a statement-local name
|
||
stuff = [[(f(x) as y), x/y] for x in range(5)]
|
||
|
||
If calling ``f(x)`` is expensive or has side effects, the clean operation of
|
||
the list comprehension gets muddled. Using a short-duration name binding
|
||
retains the simplicity; while the extra ``for`` loop does achieve this, it
|
||
does so at the cost of dividing the expression visually, putting the named
|
||
part at the end of the comprehension instead of the beginning.
|
||
|
||
Statement-local name bindings can be used in any context, but should be
|
||
avoided where regular assignment can be used, just as ``lambda`` should be
|
||
avoided when ``def`` is an option. As the name's scope extends to the full
|
||
current statement, even a block statement, this can be used to good effect
|
||
in the header of an ``if`` or ``while`` statement::
|
||
|
||
# Current Python, not caring about function return value
|
||
while input("> ") != "quit":
|
||
print("You entered a command.")
|
||
|
||
# Current Python, capturing return value - four-line loop header
|
||
while True:
|
||
command = input("> ");
|
||
if command == "quit":
|
||
break
|
||
print("You entered:", command)
|
||
|
||
# Proposed alternative to the above
|
||
while (input("> ") as command) != "quit":
|
||
print("You entered:", command)
|
||
|
||
# See, for instance, Lib/pydoc.py
|
||
if (re.search(pat, text) as match):
|
||
print("Found:", match.group(0))
|
||
|
||
while (sock.read() as data):
|
||
print("Received data:", data)
|
||
|
||
Particularly with the ``while`` loop, this can remove the need to have an
|
||
infinite loop, an assignment, and a condition. It also creates a smooth
|
||
parallel between a loop which simply uses a function call as its condition,
|
||
and one which uses that as its condition but also uses the actual value.
|
||
|
||
|
||
Performance costs
|
||
=================
|
||
|
||
The cost of SLNBs must be kept to a minimum, particularly when they are not
|
||
used; the normal case MUST NOT be measurably penalized. SLNBs are expected
|
||
to be uncommon, and using many of them in a single function should definitely
|
||
be discouraged. Thus the current implementation uses a linked list of SLNB
|
||
cells, with the absence of such a list being the normal case. This list is
|
||
used for code compilation only; once a function's bytecode has been baked in,
|
||
execution of that bytecode has no performance cost compared to regular
|
||
assignment.
|
||
|
||
Other Python implementations may choose to do things differently, but a zero
|
||
run-time cost is strongly recommended, as is a minimal compile-time cost in
|
||
the case where no SLNBs are used.
|
||
|
||
|
||
Forbidden special cases
|
||
=======================
|
||
|
||
In two situations, the use of SLNBs makes no sense, and could be confusing due
|
||
to the ``as`` keyword already having a different meaning in the same context.
|
||
|
||
1. Exception catching::
|
||
|
||
try:
|
||
...
|
||
except (Exception as e1) as e2:
|
||
...
|
||
|
||
The expression ``(Exception as e1)`` has the value ``Exception``, and
|
||
creates an SLNB ``e1 = Exception``. This is generally useless, and creates
|
||
the potential confusion in that these two statements do quite different
|
||
things:
|
||
|
||
except (Exception as e1):
|
||
except Exception as e2:
|
||
|
||
The latter captures the exception **instance**, while the former captures
|
||
the ``Exception`` **type** (not the type of the raised exception).
|
||
|
||
2. Context managers::
|
||
|
||
lock = threading.Lock()
|
||
with (lock as l) as m:
|
||
...
|
||
|
||
This captures the original Lock object as ``l``, and the result of calling
|
||
its ``__enter__`` method as ``m``. As with ``except`` statements, this
|
||
creates a situation in which parenthesizing an expression subtly changes
|
||
its semantics, with the additional pitfall that this will frequently work
|
||
(when ``x.__enter__()`` returns x, eg with file objects).
|
||
|
||
Both of these are forbidden; creating SLNBs in the headers of these statements
|
||
will result in a SyntaxError.
|
||
|
||
|
||
Alternative proposals and variants
|
||
==================================
|
||
|
||
Proposals broadly similar to this one have come up frequently on python-ideas.
|
||
Below are a number of alternative syntaxes, some of them specific to
|
||
comprehensions, which have been rejected in favour of the one given above.
|
||
|
||
1. ``where``, ``let``, or ``given``, in comprehensions only::
|
||
|
||
stuff = [(y, x/y) where y = f(x) for x in range(5)]
|
||
stuff = [(y, x/y) let y = f(x) for x in range(5)]
|
||
stuff = [(y, x/y) given y = f(x) for x in range(5)]
|
||
|
||
This brings the subexpression to a location in between the 'for' loop and
|
||
the expression. It introduces an additional language keyword, which creates
|
||
conflicts. Of the three, ``where`` reads the most cleanly, but also has the
|
||
greatest potential for conflict (eg SQLAlchemy and numpy have ``where``
|
||
methods, as does ``tkinter.dnd.Icon`` in the standard library).
|
||
|
||
2. ``with NAME = EXPR``::
|
||
|
||
stuff = [(y, x/y) with y = f(x) for x in range(5)]
|
||
|
||
As above, but reusing the `with` keyword. Doesn't read too badly, and needs
|
||
no additional language keyword. Is restricted to comprehensions, though,
|
||
and cannot as easily be transformed into "longhand" for-loop syntax. Has
|
||
the C problem that an equals sign in an expression can now create a name
|
||
binding, rather than performing a comparison. Would raise the question of
|
||
why "with NAME = EXPR:" cannot be used as a statement on its own.
|
||
|
||
3. ``with EXPR as NAME``::
|
||
|
||
stuff = [(y, x/y) with f(x) as y for x in range(5)]
|
||
|
||
As per option 2, but using ``as`` rather than an equals sign. Aligns
|
||
syntactically with other uses of ``as`` for name binding, but a simple
|
||
transformation to for-loop longhand would create drastically different
|
||
semantics; the meaning of ``with`` inside a comprehension would be
|
||
completely different from the meaning as a stand-alone statement, while
|
||
retaining identical syntax.
|
||
|
||
4. ``EXPR as NAME`` without parentheses::
|
||
|
||
stuff = [[f(x) as y, x/y] for x in range(5)]
|
||
|
||
Omitting the parentheses from this PEP's proposed syntax introduces many
|
||
syntactic ambiguities. Requiring them in all contexts leaves open the
|
||
option to make them optional in specific situations where the syntax is
|
||
unambiguous (cf generator expressions as sole parameters in function
|
||
calls), but there is no plausible way to make them optional everywhere.
|
||
|
||
5. Adorning statement-local names with a leading dot::
|
||
|
||
stuff = [[(f(x) as .y), x/.y] for x in range(5)]
|
||
|
||
This has the advantage that leaked usage can be readily detected, removing
|
||
some forms of syntactic ambiguity. However, this would be the only place
|
||
in Python where a variable's scope is encoded into its name, making
|
||
refactoring harder. This syntax is quite viable, and could be promoted to
|
||
become the current recommendation if its advantages are found to outweigh
|
||
its cost.
|
||
|
||
6. Allowing ``(EXPR as NAME)`` to assign to any form of name.
|
||
|
||
This is exactly the same as the promoted proposal, save that the name is
|
||
bound in the same scope that it would otherwise have. Any expression can
|
||
assign to any name, just as it would if the ``=`` operator had been used.
|
||
Such variables would leak out of the statement into the enclosing function,
|
||
subject to the regular behaviour of comprehensions (since they implicitly
|
||
create a nested function, the name binding would be restricted to the
|
||
comprehension itself, just as with the names bound by ``for`` loops).
|
||
|
||
7. Enhancing ``if`` and ``while`` syntax to permit the capture of their
|
||
conditions::
|
||
|
||
if re.search(pat, text) as match:
|
||
print("Found:", match.group(0))
|
||
|
||
This works beautifully if and ONLY if the desired condition is based on the
|
||
truthiness of the captured value. It is thus effective for specific
|
||
use-cases (regex matches, socket reads that return `''` when done), and
|
||
completely useless in more complicated cases (eg where the condition is
|
||
``f(x) < 0`` and you want to capture the value of ``f(x)``). It also has
|
||
no benefit to list comprehensions.
|
||
|
||
8. Adding a ``where:`` to any statement to create local name bindings::
|
||
|
||
value = x**2 + 2*x where:
|
||
x = spam(1, 4, 7, q)
|
||
|
||
Execution order is inverted (the indented body is performed first, followed
|
||
by the "header"). This requires a new keyword, unless an existing keyword
|
||
is repurposed (most likely ``with:``).
|
||
|
||
|
||
Discrepancies in the current implementation
|
||
===========================================
|
||
|
||
1. SLNBs are implemented using a special (and mostly-invisible) name
|
||
mangling. They may sometimes appear in globals() and/or locals() with
|
||
their simple or mangled names (but buggily and unreliably). They should
|
||
be suppressed as though they were guinea pigs.
|
||
|
||
2. The forbidden special cases do not yet raise SyntaxError.
|
||
|
||
|
||
References
|
||
==========
|
||
|
||
.. [1] Proof of concept / reference implementation
|
||
(https://github.com/Rosuav/cpython/tree/statement-local-variables)
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|