python-peps/pep-0572.rst

297 lines
11 KiB
ReStructuredText
Raw Normal View History

2018-02-27 17:18:52 -05:00
PEP: 572
Title: Syntax for Statement-Local Name Bindings
Author: Chris Angelico <rosuav@gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 28-Feb-2018
Python-Version: 3.8
Post-History: 28-Feb-2018
Abstract
========
Programming is all about reusing code rather than duplicating it. When
an expression needs to be used twice in quick succession but never again,
it is convenient to assign it to a temporary name with small scope.
2018-02-27 17:18:52 -05:00
By permitting name bindings to exist within a single statement only, we
2018-02-28 00:22:34 -05:00
make this both convenient and safe against name collisions.
2018-02-27 17:18:52 -05:00
Rationale
=========
2018-02-28 00:22:34 -05:00
When a subexpression is used multiple times in a list comprehension, there
are currently several ways to spell this, none of which is universally
accepted as ideal. A statement-local name allows any subexpression to be
temporarily captured and then used multiple times.
2018-02-27 17:18:52 -05:00
Syntax and semantics
====================
In any context where arbitrary Python expressions can be used, a named
expression can appear. This must be parenthesized for clarity, and is of
2018-02-27 19:43:50 -05:00
the form ``(expr as NAME)`` where ``expr`` is any valid Python expression,
and ``NAME`` is a simple name.
2018-02-27 17:18:52 -05:00
The value of such a named expression is the same as the incorporated
expression, with the additional side-effect that NAME is bound to that
value in all retrievals for the remainder of the current statement.
2018-02-27 17:18:52 -05:00
Just as function-local names shadow global names for the scope of the
function, statement-local names shadow other names for that statement.
They can also shadow each other, though actually doing this should be
strongly discouraged in style guides.
Assignment to statement-local names is ONLY through this syntax. Regular
assignment to the same name will remove the statement-local name and
affect the name in the surrounding scope (function, class, or module).
Execution order and its consequences
------------------------------------
Since the statement-local name binding lasts from its point of execution
to the end of the current statement, this can potentially cause confusion
when the actual order of execution does not match the programmer's
expectations. Some examples::
# A simple statement ends at the newline or semicolon.
a = (1 as y)
print(y) # NameError
# The assignment ignores the SLNB - this adds one to 'a'
a = (a + 1 as a)
# Compound statements usually enclose everything...
if (re.match(...) as m):
print(m.groups(0))
print(m) # NameError
# ... except when function bodies are involved...
if (input("> ") as cmd):
def run_cmd():
print("Running command", cmd) # NameError
# ... but function *headers* are executed immediately
if (input("> ") as cmd):
def run_cmd(cmd=cmd): # Capture the value in the default arg
print("Running command", cmd) # Works
Some of these examples should be considered *bad code* and rejected by code
review and/or linters; they are not, however, illegal.
2018-02-27 17:18:52 -05:00
Example usage
=============
These list comprehensions are all approximately equivalent::
# Calling the function twice
stuff = [[f(x), x/f(x)] for x in range(5)]
2018-02-27 17:18:52 -05:00
2018-02-28 00:22:34 -05:00
# External helper function
def pair(x, value): return [value, x/value]
stuff = [pair(x, f(x)) for x in range(5)]
2018-02-27 17:18:52 -05:00
# Inline helper function
stuff = [(lambda y: [y,x/y])(f(x)) for x in range(5)]
2018-02-27 17:18:52 -05:00
# Extra 'for' loop - see also Serhiy's optimization
stuff = [[y, x/y] for x in range(5) for y in [f(x)]]
# Iterating over a genexp
stuff = [[y, x/y] for x, y in ((x, f(x)) for x in range(5))]
2018-02-27 17:18:52 -05:00
# Expanding the comprehension into a loop
stuff = []
for x in range(5):
y = f(x)
stuff.append([y, x/y])
2018-02-27 17:18:52 -05:00
# Wrapping the loop in a generator function
def g():
for x in range(5):
y = f(x)
yield [y, y]
stuff = list(g)
2018-02-27 17:18:52 -05:00
# Using a statement-local name
stuff = [[(f(x) as y), x/y] for x in range(5)]
2018-02-27 17:18:52 -05:00
2018-02-27 19:43:50 -05:00
If calling ``f(x)`` is expensive or has side effects, the clean operation of
2018-02-27 17:18:52 -05:00
the list comprehension gets muddled. Using a short-duration name binding
2018-02-27 19:43:50 -05:00
retains the simplicity; while the extra ``for`` loop does achieve this, it
2018-02-27 17:18:52 -05:00
does so at the cost of dividing the expression visually, putting the named
part at the end of the comprehension instead of the beginning.
Statement-local name bindings can be used in any context, but should be
2018-02-27 19:43:50 -05:00
avoided where regular assignment can be used, just as ``lambda`` should be
avoided when ``def`` is an option. As the name's scope extends to the full
current statement, even a block statement, this can be used to good effect
in the header of an ``if`` or ``while`` statement::
while (input("> ") as command) != "quit":
print("You entered:", command)
# See, for instance, Lib/pydoc.py
if (re.search(pat, text) as match):
print("Found:", match.group(0))
while (sock.read() as data):
print("Received data:", data)
Particularly with the ``while`` loop, this can remove the need to have an
infinite loop, an assignment, and a condition. It also creates a smooth
parallel between a loop which simply uses a function call as its condition,
and one which uses that as its condition but also uses the actual value.
2018-02-27 17:18:52 -05:00
Performance costs
=================
The cost of SLNBs must be kept to a minimum, particularly when they are not
used; the normal case MUST NOT be measurably penalized. SLNBs are expected
to be uncommon, and using many of them in a single function should definitely
be discouraged. Thus the current implementation uses a linked list of SLNB
cells, with the absence of such a list being the normal case. This list is
used for code compilation only; once a function's bytecode has been baked in,
execution of that bytecode has no performance cost compared to regular
assignment.
Other Python implementations may choose to do things differently, but a zero
run-time cost is strongly recommended, as is a minimal compile-time cost in
the case where no SLNBs are used.
2018-02-27 17:18:52 -05:00
Open questions
==============
2018-02-27 19:43:50 -05:00
1. What happens if the name has already been used? ``(x, (1 as x), x)``
2018-02-27 17:18:52 -05:00
Currently, prior usage functions as if the named expression did not
exist (following the usual lookup rules); the new name binding will
shadow the other name from the point where it is evaluated until the
end of the statement. Is this acceptable? Should it raise a syntax
error or warning?
2. Syntactic confusion in ``except`` statements. While technically
2018-02-27 17:18:52 -05:00
unambiguous, it is potentially confusing to humans. In Python 3.7,
2018-02-27 19:43:50 -05:00
parenthesizing ``except (Exception as e):`` is illegal, and there is no
2018-02-27 17:18:52 -05:00
reason to capture the exception type (as opposed to the exception
instance, as is done by the regular syntax). Should this be made
outright illegal, to prevent confusion? Can it be left to linters?
It may also (and independently) be of value to use a subscope for the
normal except clause binding, such that ``except Exception as e:`` will
no longer unbind a previous use of the name ``e``.
2018-02-27 17:18:52 -05:00
3. Similar confusion in ``with`` statements, with the difference that there
2018-02-27 17:18:52 -05:00
is good reason to capture the result of an expression, and it is also
2018-02-27 19:43:50 -05:00
very common for ``__enter__`` methods to return ``self``. In many cases,
``with expr as name:`` will do the same thing as ``with (expr as name):``,
2018-02-27 17:18:52 -05:00
adding to the confusion.
Alternative proposals
=====================
Proposals of this nature have come up frequently on python-ideas. Below are
a number of alternative syntaxes, some of them specific to comprehensions,
which have been rejected in favour of the one given above.
1. ``where``, ``let``, ``given``::
stuff = [(y, x/y) where y = f(x) for x in range(5)]
stuff = [(y, x/y) let y = f(x) for x in range(5)]
stuff = [(y, x/y) given y = f(x) for x in range(5)]
This brings the subexpression to a location in between the 'for' loop and
the expression. It introduces an additional language keyword, which creates
conflicts. Of the three, ``where`` reads the most cleanly, but also has the
greatest potential for conflict (eg SQLAlchemy and numpy have ``where``
methods, as does ``tkinter.dnd.Icon`` in the standard library).
2. ``with``::
stuff = [(y, x/y) with y = f(x) for x in range(5)]
As above, but reusing the `with` keyword. Doesn't read too badly, and needs
no additional language keyword. Is restricted to comprehensions, though,
and cannot as easily be transformed into "longhand" for-loop syntax. Has
the C problem that an equals sign in an expression can now create a name
binding, rather than performing a comparison.
3. ``with... as``::
stuff = [(y, x/y) with f(x) as y for x in range(5)]
As per option 2, but using ``as`` in place of the equals sign. Aligns
syntactically with other uses of ``as`` for name binding, but a simple
transformation to for-loop longhand would create drastically different
semantics; the meaning of ``with`` inside a comprehension would be
completely different from the meaning as a stand-alone statement.
4. ``EXPR as NAME`` without parentheses::
stuff = [[f(x) as y, x/y] for x in range(5)]
Omitting the parentheses from this PEP's proposed syntax introduces many
syntactic ambiguities.
5. Adorning statement-local names with a leading dot::
stuff = [[(f(x) as .y), x/.y] for x in range(5)]
This has the advantage that leaked usage can be readily detected, removing
some forms of syntactic ambiguity. However, this would be the only place
in Python where a variable's scope is encoded into its name, making
refactoring harder. This syntax is quite viable, and could be promoted to
become the current recommendation if its advantages are found to outweigh
its cost.
6. Allowing ``(EXPR as NAME)`` to assign to any form of name.
This is exactly the same as the promoted proposal, save that the name is
bound in the same scope that it would otherwise have. Any expression can
assign to any name, just as it would if the ``=`` operator had been used.
Discrepancies in the current implementation
===========================================
1. SLNBs are implemented using a special (and mostly-invisible) name
mangling. This works perfectly inside functions (including list
comprehensions), but not at top level. Making this work at top-level
and in class definitions would be very much of value.
2. The interaction with locals() is currently slightly buggy. Ideally,
statement-local names will appear in locals() while they are active (and
shadow any other names from the same function); as an alternative, they
could simply never appear, even when in scope.
3. Assigning 'through' a SLNB is not implemented yet.
2018-02-27 17:18:52 -05:00
References
==========
.. [1] Proof of concept / reference implementation
(https://github.com/Rosuav/cpython/tree/statement-local-variables)
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: