PEP 572: Massively simplify and remove verbiage

This commit is contained in:
Chris Angelico 2018-04-20 12:53:15 +10:00
parent a58594429e
commit 1a3e52e197
1 changed files with 35 additions and 135 deletions

View File

@ -170,7 +170,7 @@ be late-bound as usual.
One consequence of this change is that certain bugs in genexps will not
be detected until the first call to ``next()``, where today they would be
caught upon creation of the generator. See 'open questions' below.
caught upon creation of the generator.
Recommended use-cases
@ -179,16 +179,18 @@ Recommended use-cases
Simplifying list comprehensions
-------------------------------
These list comprehensions are all approximately equivalent::
A list comprehension can map and filter efficiently by capturing
the condition::
results = [(x, y, x/y) for x in input_data if (y := f(x)) > 0]
Similarly, a subexpression can be reused within the main expression, by
giving it a name on first use::
stuff = [[y := f(x), x/y] for x in range(5)]
# There are a number of less obvious ways to spell this in current
# versions of Python.
# External helper function
def pair(x, value): return [value, x/value]
stuff = [pair(x, f(x)) for x in range(5)]
# versions of Python, such as:
# Inline helper function
stuff = [(lambda y: [y,x/y])(f(x)) for x in range(5)]
@ -196,36 +198,12 @@ These list comprehensions are all approximately equivalent::
# Extra 'for' loop - potentially could be optimized internally
stuff = [[y, x/y] for x in range(5) for y in [f(x)]]
# Iterating over a genexp
stuff = [[y, x/y] for x, y in ((x, f(x)) for x in range(5))]
# Expanding the comprehension into a loop
stuff = []
for x in range(5):
y = f(x)
stuff.append([y, x/y])
# Wrapping the loop in a generator function
def g():
for x in range(5):
y = f(x)
yield [y, x/y]
stuff = list(g())
# Using a mutable cache object (various forms possible)
c = {}
stuff = [[c.update(y=f(x)) or c['y'], x/c['y']] for x in range(5)]
If calling ``f(x)`` is expensive or has side effects, the clean operation of
the list comprehension gets muddled. Using a short-duration name binding
retains the simplicity; while the extra ``for`` loop does achieve this, it
does so at the cost of dividing the expression visually, putting the named
part at the end of the comprehension instead of the beginning.
Similarly, a list comprehension can map and filter efficiently by capturing
the condition::
results = [(x, y, x/y) for x in input_data if (y := f(x)) > 0]
In all cases, the name is local to the comprehension; like iteration variables,
it cannot leak out into the surrounding context.
Capturing condition values
@ -279,22 +257,14 @@ Alternative spellings
Broadly the same semantics as the current proposal, but spelled differently.
1. ``EXPR as NAME``, with or without parentheses::
1. ``EXPR as NAME``::
stuff = [[f(x) as y, x/y] for x in range(5)]
Omitting the parentheses in this form of the proposal introduces many
syntactic ambiguities. Requiring them in all contexts leaves open the
option to make them optional in specific situations where the syntax is
unambiguous (cf generator expressions as sole parameters in function
calls), but there is no plausible way to make them optional everywhere.
With the parentheses, this becomes a viable option, with its own tradeoffs
in syntactic ambiguity. Since ``EXPR as NAME`` already has meaning in
``except`` and ``with`` statements (with different semantics), this would
create unnecessary confusion or require special-casing (most notably of
``with`` and ``except`` statements, where a nearly-identical syntax has
different semantics).
Since ``EXPR as NAME`` already has meaning in ``except`` and ``with``
statements (with different semantics), this would create unnecessary
confusion or require special-casing (eg to forbid assignment within the
headers of these statements).
2. ``EXPR -> NAME``::
@ -315,9 +285,7 @@ Broadly the same semantics as the current proposal, but spelled differently.
This has the advantage that leaked usage can be readily detected, removing
some forms of syntactic ambiguity. However, this would be the only place
in Python where a variable's scope is encoded into its name, making
refactoring harder. This syntax is quite viable, and could be promoted to
become the current recommendation if its advantages are found to outweigh
its cost.
refactoring harder.
4. Adding a ``where:`` to any statement to create local name bindings::
@ -444,17 +412,26 @@ edge cases that were previously legal and now are not, and a few corner cases
with altered semantics.
Yield inside comprehensions
---------------------------
The Outermost Iterable
----------------------
As of Python 3.7, the outermost iterable in a comprehension is permitted to
contain a 'yield' expression. If this is required, the iterable (or at least
the yield) must be explicitly elevated from the comprehension::
As of Python 3.7, the outermost iterable in a comprehension is special: it is
evaluated in the surrounding context, instead of inside the comprehension.
Thus it is permitted to contain a ``yield`` expression, to use a name also
used elsewhere, and to reference names from class scope. Also, in a genexp,
the outermost iterable is pre-evaluated, but the rest of the code is not
touched until the genexp is first iterated over. Class scope is now handled
more generally (see above), but if other changes require the old behaviour,
the iterable must be explicitly elevated from the comprehension::
# Python 3.7
def f(x):
return [x for x in x if x]
def g():
return [x for x in [(yield 1)]]
# With PEP 572
def f(x):
return [y for y in x if y]
def g():
sent_item = (yield 1)
return [x for x in [sent_item]]
@ -463,57 +440,13 @@ This more clearly shows that it is g(), not the comprehension, which is able
to yield values (and is thus a generator function). The entire comprehension
is consistently in a single scope.
Name reuse inside comprehensions
--------------------------------
If the same name is used in the outermost iterable and also as an iteration
variable, this will now raise UnboundLocalError when previously it referred
to the name in the surrounding scope. Example::
# Lib/typing.py
tvars = []
for t in types:
if isinstance(t, TypeVar) and t not in tvars:
tvars.append(t)
if isinstance(t, _GenericAlias) and not t._special:
tvars.extend([ty for ty in t.__parameters__ if ty not in tvars])
If the list comprehension uses the name ``t`` rather than ``ty``, this will
work in Python 3.7 but not with this proposal. As with other unwanted name
shadowing, the solution is to use distinct names.
Name lookups in class scope
---------------------------
A comprehension inside a class previously was able to 'see' class members ONLY
from the outermost iterable. Other name lookups would ignore the class and
potentially locate a name at an outer scope::
pattern = "<%d>"
class X:
pattern = "[%d]"
numbers = [pattern % n for n in range(5)]
In Python 3.7, ``X.numbers`` would show angle brackets; with PEP 572, it would
show square brackets. Maintaining the current behaviour here is best done by
using distinct names for the different forms of ``pattern``, as would be the
case with functions.
Generator expression bugs can be caught later
---------------------------------------------
Certain types of bugs in genexps were previously caught more quickly. Some are
now detected only at first iteration::
The following expressions would, in Python 3.7, raise exceptions immediately.
With the removal of the outermost iterable's special casing, they are now
equivalent to the most obvious longhand form::
gen = (x for x in rage(10)) # NameError
gen = (x for x in 10) # TypeError (not iterable)
gen = (x for x in range(1/0)) # Exception raised during evaluation
This brings such generator expressions in line with a simple translation to
function form::
gen = (x for x in range(1/0)) # ZeroDivisionError
def <genexp>():
for x in rage(10):
@ -521,40 +454,10 @@ function form::
gen = <genexp>() # No exception yet
tng = next(gen) # NameError
Detecting these errors more quickly is nontrivial. It is, however, the exact
same problem as generator functions currently suffer from, and this proposal
brings the genexp in line with the most natural longhand form.
Open questions
==============
Can the outermost iterable still be evaluated early?
----------------------------------------------------
As of Python 3.7, the outermost iterable in a genexp is evaluated early, and
the result passed to the implicit function as an argument. With PEP 572, this
would no longer be the case. Can we still, somehow, evaluate it before moving
on? One possible implementation would be::
gen = (x for x in rage(10))
# translates to
def <genexp>():
iterable = iter(rage(10))
yield None
for x in iterable:
yield x
gen = <genexp>()
next(gen)
This would pump the iterable up to just before the loop starts, evaluating
exactly as much as is evaluated outside the generator function in Py3.7.
This would result in it being possible to call ``gen.send()`` immediately,
unlike with most generators, and may incur unnecessary overhead in the
common case where the iterable is pumped immediately (perhaps as part of a
larger expression).
Importing names into comprehensions
-----------------------------------
@ -657,9 +560,6 @@ in PEP 8 and/or other style guides.
2. If using assignment expressions would lead to ambiguity about
execution order, restructure it to use statements instead.
3. Chaining multiple assignment expressions should generally be avoided.
More than one assignment per expression can detract from readability.
Acknowledgements
================