PEP 572: Massively simplify and remove verbiage
This commit is contained in:
parent
a58594429e
commit
1a3e52e197
170
pep-0572.rst
170
pep-0572.rst
|
@ -170,7 +170,7 @@ be late-bound as usual.
|
||||||
|
|
||||||
One consequence of this change is that certain bugs in genexps will not
|
One consequence of this change is that certain bugs in genexps will not
|
||||||
be detected until the first call to ``next()``, where today they would be
|
be detected until the first call to ``next()``, where today they would be
|
||||||
caught upon creation of the generator. See 'open questions' below.
|
caught upon creation of the generator.
|
||||||
|
|
||||||
|
|
||||||
Recommended use-cases
|
Recommended use-cases
|
||||||
|
@ -179,16 +179,18 @@ Recommended use-cases
|
||||||
Simplifying list comprehensions
|
Simplifying list comprehensions
|
||||||
-------------------------------
|
-------------------------------
|
||||||
|
|
||||||
These list comprehensions are all approximately equivalent::
|
A list comprehension can map and filter efficiently by capturing
|
||||||
|
the condition::
|
||||||
|
|
||||||
|
results = [(x, y, x/y) for x in input_data if (y := f(x)) > 0]
|
||||||
|
|
||||||
|
Similarly, a subexpression can be reused within the main expression, by
|
||||||
|
giving it a name on first use::
|
||||||
|
|
||||||
stuff = [[y := f(x), x/y] for x in range(5)]
|
stuff = [[y := f(x), x/y] for x in range(5)]
|
||||||
|
|
||||||
# There are a number of less obvious ways to spell this in current
|
# There are a number of less obvious ways to spell this in current
|
||||||
# versions of Python.
|
# versions of Python, such as:
|
||||||
|
|
||||||
# External helper function
|
|
||||||
def pair(x, value): return [value, x/value]
|
|
||||||
stuff = [pair(x, f(x)) for x in range(5)]
|
|
||||||
|
|
||||||
# Inline helper function
|
# Inline helper function
|
||||||
stuff = [(lambda y: [y,x/y])(f(x)) for x in range(5)]
|
stuff = [(lambda y: [y,x/y])(f(x)) for x in range(5)]
|
||||||
|
@ -196,36 +198,12 @@ These list comprehensions are all approximately equivalent::
|
||||||
# Extra 'for' loop - potentially could be optimized internally
|
# Extra 'for' loop - potentially could be optimized internally
|
||||||
stuff = [[y, x/y] for x in range(5) for y in [f(x)]]
|
stuff = [[y, x/y] for x in range(5) for y in [f(x)]]
|
||||||
|
|
||||||
# Iterating over a genexp
|
|
||||||
stuff = [[y, x/y] for x, y in ((x, f(x)) for x in range(5))]
|
|
||||||
|
|
||||||
# Expanding the comprehension into a loop
|
|
||||||
stuff = []
|
|
||||||
for x in range(5):
|
|
||||||
y = f(x)
|
|
||||||
stuff.append([y, x/y])
|
|
||||||
|
|
||||||
# Wrapping the loop in a generator function
|
|
||||||
def g():
|
|
||||||
for x in range(5):
|
|
||||||
y = f(x)
|
|
||||||
yield [y, x/y]
|
|
||||||
stuff = list(g())
|
|
||||||
|
|
||||||
# Using a mutable cache object (various forms possible)
|
# Using a mutable cache object (various forms possible)
|
||||||
c = {}
|
c = {}
|
||||||
stuff = [[c.update(y=f(x)) or c['y'], x/c['y']] for x in range(5)]
|
stuff = [[c.update(y=f(x)) or c['y'], x/c['y']] for x in range(5)]
|
||||||
|
|
||||||
If calling ``f(x)`` is expensive or has side effects, the clean operation of
|
In all cases, the name is local to the comprehension; like iteration variables,
|
||||||
the list comprehension gets muddled. Using a short-duration name binding
|
it cannot leak out into the surrounding context.
|
||||||
retains the simplicity; while the extra ``for`` loop does achieve this, it
|
|
||||||
does so at the cost of dividing the expression visually, putting the named
|
|
||||||
part at the end of the comprehension instead of the beginning.
|
|
||||||
|
|
||||||
Similarly, a list comprehension can map and filter efficiently by capturing
|
|
||||||
the condition::
|
|
||||||
|
|
||||||
results = [(x, y, x/y) for x in input_data if (y := f(x)) > 0]
|
|
||||||
|
|
||||||
|
|
||||||
Capturing condition values
|
Capturing condition values
|
||||||
|
@ -279,22 +257,14 @@ Alternative spellings
|
||||||
|
|
||||||
Broadly the same semantics as the current proposal, but spelled differently.
|
Broadly the same semantics as the current proposal, but spelled differently.
|
||||||
|
|
||||||
1. ``EXPR as NAME``, with or without parentheses::
|
1. ``EXPR as NAME``::
|
||||||
|
|
||||||
stuff = [[f(x) as y, x/y] for x in range(5)]
|
stuff = [[f(x) as y, x/y] for x in range(5)]
|
||||||
|
|
||||||
Omitting the parentheses in this form of the proposal introduces many
|
Since ``EXPR as NAME`` already has meaning in ``except`` and ``with``
|
||||||
syntactic ambiguities. Requiring them in all contexts leaves open the
|
statements (with different semantics), this would create unnecessary
|
||||||
option to make them optional in specific situations where the syntax is
|
confusion or require special-casing (eg to forbid assignment within the
|
||||||
unambiguous (cf generator expressions as sole parameters in function
|
headers of these statements).
|
||||||
calls), but there is no plausible way to make them optional everywhere.
|
|
||||||
|
|
||||||
With the parentheses, this becomes a viable option, with its own tradeoffs
|
|
||||||
in syntactic ambiguity. Since ``EXPR as NAME`` already has meaning in
|
|
||||||
``except`` and ``with`` statements (with different semantics), this would
|
|
||||||
create unnecessary confusion or require special-casing (most notably of
|
|
||||||
``with`` and ``except`` statements, where a nearly-identical syntax has
|
|
||||||
different semantics).
|
|
||||||
|
|
||||||
2. ``EXPR -> NAME``::
|
2. ``EXPR -> NAME``::
|
||||||
|
|
||||||
|
@ -315,9 +285,7 @@ Broadly the same semantics as the current proposal, but spelled differently.
|
||||||
This has the advantage that leaked usage can be readily detected, removing
|
This has the advantage that leaked usage can be readily detected, removing
|
||||||
some forms of syntactic ambiguity. However, this would be the only place
|
some forms of syntactic ambiguity. However, this would be the only place
|
||||||
in Python where a variable's scope is encoded into its name, making
|
in Python where a variable's scope is encoded into its name, making
|
||||||
refactoring harder. This syntax is quite viable, and could be promoted to
|
refactoring harder.
|
||||||
become the current recommendation if its advantages are found to outweigh
|
|
||||||
its cost.
|
|
||||||
|
|
||||||
4. Adding a ``where:`` to any statement to create local name bindings::
|
4. Adding a ``where:`` to any statement to create local name bindings::
|
||||||
|
|
||||||
|
@ -444,17 +412,26 @@ edge cases that were previously legal and now are not, and a few corner cases
|
||||||
with altered semantics.
|
with altered semantics.
|
||||||
|
|
||||||
|
|
||||||
Yield inside comprehensions
|
The Outermost Iterable
|
||||||
---------------------------
|
----------------------
|
||||||
|
|
||||||
As of Python 3.7, the outermost iterable in a comprehension is permitted to
|
As of Python 3.7, the outermost iterable in a comprehension is special: it is
|
||||||
contain a 'yield' expression. If this is required, the iterable (or at least
|
evaluated in the surrounding context, instead of inside the comprehension.
|
||||||
the yield) must be explicitly elevated from the comprehension::
|
Thus it is permitted to contain a ``yield`` expression, to use a name also
|
||||||
|
used elsewhere, and to reference names from class scope. Also, in a genexp,
|
||||||
|
the outermost iterable is pre-evaluated, but the rest of the code is not
|
||||||
|
touched until the genexp is first iterated over. Class scope is now handled
|
||||||
|
more generally (see above), but if other changes require the old behaviour,
|
||||||
|
the iterable must be explicitly elevated from the comprehension::
|
||||||
|
|
||||||
# Python 3.7
|
# Python 3.7
|
||||||
|
def f(x):
|
||||||
|
return [x for x in x if x]
|
||||||
def g():
|
def g():
|
||||||
return [x for x in [(yield 1)]]
|
return [x for x in [(yield 1)]]
|
||||||
# With PEP 572
|
# With PEP 572
|
||||||
|
def f(x):
|
||||||
|
return [y for y in x if y]
|
||||||
def g():
|
def g():
|
||||||
sent_item = (yield 1)
|
sent_item = (yield 1)
|
||||||
return [x for x in [sent_item]]
|
return [x for x in [sent_item]]
|
||||||
|
@ -463,57 +440,13 @@ This more clearly shows that it is g(), not the comprehension, which is able
|
||||||
to yield values (and is thus a generator function). The entire comprehension
|
to yield values (and is thus a generator function). The entire comprehension
|
||||||
is consistently in a single scope.
|
is consistently in a single scope.
|
||||||
|
|
||||||
|
The following expressions would, in Python 3.7, raise exceptions immediately.
|
||||||
Name reuse inside comprehensions
|
With the removal of the outermost iterable's special casing, they are now
|
||||||
--------------------------------
|
equivalent to the most obvious longhand form::
|
||||||
|
|
||||||
If the same name is used in the outermost iterable and also as an iteration
|
|
||||||
variable, this will now raise UnboundLocalError when previously it referred
|
|
||||||
to the name in the surrounding scope. Example::
|
|
||||||
|
|
||||||
# Lib/typing.py
|
|
||||||
tvars = []
|
|
||||||
for t in types:
|
|
||||||
if isinstance(t, TypeVar) and t not in tvars:
|
|
||||||
tvars.append(t)
|
|
||||||
if isinstance(t, _GenericAlias) and not t._special:
|
|
||||||
tvars.extend([ty for ty in t.__parameters__ if ty not in tvars])
|
|
||||||
|
|
||||||
If the list comprehension uses the name ``t`` rather than ``ty``, this will
|
|
||||||
work in Python 3.7 but not with this proposal. As with other unwanted name
|
|
||||||
shadowing, the solution is to use distinct names.
|
|
||||||
|
|
||||||
|
|
||||||
Name lookups in class scope
|
|
||||||
---------------------------
|
|
||||||
|
|
||||||
A comprehension inside a class previously was able to 'see' class members ONLY
|
|
||||||
from the outermost iterable. Other name lookups would ignore the class and
|
|
||||||
potentially locate a name at an outer scope::
|
|
||||||
|
|
||||||
pattern = "<%d>"
|
|
||||||
class X:
|
|
||||||
pattern = "[%d]"
|
|
||||||
numbers = [pattern % n for n in range(5)]
|
|
||||||
|
|
||||||
In Python 3.7, ``X.numbers`` would show angle brackets; with PEP 572, it would
|
|
||||||
show square brackets. Maintaining the current behaviour here is best done by
|
|
||||||
using distinct names for the different forms of ``pattern``, as would be the
|
|
||||||
case with functions.
|
|
||||||
|
|
||||||
|
|
||||||
Generator expression bugs can be caught later
|
|
||||||
---------------------------------------------
|
|
||||||
|
|
||||||
Certain types of bugs in genexps were previously caught more quickly. Some are
|
|
||||||
now detected only at first iteration::
|
|
||||||
|
|
||||||
gen = (x for x in rage(10)) # NameError
|
gen = (x for x in rage(10)) # NameError
|
||||||
gen = (x for x in 10) # TypeError (not iterable)
|
gen = (x for x in 10) # TypeError (not iterable)
|
||||||
gen = (x for x in range(1/0)) # Exception raised during evaluation
|
gen = (x for x in range(1/0)) # ZeroDivisionError
|
||||||
|
|
||||||
This brings such generator expressions in line with a simple translation to
|
|
||||||
function form::
|
|
||||||
|
|
||||||
def <genexp>():
|
def <genexp>():
|
||||||
for x in rage(10):
|
for x in rage(10):
|
||||||
|
@ -521,40 +454,10 @@ function form::
|
||||||
gen = <genexp>() # No exception yet
|
gen = <genexp>() # No exception yet
|
||||||
tng = next(gen) # NameError
|
tng = next(gen) # NameError
|
||||||
|
|
||||||
Detecting these errors more quickly is nontrivial. It is, however, the exact
|
|
||||||
same problem as generator functions currently suffer from, and this proposal
|
|
||||||
brings the genexp in line with the most natural longhand form.
|
|
||||||
|
|
||||||
|
|
||||||
Open questions
|
Open questions
|
||||||
==============
|
==============
|
||||||
|
|
||||||
Can the outermost iterable still be evaluated early?
|
|
||||||
----------------------------------------------------
|
|
||||||
|
|
||||||
As of Python 3.7, the outermost iterable in a genexp is evaluated early, and
|
|
||||||
the result passed to the implicit function as an argument. With PEP 572, this
|
|
||||||
would no longer be the case. Can we still, somehow, evaluate it before moving
|
|
||||||
on? One possible implementation would be::
|
|
||||||
|
|
||||||
gen = (x for x in rage(10))
|
|
||||||
# translates to
|
|
||||||
def <genexp>():
|
|
||||||
iterable = iter(rage(10))
|
|
||||||
yield None
|
|
||||||
for x in iterable:
|
|
||||||
yield x
|
|
||||||
gen = <genexp>()
|
|
||||||
next(gen)
|
|
||||||
|
|
||||||
This would pump the iterable up to just before the loop starts, evaluating
|
|
||||||
exactly as much as is evaluated outside the generator function in Py3.7.
|
|
||||||
This would result in it being possible to call ``gen.send()`` immediately,
|
|
||||||
unlike with most generators, and may incur unnecessary overhead in the
|
|
||||||
common case where the iterable is pumped immediately (perhaps as part of a
|
|
||||||
larger expression).
|
|
||||||
|
|
||||||
|
|
||||||
Importing names into comprehensions
|
Importing names into comprehensions
|
||||||
-----------------------------------
|
-----------------------------------
|
||||||
|
|
||||||
|
@ -657,9 +560,6 @@ in PEP 8 and/or other style guides.
|
||||||
2. If using assignment expressions would lead to ambiguity about
|
2. If using assignment expressions would lead to ambiguity about
|
||||||
execution order, restructure it to use statements instead.
|
execution order, restructure it to use statements instead.
|
||||||
|
|
||||||
3. Chaining multiple assignment expressions should generally be avoided.
|
|
||||||
More than one assignment per expression can detract from readability.
|
|
||||||
|
|
||||||
|
|
||||||
Acknowledgements
|
Acknowledgements
|
||||||
================
|
================
|
||||||
|
|
Loading…
Reference in New Issue