diff --git a/pep-0572.rst b/pep-0572.rst index 7f1dae506..c95742654 100644 --- a/pep-0572.rst +++ b/pep-0572.rst @@ -170,7 +170,7 @@ be late-bound as usual. One consequence of this change is that certain bugs in genexps will not be detected until the first call to ``next()``, where today they would be -caught upon creation of the generator. See 'open questions' below. +caught upon creation of the generator. Recommended use-cases @@ -179,16 +179,18 @@ Recommended use-cases Simplifying list comprehensions ------------------------------- -These list comprehensions are all approximately equivalent:: +A list comprehension can map and filter efficiently by capturing +the condition:: + + results = [(x, y, x/y) for x in input_data if (y := f(x)) > 0] + +Similarly, a subexpression can be reused within the main expression, by +giving it a name on first use:: stuff = [[y := f(x), x/y] for x in range(5)] # There are a number of less obvious ways to spell this in current - # versions of Python. - - # External helper function - def pair(x, value): return [value, x/value] - stuff = [pair(x, f(x)) for x in range(5)] + # versions of Python, such as: # Inline helper function stuff = [(lambda y: [y,x/y])(f(x)) for x in range(5)] @@ -196,36 +198,12 @@ These list comprehensions are all approximately equivalent:: # Extra 'for' loop - potentially could be optimized internally stuff = [[y, x/y] for x in range(5) for y in [f(x)]] - # Iterating over a genexp - stuff = [[y, x/y] for x, y in ((x, f(x)) for x in range(5))] - - # Expanding the comprehension into a loop - stuff = [] - for x in range(5): - y = f(x) - stuff.append([y, x/y]) - - # Wrapping the loop in a generator function - def g(): - for x in range(5): - y = f(x) - yield [y, x/y] - stuff = list(g()) - # Using a mutable cache object (various forms possible) c = {} stuff = [[c.update(y=f(x)) or c['y'], x/c['y']] for x in range(5)] -If calling ``f(x)`` is expensive or has side effects, the clean operation of -the list comprehension gets muddled. Using a short-duration name binding -retains the simplicity; while the extra ``for`` loop does achieve this, it -does so at the cost of dividing the expression visually, putting the named -part at the end of the comprehension instead of the beginning. - -Similarly, a list comprehension can map and filter efficiently by capturing -the condition:: - - results = [(x, y, x/y) for x in input_data if (y := f(x)) > 0] +In all cases, the name is local to the comprehension; like iteration variables, +it cannot leak out into the surrounding context. Capturing condition values @@ -279,22 +257,14 @@ Alternative spellings Broadly the same semantics as the current proposal, but spelled differently. -1. ``EXPR as NAME``, with or without parentheses:: +1. ``EXPR as NAME``:: stuff = [[f(x) as y, x/y] for x in range(5)] - Omitting the parentheses in this form of the proposal introduces many - syntactic ambiguities. Requiring them in all contexts leaves open the - option to make them optional in specific situations where the syntax is - unambiguous (cf generator expressions as sole parameters in function - calls), but there is no plausible way to make them optional everywhere. - - With the parentheses, this becomes a viable option, with its own tradeoffs - in syntactic ambiguity. Since ``EXPR as NAME`` already has meaning in - ``except`` and ``with`` statements (with different semantics), this would - create unnecessary confusion or require special-casing (most notably of - ``with`` and ``except`` statements, where a nearly-identical syntax has - different semantics). + Since ``EXPR as NAME`` already has meaning in ``except`` and ``with`` + statements (with different semantics), this would create unnecessary + confusion or require special-casing (eg to forbid assignment within the + headers of these statements). 2. ``EXPR -> NAME``:: @@ -315,9 +285,7 @@ Broadly the same semantics as the current proposal, but spelled differently. This has the advantage that leaked usage can be readily detected, removing some forms of syntactic ambiguity. However, this would be the only place in Python where a variable's scope is encoded into its name, making - refactoring harder. This syntax is quite viable, and could be promoted to - become the current recommendation if its advantages are found to outweigh - its cost. + refactoring harder. 4. Adding a ``where:`` to any statement to create local name bindings:: @@ -444,17 +412,26 @@ edge cases that were previously legal and now are not, and a few corner cases with altered semantics. -Yield inside comprehensions ---------------------------- +The Outermost Iterable +---------------------- -As of Python 3.7, the outermost iterable in a comprehension is permitted to -contain a 'yield' expression. If this is required, the iterable (or at least -the yield) must be explicitly elevated from the comprehension:: +As of Python 3.7, the outermost iterable in a comprehension is special: it is +evaluated in the surrounding context, instead of inside the comprehension. +Thus it is permitted to contain a ``yield`` expression, to use a name also +used elsewhere, and to reference names from class scope. Also, in a genexp, +the outermost iterable is pre-evaluated, but the rest of the code is not +touched until the genexp is first iterated over. Class scope is now handled +more generally (see above), but if other changes require the old behaviour, +the iterable must be explicitly elevated from the comprehension:: # Python 3.7 + def f(x): + return [x for x in x if x] def g(): return [x for x in [(yield 1)]] # With PEP 572 + def f(x): + return [y for y in x if y] def g(): sent_item = (yield 1) return [x for x in [sent_item]] @@ -463,57 +440,13 @@ This more clearly shows that it is g(), not the comprehension, which is able to yield values (and is thus a generator function). The entire comprehension is consistently in a single scope. - -Name reuse inside comprehensions --------------------------------- - -If the same name is used in the outermost iterable and also as an iteration -variable, this will now raise UnboundLocalError when previously it referred -to the name in the surrounding scope. Example:: - - # Lib/typing.py - tvars = [] - for t in types: - if isinstance(t, TypeVar) and t not in tvars: - tvars.append(t) - if isinstance(t, _GenericAlias) and not t._special: - tvars.extend([ty for ty in t.__parameters__ if ty not in tvars]) - -If the list comprehension uses the name ``t`` rather than ``ty``, this will -work in Python 3.7 but not with this proposal. As with other unwanted name -shadowing, the solution is to use distinct names. - - -Name lookups in class scope ---------------------------- - -A comprehension inside a class previously was able to 'see' class members ONLY -from the outermost iterable. Other name lookups would ignore the class and -potentially locate a name at an outer scope:: - - pattern = "<%d>" - class X: - pattern = "[%d]" - numbers = [pattern % n for n in range(5)] - -In Python 3.7, ``X.numbers`` would show angle brackets; with PEP 572, it would -show square brackets. Maintaining the current behaviour here is best done by -using distinct names for the different forms of ``pattern``, as would be the -case with functions. - - -Generator expression bugs can be caught later ---------------------------------------------- - -Certain types of bugs in genexps were previously caught more quickly. Some are -now detected only at first iteration:: +The following expressions would, in Python 3.7, raise exceptions immediately. +With the removal of the outermost iterable's special casing, they are now +equivalent to the most obvious longhand form:: gen = (x for x in rage(10)) # NameError gen = (x for x in 10) # TypeError (not iterable) - gen = (x for x in range(1/0)) # Exception raised during evaluation - -This brings such generator expressions in line with a simple translation to -function form:: + gen = (x for x in range(1/0)) # ZeroDivisionError def (): for x in rage(10): @@ -521,40 +454,10 @@ function form:: gen = () # No exception yet tng = next(gen) # NameError -Detecting these errors more quickly is nontrivial. It is, however, the exact -same problem as generator functions currently suffer from, and this proposal -brings the genexp in line with the most natural longhand form. - Open questions ============== -Can the outermost iterable still be evaluated early? ----------------------------------------------------- - -As of Python 3.7, the outermost iterable in a genexp is evaluated early, and -the result passed to the implicit function as an argument. With PEP 572, this -would no longer be the case. Can we still, somehow, evaluate it before moving -on? One possible implementation would be:: - - gen = (x for x in rage(10)) - # translates to - def (): - iterable = iter(rage(10)) - yield None - for x in iterable: - yield x - gen = () - next(gen) - -This would pump the iterable up to just before the loop starts, evaluating -exactly as much as is evaluated outside the generator function in Py3.7. -This would result in it being possible to call ``gen.send()`` immediately, -unlike with most generators, and may incur unnecessary overhead in the -common case where the iterable is pumped immediately (perhaps as part of a -larger expression). - - Importing names into comprehensions ----------------------------------- @@ -657,9 +560,6 @@ in PEP 8 and/or other style guides. 2. If using assignment expressions would lead to ambiguity about execution order, restructure it to use statements instead. -3. Chaining multiple assignment expressions should generally be avoided. - More than one assignment per expression can detract from readability. - Acknowledgements ================