python-peps/pep-3150.txt

427 lines
15 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

PEP: 3150
Title: Statement local namespaces (aka "given" clause)
Version: $Revision$
Last-Modified: $Date$
Author: Nick Coghlan <ncoghlan@gmail.com>
Status: Deferred
Type: Standards Track
Content-Type: text/x-rst
Created: 2010-07-09
Python-Version: 3.3
Post-History: 2010-07-14
Resolution: TBD
Abstract
========
A recurring proposal ([1], [2], [3]) on python-ideas is the addition of some form of
statement local namespace.
This PEP is intended to serve as a focal point for those ideas, so we
can hopefully avoid retreading the same ground a couple of times a
year. Even if the proposal is never accepted having a PEP to point
people to can be valuable (e.g. having PEP 315 helps greatly in avoiding
endless rehashing of loop-and-a-half arguments).
The ideas in this PEP are just a sketch of a way this concept might work.
They avoid some pitfalls that have been encountered in the past, but
have not themselves been subject to the test of implementation.
PEP Deferral
============
This PEP is currently deferred at least until the language moratorium
(PEP 3003) is officially lifted by Guido. Even after that, it will
require input from at least the four major Python implementations
(CPython, PyPy, Jython, IronPython) on the feasibility of implementing
the proposed semantics to get it moving again. Input from related
projects with a vested interest in Python's syntax (e.g. Cython) will
also be valuable.
That said, if a decision on acceptance or rejection had to be made
immediately, rejection would be far more likely. Unlike the previous
major syntax addition to Python (PEP 343's ``with`` statement), this
PEP has no "killer application" of code that is clearly and obviously
improved through the use of the new syntax. The ``with`` statement (in
conjunction with the generator enhancements in PEP 342) allowed
exception handling to be factored out into context managers in a way
that had never before been possible. Code using the new statement was
not only easier to read, but much easier to write correctly in the
first place.
In the case of this PEP. however, the "Two Ways to Do It" objection is a
strong one. While the ability to break out subexpresions of a statement
without having to worry about name clashes with the rest of a
function or script and without distracting from the operation that is
the ultimate aim of the statement is potentially nice to have as a
language feature, it doesn't really provide significant expressive power
over and above what is already possible by assigning subexpressions to
ordinary local variables before the statement of interest. In particular,
explaining to new Python programmers when it is best to use a ``given``
clause and when to use normal local variables is likely to be challenging
and an unnecessary distraction.
"It might be kinda, sorta, nice to have, sometimes" really isn't a strong
argument for a new syntactic construct (particularly one this complicated).
Proposal
========
This PEP proposes the addition of an optional ``given`` clause to the
syntax for simple statements which may contain an expression. The
current list of simple statements that would be affected by this
addition is as follows:
* expression statement
* assignment statement
* augmented assignment statement
* del statement
* return statement
* yield statement
* raise statement
* assert statement
The ``given`` clause would allow subexpressions to be referenced by
name in the header line, with the actual definitions following in
the indented clause. As a simple example::
c = sqrt(a*a + b*b) given:
a = retrieve_a()
b = retrieve_b()
Rationale
=========
Some past language features (specifically function decorators
and list comprehensions) were motivated, at least in part, by
the desire to give the important parts of a statement more
prominence when reading code. In the case of function decorators,
information such as whether or not a method is a class or static
method can now be found in the function definition rather than
after the function body. List comprehensions similarly take the
expression being assigned to each member of the list and move it
to the beginning of the expression rather than leaving it buried
inside a ``for`` loop.
The rationale for the ``given`` clause is similar. Currently,
breaking out a subexpression requires naming that subexpression
*before* the actual statement of interest. The ``given`` clause
is designed to allow a programmer to highlight for the reader
the statement which is actually of interest (and presumably has
significance for later code) while hiding the most likely irrelevant
"calculation details" inside an indented suite.
Using the simple example from the proposal section, the current Python
equivalent would be::
a = retrieve_a()
b = retrieve_b()
c = sqrt(a*a + b*b)
If later code is only interested in the value of c, then the
details involved in retrieving the values of a and b may be an
unnecessary distraction to the reader (particularly if those
details are more complicated than the simple function calls
shown in the example).
To use a more illustrative example (courtesy of Alex Light),
which of the following is easier to comprehend?
Subexpressions up front?::
sea = water()
temp = get_temperature(sea)
depth = get_depth(sea)
purity = get_purity(sea)
saltiness = get_salinity(sea)
size = get_size(sea)
density = get_density(sea)
desired_property = calc_value(temp, depth, purity,
salinity, size, density)
# Further operations using desired_property
Or subexpressions indented?::
desired_property = calc_value(temp, depth, purity,
salinity, size, density) given:
sea = water()
temp = get_temperature(sea)
depth = get_depth(sea)
purity = get_purity(sea)
saltiness = get_salinity(sea)
size = get_size(sea)
density = get_density(sea)
# Further operations using desired_property
The ``given`` clause may also function as a more readable
alternative to some uses of lambda expressions and similar
constructs when passing one-off functions to operations
like ``sorted``.
One way to think of the proposed clause is as a middle
ground between normal in-line code and separation of an
operation out into a dedicated function.
Keyword Choice
==============
This proposal initially used ``where`` based on the name of a similar
construct in Haskell. However, it has been pointed out that there
are existing Python libraries (such as Numpy [4]) that already use
``where`` in the SQL query condition sense, making that keyword choice
potentially confusing.
While ``given`` may also be used as a variable name (and hence would be
deprecated using the usual ``__future__`` dance for introducing
new keywords), it is associated much more strongly with the desired
"here are some extra variables this expression may use" semantics
for the new clause.
Reusing the ``with`` keyword has also been proposed. This has the
advantage of avoiding the addition of a new keyword, but also has
a high potential for confusion as the ``with`` clause and ``with``
statement would look similar but do completely different things.
That way lies C++ and Perl :)
Syntax Change
=============
Current::
expr_stmt: testlist_star_expr (augassign (yield_expr|testlist) |
('=' (yield_expr|testlist_star_expr))*)
del_stmt: 'del' exprlist
return_stmt: 'return' [testlist]
yield_stmt: yield_expr
raise_stmt: 'raise' [test ['from' test]]
assert_stmt: 'assert' test [',' test]
New::
expr_stmt: testlist_star_expr (augassign (yield_expr|testlist) |
('=' (yield_expr|testlist_star_expr))*) [given_clause]
del_stmt: 'del' exprlist [given_clause]
return_stmt: 'return' [testlist] [given_clause]
yield_stmt: yield_expr [given_clause]
raise_stmt: 'raise' [test ['from' test]] [given_clause]
assert_stmt: 'assert' test [',' test] [given_clause]
given_clause: "given" ":" suite
(Note that expr_stmt in the grammar covers assignment and augmented
assignment in addition to simple expression statements)
The new clause is added as an optional element of the existing statements
rather than as a new kind of compound statement in order to avoid creating
an ambiguity in the grammar. It is applied only to the specific elements
listed so that nonsense like the following is disallowed::
pass given:
a = b = 1
However, even this is inadequate, as it creates problems for the definition
of simple_stmt (which allows chaining of multiple single line statements
with ";" rather than "\\n").
So the above syntax change should instead be taken as a statement of intent.
Any actual proposal would need to resolve the simple_stmt parsing problem
before it could be seriously considered. This would likely require a
non-trivial restructuring of the grammar, breaking up small_stmt and
flow_stmt to separate the statements that potentially contain arbitrary
subexpressions and then allowing a single one of those statements with
a ``given`` clause at the simple_stmt level. Something along the lines of::
stmt: simple_stmt | given_stmt | compound_stmt
simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE
small_stmt: (pass_stmt | flow_stmt | import_stmt |
global_stmt | nonlocal_stmt)
flow_stmt: break_stmt | continue_stmt
given_stmt: subexpr_stmt (given_clause |
(';' (small_stmt | subexpr_stmt))* [';']) NEWLINE
subexpr_stmt: expr_stmt | del_stmt | flow_subexpr_stmt | assert_stmt
flow_subexpr_stmt: return_stmt | raise_stmt | yield_stmt
given_clause: "given" ":" suite
For reference, here are the current definitions at that level::
stmt: simple_stmt | compound_stmt
simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE
small_stmt: (expr_stmt | del_stmt | pass_stmt | flow_stmt |
import_stmt | global_stmt | nonlocal_stmt | assert_stmt)
flow_stmt: break_stmt | continue_stmt | return_stmt | raise_stmt | yield_stmt
Common Objections
=================
* Two Ways To Do It: a lot of code may now be written with values
defined either before the expression where they are used or
afterwards in a ``given`` clause, creating two ways to do it,
without an obvious way of choosing between them.
* Out of Order Execution: the ``given`` clause makes execution
jump around a little strangely, as the body of the ``given``
clause is executed before the simple statement in the clause
header. The closest any other part of Python comes to this
is the out of order evaluation in list comprehensions,
generator expressions and conditional expressions.
These objections should not be dismissed lightly - the proposal
in this PEP needs to be subjected to the test of application to
a large code base (such as the standard library) in a search
for examples where the readability of real world code is genuinely
enhanced.
New PEP 8 guidelines would also need to be developed to provide
appropriate direction on when to use the ``given`` clause over
ordinary variable assignments.
Possible Additions
==================
* The current proposal allows the addition of a ``given`` clause only
for simple statements. Extending the idea to allow the use of
compound statements would be quite possible, but doing so raises
serious readability concerns (as values defined in the ``given``
clause may be used well before they are defined, exactly the kind
of readability trap that other features like decorators and ``with``
statements are designed to eliminate)
* Currently only the outermost clause of comprehensions and generator
expressions can reference the surrounding namespace when executed
at class level. If this proposal is implemented successfully, the
associated namespace semantics could allow that restriction to be
lifted. There would be backwards compatibility implications in doing
so as existing code may be relying on the behaviour of ignoring
class level variables, but the idea is worth considering.
Torture Test
============
An implementation of this PEP must support execution of the following
code at module, class and function scope::
b = {}
a = b[f(a)] = x given:
x = 42
def f(x):
return x
assert "x" not in locals()
assert "f" not in locals()
assert a == 42
assert d[42] == 42 given:
d = b
assert "d" not in locals()
Most naive implementations will choke on the first complex assignment,
while less naive but still broken implementations will fail when
the torture test is executed at class scope.
And yes, that's a perfectly well-defined assignment statement. Insane,
you might rightly say, but legal::
>>> def f(x): return x
...
>>> x = 42
>>> b = {}
>>> a = b[f(a)] = x
>>> a
42
>>> b
{42: 42}
Possible Implementation Strategy
================================
AKA How Class Scopes Screw You When Attempting To Implement This
The natural idea when setting out to implement this concept is to
use an ordinary nested function scope. This doesn't work for the
two reasons mentioned in the Torture Test section above:
* Non-local variables are not your friend because they ignore class scopes
and (when writing back to the outer scope) aren't really on speaking
terms with module scopes either.
* Return-based semantics struggle with complex assignment statements
like the one in the torture test
The most promising approach is one based on symtable analysis and
copy-in-copy-out referencing semantics to move any required name
bindings between the inner and outer scopes. The torture test above
would then translate to something like the following::
b = {}
def _anon1(b): # 'b' reference copied in
x = 42
def f(x):
return x
a = b[f(a)] = x
return a # 'a' reference copied out
a = _anon1(b)
assert "x" not in locals()
assert "f" not in locals()
assert a == 42
def _anon2(b) # 'b' reference copied in
d = b
assert d[42] == 42
# Nothing to copy out (not an assignment)
_anon2()
assert "d" not in locals()
However, as noted in the abstract, an actual implementation of
this idea has never been tried.
Reference Implementation
========================
None as yet. If you want a crash course in Python namespace
semantics and code compilation, feel free to try ;)
TO-DO
=====
* Mention two-suite in-order variants (and explain why they're even more
pointless than the specific idea in the PEP)
* Mention PEP 359 and possible uses for locals() in the ``given`` clause
* Define the expected semantics of ``break``, ``continue``, ``return``
and ``yield`` in a ``given`` clause (i.e. syntax errors at the clause
level, but allowed inside the appropriate compound statements)
* Define the expected semantics of ``nonlocal`` and ``global`` in the
``given`` clause
* Define the name lookup semantics for function definitions in a
``given`` clause at function, class and module scope.
References
==========
.. [1] http://mail.python.org/pipermail/python-ideas/2010-June/007476.html
.. [2] http://mail.python.org/pipermail/python-ideas/2010-July/007584.html
.. [3] http://mail.python.org/pipermail/python-ideas/2009-July/005132.html
.. [4] http://mail.python.org/pipermail/python-ideas/2010-July/007596.html
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: