187 lines
6.5 KiB
Plaintext
187 lines
6.5 KiB
Plaintext
PEP: 211
|
||
Title: Adding A New Outer Product Operator
|
||
Version: $Revision$
|
||
Author: gvwilson@ddj.com (Greg Wilson)
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Python-Version: 2.1
|
||
Created: 15-Jul-2000
|
||
Post-History:
|
||
|
||
|
||
Introduction
|
||
|
||
This PEP describes a proposal to define "@" (pronounced "across")
|
||
as a new outer product operator in Python 2.2. When applied to
|
||
sequences (or other iterable objects), this operator will combine
|
||
their iterators, so that:
|
||
|
||
for (i, j) in S @ T:
|
||
pass
|
||
|
||
will be equivalent to:
|
||
|
||
for i in S:
|
||
for j in T:
|
||
pass
|
||
|
||
Classes will be able to overload this operator using the special
|
||
methods "__across__", "__racross__", and "__iacross__". In
|
||
particular, the new Numeric module (PEP 0209) will overload this
|
||
operator for multi-dimensional arrays to implement matrix
|
||
multiplication.
|
||
|
||
|
||
Background
|
||
|
||
Number-crunching is now just a small part of computing, but many
|
||
programmers --- including many Python users --- still need to
|
||
express complex mathematical operations in code. Most numerical
|
||
languages, such as APL, Fortran-90, MATLAB, IDL, and Mathematica,
|
||
therefore provide two forms of the common arithmetic operators.
|
||
One form works element-by-element, e.g. multiplies corresponding
|
||
elements of its matrix arguments. The other implements the
|
||
"mathematical" definition of that operation, e.g. performs
|
||
row-column matrix multiplication.
|
||
|
||
Zhu and Lielens have proposed doubling up Python's operators in
|
||
this way [1]. Their proposal would create six new binary infix
|
||
operators, and six new in-place operators.
|
||
|
||
The original version of this proposal was much more conservative.
|
||
The author consulted the developers of GNU Octave [2], an open
|
||
source clone of MATLAB. Its developers agreed that providing an
|
||
infix operator for matrix multiplication was important: numerical
|
||
programmers really do care whether they have to write "mmul(A,B)"
|
||
instead of "A op B".
|
||
|
||
On the other hand, when asked how important it was to have infix
|
||
operators for matrix solution and other operations, Prof. James
|
||
Rawlings replied [3]:
|
||
|
||
I DON'T think it's a must have, and I do a lot of matrix
|
||
inversion. I cannot remember if its A\b or b\A so I always
|
||
write inv(A)*b instead. I recommend dropping \.
|
||
|
||
Based on this discussion, and feedback from students at the US
|
||
national laboratories and elsewhere, we recommended adding only
|
||
one new operator, for matrix multiplication, to Python.
|
||
|
||
|
||
Iterators
|
||
|
||
The planned addition of iterators to Python 2.2 opens up a broader
|
||
scope for this proposal. As part of the discussion of PEP 0201
|
||
"Lockstep Iteration" [4], the author of this proposal conducted an
|
||
informal usability experiment [5]. The results showed that users
|
||
are psychologically receptive to "cross-product" loop syntax. For
|
||
example, most users expected:
|
||
|
||
S = [10, 20, 30]
|
||
T = [1, 2, 3]
|
||
for x in S; y in T:
|
||
print x+y,
|
||
|
||
to print "11 12 13 21 22 23 31 32 33". We believe that users will
|
||
have the same reaction to:
|
||
|
||
for (x, y) in S @ T:
|
||
print x+y
|
||
|
||
i.e. that they will naturally interpret this as a tidy way to
|
||
write loop nests.
|
||
|
||
This is where iterators come in. Actually constructing the
|
||
cross-product of two (or more) sequences before executing the loop
|
||
would be very expensive. On the other hand, "@" could be defined
|
||
to get its arguments' iterators, and then create an outer iterator
|
||
which returns tuples of the values returned by the inner
|
||
iterators.
|
||
|
||
|
||
Discussion
|
||
|
||
1. Adding a named function "across" would have less impact on
|
||
Python than a new infix operator. However, this would not make
|
||
Python more appealing to numerical programmers, who really do
|
||
care whether they can write matrix multiplication using an
|
||
operator, or whether they have to write it as a function call.
|
||
|
||
2. "@" would have be chainable in the same way as comparison
|
||
operators, i.e.:
|
||
|
||
(1, 2) @ (3, 4) @ (5, 6)
|
||
|
||
would have to return (1, 3, 5) ... (2, 4, 6), and *not*
|
||
((1, 3), 5) ... ((2, 4), 6). This should not require special
|
||
support from the parser, as the outer iterator created by the
|
||
first "@" could easily be taught how to combine itself with
|
||
ordinary iterators.
|
||
|
||
3. There would have to be some way to distinguish restartable
|
||
iterators from ones that couldn't be restarted. For example,
|
||
if S is an input stream (e.g. a file), and L is a list, then "S
|
||
@ L" is straightforward, but "L @ S" is not, since iteration
|
||
through the stream cannot be repeated. This could be treated
|
||
as an error, or by having the outer iterator detect
|
||
non-restartable inner iterators and cache their values.
|
||
|
||
4. Whiteboard testing of this proposal in front of three novice
|
||
Python users (all of them experienced programmers) indicates
|
||
that users will expect:
|
||
|
||
"ab" @ "cd"
|
||
|
||
to return four strings, not four tuples of pairs of
|
||
characters. Opinion was divided on what:
|
||
|
||
("a", "b") @ "cd"
|
||
|
||
ought to return...
|
||
|
||
|
||
Alternatives
|
||
|
||
1. Do nothing --- keep Python simple.
|
||
|
||
This is always the default choice.
|
||
|
||
2. Add a named function instead of an operator.
|
||
|
||
Python is not primarily a numerical language; it may not be worth
|
||
complexifying it for this special case. However, support for real
|
||
matrix multiplication *is* frequently requested, and the proposed
|
||
semantics for "@" for built-in sequence types would simplify
|
||
expression of a very common idiom (nested loops).
|
||
|
||
3. Introduce prefixed forms of all existing operators, such as
|
||
"~*" and "~+", as proposed in PEP 0225 [1].
|
||
|
||
Our objections to this are that there isn't enough demand to
|
||
justify the additional complexity (see Rawlings' comments [3]),
|
||
and that the proposed syntax fails the "low toner" readability
|
||
test.
|
||
|
||
|
||
Acknowledgments
|
||
|
||
I am grateful to Huaiyu Zhu for initiating this discussion, and to
|
||
James Rawlings and students in various Python courses for their
|
||
discussions of what numerical programmers really care about.
|
||
|
||
|
||
References
|
||
|
||
[1] http://python.sourceforge.net/peps/pep-0225.html
|
||
[2] http://bevo.che.wisc.edu/octave/
|
||
[3] http://www.egroups.com/message/python-numeric/4
|
||
[4] http://python.sourceforge.net/peps/pep-0201.html
|
||
[5] http://mail.python.org/pipermail/python-dev/2000-July/006427.html
|
||
|
||
|
||
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
End:
|