2000-07-15 19:25:49 -04:00
|
|
|
|
PEP: 211
|
2001-06-05 12:50:09 -04:00
|
|
|
|
Title: Adding A New Outer Product Operator
|
2000-07-15 19:25:49 -04:00
|
|
|
|
Version: $Revision$
|
2006-03-23 15:13:19 -05:00
|
|
|
|
Last-Modified: $Date$
|
2001-06-05 12:50:09 -04:00
|
|
|
|
Author: gvwilson@ddj.com (Greg Wilson)
|
2000-09-19 11:29:36 -04:00
|
|
|
|
Status: Draft
|
|
|
|
|
Type: Standards Track
|
2000-07-15 19:25:49 -04:00
|
|
|
|
Python-Version: 2.1
|
2000-08-11 10:18:44 -04:00
|
|
|
|
Created: 15-Jul-2000
|
|
|
|
|
Post-History:
|
2000-07-15 19:25:49 -04:00
|
|
|
|
|
|
|
|
|
|
2000-08-11 10:18:44 -04:00
|
|
|
|
Introduction
|
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
This PEP describes a proposal to define "@" (pronounced "across")
|
|
|
|
|
as a new outer product operator in Python 2.2. When applied to
|
|
|
|
|
sequences (or other iterable objects), this operator will combine
|
|
|
|
|
their iterators, so that:
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
for (i, j) in S @ T:
|
|
|
|
|
pass
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
will be equivalent to:
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
for i in S:
|
|
|
|
|
for j in T:
|
|
|
|
|
pass
|
|
|
|
|
|
|
|
|
|
Classes will be able to overload this operator using the special
|
|
|
|
|
methods "__across__", "__racross__", and "__iacross__". In
|
|
|
|
|
particular, the new Numeric module (PEP 0209) will overload this
|
|
|
|
|
operator for multi-dimensional arrays to implement matrix
|
|
|
|
|
multiplication.
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Background
|
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
Number-crunching is now just a small part of computing, but many
|
|
|
|
|
programmers --- including many Python users --- still need to
|
|
|
|
|
express complex mathematical operations in code. Most numerical
|
|
|
|
|
languages, such as APL, Fortran-90, MATLAB, IDL, and Mathematica,
|
|
|
|
|
therefore provide two forms of the common arithmetic operators.
|
|
|
|
|
One form works element-by-element, e.g. multiplies corresponding
|
|
|
|
|
elements of its matrix arguments. The other implements the
|
|
|
|
|
"mathematical" definition of that operation, e.g. performs
|
|
|
|
|
row-column matrix multiplication.
|
|
|
|
|
|
|
|
|
|
Zhu and Lielens have proposed doubling up Python's operators in
|
|
|
|
|
this way [1]. Their proposal would create six new binary infix
|
|
|
|
|
operators, and six new in-place operators.
|
|
|
|
|
|
|
|
|
|
The original version of this proposal was much more conservative.
|
|
|
|
|
The author consulted the developers of GNU Octave [2], an open
|
|
|
|
|
source clone of MATLAB. Its developers agreed that providing an
|
|
|
|
|
infix operator for matrix multiplication was important: numerical
|
|
|
|
|
programmers really do care whether they have to write "mmul(A,B)"
|
|
|
|
|
instead of "A op B".
|
|
|
|
|
|
|
|
|
|
On the other hand, when asked how important it was to have infix
|
|
|
|
|
operators for matrix solution and other operations, Prof. James
|
|
|
|
|
Rawlings replied [3]:
|
2000-11-22 17:01:47 -05:00
|
|
|
|
|
|
|
|
|
I DON'T think it's a must have, and I do a lot of matrix
|
|
|
|
|
inversion. I cannot remember if its A\b or b\A so I always
|
|
|
|
|
write inv(A)*b instead. I recommend dropping \.
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
Based on this discussion, and feedback from students at the US
|
|
|
|
|
national laboratories and elsewhere, we recommended adding only
|
|
|
|
|
one new operator, for matrix multiplication, to Python.
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
Iterators
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
The planned addition of iterators to Python 2.2 opens up a broader
|
2001-07-05 15:09:19 -04:00
|
|
|
|
scope for this proposal. As part of the discussion of PEP 201,
|
|
|
|
|
Lockstep Iteration[4], the author of this proposal conducted an
|
|
|
|
|
informal usability experiment[5]. The results showed that users
|
2001-06-05 12:50:09 -04:00
|
|
|
|
are psychologically receptive to "cross-product" loop syntax. For
|
|
|
|
|
example, most users expected:
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
S = [10, 20, 30]
|
|
|
|
|
T = [1, 2, 3]
|
|
|
|
|
for x in S; y in T:
|
|
|
|
|
print x+y,
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
to print "11 12 13 21 22 23 31 32 33". We believe that users will
|
|
|
|
|
have the same reaction to:
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
for (x, y) in S @ T:
|
|
|
|
|
print x+y
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
i.e. that they will naturally interpret this as a tidy way to
|
|
|
|
|
write loop nests.
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
This is where iterators come in. Actually constructing the
|
|
|
|
|
cross-product of two (or more) sequences before executing the loop
|
|
|
|
|
would be very expensive. On the other hand, "@" could be defined
|
|
|
|
|
to get its arguments' iterators, and then create an outer iterator
|
|
|
|
|
which returns tuples of the values returned by the inner
|
|
|
|
|
iterators.
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
Discussion
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
1. Adding a named function "across" would have less impact on
|
|
|
|
|
Python than a new infix operator. However, this would not make
|
|
|
|
|
Python more appealing to numerical programmers, who really do
|
|
|
|
|
care whether they can write matrix multiplication using an
|
|
|
|
|
operator, or whether they have to write it as a function call.
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
2. "@" would have be chainable in the same way as comparison
|
|
|
|
|
operators, i.e.:
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
(1, 2) @ (3, 4) @ (5, 6)
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
would have to return (1, 3, 5) ... (2, 4, 6), and *not*
|
|
|
|
|
((1, 3), 5) ... ((2, 4), 6). This should not require special
|
|
|
|
|
support from the parser, as the outer iterator created by the
|
|
|
|
|
first "@" could easily be taught how to combine itself with
|
|
|
|
|
ordinary iterators.
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
3. There would have to be some way to distinguish restartable
|
|
|
|
|
iterators from ones that couldn't be restarted. For example,
|
|
|
|
|
if S is an input stream (e.g. a file), and L is a list, then "S
|
|
|
|
|
@ L" is straightforward, but "L @ S" is not, since iteration
|
|
|
|
|
through the stream cannot be repeated. This could be treated
|
|
|
|
|
as an error, or by having the outer iterator detect
|
|
|
|
|
non-restartable inner iterators and cache their values.
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
4. Whiteboard testing of this proposal in front of three novice
|
|
|
|
|
Python users (all of them experienced programmers) indicates
|
|
|
|
|
that users will expect:
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
"ab" @ "cd"
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
to return four strings, not four tuples of pairs of
|
|
|
|
|
characters. Opinion was divided on what:
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
("a", "b") @ "cd"
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
ought to return...
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
|
|
|
|
|
2000-11-22 17:01:47 -05:00
|
|
|
|
Alternatives
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
1. Do nothing --- keep Python simple.
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
This is always the default choice.
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
2. Add a named function instead of an operator.
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
Python is not primarily a numerical language; it may not be worth
|
|
|
|
|
complexifying it for this special case. However, support for real
|
|
|
|
|
matrix multiplication *is* frequently requested, and the proposed
|
|
|
|
|
semantics for "@" for built-in sequence types would simplify
|
|
|
|
|
expression of a very common idiom (nested loops).
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
3. Introduce prefixed forms of all existing operators, such as
|
2001-07-05 15:09:19 -04:00
|
|
|
|
"~*" and "~+", as proposed in PEP 225 [1].
|
2000-11-22 17:01:47 -05:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
Our objections to this are that there isn't enough demand to
|
|
|
|
|
justify the additional complexity (see Rawlings' comments [3]),
|
|
|
|
|
and that the proposed syntax fails the "low toner" readability
|
|
|
|
|
test.
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2000-11-22 17:01:47 -05:00
|
|
|
|
|
|
|
|
|
Acknowledgments
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
I am grateful to Huaiyu Zhu for initiating this discussion, and to
|
|
|
|
|
James Rawlings and students in various Python courses for their
|
|
|
|
|
discussions of what numerical programmers really care about.
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
|
|
|
|
|
2000-11-22 17:01:47 -05:00
|
|
|
|
References
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2001-07-05 15:09:19 -04:00
|
|
|
|
[1] PEP 225, Elementwise/Objectwise Operators, Zhu, Lielens
|
|
|
|
|
http://www.python.org/peps/pep-0225.html
|
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
[2] http://bevo.che.wisc.edu/octave/
|
2001-07-05 15:09:19 -04:00
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
[3] http://www.egroups.com/message/python-numeric/4
|
2001-07-05 15:09:19 -04:00
|
|
|
|
|
|
|
|
|
[4] PEP 201, Lockstep Iteration, Warsaw
|
|
|
|
|
http://www.python.org/peps/pep-0201.html
|
|
|
|
|
|
2001-06-05 12:50:09 -04:00
|
|
|
|
[5] http://mail.python.org/pipermail/python-dev/2000-July/006427.html
|
2000-11-22 17:01:47 -05:00
|
|
|
|
|
2000-08-11 10:18:44 -04:00
|
|
|
|
|
2000-07-15 19:25:49 -04:00
|
|
|
|
|
|
|
|
|
Local Variables:
|
|
|
|
|
mode: indented-text
|
|
|
|
|
indent-tabs-mode: nil
|
|
|
|
|
End:
|