Long overdue update from Greg Wilson 22-Apr-2001

This commit is contained in:
Barry Warsaw 2001-06-05 16:50:09 +00:00
parent cab29f1501
commit e0e0bb6439
1 changed files with 118 additions and 260 deletions

View File

@ -1,7 +1,7 @@
PEP: 211
Title: Adding New Linear Algebra Operators to Python
Title: Adding A New Outer Product Operator
Version: $Revision$
Author: gvwilson@nevex.com (Greg Wilson)
Author: gvwilson@ddj.com (Greg Wilson)
Status: Draft
Type: Standards Track
Python-Version: 2.1
@ -11,314 +11,172 @@ Post-History:
Introduction
This PEP describes a conservative proposal to add linear algebra
operators to Python 2.0. It discusses why such operators are
desirable, and why a minimalist approach should be adopted at this
point. This PEP summarizes discussions held in mailing list
forums, and provides URLs for further information, where
appropriate. The CVS revision history of this file contains the
definitive historical record.
This PEP describes a proposal to define "@" (pronounced "across")
as a new outer product operator in Python 2.2. When applied to
sequences (or other iterable objects), this operator will combine
their iterators, so that:
for (i, j) in S @ T:
pass
Summary
will be equivalent to:
Add a single new infix binary operator '@' ("across"), and
corresponding special methods "__across__()", "__racross__()", and
"__iacross__()". This operator will perform mathematical matrix
multiplication on NumPy arrays, and generate cross-products when
applied to built-in sequence types. No existing operator
definitions will be changed.
for i in S:
for j in T:
pass
Classes will be able to overload this operator using the special
methods "__across__", "__racross__", and "__iacross__". In
particular, the new Numeric module (PEP 0209) will overload this
operator for multi-dimensional arrays to implement matrix
multiplication.
Background
The first high-level programming language, Fortran, was invented
to do arithmetic. While this is now just a small part of
computing, there are still many programmers who need to express
complex mathematical operations in code.
Number-crunching is now just a small part of computing, but many
programmers --- including many Python users --- still need to
express complex mathematical operations in code. Most numerical
languages, such as APL, Fortran-90, MATLAB, IDL, and Mathematica,
therefore provide two forms of the common arithmetic operators.
One form works element-by-element, e.g. multiplies corresponding
elements of its matrix arguments. The other implements the
"mathematical" definition of that operation, e.g. performs
row-column matrix multiplication.
The most influential of Fortran's successors was APL [1]. Its
author, Kenneth Iverson, designed the language as a notation for
expressing matrix algebra, and received the 1980 Turing Award for
his work.
Zhu and Lielens have proposed doubling up Python's operators in
this way [1]. Their proposal would create six new binary infix
operators, and six new in-place operators.
APL's operators supported both familiar algebraic operations, such
as vector dot product and matrix multiplication, and a wide range
of structural operations, such as stitching vectors together to
create arrays. Even by programming's standards, APL is
exceptionally cryptic: many of its symbols did not exist on
standard keyboards, and expressions have to be read right to left.
The original version of this proposal was much more conservative.
The author consulted the developers of GNU Octave [2], an open
source clone of MATLAB. Its developers agreed that providing an
infix operator for matrix multiplication was important: numerical
programmers really do care whether they have to write "mmul(A,B)"
instead of "A op B".
Most subsequent work numerical languages, such as Fortran-90,
MATLAB, and Mathematica, have tried to provide the power of APL
without the obscurity. Python's NumPy [2] has most of the
features that users of such languages expect, but these are
provided through named functions and methods, rather than
overloaded operators. This makes NumPy clumsier than most
alternatives.
The author of this PEP therefore consulted the developers of GNU
Octave [3], an open source clone of MATLAB. When asked how
important it was to have infix operators for matrix solution,
Prof. James Rawlings replied [4]:
On the other hand, when asked how important it was to have infix
operators for matrix solution and other operations, Prof. James
Rawlings replied [3]:
I DON'T think it's a must have, and I do a lot of matrix
inversion. I cannot remember if its A\b or b\A so I always
write inv(A)*b instead. I recommend dropping \.
Rawlings' feedback on other operators was similar. It is worth
noting in this context that notations such as "/" and "\" for
matrix solution were invented by programmers, not mathematicians,
and have not been adopted by the latter.
Based on this discussion, and feedback from classes at the US
national laboratories and elsewhere, we recommend only adding a
matrix multiplication operator to Python at this time. If there
is significant user demand for syntactic support for other
operations, these can be added in a later release.
Based on this discussion, and feedback from students at the US
national laboratories and elsewhere, we recommended adding only
one new operator, for matrix multiplication, to Python.
Requirements
Iterators
The most important requirement is minimal impact on existing
Python programs and users: the proposal must not break existing
code (except possibly NumPy).
The planned addition of iterators to Python 2.2 opens up a broader
scope for this proposal. As part of the discussion of PEP 0201
"Lockstep Iteration" [4], the author of this proposal conducted an
informal usability experiment [5]. The results showed that users
are psychologically receptive to "cross-product" loop syntax. For
example, most users expected:
The second most important requirement is the ability to handle all
common cases cleanly and clearly. There are nine such cases:
S = [10, 20, 30]
T = [1, 2, 3]
for x in S; y in T:
print x+y,
|5 6| * 9 = |45 54| MS: matrix-scalar multiplication
|7 8| |63 72|
to print "11 12 13 21 22 23 31 32 33". We believe that users will
have the same reaction to:
9 * |5 6| = |45 54| SM: scalar-matrix multiplication
|7 8| |63 72|
for (x, y) in S @ T:
print x+y
|2 3| * |4 5| = |8 15| VE: vector elementwise multiplication
i.e. that they will naturally interpret this as a tidy way to
write loop nests.
This is where iterators come in. Actually constructing the
cross-product of two (or more) sequences before executing the loop
would be very expensive. On the other hand, "@" could be defined
to get its arguments' iterators, and then create an outer iterator
which returns tuples of the values returned by the inner
iterators.
|2 3| * |4| = 23 VD: vector dot product
|5|
Discussion
|2| * |4 5| = | 8 10| VO: vector outer product
|3| |12 15|
1. Adding a named function "across" would have less impact on
Python than a new infix operator. However, this would not make
Python more appealing to numerical programmers, who really do
care whether they can write matrix multiplication using an
operator, or whether they have to write it as a function call.
|1 2| * |5 6| = | 5 12| ME: matrix elementwise multiplication
|3 4| |7 8| |21 32|
2. "@" would have be chainable in the same way as comparison
operators, i.e.:
|1 2| * |5 6| = |19 22| MM: mathematical matrix multiplication
|3 4| |7 8| |43 50|
(1, 2) @ (3, 4) @ (5, 6)
|1 2| * |5 6| = |19 22| VM: vector-matrix multiplication
|7 8|
would have to return (1, 3, 5) ... (2, 4, 6), and *not*
((1, 3), 5) ... ((2, 4), 6). This should not require special
support from the parser, as the outer iterator created by the
first "@" could easily be taught how to combine itself with
ordinary iterators.
|5 6| * |1| = |17| MV: matrix-vector multiplication
|7 8| |2| |23|
3. There would have to be some way to distinguish restartable
iterators from ones that couldn't be restarted. For example,
if S is an input stream (e.g. a file), and L is a list, then "S
@ L" is straightforward, but "L @ S" is not, since iteration
through the stream cannot be repeated. This could be treated
as an error, or by having the outer iterator detect
non-restartable inner iterators and cache their values.
Note that 1-dimensional vectors are treated as rows in VM, as
columns in MV, and as both in VD and VO. Both are special cases
of 2-dimensional matrices (Nx1 and 1xN respectively). We will
therefore define the new operator only for 2-dimensional arrays,
and provide an easy (and efficient) way for users to treat
1-dimensional structures as 2-dimensional.
4. Whiteboard testing of this proposal in front of three novice
Python users (all of them experienced programmers) indicates
that users will expect:
Third, we must avoid confusion between Python's notation and those
of MATLAB and Fortran-90. In particular, mathematical matrix
multiplication (case MM) should not be represented as '.*', since:
"ab" @ "cd"
(a) MATLAB uses prefix-'.' forms to mean 'elementwise', and raw
forms to mean "mathematical"; and
to return four strings, not four tuples of pairs of
characters. Opinion was divided on what:
(b) even if the Python parser can be taught how to handle dotted
forms, '1.*A' will still be visually ambiguous.
("a", "b") @ "cd"
Proposal
The meanings of all existing operators will be unchanged. In
particular, 'A*B' will continue to be interpreted elementwise.
This takes care of the cases MS, SM, VE, and ME, and ensures
minimal impact on existing programs.
A new operator '@' (pronounced "across") will be added to Python,
along with special methods "__across__()", "__racross__()", and
"__iacross__()", with the usual semantics. (We recommend using
"@", rather than the times-like "><", because of the ease with
which the latter could be mis-typed as inequality "<>".)
No new operators will be defined to mean "solve a set of linear
equations", or "invert a matrix".
(Optional) When applied to sequences, the "@" operator will return
a tuple of tuples containing the cross-product of their elements
in left-to-right order:
>>> [1, 2] @ (3, 4)
((1, 3), (1, 4), (2, 3), (2, 4))
>>> [1, 2] @ (3, 4) @ (5, 6)
((1, 3, 5), (1, 3, 6),
(1, 4, 5), (1, 4, 6),
(2, 3, 5), (2, 3, 6),
(2, 4, 5), (2, 4, 6))
This will require the same kind of special support from the parser
as chained comparisons (such as "a<b<c<=d"). However, it will
permit:
>>> for (i, j) in [1, 2] @ [3, 4]:
>>> print i, j
1 3
1 4
2 3
2 4
as a short-hand for the common nested loop idiom:
>>> for i in [1, 2]:
>>> for j in [3, 4]:
>>> print i, j
Response to the 'lockstep loop' questionnaire [5] indicated that
newcomers would be comfortable with this (so comfortable, in fact,
that most of them interpreted most multi-loop 'zip' syntaxes [6]
as implementing single-stage nesting).
ought to return...
Alternatives
01. Don't add new operators.
1. Do nothing --- keep Python simple.
This is always the default choice.
2. Add a named function instead of an operator.
Python is not primarily a numerical language; it may not be worth
complexifying it for this special case. NumPy's success is proof
that users can and will use functions and methods for linear
algebra. However, support for real matrix multiplication is
frequently requested, as:
complexifying it for this special case. However, support for real
matrix multiplication *is* frequently requested, and the proposed
semantics for "@" for built-in sequence types would simplify
expression of a very common idiom (nested loops).
* functional forms are cumbersome for lengthy formulas, and do not
respect the operator precedence rules of conventional mathematics;
and
3. Introduce prefixed forms of all existing operators, such as
"~*" and "~+", as proposed in PEP 0225 [1].
* method forms are asymmetric in their operands.
What's more, the proposed semantics for "@" for built-in sequence
types would simplify expression of a very common idiom (nested
loops). User testing during discussion of 'lockstep loops'
indicated that both new and experienced users would understand
this immediately.
02. Introduce prefixed forms of all existing operators, such as
"~*" and "~+", as proposed in PEP 0225 [7].
This proposal would duplicate all built-in mathematical operators
with matrix equivalents, as in numerical languages such as
MATLAB. Our objections to this are:
* Python is not primarily a numerical programming language. While
the (self-selected) participants in the discussions that led to
PEP 0225 may want all of these new operators, the majority of
Python users would be indifferent. The extra complexity they
would introduce into the language therefore does not seem
merited. (See also Rawlings' comments, quoted in the Background
section, about these operators not being essential.)
* The proposed syntax is difficult to read (i.e. passes the "low
toner" readability test).
03. Retain the existing meaning of all operators, but create a
behavioral accessor for arrays, such that:
A * B
is elementwise multiplication (ME), but:
A.m() * B.m()
is mathematical multiplication (MM). The method "A.m()" would
return an object that aliased A's memory (for efficiency), but
which had a different implementation of __mul__().
This proposal was made by Moshe Zadka, and is also considered by
PEP 0225 [7]. Its advantage is that it has no effect on the
existing implementation of Python: changes are localized in the
Numeric module. The disadvantages are
* The semantics of "A.m() * B", "A + B.m()", and so on would have
to be defined, and there is no "obvious" choice for them.
* Aliasing objects to trigger different operator behavior feels
less Pythonic than either calling methods (as in the existing
Numeric module) or using a different operator. This PEP is
primarily about look and feel, and about making Python more
attractive to people who are not already using it.
Related Proposals
0207 : Rich Comparisons
It may become possible to overload comparison operators
such as '<' so that an expression such as 'A < B' returns
an array, rather than a scalar value.
0209 : Adding Multidimensional Arrays
Multidimensional arrays are currently an extension to
Python, rather than a built-in type.
0225 : Elementwise/Objectwise Operators
A larger proposal that addresses the same subject, but
which proposes many more additions to the language.
Our objections to this are that there isn't enough demand to
justify the additional complexity (see Rawlings' comments [3]),
and that the proposed syntax fails the "low toner" readability
test.
Acknowledgments
I am grateful to Huaiyu Zhu [8] for initiating this discussion,
and for some of the ideas and terminology included below.
I am grateful to Huaiyu Zhu for initiating this discussion, and to
James Rawlings and students in various Python courses for their
discussions of what numerical programmers really care about.
References
[1] http://www.acm.org/sigapl/whyapl.htm
[2] http://numpy.sourceforge.net
[3] http://bevo.che.wisc.edu/octave/
[4] http://www.egroups.com/message/python-numeric/4
[5] http://www.python.org/pipermail/python-dev/2000-July/013139.html
[6] PEP-0201.txt "Lockstep Iteration"
[7] http://www.python.org/pipermail/python-list/2000-August/112529.html
Appendix: NumPy
NumPy will overload "@" to perform mathematical multiplication of
arrays where shapes permit, and to throw an exception otherwise.
Its implementation of "@" will treat built-in sequence types as if
they were column vectors. This takes care of the cases MM and MV.
An attribute "T" will be added to the NumPy array type, such that
"m.T" is:
(a) the transpose of "m" for a 2-dimensional array
(b) the 1xN matrix transpose of "m" if "m" is a 1-dimensional
array; or
(c) a runtime error for an array with rank >= 3.
This attribute will alias the memory of the base object. NumPy's
"transpose()" function will be extended to turn built-in sequence
types into row vectors. This takes care of the VM, VD, and VO
cases. We propose an attribute because:
(a) the resulting notation is similar to the 'superscript T' (at
least, as similar as ASCII allows), and
(b) it signals that the transposition aliases the original object.
NumPy will define a value "inv", which will be recognized by the
exponentiation operator, such that "A ** inv" is the inverse of
"A". This is similar in spirit to NumPy's existing "newaxis"
value.
[1] http://python.sourceforge.net/peps/pep-0225.html
[2] http://bevo.che.wisc.edu/octave/
[3] http://www.egroups.com/message/python-numeric/4
[4] http://python.sourceforge.net/peps/pep-0201.html
[5] http://mail.python.org/pipermail/python-dev/2000-July/006427.html