Greg's latest revision.

This commit is contained in:
Barry Warsaw 2000-11-22 22:01:47 +00:00
parent 6efa7422c1
commit 8daf85c43a
1 changed files with 141 additions and 168 deletions

View File

@ -11,73 +11,81 @@ Post-History:
Introduction
This PEP describes a proposal to add linear algebra operators to
Python 2.0. It discusses why such operators are desirable, and
alternatives that have been considered and discarded. This PEP
summarizes discussions held in mailing list forums, and provides
URLs for further information, where appropriate. The CVS revision
history of this file contains the definitive historical record.
This PEP describes a conservative proposal to add linear algebra
operators to Python 2.0. It discusses why such operators are
desirable, and why a minimalist approach should be adopted at this
point. This PEP summarizes discussions held in mailing list
forums, and provides URLs for further information, where
appropriate. The CVS revision history of this file contains the
definitive historical record.
Proposal
Add a single new infix binary operator '@' ("across"), and
corresponding special methods "__across__()" and "__racross__()".
This operator will perform mathematical matrix multiplication on
NumPy arrays, and generate cross-products when applied to built-in
sequence types. No existing operator definitions will be changed.
corresponding special methods "__across__()", "__racross__()", and
"__iacross__()". This operator will perform mathematical matrix
multiplication on NumPy arrays, and generate cross-products when
applied to built-in sequence types. No existing operator
definitions will be changed.
Background
Computers were invented to do arithmetic, as was the first
high-level programming language, Fortran. While Fortran was a
great advance on its machine-level predecessors, there was still a
very large gap between its syntax and the notation used by
mathematicians. The most influential effort to close this gap was
APL [1]:
The first high-level programming language, Fortran, was invented
to do arithmetic. While this is now just a small part of
computing, there are still many programmers who need to express
complex mathematical operations in code.
The language [APL] was invented by Kenneth E. Iverson while at
Harvard University. The language, originally titled "Iverson
Notation", was designed to overcome the inherent ambiguities
and points of confusion found when dealing with standard
mathematical notation. It was later described in 1962 in a
book simply titled "A Programming Language" (hence APL).
Towards the end of the sixties, largely through the efforts of
IBM, the computer community gained its first exposure to
APL. Iverson received the Turing Award in 1980 for this work.
The most influential of Fortran's successors was APL [1]. Its
author, Kenneth Iverson, designed the language as a notation for
expressing matrix algebra, and received the 1980 Turing Award for
his work.
APL's operators supported both familiar algebraic operations, such
as vector dot product and matrix multiplication, and a wide range
of structural operations, such as stitching vectors together to
create arrays. Its notation was exceptionally cryptic: many of
its symbols did not exist on standard keyboards, and expressions
had to be read right to left.
create arrays. Even by programming's standards, APL is
exceptionally cryptic: many of its symbols did not exist on
standard keyboards, and expressions have to be read right to left.
Most subsequent work on numerical languages, such as Fortran-90,
MATLAB, and Mathematica, has tried to provide the power of APL
Most subsequent work numerical languages, such as Fortran-90,
MATLAB, and Mathematica, have tried to provide the power of APL
without the obscurity. Python's NumPy [2] has most of the
features that users of such languages expect, but these are
provided through named functions and methods, rather than
overloaded operators. This makes NumPy clumsier than its
competitors.
overloaded operators. This makes NumPy clumsier than most
alternatives.
One way to make NumPy more competitive is to provide greater
syntactic support in Python itself for linear algebra. This
proposal therefore examines the requirements that new linear
algebra operators in Python must satisfy, and proposes a syntax
and semantics for those operators.
The author of this PEP therefore consulted the developers of GNU
Octave [3], an open source clone of MATLAB. When asked how
important it was to have infix operators for matrix solution,
Prof. James Rawlings replied [4]:
I DON'T think it's a must have, and I do a lot of matrix
inversion. I cannot remember if its A\b or b\A so I always
write inv(A)*b instead. I recommend dropping \.
Rawlings' feedback on other operators was similar. It is worth
noting in this context that notations such as "/" and "\" for
matrix solution were invented by programmers, not mathematicians,
and have not been adopted by the latter.
Based on this discussion, and feedback from classes at the US
national laboratories and elsewhere, we recommend only adding a
matrix multiplication operator to Python at this time. If there
is significant user demand for syntactic support for other
operations, these can be added in a later release.
Requirements
The most important requirement is that there be minimal impact on
the existing definition of Python. The proposal must not break
existing programs, except possibly those that use NumPy.
The most important requirement is minimal impact on existing
Python programs and users: the proposal must not break existing
code (except possibly NumPy).
The second most important requirement is to be able to do both
elementwise and mathematical matrix multiplication using infix
notation. The nine cases that must be handled are:
The second most important requirement is the ability to handle all
common cases cleanly and clearly. There are nine such cases:
|5 6| * 9 = |45 54| MS: matrix-scalar multiplication
|7 8| |63 72|
@ -108,60 +116,23 @@ Requirements
Note that 1-dimensional vectors are treated as rows in VM, as
columns in MV, and as both in VD and VO. Both are special cases
of 2-dimensional matrices (Nx1 and 1xN respectively). It may
therefore be reasonable to define the new operator only for
2-dimensional arrays, and provide an easy (and efficient) way for
users to convert 1-dimensional structures to 2-dimensional.
Behavior of a new multiplication operator for built-in types may
then:
of 2-dimensional matrices (Nx1 and 1xN respectively). We will
therefore define the new operator only for 2-dimensional arrays,
and provide an easy (and efficient) way for users to treat
1-dimensional structures as 2-dimensional.
(a) be a parsing error (possible only if a constant is one of the
arguments, since names are untyped in Python);
(b) generate a runtime error; or
(c) be derived by plausible extension from its behavior in the
two-dimensional case.
Third, syntactic support should be considered for three other
operations:
T
(a) transposition: A => A[j, i] for A[i, j]
-1
(b) inverse: A => A' such that A' * A = I (the identity matrix)
(c) solution: A/b => x such that A * x = b
A\b => x such that x * A = b
With regard to (c), it is worth noting that the two syntaxes used
were invented by programmers, not mathematicians. Mathematicians
do not have a standard, widely-used notation for matrix solution.
It is also worth noting that dozens of matrix inversion and
solution algorithms are widely used. MATLAB and its kin bind
their inversion and/or solution operators to one which is
reasonably robust in most cases, and require users to call
functions or methods to access others.
Fourth, confusion between Python's notation and those of MATLAB
and Fortran-90 should be avoided. In particular, mathematical
matrix multiplication (case MM) should not be represented as '.*',
since:
Third, we must avoid confusion between Python's notation and those
of MATLAB and Fortran-90. In particular, mathematical matrix
multiplication (case MM) should not be represented as '.*', since:
(a) MATLAB uses prefix-'.' forms to mean 'elementwise', and raw
forms to mean "mathematical" [4]; and
forms to mean "mathematical"; and
(b) even if the Python parser can be taught how to handle dotted
forms, '1.*A' will still be visually ambiguous [4].
One anti-requirement is that new operators are not needed for
addition, subtraction, bitwise operations, and so on, since
mathematicians already treat them elementwise.
forms, '1.*A' will still be visually ambiguous.
Proposal:
Proposal
The meanings of all existing operators will be unchanged. In
particular, 'A*B' will continue to be interpreted elementwise.
@ -169,14 +140,13 @@ Proposal:
minimal impact on existing programs.
A new operator '@' (pronounced "across") will be added to Python,
along with two special methods, "__across__()" and
"__racross__()", with the usual semantics.
along with special methods "__across__()", "__racross__()", and
"__iacross__()", with the usual semantics.
NumPy will overload "@" to perform mathematical multiplication of
arrays where shapes permit, and to throw an exception otherwise.
The matrix class's implementation of "@" will treat built-in
sequence types as if they were column vectors. This takes care of
the cases MM and MV.
Its implementation of "@" will treat built-in sequence types as if
they were column vectors. This takes care of the cases MM and MV.
An attribute "T" will be added to the NumPy array type, such that
"m.T" is:
@ -204,22 +174,22 @@ Proposal:
operator, such that "A ** inv" is the inverse of "A". This is
similar in spirit to NumPy's existing "newaxis" value.
(Optional) When applied to sequences, the operator will return a
list of tuples containing the cross-product of their elements in
left-to-right order:
(Optional) When applied to sequences, the "@" operator will return
a tuple of tuples containing the cross-product of their elements
in left-to-right order:
>>> [1, 2] @ (3, 4)
[(1, 3), (1, 4), (2, 3), (2, 4)]
((1, 3), (1, 4), (2, 3), (2, 4))
>>> [1, 2] @ (3, 4) @ (5, 6)
[(1, 3, 5), (1, 3, 6),
((1, 3, 5), (1, 3, 6),
(1, 4, 5), (1, 4, 6),
(2, 3, 5), (2, 3, 6),
(2, 4, 5), (2, 4, 6)]
(2, 4, 5), (2, 4, 6))
This will require the same kind of special support from the parser
as chained comparisons (such as "a<b<c<=d"). However, it would
permit the following:
as chained comparisons (such as "a<b<c<=d"). However, it will
permit:
>>> for (i, j) in [1, 2] @ [3, 4]:
>>> print i, j
@ -240,44 +210,48 @@ Proposal:
as implementing single-stage nesting).
Alternatives:
Alternatives
01. Don't add new operators --- stick to functions and methods.
01. Don't add new operators.
Python is not primarily a numerical language. It is not worth
complexifying the language for this special case --- NumPy's
success is proof that users can and will use functions and methods
for linear algebra.
Python is not primarily a numerical language; it may not be worth
complexifying it for this special case. NumPy's success is proof
that users can and will use functions and methods for linear
algebra. However, support for real matrix multiplication is
frequently requested, as:
On the positive side, this maintains Python's simplicity. Its
weakness is that support for real matrix multiplication (and, to a
lesser extent, other linear algebra operations) is frequently
requested, as functional forms are cumbersome for lengthy
formulas, and do not respect the operator precedence rules of
conventional mathematics. In addition, the method form is
asymmetric in its operands.
* functional forms are cumbersome for lengthy formulas, and do not
respect the operator precedence rules of conventional mathematics;
and
02. Introduce prefixed forms of existing operators, such as "@*"
or "~*", or used boxed forms, such as "[*]" or "%*%".
* method forms are asymmetric in their operands.
There are (at least) three objections to this. First, either form
seems to imply that all operators exist in both forms. This is
more new entities than the problem merits, and would require the
addition of many new overloadable methods, such as __at_mul__.
What's more, the proposed semantics for "@" for built-in sequence
types would simplify expression of a very common idiom (nested
loops). User testing during discussion of 'lockstep loops'
indicated that both new and experienced users would understand
this immediately.
Second, while it is certainly possible to invent semantics for
these new operators for built-in types, this would be a case of
the tail wagging the dog, i.e. of letting the existence of a
feature "create" a need for it.
02. Introduce prefixed forms of all existing operators, such as
"~*" and "~+", as proposed in PEP 0225 [7].
Finally, the boxed forms make human parsing more complex, e.g.:
This proposal would duplicate all built-in mathematical operators
with matrix equivalents, as in numerical languages such as
MATLAB. Our objections to this are:
A[*] = B vs. A[:] = B
* Python is not primarily a numerical programming language. While
the (self-selected) participants in the discussions that led to
PEP 0225 may want all of these new operators, the majority of
Python users would be indifferent. The extra complexity they
would introduce into the language therefore does not seem
merited. (See also Rawlings' comments, quoted in the Background
section, about these operators not being essential.)
03. (From Moshe Zadka [7], and also considered by Huaiyu Zhou [8]
in his proposal [9]) Retain the existing meaning of all
operators, but create a behavioral accessor for arrays, such
that:
* The proposed syntax is difficult to read (i.e. passes the "low
toner" readability test).
03. Retain the existing meaning of all operators, but create a
behavioral accessor for arrays, such that:
A * B
@ -289,42 +263,23 @@ Alternatives:
return an object that aliased A's memory (for efficiency), but
which had a different implementation of __mul__().
The advantage of this method is that it has no effect on the
This proposal was made by Moshe Zadka, and is also considered by
PEP 0225 [7]. Its advantage is that it has no effect on the
existing implementation of Python: changes are localized in the
Numeric module. The disadvantages are:
Numeric module. The disadvantages are
(a) The semantics of "A.m() * B", "A + B.m()", and so on would
have to be defined, and there is no "obvious" choice for them.
* The semantics of "A.m() * B", "A + B.m()", and so on would have
to be defined, and there is no "obvious" choice for them.
(b) Aliasing objects to trigger different operator behavior feels
less Pythonic than either calling methods (as in the existing
Numeric module) or using a different operator. This PEP is
primarily about look and feel, and about making Python more
attractive to people who are not already using it.
04. (From a proposal [9] by Huaiyu Zhou [8]) Introduce a "delayed
inverse" attribute, similar to the "transpose" attribute
advocated in the third part of this proposal. The expression
"a.I" would be a delayed handle on the inverse of the matrix
"a", which would be evaluated in context as required. For
example, "a.I * b" and "b * a.I" would solve sets of linear
equations, without actually calculating the inverse.
The main drawback of this proposal it is reliance on lazy
evaluation, and even more on "smart" lazy evaluation (i.e. the
operation performed depends on the context in which the evaluation
is done). The BDFL has so far resisted introducing LE into
Python.
* Aliasing objects to trigger different operator behavior feels
less Pythonic than either calling methods (as in the existing
Numeric module) or using a different operator. This PEP is
primarily about look and feel, and about making Python more
attractive to people who are not already using it.
Related Proposals
0203 : Augmented Assignments
If new operators for linear algebra are introduced, it may
make sense to introduce augmented assignment forms for
them.
0207 : Rich Comparisons
It may become possible to overload comparison operators
@ -336,24 +291,42 @@ Related Proposals
Multidimensional arrays are currently an extension to
Python, rather than a built-in type.
0225 : Elementwise/Objectwise Operators
Acknowledgments:
A (much) larger proposal that addresses the same subject.
Acknowledgments
I am grateful to Huaiyu Zhu [8] for initiating this discussion,
and for some of the ideas and terminology included below.
References:
References
[1] http://www.acm.org/sigapl/whyapl.htm
[2] http://numpy.sourceforge.net
[3] PEP-0203.txt "Augmented Assignments".
[4] http://bevo.che.wisc.edu/octave/doc/octave_9.html#SEC69
[3] http://bevo.che.wisc.edu/octave/
[4] http://www.egroups.com/message/python-numeric/4
[5] http://www.python.org/pipermail/python-dev/2000-July/013139.html
[6] PEP-0201.txt "Lockstep Iteration"
[7] Moshe Zadka is 'moshez@math.huji.ac.il'.
[8] Huaiyu Zhu is 'hzhu@users.sourceforge.net'
[9] http://www.python.org/pipermail/python-list/2000-August/112529.html
[7] http://www.python.org/pipermail/python-list/2000-August/112529.html
Appendix: Other Operations
We considered syntactic support for three other operations:
T
(a) transposition: A => A[j, i] for A[i, j]
-1
(b) inverse: A => A' such that A' * A = I (the identity matrix)
(c) solution: A/b => x such that A * x = b
A\b => x such that x * A = b
Local Variables: