PEP: 211 Title: Adding New Linear Algebra Operators to Python Version: $Revision$ Author: gvwilson@nevex.com (Greg Wilson) Status: Draft Type: Standards Track Python-Version: 2.1 Created: 15-Jul-2000 Post-History: Introduction This PEP describes a conservative proposal to add linear algebra operators to Python 2.0. It discusses why such operators are desirable, and why a minimalist approach should be adopted at this point. This PEP summarizes discussions held in mailing list forums, and provides URLs for further information, where appropriate. The CVS revision history of this file contains the definitive historical record. Proposal Add a single new infix binary operator '@' ("across"), and corresponding special methods "__across__()", "__racross__()", and "__iacross__()". This operator will perform mathematical matrix multiplication on NumPy arrays, and generate cross-products when applied to built-in sequence types. No existing operator definitions will be changed. Background The first high-level programming language, Fortran, was invented to do arithmetic. While this is now just a small part of computing, there are still many programmers who need to express complex mathematical operations in code. The most influential of Fortran's successors was APL [1]. Its author, Kenneth Iverson, designed the language as a notation for expressing matrix algebra, and received the 1980 Turing Award for his work. APL's operators supported both familiar algebraic operations, such as vector dot product and matrix multiplication, and a wide range of structural operations, such as stitching vectors together to create arrays. Even by programming's standards, APL is exceptionally cryptic: many of its symbols did not exist on standard keyboards, and expressions have to be read right to left. Most subsequent work numerical languages, such as Fortran-90, MATLAB, and Mathematica, have tried to provide the power of APL without the obscurity. Python's NumPy [2] has most of the features that users of such languages expect, but these are provided through named functions and methods, rather than overloaded operators. This makes NumPy clumsier than most alternatives. The author of this PEP therefore consulted the developers of GNU Octave [3], an open source clone of MATLAB. When asked how important it was to have infix operators for matrix solution, Prof. James Rawlings replied [4]: I DON'T think it's a must have, and I do a lot of matrix inversion. I cannot remember if its A\b or b\A so I always write inv(A)*b instead. I recommend dropping \. Rawlings' feedback on other operators was similar. It is worth noting in this context that notations such as "/" and "\" for matrix solution were invented by programmers, not mathematicians, and have not been adopted by the latter. Based on this discussion, and feedback from classes at the US national laboratories and elsewhere, we recommend only adding a matrix multiplication operator to Python at this time. If there is significant user demand for syntactic support for other operations, these can be added in a later release. Requirements The most important requirement is minimal impact on existing Python programs and users: the proposal must not break existing code (except possibly NumPy). The second most important requirement is the ability to handle all common cases cleanly and clearly. There are nine such cases: |5 6| * 9 = |45 54| MS: matrix-scalar multiplication |7 8| |63 72| 9 * |5 6| = |45 54| SM: scalar-matrix multiplication |7 8| |63 72| |2 3| * |4 5| = |8 15| VE: vector elementwise multiplication |2 3| * |4| = 23 VD: vector dot product |5| |2| * |4 5| = | 8 10| VO: vector outer product |3| |12 15| |1 2| * |5 6| = | 5 12| ME: matrix elementwise multiplication |3 4| |7 8| |21 32| |1 2| * |5 6| = |19 22| MM: mathematical matrix multiplication |3 4| |7 8| |43 50| |1 2| * |5 6| = |19 22| VM: vector-matrix multiplication |7 8| |5 6| * |1| = |17| MV: matrix-vector multiplication |7 8| |2| |23| Note that 1-dimensional vectors are treated as rows in VM, as columns in MV, and as both in VD and VO. Both are special cases of 2-dimensional matrices (Nx1 and 1xN respectively). We will therefore define the new operator only for 2-dimensional arrays, and provide an easy (and efficient) way for users to treat 1-dimensional structures as 2-dimensional. Third, we must avoid confusion between Python's notation and those of MATLAB and Fortran-90. In particular, mathematical matrix multiplication (case MM) should not be represented as '.*', since: (a) MATLAB uses prefix-'.' forms to mean 'elementwise', and raw forms to mean "mathematical"; and (b) even if the Python parser can be taught how to handle dotted forms, '1.*A' will still be visually ambiguous. Proposal The meanings of all existing operators will be unchanged. In particular, 'A*B' will continue to be interpreted elementwise. This takes care of the cases MS, SM, VE, and ME, and ensures minimal impact on existing programs. A new operator '@' (pronounced "across") will be added to Python, along with special methods "__across__()", "__racross__()", and "__iacross__()", with the usual semantics. NumPy will overload "@" to perform mathematical multiplication of arrays where shapes permit, and to throw an exception otherwise. Its implementation of "@" will treat built-in sequence types as if they were column vectors. This takes care of the cases MM and MV. An attribute "T" will be added to the NumPy array type, such that "m.T" is: (a) the transpose of "m" for a 2-dimensional array (b) the 1xN matrix transpose of "m" if "m" is a 1-dimensional array; or (c) a runtime error for an array with rank >= 3. This attribute will alias the memory of the base object. NumPy's "transpose()" function will be extended to turn built-in sequence types into row vectors. This takes care of the VM, VD, and VO cases. We propose an attribute because: (a) the resulting notation is similar to the 'superscript T' (at least, as similar as ASCII allows), and (b) it signals that the transposition aliases the original object. No new operators will be defined to mean "solve a set of linear equations", or "invert a matrix". Instead, NumPy will define a value "inv", which will be recognized by the exponentiation operator, such that "A ** inv" is the inverse of "A". This is similar in spirit to NumPy's existing "newaxis" value. (Optional) When applied to sequences, the "@" operator will return a tuple of tuples containing the cross-product of their elements in left-to-right order: >>> [1, 2] @ (3, 4) ((1, 3), (1, 4), (2, 3), (2, 4)) >>> [1, 2] @ (3, 4) @ (5, 6) ((1, 3, 5), (1, 3, 6), (1, 4, 5), (1, 4, 6), (2, 3, 5), (2, 3, 6), (2, 4, 5), (2, 4, 6)) This will require the same kind of special support from the parser as chained comparisons (such as "a>> for (i, j) in [1, 2] @ [3, 4]: >>> print i, j 1 3 1 4 2 3 2 4 as a short-hand for the common nested loop idiom: >>> for i in [1, 2]: >>> for j in [3, 4]: >>> print i, j Response to the 'lockstep loop' questionnaire [5] indicated that newcomers would be comfortable with this (so comfortable, in fact, that most of them interpreted most multi-loop 'zip' syntaxes [6] as implementing single-stage nesting). Alternatives 01. Don't add new operators. Python is not primarily a numerical language; it may not be worth complexifying it for this special case. NumPy's success is proof that users can and will use functions and methods for linear algebra. However, support for real matrix multiplication is frequently requested, as: * functional forms are cumbersome for lengthy formulas, and do not respect the operator precedence rules of conventional mathematics; and * method forms are asymmetric in their operands. What's more, the proposed semantics for "@" for built-in sequence types would simplify expression of a very common idiom (nested loops). User testing during discussion of 'lockstep loops' indicated that both new and experienced users would understand this immediately. 02. Introduce prefixed forms of all existing operators, such as "~*" and "~+", as proposed in PEP 0225 [7]. This proposal would duplicate all built-in mathematical operators with matrix equivalents, as in numerical languages such as MATLAB. Our objections to this are: * Python is not primarily a numerical programming language. While the (self-selected) participants in the discussions that led to PEP 0225 may want all of these new operators, the majority of Python users would be indifferent. The extra complexity they would introduce into the language therefore does not seem merited. (See also Rawlings' comments, quoted in the Background section, about these operators not being essential.) * The proposed syntax is difficult to read (i.e. passes the "low toner" readability test). 03. Retain the existing meaning of all operators, but create a behavioral accessor for arrays, such that: A * B is elementwise multiplication (ME), but: A.m() * B.m() is mathematical multiplication (MM). The method "A.m()" would return an object that aliased A's memory (for efficiency), but which had a different implementation of __mul__(). This proposal was made by Moshe Zadka, and is also considered by PEP 0225 [7]. Its advantage is that it has no effect on the existing implementation of Python: changes are localized in the Numeric module. The disadvantages are * The semantics of "A.m() * B", "A + B.m()", and so on would have to be defined, and there is no "obvious" choice for them. * Aliasing objects to trigger different operator behavior feels less Pythonic than either calling methods (as in the existing Numeric module) or using a different operator. This PEP is primarily about look and feel, and about making Python more attractive to people who are not already using it. Related Proposals 0207 : Rich Comparisons It may become possible to overload comparison operators such as '<' so that an expression such as 'A < B' returns an array, rather than a scalar value. 0209 : Adding Multidimensional Arrays Multidimensional arrays are currently an extension to Python, rather than a built-in type. 0225 : Elementwise/Objectwise Operators A (much) larger proposal that addresses the same subject. Acknowledgments I am grateful to Huaiyu Zhu [8] for initiating this discussion, and for some of the ideas and terminology included below. References [1] http://www.acm.org/sigapl/whyapl.htm [2] http://numpy.sourceforge.net [3] http://bevo.che.wisc.edu/octave/ [4] http://www.egroups.com/message/python-numeric/4 [5] http://www.python.org/pipermail/python-dev/2000-July/013139.html [6] PEP-0201.txt "Lockstep Iteration" [7] http://www.python.org/pipermail/python-list/2000-August/112529.html Appendix: Other Operations We considered syntactic support for three other operations: T (a) transposition: A => A[j, i] for A[i, j] -1 (b) inverse: A => A' such that A' * A = I (the identity matrix) (c) solution: A/b => x such that A * x = b A\b => x such that x * A = b Local Variables: mode: indented-text indent-tabs-mode: nil End: