PEP: 211 Title: Adding New Linear Algebra Operators to Python Version: $Revision$ Owner: gvwilson@nevex.com (Greg Wilson) Python-Version: 2.1 Created: 15-Jul-2000 Status: Draft Post-History: Introduction This PEP describes a proposal to add linear algebra operators to Python 2.0. It discusses why such operators are desirable, and alternatives that have been considered and discarded. This PEP summarizes discussions held in mailing list forums, and provides URLs for further information, where appropriate. The CVS revision history of this file contains the definitive historical record. Proposal Add a single new infix binary operator '@' ("across"), and corresponding special methods "__across__()" and "__racross__()". This operator will perform mathematical matrix multiplication on NumPy arrays, and generate cross-products when applied to built-in sequence types. No existing operator definitions will be changed. Background Computers were invented to do arithmetic, as was the first high-level programming language, Fortran. While Fortran was a great advance on its machine-level predecessors, there was still a very large gap between its syntax and the notation used by mathematicians. The most influential effort to close this gap was APL [1]: The language [APL] was invented by Kenneth E. Iverson while at Harvard University. The language, originally titled "Iverson Notation", was designed to overcome the inherent ambiguities and points of confusion found when dealing with standard mathematical notation. It was later described in 1962 in a book simply titled "A Programming Language" (hence APL). Towards the end of the sixties, largely through the efforts of IBM, the computer community gained its first exposure to APL. Iverson received the Turing Award in 1980 for this work. APL's operators supported both familiar algebraic operations, such as vector dot product and matrix multiplication, and a wide range of structural operations, such as stitching vectors together to create arrays. Its notation was exceptionally cryptic: many of its symbols did not exist on standard keyboards, and expressions had to be read right to left. Most subsequent work on numerical languages, such as Fortran-90, MATLAB, and Mathematica, has tried to provide the power of APL without the obscurity. Python's NumPy [2] has most of the features that users of such languages expect, but these are provided through named functions and methods, rather than overloaded operators. This makes NumPy clumsier than its competitors. One way to make NumPy more competitive is to provide greater syntactic support in Python itself for linear algebra. This proposal therefore examines the requirements that new linear algebra operators in Python must satisfy, and proposes a syntax and semantics for those operators. Requirements The most important requirement is that there be minimal impact on the existing definition of Python. The proposal must not break existing programs, except possibly those that use NumPy. The second most important requirement is to be able to do both elementwise and mathematical matrix multiplication using infix notation. The nine cases that must be handled are: |5 6| * 9 = |45 54| MS: matrix-scalar multiplication |7 8| |63 72| 9 * |5 6| = |45 54| SM: scalar-matrix multiplication |7 8| |63 72| |2 3| * |4 5| = |8 15| VE: vector elementwise multiplication |2 3| * |4| = 23 VD: vector dot product |5| |2| * |4 5| = | 8 10| VO: vector outer product |3| |12 15| |1 2| * |5 6| = | 5 12| ME: matrix elementwise multiplication |3 4| |7 8| |21 32| |1 2| * |5 6| = |19 22| MM: mathematical matrix multiplication |3 4| |7 8| |43 50| |1 2| * |5 6| = |19 22| VM: vector-matrix multiplication |7 8| |5 6| * |1| = |17| MV: matrix-vector multiplication |7 8| |2| |23| Note that 1-dimensional vectors are treated as rows in VM, as columns in MV, and as both in VD and VO. Both are special cases of 2-dimensional matrices (Nx1 and 1xN respectively). It may therefore be reasonable to define the new operator only for 2-dimensional arrays, and provide an easy (and efficient) way for users to convert 1-dimensional structures to 2-dimensional. Behavior of a new multiplication operator for built-in types may then: (a) be a parsing error (possible only if a constant is one of the arguments, since names are untyped in Python); (b) generate a runtime error; or (c) be derived by plausible extension from its behavior in the two-dimensional case. Third, syntactic support should be considered for three other operations: T (a) transposition: A => A[j, i] for A[i, j] -1 (b) inverse: A => A' such that A' * A = I (the identity matrix) (c) solution: A/b => x such that A * x = b A\b => x such that x * A = b With regard to (c), it is worth noting that the two syntaxes used were invented by programmers, not mathematicians. Mathematicians do not have a standard, widely-used notation for matrix solution. It is also worth noting that dozens of matrix inversion and solution algorithms are widely used. MATLAB and its kin bind their inversion and/or solution operators to one which is reasonably robust in most cases, and require users to call functions or methods to access others. Fourth, confusion between Python's notation and those of MATLAB and Fortran-90 should be avoided. In particular, mathematical matrix multiplication (case MM) should not be represented as '.*', since: (a) MATLAB uses prefix-'.' forms to mean 'elementwise', and raw forms to mean "mathematical" [4]; and (b) even if the Python parser can be taught how to handle dotted forms, '1.*A' will still be visually ambiguous [4]. One anti-requirement is that new operators are not needed for addition, subtraction, bitwise operations, and so on, since mathematicians already treat them elementwise. Proposal: The meanings of all existing operators will be unchanged. In particular, 'A*B' will continue to be interpreted elementwise. This takes care of the cases MS, SM, VE, and ME, and ensures minimal impact on existing programs. A new operator '@' (pronounced "across") will be added to Python, along with two special methods, "__across__()" and "__racross__()", with the usual semantics. NumPy will overload "@" to perform mathematical multiplication of arrays where shapes permit, and to throw an exception otherwise. The matrix class's implementation of "@" will treat built-in sequence types as if they were column vectors. This takes care of the cases MM and MV. An attribute "T" will be added to the NumPy array type, such that "m.T" is: (a) the transpose of "m" for a 2-dimensional array (b) the 1xN matrix transpose of "m" if "m" is a 1-dimensional array; or (c) a runtime error for an array with rank >= 3. This attribute will alias the memory of the base object. NumPy's "transpose()" function will be extended to turn built-in sequence types into row vectors. This takes care of the VM, VD, and VO cases. We propose an attribute because: (a) the resulting notation is similar to the 'superscript T' (at least, as similar as ASCII allows), and (b) it signals that the transposition aliases the original object. No new operators will be defined to mean "solve a set of linear equations", or "invert a matrix". Instead, NumPy will define a value "inv", which will be recognized by the exponentiation operator, such that "A ** inv" is the inverse of "A". This is similar in spirit to NumPy's existing "newaxis" value. (Optional) When applied to sequences, the operator will return a list of tuples containing the cross-product of their elements in left-to-right order: >>> [1, 2] @ (3, 4) [(1, 3), (1, 4), (2, 3), (2, 4)] >>> [1, 2] @ (3, 4) @ (5, 6) [(1, 3, 5), (1, 3, 6), (1, 4, 5), (1, 4, 6), (2, 3, 5), (2, 3, 6), (2, 4, 5), (2, 4, 6)] This will require the same kind of special support from the parser as chained comparisons (such as "a>> for (i, j) in [1, 2] @ [3, 4]: >>> print i, j 1 3 1 4 2 3 2 4 as a short-hand for the common nested loop idiom: >>> for i in [1, 2]: >>> for j in [3, 4]: >>> print i, j Response to the 'lockstep loop' questionnaire [5] indicated that newcomers would be comfortable with this (so comfortable, in fact, that most of them interpreted most multi-loop 'zip' syntaxes [6] as implementing single-stage nesting). Alternatives: 01. Don't add new operators --- stick to functions and methods. Python is not primarily a numerical language. It is not worth complexifying the language for this special case --- NumPy's success is proof that users can and will use functions and methods for linear algebra. On the positive side, this maintains Python's simplicity. Its weakness is that support for real matrix multiplication (and, to a lesser extent, other linear algebra operations) is frequently requested, as functional forms are cumbersome for lengthy formulas, and do not respect the operator precedence rules of conventional mathematics. In addition, the method form is asymmetric in its operands. 02. Introduce prefixed forms of existing operators, such as "@*" or "~*", or used boxed forms, such as "[*]" or "%*%". There are (at least) three objections to this. First, either form seems to imply that all operators exist in both forms. This is more new entities than the problem merits, and would require the addition of many new overloadable methods, such as __at_mul__. Second, while it is certainly possible to invent semantics for these new operators for built-in types, this would be a case of the tail wagging the dog, i.e. of letting the existence of a feature "create" a need for it. Finally, the boxed forms make human parsing more complex, e.g.: A[*] = B vs. A[:] = B 03. (From Moshe Zadka [7], and also considered by Huaiyu Zhou [8] in his proposal [9]) Retain the existing meaning of all operators, but create a behavioral accessor for arrays, such that: A * B is elementwise multiplication (ME), but: A.m() * B.m() is mathematical multiplication (MM). The method "A.m()" would return an object that aliased A's memory (for efficiency), but which had a different implementation of __mul__(). The advantage of this method is that it has no effect on the existing implementation of Python: changes are localized in the Numeric module. The disadvantages are: (a) The semantics of "A.m() * B", "A + B.m()", and so on would have to be defined, and there is no "obvious" choice for them. (b) Aliasing objects to trigger different operator behavior feels less Pythonic than either calling methods (as in the existing Numeric module) or using a different operator. This PEP is primarily about look and feel, and about making Python more attractive to people who are not already using it. 04. (From a proposal [9] by Huaiyu Zhou [8]) Introduce a "delayed inverse" attribute, similar to the "transpose" attribute advocated in the third part of this proposal. The expression "a.I" would be a delayed handle on the inverse of the matrix "a", which would be evaluated in context as required. For example, "a.I * b" and "b * a.I" would solve sets of linear equations, without actually calculating the inverse. The main drawback of this proposal it is reliance on lazy evaluation, and even more on "smart" lazy evaluation (i.e. the operation performed depends on the context in which the evaluation is done). The BDFL has so far resisted introducing LE into Python. Related Proposals 0203 : Augmented Assignments If new operators for linear algebra are introduced, it may make sense to introduce augmented assignment forms for them. 0207 : Rich Comparisons It may become possible to overload comparison operators such as '<' so that an expression such as 'A < B' returns an array, rather than a scalar value. 0209 : Adding Multidimensional Arrays Multidimensional arrays are currently an extension to Python, rather than a built-in type. Acknowledgments: I am grateful to Huaiyu Zhu [8] for initiating this discussion, and for some of the ideas and terminology included below. References: [1] http://www.acm.org/sigapl/whyapl.htm [2] http://numpy.sourceforge.net [3] PEP-0203.txt "Augmented Assignments". [4] http://bevo.che.wisc.edu/octave/doc/octave_9.html#SEC69 [5] http://www.python.org/pipermail/python-dev/2000-July/013139.html [6] PEP-0201.txt "Lockstep Iteration" [7] Moshe Zadka is 'moshez@math.huji.ac.il'. [8] Huaiyu Zhu is 'hzhu@users.sourceforge.net' [9] http://www.python.org/pipermail/python-list/2000-August/112529.html Local Variables: mode: indented-text indent-tabs-mode: nil End: