Long overdue update from Greg Wilson 22-Apr-2001

2001-06-05 16:50:09 +00:00 · 2001-06-05 16:50:09 +00:00 · e0e0bb6439
parent cab29f1501
commit e0e0bb6439
1 changed files with 118 additions and 260 deletions
--- a/pep-0211.txt
+++ b/pep-0211.txt
@ -1,7 +1,7 @@
 PEP: 211
-Title: Adding New Linear Algebra Operators to Python
+Title: Adding A New Outer Product Operator
 Version: $Revision$
-Author: gvwilson@nevex.com (Greg Wilson)
+Author: gvwilson@ddj.com (Greg Wilson)
 Status: Draft
 Type: Standards Track
 Python-Version: 2.1
@ -11,314 +11,172 @@ Post-History:

 Introduction

-    This PEP describes a conservative proposal to add linear algebra
-    operators to Python 2.0.  It discusses why such operators are
-    desirable, and why a minimalist approach should be adopted at this
-    point.  This PEP summarizes discussions held in mailing list
-    forums, and provides URLs for further information, where
-    appropriate.  The CVS revision history of this file contains the
-    definitive historical record.
+    This PEP describes a proposal to define "@" (pronounced "across")
+    as a new outer product operator in Python 2.2.  When applied to
+    sequences (or other iterable objects), this operator will combine
+    their iterators, so that:

+        for (i, j) in S @ T:
+            pass

-Summary
+    will be equivalent to:

-    Add a single new infix binary operator '@' ("across"), and
-    corresponding special methods "__across__()", "__racross__()", and
-    "__iacross__()".  This operator will perform mathematical matrix
-    multiplication on NumPy arrays, and generate cross-products when
-    applied to built-in sequence types.  No existing operator
-    definitions will be changed.
+        for i in S:
+            for j in T:
+                pass
+
+    Classes will be able to overload this operator using the special
+    methods "__across__", "__racross__", and "__iacross__".  In
+    particular, the new Numeric module (PEP 0209) will overload this
+    operator for multi-dimensional arrays to implement matrix
+    multiplication.


 Background

-    The first high-level programming language, Fortran, was invented
-    to do arithmetic.  While this is now just a small part of
-    computing, there are still many programmers who need to express
-    complex mathematical operations in code.
+    Number-crunching is now just a small part of computing, but many
+    programmers --- including many Python users --- still need to
+    express complex mathematical operations in code.  Most numerical
+    languages, such as APL, Fortran-90, MATLAB, IDL, and Mathematica,
+    therefore provide two forms of the common arithmetic operators.
+    One form works element-by-element, e.g. multiplies corresponding
+    elements of its matrix arguments.  The other implements the
+    "mathematical" definition of that operation, e.g. performs
+    row-column matrix multiplication.

-    The most influential of Fortran's successors was APL [1].  Its
-    author, Kenneth Iverson, designed the language as a notation for
-    expressing matrix algebra, and received the 1980 Turing Award for
-    his work.
+    Zhu and Lielens have proposed doubling up Python's operators in
+    this way [1].  Their proposal would create six new binary infix
+    operators, and six new in-place operators.

-    APL's operators supported both familiar algebraic operations, such
-    as vector dot product and matrix multiplication, and a wide range
-    of structural operations, such as stitching vectors together to
-    create arrays.  Even by programming's standards, APL is
-    exceptionally cryptic: many of its symbols did not exist on
-    standard keyboards, and expressions have to be read right to left.
+    The original version of this proposal was much more conservative.
+    The author consulted the developers of GNU Octave [2], an open
+    source clone of MATLAB.  Its developers agreed that providing an
+    infix operator for matrix multiplication was important: numerical
+    programmers really do care whether they have to write "mmul(A,B)"
+    instead of "A op B".

-    Most subsequent work numerical languages, such as Fortran-90,
-    MATLAB, and Mathematica, have tried to provide the power of APL
-    without the obscurity.  Python's NumPy [2] has most of the
-    features that users of such languages expect, but these are
-    provided through named functions and methods, rather than
-    overloaded operators.  This makes NumPy clumsier than most
-    alternatives.
-
-    The author of this PEP therefore consulted the developers of GNU
-    Octave [3], an open source clone of MATLAB.  When asked how
-    important it was to have infix operators for matrix solution,
-    Prof. James Rawlings replied [4]:
+    On the other hand, when asked how important it was to have infix
+    operators for matrix solution and other operations, Prof. James
+    Rawlings replied [3]:

        I DON'T think it's a must have, and I do a lot of matrix
        inversion. I cannot remember if its A\b or b\A so I always
        write inv(A)*b instead. I recommend dropping \.

-    Rawlings' feedback on other operators was similar.  It is worth
-    noting in this context that notations such as "/" and "\" for
-    matrix solution were invented by programmers, not mathematicians,
-    and have not been adopted by the latter.
-
-    Based on this discussion, and feedback from classes at the US
-    national laboratories and elsewhere, we recommend only adding a
-    matrix multiplication operator to Python at this time.  If there
-    is significant user demand for syntactic support for other
-    operations, these can be added in a later release.
+    Based on this discussion, and feedback from students at the US
+    national laboratories and elsewhere, we recommended adding only
+    one new operator, for matrix multiplication, to Python.


-Requirements
+Iterators

-    The most important requirement is minimal impact on existing
-    Python programs and users: the proposal must not break existing
-    code (except possibly NumPy).
+    The planned addition of iterators to Python 2.2 opens up a broader
+    scope for this proposal.  As part of the discussion of PEP 0201
+    "Lockstep Iteration" [4], the author of this proposal conducted an
+    informal usability experiment [5].  The results showed that users
+    are psychologically receptive to "cross-product" loop syntax.  For
+    example, most users expected:

-    The second most important requirement is the ability to handle all
-    common cases cleanly and clearly.  There are nine such cases:
+        S = [10, 20, 30]
+        T = [1, 2, 3]
+        for x in S; y in T:
+            print x+y,

-        |5 6| *   9   = |45 54|      MS: matrix-scalar multiplication
-        |7 8|           |63 72|
+    to print "11 12 13 21 22 23 31 32 33".  We believe that users will
+    have the same reaction to:

-          9   * |5 6| = |45 54|      SM: scalar-matrix multiplication
-                |7 8|   |63 72|
+        for (x, y) in S @ T:
+            print x+y

-        |2 3| * |4 5| = |8 15|       VE: vector elementwise multiplication
+    i.e. that they will naturally interpret this as a tidy way to
+    write loop nests.
+
+    This is where iterators come in.  Actually constructing the
+    cross-product of two (or more) sequences before executing the loop
+    would be very expensive.  On the other hand, "@" could be defined
+    to get its arguments' iterators, and then create an outer iterator
+    which returns tuples of the values returned by the inner
+    iterators.


-        |2 3| *  |4|  =   23         VD: vector dot product
-                 |5|
+Discussion

-         |2|  * |4 5| = | 8 10|      VO: vector outer product
-         |3|            |12 15|
+    1. Adding a named function "across" would have less impact on
+       Python than a new infix operator.  However, this would not make
+       Python more appealing to numerical programmers, who really do
+       care whether they can write matrix multiplication using an
+       operator, or whether they have to write it as a function call.

-        |1 2| * |5 6| = | 5 12|      ME: matrix elementwise multiplication
-        |3 4|   |7 8|   |21 32|
+    2. "@" would have be chainable in the same way as comparison
+       operators, i.e.:

-        |1 2| * |5 6| = |19 22|      MM: mathematical matrix multiplication
-        |3 4|   |7 8|   |43 50|
+        (1, 2) @ (3, 4) @ (5, 6)

-        |1 2| * |5 6| = |19 22|      VM: vector-matrix multiplication
-                |7 8|
+       would have to return (1, 3, 5) ... (2, 4, 6), and *not*
+       ((1, 3), 5) ... ((2, 4), 6).  This should not require special
+       support from the parser, as the outer iterator created by the
+       first "@" could easily be taught how to combine itself with
+       ordinary iterators.

-        |5 6| *  |1|  =   |17|       MV: matrix-vector multiplication
-        |7 8|    |2|      |23|
+    3. There would have to be some way to distinguish restartable
+       iterators from ones that couldn't be restarted.  For example,
+       if S is an input stream (e.g. a file), and L is a list, then "S
+       @ L" is straightforward, but "L @ S" is not, since iteration
+       through the stream cannot be repeated.  This could be treated
+       as an error, or by having the outer iterator detect
+       non-restartable inner iterators and cache their values.

-    Note that 1-dimensional vectors are treated as rows in VM, as
-    columns in MV, and as both in VD and VO.  Both are special cases
-    of 2-dimensional matrices (Nx1 and 1xN respectively).  We will
-    therefore define the new operator only for 2-dimensional arrays,
-    and provide an easy (and efficient) way for users to treat
-    1-dimensional structures as 2-dimensional.
+    4. Whiteboard testing of this proposal in front of three novice
+       Python users (all of them experienced programmers) indicates
+       that users will expect:

-    Third, we must avoid confusion between Python's notation and those
-    of MATLAB and Fortran-90.  In particular, mathematical matrix
-    multiplication (case MM) should not be represented as '.*', since:
+        "ab" @ "cd"

-    (a) MATLAB uses prefix-'.' forms to mean 'elementwise', and raw
-        forms to mean "mathematical"; and
+       to return four strings, not four tuples of pairs of
+       characters.  Opinion was divided on what:

-    (b) even if the Python parser can be taught how to handle dotted
-        forms, '1.*A' will still be visually ambiguous.
+        ("a", "b") @ "cd"

-
-Proposal
-
-    The meanings of all existing operators will be unchanged.  In
-    particular, 'A*B' will continue to be interpreted elementwise.
-    This takes care of the cases MS, SM, VE, and ME, and ensures
-    minimal impact on existing programs.
-
-    A new operator '@' (pronounced "across") will be added to Python,
-    along with special methods "__across__()", "__racross__()", and
-    "__iacross__()", with the usual semantics.  (We recommend using
-    "@", rather than the times-like "><", because of the ease with
-    which the latter could be mis-typed as inequality "<>".)
-
-    No new operators will be defined to mean "solve a set of linear
-    equations", or "invert a matrix".
-
-    (Optional) When applied to sequences, the "@" operator will return
-    a tuple of tuples containing the cross-product of their elements
-    in left-to-right order:
-
-    >>> [1, 2] @ (3, 4)
-    ((1, 3), (1, 4), (2, 3), (2, 4))
-
-    >>> [1, 2] @ (3, 4) @ (5, 6)
-    ((1, 3, 5), (1, 3, 6), 
-     (1, 4, 5), (1, 4, 6),
-     (2, 3, 5), (2, 3, 6),
-     (2, 4, 5), (2, 4, 6))
-
-    This will require the same kind of special support from the parser
-    as chained comparisons (such as "a<b<c<=d").  However, it will
-    permit:
-
-    >>> for (i, j) in [1, 2] @ [3, 4]:
-    >>>     print i, j
-    1 3
-    1 4
-    2 3
-    2 4
-
-    as a short-hand for the common nested loop idiom:
-
-    >>> for i in [1, 2]:
-    >>>    for j in [3, 4]:
-    >>>        print i, j
-
-    Response to the 'lockstep loop' questionnaire [5] indicated that
-    newcomers would be comfortable with this (so comfortable, in fact,
-    that most of them interpreted most multi-loop 'zip' syntaxes [6]
-    as implementing single-stage nesting).
+       ought to return...


 Alternatives

-    01. Don't add new operators.
+    1. Do nothing --- keep Python simple.
+
+    This is always the default choice.
+
+    2. Add a named function instead of an operator.

    Python is not primarily a numerical language; it may not be worth
-    complexifying it for this special case.  NumPy's success is proof
-    that users can and will use functions and methods for linear
-    algebra.  However, support for real matrix multiplication is
-    frequently requested, as:
+    complexifying it for this special case.  However, support for real
+    matrix multiplication *is* frequently requested, and the proposed
+    semantics for "@" for built-in sequence types would simplify
+    expression of a very common idiom (nested loops).

-    * functional forms are cumbersome for lengthy formulas, and do not
-      respect the operator precedence rules of conventional mathematics;
-      and
+    3. Introduce prefixed forms of all existing operators, such as
+       "~*" and "~+", as proposed in PEP 0225 [1].

-    * method forms are asymmetric in their operands.
-
-    What's more, the proposed semantics for "@" for built-in sequence
-    types would simplify expression of a very common idiom (nested
-    loops).  User testing during discussion of 'lockstep loops'
-    indicated that both new and experienced users would understand
-    this immediately.
-
-    02. Introduce prefixed forms of all existing operators, such as
-        "~*" and "~+", as proposed in PEP 0225 [7].
-
-    This proposal would duplicate all built-in mathematical operators
-    with matrix equivalents, as in numerical languages such as
-    MATLAB.  Our objections to this are:
-
-    * Python is not primarily a numerical programming language.  While
-      the (self-selected) participants in the discussions that led to
-      PEP 0225 may want all of these new operators, the majority of
-      Python users would be indifferent.  The extra complexity they
-      would introduce into the language therefore does not seem
-      merited. (See also Rawlings' comments, quoted in the Background
-      section, about these operators not being essential.)
-
-    * The proposed syntax is difficult to read (i.e. passes the "low
-      toner" readability test).
-
-    03. Retain the existing meaning of all operators, but create a
-        behavioral accessor for arrays, such that:
-
-            A * B
-
-        is elementwise multiplication (ME), but:
-
-            A.m() * B.m()
-
-        is mathematical multiplication (MM).  The method "A.m()" would
-        return an object that aliased A's memory (for efficiency), but
-        which had a different implementation of __mul__().
-
-    This proposal was made by Moshe Zadka, and is also considered by
-    PEP 0225 [7].  Its advantage is that it has no effect on the
-    existing implementation of Python: changes are localized in the
-    Numeric module.  The disadvantages are
-
-    * The semantics of "A.m() * B", "A + B.m()", and so on would have
-      to be defined, and there is no "obvious" choice for them.
-
-    * Aliasing objects to trigger different operator behavior feels
-      less Pythonic than either calling methods (as in the existing
-      Numeric module) or using a different operator.  This PEP is
-      primarily about look and feel, and about making Python more
-      attractive to people who are not already using it.
-
-
-Related Proposals
-
-    0207 :  Rich Comparisons
-
-            It may become possible to overload comparison operators
-            such as '<' so that an expression such as 'A < B' returns
-            an array, rather than a scalar value.
-
-    0209 :  Adding Multidimensional Arrays
-
-            Multidimensional arrays are currently an extension to
-            Python, rather than a built-in type.
-
-    0225 :  Elementwise/Objectwise Operators
-
-            A larger proposal that addresses the same subject, but
-            which proposes many more additions to the language.
+    Our objections to this are that there isn't enough demand to
+    justify the additional complexity (see Rawlings' comments [3]),
+    and that the proposed syntax fails the "low toner" readability
+    test.


 Acknowledgments

-    I am grateful to Huaiyu Zhu [8] for initiating this discussion,
-    and for some of the ideas and terminology included below.
+    I am grateful to Huaiyu Zhu for initiating this discussion, and to
+    James Rawlings and students in various Python courses for their
+    discussions of what numerical programmers really care about.


 References

-    [1] http://www.acm.org/sigapl/whyapl.htm
-    [2] http://numpy.sourceforge.net
-    [3] http://bevo.che.wisc.edu/octave/
-    [4] http://www.egroups.com/message/python-numeric/4
-    [5] http://www.python.org/pipermail/python-dev/2000-July/013139.html
-    [6] PEP-0201.txt "Lockstep Iteration"
-    [7] http://www.python.org/pipermail/python-list/2000-August/112529.html
-
-
-Appendix: NumPy
-
-    NumPy will overload "@" to perform mathematical multiplication of
-    arrays where shapes permit, and to throw an exception otherwise.
-    Its implementation of "@" will treat built-in sequence types as if
-    they were column vectors.  This takes care of the cases MM and MV.
-
-    An attribute "T" will be added to the NumPy array type, such that
-    "m.T" is:
-
-    (a) the transpose of "m" for a 2-dimensional array
-
-    (b) the 1xN matrix transpose of "m" if "m" is a 1-dimensional
-        array; or
-
-    (c) a runtime error for an array with rank >= 3.
-
-    This attribute will alias the memory of the base object.  NumPy's
-    "transpose()" function will be extended to turn built-in sequence
-    types into row vectors.  This takes care of the VM, VD, and VO
-    cases.  We propose an attribute because:
-
-    (a) the resulting notation is similar to the 'superscript T' (at
-        least, as similar as ASCII allows), and
-
-    (b) it signals that the transposition aliases the original object.
-
-    NumPy will define a value "inv", which will be recognized by the
-    exponentiation operator, such that "A ** inv" is the inverse of
-    "A".  This is similar in spirit to NumPy's existing "newaxis"
-    value.
+    [1] http://python.sourceforge.net/peps/pep-0225.html
+    [2] http://bevo.che.wisc.edu/octave/
+    [3] http://www.egroups.com/message/python-numeric/4
+    [4] http://python.sourceforge.net/peps/pep-0201.html
+    [5] http://mail.python.org/pipermail/python-dev/2000-July/006427.html