python-peps/pep-0465/index.html

1434 lines
135 KiB
HTML
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="color-scheme" content="light dark">
<title>PEP 465 A dedicated infix operator for matrix multiplication | peps.python.org</title>
<link rel="shortcut icon" href="../_static/py.png">
<link rel="canonical" href="https://peps.python.org/pep-0465/">
<link rel="stylesheet" href="../_static/style.css" type="text/css">
<link rel="stylesheet" href="../_static/mq.css" type="text/css">
<link rel="stylesheet" href="../_static/pygments.css" type="text/css" media="(prefers-color-scheme: light)" id="pyg-light">
<link rel="stylesheet" href="../_static/pygments_dark.css" type="text/css" media="(prefers-color-scheme: dark)" id="pyg-dark">
<link rel="alternate" type="application/rss+xml" title="Latest PEPs" href="https://peps.python.org/peps.rss">
<meta property="og:title" content='PEP 465 A dedicated infix operator for matrix multiplication | peps.python.org'>
<meta property="og:description" content="This PEP proposes a new binary operator to be used for matrix multiplication, called @. (Mnemonic: @ is * for mATrices.)">
<meta property="og:type" content="website">
<meta property="og:url" content="https://peps.python.org/pep-0465/">
<meta property="og:site_name" content="Python Enhancement Proposals (PEPs)">
<meta property="og:image" content="https://peps.python.org/_static/og-image.png">
<meta property="og:image:alt" content="Python PEPs">
<meta property="og:image:width" content="200">
<meta property="og:image:height" content="200">
<meta name="description" content="This PEP proposes a new binary operator to be used for matrix multiplication, called @. (Mnemonic: @ is * for mATrices.)">
<meta name="theme-color" content="#3776ab">
</head>
<body>
<svg xmlns="http://www.w3.org/2000/svg" style="display: none;">
<symbol id="svg-sun-half" viewBox="0 0 24 24" pointer-events="all">
<title>Following system colour scheme</title>
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none"
stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
<circle cx="12" cy="12" r="9"></circle>
<path d="M12 3v18m0-12l4.65-4.65M12 14.3l7.37-7.37M12 19.6l8.85-8.85"></path>
</svg>
</symbol>
<symbol id="svg-moon" viewBox="0 0 24 24" pointer-events="all">
<title>Selected dark colour scheme</title>
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none"
stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
<path stroke="none" d="M0 0h24v24H0z" fill="none"></path>
<path d="M12 3c.132 0 .263 0 .393 0a7.5 7.5 0 0 0 7.92 12.446a9 9 0 1 1 -8.313 -12.454z"></path>
</svg>
</symbol>
<symbol id="svg-sun" viewBox="0 0 24 24" pointer-events="all">
<title>Selected light colour scheme</title>
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none"
stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
<circle cx="12" cy="12" r="5"></circle>
<line x1="12" y1="1" x2="12" y2="3"></line>
<line x1="12" y1="21" x2="12" y2="23"></line>
<line x1="4.22" y1="4.22" x2="5.64" y2="5.64"></line>
<line x1="18.36" y1="18.36" x2="19.78" y2="19.78"></line>
<line x1="1" y1="12" x2="3" y2="12"></line>
<line x1="21" y1="12" x2="23" y2="12"></line>
<line x1="4.22" y1="19.78" x2="5.64" y2="18.36"></line>
<line x1="18.36" y1="5.64" x2="19.78" y2="4.22"></line>
</svg>
</symbol>
</svg>
<script>
document.documentElement.dataset.colour_scheme = localStorage.getItem("colour_scheme") || "auto"
</script>
<section id="pep-page-section">
<header>
<h1>Python Enhancement Proposals</h1>
<ul class="breadcrumbs">
<li><a href="https://www.python.org/" title="The Python Programming Language">Python</a> &raquo; </li>
<li><a href="../pep-0000/">PEP Index</a> &raquo; </li>
<li>PEP 465</li>
</ul>
<button id="colour-scheme-cycler" onClick="setColourScheme(nextColourScheme())">
<svg aria-hidden="true" class="colour-scheme-icon-when-auto"><use href="#svg-sun-half"></use></svg>
<svg aria-hidden="true" class="colour-scheme-icon-when-dark"><use href="#svg-moon"></use></svg>
<svg aria-hidden="true" class="colour-scheme-icon-when-light"><use href="#svg-sun"></use></svg>
<span class="visually-hidden">Toggle light / dark / auto colour theme</span>
</button>
</header>
<article>
<section id="pep-content">
<h1 class="page-title">PEP 465 A dedicated infix operator for matrix multiplication</h1>
<dl class="rfc2822 field-list simple">
<dt class="field-odd">Author<span class="colon">:</span></dt>
<dd class="field-odd">Nathaniel J. Smith &lt;njs&#32;&#97;t&#32;pobox.com&gt;</dd>
<dt class="field-even">Status<span class="colon">:</span></dt>
<dd class="field-even"><abbr title="Accepted and implementation complete, or no longer active">Final</abbr></dd>
<dt class="field-odd">Type<span class="colon">:</span></dt>
<dd class="field-odd"><abbr title="Normative PEP with a new feature for Python, implementation change for CPython or interoperability standard for the ecosystem">Standards Track</abbr></dd>
<dt class="field-even">Created<span class="colon">:</span></dt>
<dd class="field-even">20-Feb-2014</dd>
<dt class="field-odd">Python-Version<span class="colon">:</span></dt>
<dd class="field-odd">3.5</dd>
<dt class="field-even">Post-History<span class="colon">:</span></dt>
<dd class="field-even">13-Mar-2014</dd>
<dt class="field-odd">Resolution<span class="colon">:</span></dt>
<dd class="field-odd"><a class="reference external" href="https://mail.python.org/archives/list/python-dev&#64;python.org/message/D63NDWHPF7OC2Z455MPHOW6QLLSNQUJ5/">Python-Dev message</a></dd>
</dl>
<hr class="docutils" />
<section id="contents">
<details><summary>Table of Contents</summary><ul class="simple">
<li><a class="reference internal" href="#abstract">Abstract</a></li>
<li><a class="reference internal" href="#specification">Specification</a></li>
<li><a class="reference internal" href="#motivation">Motivation</a><ul>
<li><a class="reference internal" href="#executive-summary">Executive summary</a></li>
<li><a class="reference internal" href="#background-what-s-wrong-with-the-status-quo">Background: Whats wrong with the status quo?</a></li>
<li><a class="reference internal" href="#why-should-matrix-multiplication-be-infix">Why should matrix multiplication be infix?</a></li>
<li><a class="reference internal" href="#transparent-syntax-is-especially-crucial-for-non-expert-programmers">Transparent syntax is especially crucial for non-expert programmers</a></li>
<li><a class="reference internal" href="#but-isn-t-matrix-multiplication-a-pretty-niche-requirement">But isnt matrix multiplication a pretty niche requirement?</a></li>
<li><a class="reference internal" href="#so-is-good-for-matrix-formulas-but-how-common-are-those-really">So <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> is good for matrix formulas, but how common are those really?</a></li>
<li><a class="reference internal" href="#but-isn-t-it-weird-to-add-an-operator-with-no-stdlib-uses">But isnt it weird to add an operator with no stdlib uses?</a></li>
</ul>
</li>
<li><a class="reference internal" href="#compatibility-considerations">Compatibility considerations</a></li>
<li><a class="reference internal" href="#intended-usage-details">Intended usage details</a><ul>
<li><a class="reference internal" href="#semantics">Semantics</a></li>
<li><a class="reference internal" href="#adoption">Adoption</a></li>
</ul>
</li>
<li><a class="reference internal" href="#implementation-details">Implementation details</a></li>
<li><a class="reference internal" href="#rationale-for-specification-details">Rationale for specification details</a><ul>
<li><a class="reference internal" href="#choice-of-operator">Choice of operator</a></li>
<li><a class="reference internal" href="#precedence-and-associativity">Precedence and associativity</a></li>
<li><a class="reference internal" href="#non-definitions-for-built-in-types">(Non)-Definitions for built-in types</a></li>
<li><a class="reference internal" href="#non-definition-of-matrix-power">Non-definition of matrix power</a></li>
</ul>
</li>
<li><a class="reference internal" href="#rejected-alternatives-to-adding-a-new-operator">Rejected alternatives to adding a new operator</a></li>
<li><a class="reference internal" href="#discussions-of-this-pep">Discussions of this PEP</a></li>
<li><a class="reference internal" href="#references">References</a></li>
<li><a class="reference internal" href="#copyright">Copyright</a></li>
</ul>
</details></section>
<section id="abstract">
<h2><a class="toc-backref" href="#abstract" role="doc-backlink">Abstract</a></h2>
<p>This PEP proposes a new binary operator to be used for matrix
multiplication, called <code class="docutils literal notranslate"><span class="pre">&#64;</span></code>. (Mnemonic: <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> is <code class="docutils literal notranslate"><span class="pre">*</span></code> for
mATrices.)</p>
</section>
<section id="specification">
<h2><a class="toc-backref" href="#specification" role="doc-backlink">Specification</a></h2>
<p>A new binary operator is added to the Python language, together
with the corresponding in-place version:</p>
<table class="docutils align-default">
<thead>
<tr class="row-odd"><th class="head">Op</th>
<th class="head">Precedence/associativity</th>
<th class="head">Methods</th>
</tr>
</thead>
<tbody>
<tr class="row-even"><td><code class="docutils literal notranslate"><span class="pre">&#64;</span></code></td>
<td>Same as <code class="docutils literal notranslate"><span class="pre">*</span></code></td>
<td><code class="docutils literal notranslate"><span class="pre">__matmul__</span></code>, <code class="docutils literal notranslate"><span class="pre">__rmatmul__</span></code></td>
</tr>
<tr class="row-odd"><td><code class="docutils literal notranslate"><span class="pre">&#64;=</span></code></td>
<td>n/a</td>
<td><code class="docutils literal notranslate"><span class="pre">__imatmul__</span></code></td>
</tr>
</tbody>
</table>
<p>No implementations of these methods are added to the builtin or
standard library types. However, a number of projects have reached
consensus on the recommended semantics for these operations; see
<a class="reference internal" href="#intended-usage-details">Intended usage details</a> below for details.</p>
<p>For details on how this operator will be implemented in CPython, see
<a class="reference internal" href="#implementation-details">Implementation details</a>.</p>
</section>
<section id="motivation">
<h2><a class="toc-backref" href="#motivation" role="doc-backlink">Motivation</a></h2>
<section id="executive-summary">
<h3><a class="toc-backref" href="#executive-summary" role="doc-backlink">Executive summary</a></h3>
<p>In numerical code, there are two important operations which compete
for use of Pythons <code class="docutils literal notranslate"><span class="pre">*</span></code> operator: elementwise multiplication, and
matrix multiplication. In the nearly twenty years since the Numeric
library was first proposed, there have been many attempts to resolve
this tension <a class="footnote-reference brackets" href="#hugunin" id="id1">[13]</a>; none have been really satisfactory.
Currently, most numerical Python code uses <code class="docutils literal notranslate"><span class="pre">*</span></code> for elementwise
multiplication, and function/method syntax for matrix multiplication;
however, this leads to ugly and unreadable code in common
circumstances. The problem is bad enough that significant amounts of
code continue to use the opposite convention (which has the virtue of
producing ugly and unreadable code in <em>different</em> circumstances), and
this API fragmentation across codebases then creates yet more
problems. There does not seem to be any <em>good</em> solution to the
problem of designing a numerical API within current Python syntax
only a landscape of options that are bad in different ways. The
minimal change to Python syntax which is sufficient to resolve these
problems is the addition of a single new infix operator for matrix
multiplication.</p>
<p>Matrix multiplication has a singular combination of features which
distinguish it from other binary operations, which together provide a
uniquely compelling case for the addition of a dedicated infix
operator:</p>
<ul class="simple">
<li>Just as for the existing numerical operators, there exists a vast
body of prior art supporting the use of infix notation for matrix
multiplication across all fields of mathematics, science, and
engineering; <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> harmoniously fills a hole in Pythons existing
operator system.</li>
<li><code class="docutils literal notranslate"><span class="pre">&#64;</span></code> greatly clarifies real-world code.</li>
<li><code class="docutils literal notranslate"><span class="pre">&#64;</span></code> provides a smoother onramp for less experienced users, who are
particularly harmed by hard-to-read code and API fragmentation.</li>
<li><code class="docutils literal notranslate"><span class="pre">&#64;</span></code> benefits a substantial and growing portion of the Python user
community.</li>
<li><code class="docutils literal notranslate"><span class="pre">&#64;</span></code> will be used frequently in fact, evidence suggests it may
be used more frequently than <code class="docutils literal notranslate"><span class="pre">//</span></code> or the bitwise operators.</li>
<li><code class="docutils literal notranslate"><span class="pre">&#64;</span></code> allows the Python numerical community to reduce fragmentation,
and finally standardize on a single consensus duck type for all
numerical array objects.</li>
</ul>
</section>
<section id="background-what-s-wrong-with-the-status-quo">
<h3><a class="toc-backref" href="#background-what-s-wrong-with-the-status-quo" role="doc-backlink">Background: Whats wrong with the status quo?</a></h3>
<p>When we crunch numbers on a computer, we usually have lots and lots of
numbers to deal with. Trying to deal with them one at a time is
cumbersome and slow especially when using an interpreted language.
Instead, we want the ability to write down simple operations that
apply to large collections of numbers all at once. The <em>n-dimensional
array</em> is the basic object that all popular numeric computing
environments use to make this possible. Python has several libraries
that provide such arrays, with numpy being at present the most
prominent.</p>
<p>When working with n-dimensional arrays, there are two different ways
we might want to define multiplication. One is elementwise
multiplication:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="p">[[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="p">[[</span><span class="mi">11</span><span class="p">,</span> <span class="mi">12</span><span class="p">],</span> <span class="p">[[</span><span class="mi">1</span> <span class="o">*</span> <span class="mi">11</span><span class="p">,</span> <span class="mi">2</span> <span class="o">*</span> <span class="mi">12</span><span class="p">],</span>
<span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">]]</span> <span class="n">x</span> <span class="p">[</span><span class="mi">13</span><span class="p">,</span> <span class="mi">14</span><span class="p">]]</span> <span class="o">=</span> <span class="p">[</span><span class="mi">3</span> <span class="o">*</span> <span class="mi">13</span><span class="p">,</span> <span class="mi">4</span> <span class="o">*</span> <span class="mi">14</span><span class="p">]]</span>
</pre></div>
</div>
<p>and the other is <a class="reference external" href="https://en.wikipedia.org/wiki/Matrix_multiplication">matrix multiplication</a>:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="p">[[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span> <span class="p">[[</span><span class="mi">11</span><span class="p">,</span> <span class="mi">12</span><span class="p">],</span> <span class="p">[[</span><span class="mi">1</span> <span class="o">*</span> <span class="mi">11</span> <span class="o">+</span> <span class="mi">2</span> <span class="o">*</span> <span class="mi">13</span><span class="p">,</span> <span class="mi">1</span> <span class="o">*</span> <span class="mi">12</span> <span class="o">+</span> <span class="mi">2</span> <span class="o">*</span> <span class="mi">14</span><span class="p">],</span>
<span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">]]</span> <span class="n">x</span> <span class="p">[</span><span class="mi">13</span><span class="p">,</span> <span class="mi">14</span><span class="p">]]</span> <span class="o">=</span> <span class="p">[</span><span class="mi">3</span> <span class="o">*</span> <span class="mi">11</span> <span class="o">+</span> <span class="mi">4</span> <span class="o">*</span> <span class="mi">13</span><span class="p">,</span> <span class="mi">3</span> <span class="o">*</span> <span class="mi">12</span> <span class="o">+</span> <span class="mi">4</span> <span class="o">*</span> <span class="mi">14</span><span class="p">]]</span>
</pre></div>
</div>
<p>Elementwise multiplication is useful because it lets us easily and
quickly perform many multiplications on a large collection of values,
without writing a slow and cumbersome <code class="docutils literal notranslate"><span class="pre">for</span></code> loop. And this works as
part of a very general schema: when using the array objects provided
by numpy or other numerical libraries, all Python operators work
elementwise on arrays of all dimensionalities. The result is that one
can write functions using straightforward code like <code class="docutils literal notranslate"><span class="pre">a</span> <span class="pre">*</span> <span class="pre">b</span> <span class="pre">+</span> <span class="pre">c</span> <span class="pre">/</span> <span class="pre">d</span></code>,
treating the variables as if they were simple values, but then
immediately use this function to efficiently perform this calculation
on large collections of values, while keeping them organized using
whatever arbitrarily complex array layout works best for the problem
at hand.</p>
<p>Matrix multiplication is more of a special case. Its only defined on
2d arrays (also known as “matrices”), and multiplication is the only
operation that has an important “matrix” version “matrix addition”
is the same as elementwise addition; there is no such thing as “matrix
bitwise-or” or “matrix floordiv”; “matrix division” and “matrix
to-the-power-of” can be defined but are not very useful, etc.
However, matrix multiplication is still used very heavily across all
numerical application areas; mathematically, its one of the most
fundamental operations there is.</p>
<p>Because Python syntax currently allows for only a single
multiplication operator <code class="docutils literal notranslate"><span class="pre">*</span></code>, libraries providing array-like objects
must decide: either use <code class="docutils literal notranslate"><span class="pre">*</span></code> for elementwise multiplication, or use
<code class="docutils literal notranslate"><span class="pre">*</span></code> for matrix multiplication. And, unfortunately, it turns out
that when doing general-purpose number crunching, both operations are
used frequently, and there are major advantages to using infix rather
than function call syntax in both cases. Thus it is not at all clear
which convention is optimal, or even acceptable; often it varies on a
case-by-case basis.</p>
<p>Nonetheless, network effects mean that it is very important that we
pick <em>just one</em> convention. In numpy, for example, it is technically
possible to switch between the conventions, because numpy provides two
different types with different <code class="docutils literal notranslate"><span class="pre">__mul__</span></code> methods. For
<code class="docutils literal notranslate"><span class="pre">numpy.ndarray</span></code> objects, <code class="docutils literal notranslate"><span class="pre">*</span></code> performs elementwise multiplication,
and matrix multiplication must use a function call (<code class="docutils literal notranslate"><span class="pre">numpy.dot</span></code>).
For <code class="docutils literal notranslate"><span class="pre">numpy.matrix</span></code> objects, <code class="docutils literal notranslate"><span class="pre">*</span></code> performs matrix multiplication,
and elementwise multiplication requires function syntax. Writing code
using <code class="docutils literal notranslate"><span class="pre">numpy.ndarray</span></code> works fine. Writing code using
<code class="docutils literal notranslate"><span class="pre">numpy.matrix</span></code> also works fine. But trouble begins as soon as we
try to integrate these two pieces of code together. Code that expects
an <code class="docutils literal notranslate"><span class="pre">ndarray</span></code> and gets a <code class="docutils literal notranslate"><span class="pre">matrix</span></code>, or vice-versa, may crash or
return incorrect results. Keeping track of which functions expect
which types as inputs, and return which types as outputs, and then
converting back and forth all the time, is incredibly cumbersome and
impossible to get right at any scale. Functions that defensively try
to handle both types as input and DTRT, find themselves floundering
into a swamp of <code class="docutils literal notranslate"><span class="pre">isinstance</span></code> and <code class="docutils literal notranslate"><span class="pre">if</span></code> statements.</p>
<p><a class="pep reference internal" href="../pep-0238/" title="PEP 238 Changing the Division Operator">PEP 238</a> split <code class="docutils literal notranslate"><span class="pre">/</span></code> into two operators: <code class="docutils literal notranslate"><span class="pre">/</span></code> and <code class="docutils literal notranslate"><span class="pre">//</span></code>. Imagine the
chaos that would have resulted if it had instead split <code class="docutils literal notranslate"><span class="pre">int</span></code> into
two types: <code class="docutils literal notranslate"><span class="pre">classic_int</span></code>, whose <code class="docutils literal notranslate"><span class="pre">__div__</span></code> implemented floor
division, and <code class="docutils literal notranslate"><span class="pre">new_int</span></code>, whose <code class="docutils literal notranslate"><span class="pre">__div__</span></code> implemented true
division. This, in a more limited way, is the situation that Python
number-crunchers currently find themselves in.</p>
<p>In practice, the vast majority of projects have settled on the
convention of using <code class="docutils literal notranslate"><span class="pre">*</span></code> for elementwise multiplication, and function
call syntax for matrix multiplication (e.g., using <code class="docutils literal notranslate"><span class="pre">numpy.ndarray</span></code>
instead of <code class="docutils literal notranslate"><span class="pre">numpy.matrix</span></code>). This reduces the problems caused by API
fragmentation, but it doesnt eliminate them. The strong desire to
use infix notation for matrix multiplication has caused a number of
specialized array libraries to continue to use the opposing convention
(e.g., scipy.sparse, pyoperators, pyviennacl) despite the problems
this causes, and <code class="docutils literal notranslate"><span class="pre">numpy.matrix</span></code> itself still gets used in
introductory programming courses, often appears in StackOverflow
answers, and so forth. Well-written libraries thus must continue to
be prepared to deal with both types of objects, and, of course, are
also stuck using unpleasant funcall syntax for matrix multiplication.
After nearly two decades of trying, the numerical community has still
not found any way to resolve these problems within the constraints of
current Python syntax (see <a class="reference internal" href="#rejected-alternatives-to-adding-a-new-operator">Rejected alternatives to adding a new
operator</a> below).</p>
<p>This PEP proposes the minimum effective change to Python syntax that
will allow us to drain this swamp. It splits <code class="docutils literal notranslate"><span class="pre">*</span></code> into two
operators, just as was done for <code class="docutils literal notranslate"><span class="pre">/</span></code>: <code class="docutils literal notranslate"><span class="pre">*</span></code> for elementwise
multiplication, and <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> for matrix multiplication. (Why not the
reverse? Because this way is compatible with the existing consensus,
and because it gives us a consistent rule that all the built-in
numeric operators also apply in an elementwise manner to arrays; the
reverse convention would lead to more special cases.)</p>
<p>So thats why matrix multiplication doesnt and cant just use <code class="docutils literal notranslate"><span class="pre">*</span></code>.
Now, in the rest of this section, well explain why it nonetheless
meets the high bar for adding a new operator.</p>
</section>
<section id="why-should-matrix-multiplication-be-infix">
<h3><a class="toc-backref" href="#why-should-matrix-multiplication-be-infix" role="doc-backlink">Why should matrix multiplication be infix?</a></h3>
<p>Right now, most numerical code in Python uses syntax like
<code class="docutils literal notranslate"><span class="pre">numpy.dot(a,</span> <span class="pre">b)</span></code> or <code class="docutils literal notranslate"><span class="pre">a.dot(b)</span></code> to perform matrix multiplication.
This obviously works, so why do people make such a fuss about it, even
to the point of creating API fragmentation and compatibility swamps?</p>
<p>Matrix multiplication shares two features with ordinary arithmetic
operations like addition and multiplication on numbers: (a) it is used
very heavily in numerical programs often multiple times per line of
code and (b) it has an ancient and universally adopted tradition of
being written using infix syntax. This is because, for typical
formulas, this notation is dramatically more readable than any
function call syntax. Heres an example to demonstrate:</p>
<p>One of the most useful tools for testing a statistical hypothesis is
the linear hypothesis test for OLS regression models. It doesnt
really matter what all those words I just said mean; if we find
ourselves having to implement this thing, what well do is look up
some textbook or paper on it, and encounter many mathematical formulas
that look like:</p>
<div class="formula">
<i>S</i>=(<i>H</i><i>β</i><i>r</i>)<sup><i>T</i></sup>(<i>H</i><i>V</i><i>H</i><sup><i>T</i></sup>)<sup>1</sup>(<i>H</i><i>β</i><i>r</i>)
</div>
<p>Here the various variables are all vectors or matrices (details for
the curious: <a class="footnote-reference brackets" href="#lht" id="id2">[5]</a>).</p>
<p>Now we need to write code to perform this calculation. In current
numpy, matrix multiplication can be performed using either the
function or method call syntax. Neither provides a particularly
readable translation of the formula:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>
<span class="kn">from</span> <span class="nn">numpy.linalg</span> <span class="kn">import</span> <span class="n">inv</span><span class="p">,</span> <span class="n">solve</span>
<span class="c1"># Using dot function:</span>
<span class="n">S</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">dot</span><span class="p">((</span><span class="n">np</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">H</span><span class="p">,</span> <span class="n">beta</span><span class="p">)</span> <span class="o">-</span> <span class="n">r</span><span class="p">)</span><span class="o">.</span><span class="n">T</span><span class="p">,</span>
<span class="n">np</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">inv</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">H</span><span class="p">,</span> <span class="n">V</span><span class="p">),</span> <span class="n">H</span><span class="o">.</span><span class="n">T</span><span class="p">)),</span> <span class="n">np</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">H</span><span class="p">,</span> <span class="n">beta</span><span class="p">)</span> <span class="o">-</span> <span class="n">r</span><span class="p">))</span>
<span class="c1"># Using dot method:</span>
<span class="n">S</span> <span class="o">=</span> <span class="p">(</span><span class="n">H</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">beta</span><span class="p">)</span> <span class="o">-</span> <span class="n">r</span><span class="p">)</span><span class="o">.</span><span class="n">T</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">inv</span><span class="p">(</span><span class="n">H</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">V</span><span class="p">)</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">H</span><span class="o">.</span><span class="n">T</span><span class="p">)))</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">H</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">beta</span><span class="p">)</span> <span class="o">-</span> <span class="n">r</span><span class="p">)</span>
</pre></div>
</div>
<p>With the <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> operator, the direct translation of the above formula
becomes:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">S</span> <span class="o">=</span> <span class="p">(</span><span class="n">H</span> <span class="o">@</span> <span class="n">beta</span> <span class="o">-</span> <span class="n">r</span><span class="p">)</span><span class="o">.</span><span class="n">T</span> <span class="o">@</span> <span class="n">inv</span><span class="p">(</span><span class="n">H</span> <span class="o">@</span> <span class="n">V</span> <span class="o">@</span> <span class="n">H</span><span class="o">.</span><span class="n">T</span><span class="p">)</span> <span class="o">@</span> <span class="p">(</span><span class="n">H</span> <span class="o">@</span> <span class="n">beta</span> <span class="o">-</span> <span class="n">r</span><span class="p">)</span>
</pre></div>
</div>
<p>Notice that there is now a transparent, 1-to-1 mapping between the
symbols in the original formula and the code that implements it.</p>
<p>Of course, an experienced programmer will probably notice that this is
not the best way to compute this expression. The repeated computation
of <span class="formula"><i>H</i><i>β</i><i>r</i></span> should perhaps be factored out; and,
expressions of the form <code class="docutils literal notranslate"><span class="pre">dot(inv(A),</span> <span class="pre">B)</span></code> should almost always be
replaced by the more numerically stable <code class="docutils literal notranslate"><span class="pre">solve(A,</span> <span class="pre">B)</span></code>. When using
<code class="docutils literal notranslate"><span class="pre">&#64;</span></code>, performing these two refactorings gives us:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="c1"># Version 1 (as above)</span>
<span class="n">S</span> <span class="o">=</span> <span class="p">(</span><span class="n">H</span> <span class="o">@</span> <span class="n">beta</span> <span class="o">-</span> <span class="n">r</span><span class="p">)</span><span class="o">.</span><span class="n">T</span> <span class="o">@</span> <span class="n">inv</span><span class="p">(</span><span class="n">H</span> <span class="o">@</span> <span class="n">V</span> <span class="o">@</span> <span class="n">H</span><span class="o">.</span><span class="n">T</span><span class="p">)</span> <span class="o">@</span> <span class="p">(</span><span class="n">H</span> <span class="o">@</span> <span class="n">beta</span> <span class="o">-</span> <span class="n">r</span><span class="p">)</span>
<span class="c1"># Version 2</span>
<span class="n">trans_coef</span> <span class="o">=</span> <span class="n">H</span> <span class="o">@</span> <span class="n">beta</span> <span class="o">-</span> <span class="n">r</span>
<span class="n">S</span> <span class="o">=</span> <span class="n">trans_coef</span><span class="o">.</span><span class="n">T</span> <span class="o">@</span> <span class="n">inv</span><span class="p">(</span><span class="n">H</span> <span class="o">@</span> <span class="n">V</span> <span class="o">@</span> <span class="n">H</span><span class="o">.</span><span class="n">T</span><span class="p">)</span> <span class="o">@</span> <span class="n">trans_coef</span>
<span class="c1"># Version 3</span>
<span class="n">S</span> <span class="o">=</span> <span class="n">trans_coef</span><span class="o">.</span><span class="n">T</span> <span class="o">@</span> <span class="n">solve</span><span class="p">(</span><span class="n">H</span> <span class="o">@</span> <span class="n">V</span> <span class="o">@</span> <span class="n">H</span><span class="o">.</span><span class="n">T</span><span class="p">,</span> <span class="n">trans_coef</span><span class="p">)</span>
</pre></div>
</div>
<p>Notice that when comparing between each pair of steps, its very easy
to see exactly what was changed. If we apply the equivalent
transformations to the code using the .dot method, then the changes
are much harder to read out or verify for correctness:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="c1"># Version 1 (as above)</span>
<span class="n">S</span> <span class="o">=</span> <span class="p">(</span><span class="n">H</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">beta</span><span class="p">)</span> <span class="o">-</span> <span class="n">r</span><span class="p">)</span><span class="o">.</span><span class="n">T</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">inv</span><span class="p">(</span><span class="n">H</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">V</span><span class="p">)</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">H</span><span class="o">.</span><span class="n">T</span><span class="p">)))</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">H</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">beta</span><span class="p">)</span> <span class="o">-</span> <span class="n">r</span><span class="p">)</span>
<span class="c1"># Version 2</span>
<span class="n">trans_coef</span> <span class="o">=</span> <span class="n">H</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">beta</span><span class="p">)</span> <span class="o">-</span> <span class="n">r</span>
<span class="n">S</span> <span class="o">=</span> <span class="n">trans_coef</span><span class="o">.</span><span class="n">T</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">inv</span><span class="p">(</span><span class="n">H</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">V</span><span class="p">)</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">H</span><span class="o">.</span><span class="n">T</span><span class="p">)))</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">trans_coef</span><span class="p">)</span>
<span class="c1"># Version 3</span>
<span class="n">S</span> <span class="o">=</span> <span class="n">trans_coef</span><span class="o">.</span><span class="n">T</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">solve</span><span class="p">(</span><span class="n">H</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">V</span><span class="p">)</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">H</span><span class="o">.</span><span class="n">T</span><span class="p">)),</span> <span class="n">trans_coef</span><span class="p">)</span>
</pre></div>
</div>
<p>Readability counts! The statements using <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> are shorter, contain
more whitespace, can be directly and easily compared both to each
other and to the textbook formula, and contain only meaningful
parentheses. This last point is particularly important for
readability: when using function-call syntax, the required parentheses
on every operation create visual clutter that makes it very difficult
to parse out the overall structure of the formula by eye, even for a
relatively simple formula like this one. Eyes are terrible at parsing
non-regular languages. I made and caught many errors while trying to
write out the dot formulas above. I know they still contain at
least one error, maybe more. (Exercise: find it. Or them.) The
<code class="docutils literal notranslate"><span class="pre">&#64;</span></code> examples, by contrast, are not only correct, theyre obviously
correct at a glance.</p>
<p>If we are even more sophisticated programmers, and writing code that
we expect to be reused, then considerations of speed or numerical
accuracy might lead us to prefer some particular order of evaluation.
Because <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> makes it possible to omit irrelevant parentheses, we can
be certain that if we <em>do</em> write something like <code class="docutils literal notranslate"><span class="pre">(H</span> <span class="pre">&#64;</span> <span class="pre">V)</span> <span class="pre">&#64;</span> <span class="pre">H.T</span></code>,
then our readers will know that the parentheses must have been added
intentionally to accomplish some meaningful purpose. In the <code class="docutils literal notranslate"><span class="pre">dot</span></code>
examples, its impossible to know which nesting decisions are
important, and which are arbitrary.</p>
<p>Infix <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> dramatically improves matrix code usability at all stages
of programmer interaction.</p>
</section>
<section id="transparent-syntax-is-especially-crucial-for-non-expert-programmers">
<h3><a class="toc-backref" href="#transparent-syntax-is-especially-crucial-for-non-expert-programmers" role="doc-backlink">Transparent syntax is especially crucial for non-expert programmers</a></h3>
<p>A large proportion of scientific code is written by people who are
experts in their domain, but are not experts in programming. And
there are many university courses run each year with titles like “Data
analysis for social scientists” which assume no programming
background, and teach some combination of mathematical techniques,
introduction to programming, and the use of programming to implement
these mathematical techniques, all within a 10-15 week period. These
courses are more and more often being taught in Python rather than
special-purpose languages like R or Matlab.</p>
<p>For these kinds of users, whose programming knowledge is fragile, the
existence of a transparent mapping between formulas and code often
means the difference between succeeding and failing to write that code
at all. This is so important that such classes often use the
<code class="docutils literal notranslate"><span class="pre">numpy.matrix</span></code> type which defines <code class="docutils literal notranslate"><span class="pre">*</span></code> to mean matrix
multiplication, even though this type is buggy and heavily
disrecommended by the rest of the numpy community for the
fragmentation that it causes. This pedagogical use case is, in fact,
the <em>only</em> reason <code class="docutils literal notranslate"><span class="pre">numpy.matrix</span></code> remains a supported part of numpy.
Adding <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> will benefit both beginning and advanced users with
better syntax; and furthermore, it will allow both groups to
standardize on the same notation from the start, providing a smoother
on-ramp to expertise.</p>
</section>
<section id="but-isn-t-matrix-multiplication-a-pretty-niche-requirement">
<h3><a class="toc-backref" href="#but-isn-t-matrix-multiplication-a-pretty-niche-requirement" role="doc-backlink">But isnt matrix multiplication a pretty niche requirement?</a></h3>
<p>The world is full of continuous data, and computers are increasingly
called upon to work with it in sophisticated ways. Arrays are the
lingua franca of finance, machine learning, 3d graphics, computer
vision, robotics, operations research, econometrics, meteorology,
computational linguistics, recommendation systems, neuroscience,
astronomy, bioinformatics (including genetics, cancer research, drug
discovery, etc.), physics engines, quantum mechanics, geophysics,
network analysis, and many other application areas. In most or all of
these areas, Python is rapidly becoming a dominant player, in large
part because of its ability to elegantly mix traditional discrete data
structures (hash tables, strings, etc.) on an equal footing with
modern numerical data types and algorithms.</p>
<p>We all live in our own little sub-communities, so some Python users
may be surprised to realize the sheer extent to which Python is used
for number crunching especially since much of this particular
sub-communitys activity occurs outside of traditional Python/FOSS
channels. So, to give some rough idea of just how many numerical
Python programmers are actually out there, here are two numbers: In
2013, there were 7 international conferences organized specifically on
numerical Python <a class="footnote-reference brackets" href="#scipy-conf" id="id3">[3]</a> <a class="footnote-reference brackets" href="#pydata-conf" id="id4">[4]</a>. At PyCon 2014, ~20%
of the tutorials appear to involve the use of matrices
<a class="footnote-reference brackets" href="#pycon-tutorials" id="id5">[6]</a>.</p>
<p>To quantify this further, we used Githubs “search” function to look
at what modules are actually imported across a wide range of
real-world code (i.e., all the code on Github). We checked for
imports of several popular stdlib modules, a variety of numerically
oriented modules, and various other extremely high-profile modules
like django and lxml (the latter of which is the #1 most downloaded
package on PyPI). Starred lines indicate packages which export
array- or matrix-like objects which will adopt <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> if this PEP is
approved:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">Count</span> <span class="n">of</span> <span class="n">Python</span> <span class="n">source</span> <span class="n">files</span> <span class="n">on</span> <span class="n">Github</span> <span class="n">matching</span> <span class="n">given</span> <span class="n">search</span> <span class="n">terms</span>
<span class="p">(</span><span class="k">as</span> <span class="n">of</span> <span class="mi">2014</span><span class="o">-</span><span class="mi">04</span><span class="o">-</span><span class="mi">10</span><span class="p">,</span> <span class="o">~</span><span class="mi">21</span><span class="p">:</span><span class="mi">00</span> <span class="n">UTC</span><span class="p">)</span>
<span class="o">================</span> <span class="o">==========</span> <span class="o">===============</span> <span class="o">=======</span> <span class="o">===========</span>
<span class="n">module</span> <span class="s2">&quot;import X&quot;</span> <span class="s2">&quot;from X import&quot;</span> <span class="n">total</span> <span class="n">total</span><span class="o">/</span><span class="n">numpy</span>
<span class="o">================</span> <span class="o">==========</span> <span class="o">===============</span> <span class="o">=======</span> <span class="o">===========</span>
<span class="n">sys</span> <span class="mi">2374638</span> <span class="mi">63301</span> <span class="mi">2437939</span> <span class="mf">5.85</span>
<span class="n">os</span> <span class="mi">1971515</span> <span class="mi">37571</span> <span class="mi">2009086</span> <span class="mf">4.82</span>
<span class="n">re</span> <span class="mi">1294651</span> <span class="mi">8358</span> <span class="mi">1303009</span> <span class="mf">3.12</span>
<span class="n">numpy</span> <span class="o">**************</span> <span class="mi">337916</span> <span class="o">**********</span> <span class="mi">79065</span> <span class="o">*</span> <span class="mi">416981</span> <span class="o">*******</span> <span class="mf">1.00</span>
<span class="n">warnings</span> <span class="mi">298195</span> <span class="mi">73150</span> <span class="mi">371345</span> <span class="mf">0.89</span>
<span class="n">subprocess</span> <span class="mi">281290</span> <span class="mi">63644</span> <span class="mi">344934</span> <span class="mf">0.83</span>
<span class="n">django</span> <span class="mi">62795</span> <span class="mi">219302</span> <span class="mi">282097</span> <span class="mf">0.68</span>
<span class="n">math</span> <span class="mi">200084</span> <span class="mi">81903</span> <span class="mi">281987</span> <span class="mf">0.68</span>
<span class="n">threading</span> <span class="mi">212302</span> <span class="mi">45423</span> <span class="mi">257725</span> <span class="mf">0.62</span>
<span class="n">pickle</span><span class="o">+</span><span class="n">cPickle</span> <span class="mi">215349</span> <span class="mi">22672</span> <span class="mi">238021</span> <span class="mf">0.57</span>
<span class="n">matplotlib</span> <span class="mi">119054</span> <span class="mi">27859</span> <span class="mi">146913</span> <span class="mf">0.35</span>
<span class="n">sqlalchemy</span> <span class="mi">29842</span> <span class="mi">82850</span> <span class="mi">112692</span> <span class="mf">0.27</span>
<span class="n">pylab</span> <span class="o">***************</span> <span class="mi">36754</span> <span class="o">**********</span> <span class="mi">41063</span> <span class="o">**</span> <span class="mi">77817</span> <span class="o">*******</span> <span class="mf">0.19</span>
<span class="n">scipy</span> <span class="o">***************</span> <span class="mi">40829</span> <span class="o">**********</span> <span class="mi">28263</span> <span class="o">**</span> <span class="mi">69092</span> <span class="o">*******</span> <span class="mf">0.17</span>
<span class="n">lxml</span> <span class="mi">19026</span> <span class="mi">38061</span> <span class="mi">57087</span> <span class="mf">0.14</span>
<span class="n">zlib</span> <span class="mi">40486</span> <span class="mi">6623</span> <span class="mi">47109</span> <span class="mf">0.11</span>
<span class="n">multiprocessing</span> <span class="mi">25247</span> <span class="mi">19850</span> <span class="mi">45097</span> <span class="mf">0.11</span>
<span class="n">requests</span> <span class="mi">30896</span> <span class="mi">560</span> <span class="mi">31456</span> <span class="mf">0.08</span>
<span class="n">jinja2</span> <span class="mi">8057</span> <span class="mi">24047</span> <span class="mi">32104</span> <span class="mf">0.08</span>
<span class="n">twisted</span> <span class="mi">13858</span> <span class="mi">6404</span> <span class="mi">20262</span> <span class="mf">0.05</span>
<span class="n">gevent</span> <span class="mi">11309</span> <span class="mi">8529</span> <span class="mi">19838</span> <span class="mf">0.05</span>
<span class="n">pandas</span> <span class="o">**************</span> <span class="mi">14923</span> <span class="o">***********</span> <span class="mi">4005</span> <span class="o">**</span> <span class="mi">18928</span> <span class="o">*******</span> <span class="mf">0.05</span>
<span class="n">sympy</span> <span class="mi">2779</span> <span class="mi">9537</span> <span class="mi">12316</span> <span class="mf">0.03</span>
<span class="n">theano</span> <span class="o">***************</span> <span class="mi">3654</span> <span class="o">***********</span> <span class="mi">1828</span> <span class="o">***</span> <span class="mi">5482</span> <span class="o">*******</span> <span class="mf">0.01</span>
<span class="o">================</span> <span class="o">==========</span> <span class="o">===============</span> <span class="o">=======</span> <span class="o">===========</span>
</pre></div>
</div>
<p>These numbers should be taken with several grains of salt (see
footnote for discussion: <a class="footnote-reference brackets" href="#github-details" id="id6">[12]</a>), but, to the extent they
can be trusted, they suggest that <code class="docutils literal notranslate"><span class="pre">numpy</span></code> might be the single
most-imported non-stdlib module in the entire Pythonverse; its even
more-imported than such stdlib stalwarts as <code class="docutils literal notranslate"><span class="pre">subprocess</span></code>, <code class="docutils literal notranslate"><span class="pre">math</span></code>,
<code class="docutils literal notranslate"><span class="pre">pickle</span></code>, and <code class="docutils literal notranslate"><span class="pre">threading</span></code>. And numpy users represent only a
subset of the broader numerical community that will benefit from the
<code class="docutils literal notranslate"><span class="pre">&#64;</span></code> operator. Matrices may once have been a niche data type
restricted to Fortran programs running in university labs and military
clusters, but those days are long gone. Number crunching is a
mainstream part of modern Python usage.</p>
<p>In addition, there is some precedence for adding an infix operator to
handle a more-specialized arithmetic operation: the floor division
operator <code class="docutils literal notranslate"><span class="pre">//</span></code>, like the bitwise operators, is very useful under
certain circumstances when performing exact calculations on discrete
values. But it seems likely that there are many Python programmers
who have never had reason to use <code class="docutils literal notranslate"><span class="pre">//</span></code> (or, for that matter, the
bitwise operators). <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> is no more niche than <code class="docutils literal notranslate"><span class="pre">//</span></code>.</p>
</section>
<section id="so-is-good-for-matrix-formulas-but-how-common-are-those-really">
<h3><a class="toc-backref" href="#so-is-good-for-matrix-formulas-but-how-common-are-those-really" role="doc-backlink">So <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> is good for matrix formulas, but how common are those really?</a></h3>
<p>Weve seen that <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> makes matrix formulas dramatically easier to
work with for both experts and non-experts, that matrix formulas
appear in many important applications, and that numerical libraries
like numpy are used by a substantial proportion of Pythons user base.
But numerical libraries arent just about matrix formulas, and being
important doesnt necessarily mean taking up a lot of code: if matrix
formulas only occurred in one or two places in the average
numerically-oriented project, then it still wouldnt be worth adding a
new operator. So how common is matrix multiplication, really?</p>
<p>When the going gets tough, the tough get empirical. To get a rough
estimate of how useful the <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> operator will be, the table below
shows the rate at which different Python operators are actually used
in the stdlib, and also in two high-profile numerical packages the
scikit-learn machine learning library, and the nipy neuroimaging
library normalized by source lines of code (SLOC). Rows are sorted
by the combined column, which pools all three code bases together.
The combined column is thus strongly weighted towards the stdlib,
which is much larger than both projects put together (stdlib: 411575
SLOC, scikit-learn: 50924 SLOC, nipy: 37078 SLOC). <a class="footnote-reference brackets" href="#sloc-details" id="id7">[7]</a></p>
<p>The <code class="docutils literal notranslate"><span class="pre">dot</span></code> row (marked <code class="docutils literal notranslate"><span class="pre">******</span></code>) counts how common matrix multiply
operations are in each codebase.</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="o">====</span> <span class="o">======</span> <span class="o">============</span> <span class="o">====</span> <span class="o">========</span>
<span class="n">op</span> <span class="n">stdlib</span> <span class="n">scikit</span><span class="o">-</span><span class="n">learn</span> <span class="n">nipy</span> <span class="n">combined</span>
<span class="o">====</span> <span class="o">======</span> <span class="o">============</span> <span class="o">====</span> <span class="o">========</span>
<span class="o">=</span> <span class="mi">2969</span> <span class="mi">5536</span> <span class="mi">4932</span> <span class="mi">3376</span> <span class="o">/</span> <span class="mi">10</span><span class="p">,</span><span class="mi">000</span> <span class="n">SLOC</span>
<span class="o">-</span> <span class="mi">218</span> <span class="mi">444</span> <span class="mi">496</span> <span class="mi">261</span>
<span class="o">+</span> <span class="mi">224</span> <span class="mi">201</span> <span class="mi">348</span> <span class="mi">231</span>
<span class="o">==</span> <span class="mi">177</span> <span class="mi">248</span> <span class="mi">334</span> <span class="mi">196</span>
<span class="o">*</span> <span class="mi">156</span> <span class="mi">284</span> <span class="mi">465</span> <span class="mi">192</span>
<span class="o">%</span> <span class="mi">121</span> <span class="mi">114</span> <span class="mi">107</span> <span class="mi">119</span>
<span class="o">**</span> <span class="mi">59</span> <span class="mi">111</span> <span class="mi">118</span> <span class="mi">68</span>
<span class="o">!=</span> <span class="mi">40</span> <span class="mi">56</span> <span class="mi">74</span> <span class="mi">44</span>
<span class="o">/</span> <span class="mi">18</span> <span class="mi">121</span> <span class="mi">183</span> <span class="mi">41</span>
<span class="o">&gt;</span> <span class="mi">29</span> <span class="mi">70</span> <span class="mi">110</span> <span class="mi">39</span>
<span class="o">+=</span> <span class="mi">34</span> <span class="mi">61</span> <span class="mi">67</span> <span class="mi">39</span>
<span class="o">&lt;</span> <span class="mi">32</span> <span class="mi">62</span> <span class="mi">76</span> <span class="mi">38</span>
<span class="o">&gt;=</span> <span class="mi">19</span> <span class="mi">17</span> <span class="mi">17</span> <span class="mi">18</span>
<span class="o">&lt;=</span> <span class="mi">18</span> <span class="mi">27</span> <span class="mi">12</span> <span class="mi">18</span>
<span class="n">dot</span> <span class="o">*****</span> <span class="mi">0</span> <span class="o">**********</span> <span class="mi">99</span> <span class="o">**</span> <span class="mi">74</span> <span class="o">******</span> <span class="mi">16</span>
<span class="o">|</span> <span class="mi">18</span> <span class="mi">1</span> <span class="mi">2</span> <span class="mi">15</span>
<span class="o">&amp;</span> <span class="mi">14</span> <span class="mi">0</span> <span class="mi">6</span> <span class="mi">12</span>
<span class="o">&lt;&lt;</span> <span class="mi">10</span> <span class="mi">1</span> <span class="mi">1</span> <span class="mi">8</span>
<span class="o">//</span> <span class="mi">9</span> <span class="mi">9</span> <span class="mi">1</span> <span class="mi">8</span>
<span class="o">-=</span> <span class="mi">5</span> <span class="mi">21</span> <span class="mi">14</span> <span class="mi">8</span>
<span class="o">*=</span> <span class="mi">2</span> <span class="mi">19</span> <span class="mi">22</span> <span class="mi">5</span>
<span class="o">/=</span> <span class="mi">0</span> <span class="mi">23</span> <span class="mi">16</span> <span class="mi">4</span>
<span class="o">&gt;&gt;</span> <span class="mi">4</span> <span class="mi">0</span> <span class="mi">0</span> <span class="mi">3</span>
<span class="o">^</span> <span class="mi">3</span> <span class="mi">0</span> <span class="mi">0</span> <span class="mi">3</span>
<span class="o">~</span> <span class="mi">2</span> <span class="mi">4</span> <span class="mi">5</span> <span class="mi">2</span>
<span class="o">|=</span> <span class="mi">3</span> <span class="mi">0</span> <span class="mi">0</span> <span class="mi">2</span>
<span class="o">&amp;=</span> <span class="mi">1</span> <span class="mi">0</span> <span class="mi">0</span> <span class="mi">1</span>
<span class="o">//=</span> <span class="mi">1</span> <span class="mi">0</span> <span class="mi">0</span> <span class="mi">1</span>
<span class="o">^=</span> <span class="mi">1</span> <span class="mi">0</span> <span class="mi">0</span> <span class="mi">0</span>
<span class="o">**=</span> <span class="mi">0</span> <span class="mi">2</span> <span class="mi">0</span> <span class="mi">0</span>
<span class="o">%=</span> <span class="mi">0</span> <span class="mi">0</span> <span class="mi">0</span> <span class="mi">0</span>
<span class="o">&lt;&lt;=</span> <span class="mi">0</span> <span class="mi">0</span> <span class="mi">0</span> <span class="mi">0</span>
<span class="o">&gt;&gt;=</span> <span class="mi">0</span> <span class="mi">0</span> <span class="mi">0</span> <span class="mi">0</span>
<span class="o">====</span> <span class="o">======</span> <span class="o">============</span> <span class="o">====</span> <span class="o">========</span>
</pre></div>
</div>
<p>These two numerical packages alone contain ~780 uses of matrix
multiplication. Within these packages, matrix multiplication is used
more heavily than most comparison operators (<code class="docutils literal notranslate"><span class="pre">&lt;</span></code> <code class="docutils literal notranslate"><span class="pre">!=</span></code> <code class="docutils literal notranslate"><span class="pre">&lt;=</span></code>
<code class="docutils literal notranslate"><span class="pre">&gt;=</span></code>). Even when we dilute these counts by including the stdlib
into our comparisons, matrix multiplication is still used more often
in total than any of the bitwise operators, and 2x as often as <code class="docutils literal notranslate"><span class="pre">//</span></code>.
This is true even though the stdlib, which contains a fair amount of
integer arithmetic and no matrix operations, makes up more than 80% of
the combined code base.</p>
<p>By coincidence, the numeric libraries make up approximately the same
proportion of the combined codebase as numeric tutorials make up of
PyCon 2014s tutorial schedule, which suggests that the combined
column may not be <em>wildly</em> unrepresentative of new Python code in
general. While its impossible to know for certain, from this data it
seems entirely possible that across all Python code currently being
written, matrix multiplication is already used more often than <code class="docutils literal notranslate"><span class="pre">//</span></code>
and the bitwise operations.</p>
</section>
<section id="but-isn-t-it-weird-to-add-an-operator-with-no-stdlib-uses">
<h3><a class="toc-backref" href="#but-isn-t-it-weird-to-add-an-operator-with-no-stdlib-uses" role="doc-backlink">But isnt it weird to add an operator with no stdlib uses?</a></h3>
<p>Its certainly unusual (though extended slicing existed for some time
builtin types gained support for it, <code class="docutils literal notranslate"><span class="pre">Ellipsis</span></code> is still unused
within the stdlib, etc.). But the important thing is whether a change
will benefit users, not where the software is being downloaded from.
Its clear from the above that <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> will be used, and used heavily.
And this PEP provides the critical piece that will allow the Python
numerical community to finally reach consensus on a standard duck type
for all array-like objects, which is a necessary precondition to ever
adding a numerical array type to the stdlib.</p>
</section>
</section>
<section id="compatibility-considerations">
<h2><a class="toc-backref" href="#compatibility-considerations" role="doc-backlink">Compatibility considerations</a></h2>
<p>Currently, the only legal use of the <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> token in Python code is at
statement beginning in decorators. The new operators are both infix;
the one place they can never occur is at statement beginning.
Therefore, no existing code will be broken by the addition of these
operators, and there is no possible parsing ambiguity between
decorator-&#64; and the new operators.</p>
<p>Another important kind of compatibility is the mental cost paid by
users to update their understanding of the Python language after this
change, particularly for users who do not work with matrices and thus
do not benefit. Here again, <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> has minimal impact: even
comprehensive tutorials and references will only need to add a
sentence or two to fully document this PEPs changes for a
non-numerical audience.</p>
</section>
<section id="intended-usage-details">
<h2><a class="toc-backref" href="#intended-usage-details" role="doc-backlink">Intended usage details</a></h2>
<p>This section is informative, rather than normative it documents the
consensus of a number of libraries that provide array- or matrix-like
objects on how <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> will be implemented.</p>
<p>This section uses the numpy terminology for describing arbitrary
multidimensional arrays of data, because it is a superset of all other
commonly used models. In this model, the <em>shape</em> of any array is
represented by a tuple of integers. Because matrices are
two-dimensional, they have len(shape) == 2, while 1d vectors have
len(shape) == 1, and scalars have shape == (), i.e., they are “0
dimensional”. Any array contains prod(shape) total entries. Notice
that <a class="reference external" href="https://en.wikipedia.org/wiki/Empty_product">prod(()) == 1</a> (for the same reason that sum(()) == 0); scalars
are just an ordinary kind of array, not a special case. Notice also
that we distinguish between a single scalar value (shape == (),
analogous to <code class="docutils literal notranslate"><span class="pre">1</span></code>), a vector containing only a single entry (shape ==
(1,), analogous to <code class="docutils literal notranslate"><span class="pre">[1]</span></code>), a matrix containing only a single entry
(shape == (1, 1), analogous to <code class="docutils literal notranslate"><span class="pre">[[1]]</span></code>), etc., so the dimensionality
of any array is always well-defined. Other libraries with more
restricted representations (e.g., those that support 2d arrays only)
might implement only a subset of the functionality described here.</p>
<section id="semantics">
<h3><a class="toc-backref" href="#semantics" role="doc-backlink">Semantics</a></h3>
<p>The recommended semantics for <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> for different inputs are:</p>
<ul>
<li>2d inputs are conventional matrices, and so the semantics are
obvious: we apply conventional matrix multiplication. If we write
<code class="docutils literal notranslate"><span class="pre">arr(2,</span> <span class="pre">3)</span></code> to represent an arbitrary 2x3 array, then <code class="docutils literal notranslate"><span class="pre">arr(2,</span> <span class="pre">3)</span>
<span class="pre">&#64;</span> <span class="pre">arr(3,</span> <span class="pre">4)</span></code> returns an array with shape (2, 4).</li>
<li>1d vector inputs are promoted to 2d by prepending or appending a 1
to the shape, the operation is performed, and then the added
dimension is removed from the output. The 1 is always added on the
“outside” of the shape: prepended for left arguments, and appended
for right arguments. The result is that matrix &#64; vector and vector
&#64; matrix are both legal (assuming compatible shapes), and both
return 1d vectors; vector &#64; vector returns a scalar. This is
clearer with examples.<ul class="simple">
<li><code class="docutils literal notranslate"><span class="pre">arr(2,</span> <span class="pre">3)</span> <span class="pre">&#64;</span> <span class="pre">arr(3,</span> <span class="pre">1)</span></code> is a regular matrix product, and returns
an array with shape (2, 1), i.e., a column vector.</li>
<li><code class="docutils literal notranslate"><span class="pre">arr(2,</span> <span class="pre">3)</span> <span class="pre">&#64;</span> <span class="pre">arr(3)</span></code> performs the same computation as the
previous (i.e., treats the 1d vector as a matrix containing a
single <em>column</em>, shape = (3, 1)), but returns the result with
shape (2,), i.e., a 1d vector.</li>
<li><code class="docutils literal notranslate"><span class="pre">arr(1,</span> <span class="pre">3)</span> <span class="pre">&#64;</span> <span class="pre">arr(3,</span> <span class="pre">2)</span></code> is a regular matrix product, and returns
an array with shape (1, 2), i.e., a row vector.</li>
<li><code class="docutils literal notranslate"><span class="pre">arr(3)</span> <span class="pre">&#64;</span> <span class="pre">arr(3,</span> <span class="pre">2)</span></code> performs the same computation as the
previous (i.e., treats the 1d vector as a matrix containing a
single <em>row</em>, shape = (1, 3)), but returns the result with shape
(2,), i.e., a 1d vector.</li>
<li><code class="docutils literal notranslate"><span class="pre">arr(1,</span> <span class="pre">3)</span> <span class="pre">&#64;</span> <span class="pre">arr(3,</span> <span class="pre">1)</span></code> is a regular matrix product, and returns
an array with shape (1, 1), i.e., a single value in matrix form.</li>
<li><code class="docutils literal notranslate"><span class="pre">arr(3)</span> <span class="pre">&#64;</span> <span class="pre">arr(3)</span></code> performs the same computation as the
previous, but returns the result with shape (), i.e., a single
scalar value, not in matrix form. So this is the standard inner
product on vectors.</li>
</ul>
<p>An infelicity of this definition for 1d vectors is that it makes
<code class="docutils literal notranslate"><span class="pre">&#64;</span></code> non-associative in some cases (<code class="docutils literal notranslate"><span class="pre">(Mat1</span> <span class="pre">&#64;</span> <span class="pre">vec)</span> <span class="pre">&#64;</span> <span class="pre">Mat2</span></code> !=
<code class="docutils literal notranslate"><span class="pre">Mat1</span> <span class="pre">&#64;</span> <span class="pre">(vec</span> <span class="pre">&#64;</span> <span class="pre">Mat2)</span></code>). But this seems to be a case where
practicality beats purity: non-associativity only arises for strange
expressions that would never be written in practice; if they are
written anyway then there is a consistent rule for understanding
what will happen (<code class="docutils literal notranslate"><span class="pre">Mat1</span> <span class="pre">&#64;</span> <span class="pre">vec</span> <span class="pre">&#64;</span> <span class="pre">Mat2</span></code> is parsed as <code class="docutils literal notranslate"><span class="pre">(Mat1</span> <span class="pre">&#64;</span> <span class="pre">vec)</span>
<span class="pre">&#64;</span> <span class="pre">Mat2</span></code>, just like <code class="docutils literal notranslate"><span class="pre">a</span> <span class="pre">-</span> <span class="pre">b</span> <span class="pre">-</span> <span class="pre">c</span></code>); and, not supporting 1d vectors
would rule out many important use cases that do arise very commonly
in practice. No-one wants to explain to new users why to solve the
simplest linear system in the obvious way, they have to type
<code class="docutils literal notranslate"><span class="pre">(inv(A)</span> <span class="pre">&#64;</span> <span class="pre">b[:,</span> <span class="pre">np.newaxis]).flatten()</span></code> instead of <code class="docutils literal notranslate"><span class="pre">inv(A)</span> <span class="pre">&#64;</span> <span class="pre">b</span></code>,
or perform an ordinary least-squares regression by typing
<code class="docutils literal notranslate"><span class="pre">solve(X.T</span> <span class="pre">&#64;</span> <span class="pre">X,</span> <span class="pre">X</span> <span class="pre">&#64;</span> <span class="pre">y[:,</span> <span class="pre">np.newaxis]).flatten()</span></code> instead of
<code class="docutils literal notranslate"><span class="pre">solve(X.T</span> <span class="pre">&#64;</span> <span class="pre">X,</span> <span class="pre">X</span> <span class="pre">&#64;</span> <span class="pre">y)</span></code>. No-one wants to type <code class="docutils literal notranslate"><span class="pre">(a[np.newaxis,</span> <span class="pre">:]</span>
<span class="pre">&#64;</span> <span class="pre">b[:,</span> <span class="pre">np.newaxis])[0,</span> <span class="pre">0]</span></code> instead of <code class="docutils literal notranslate"><span class="pre">a</span> <span class="pre">&#64;</span> <span class="pre">b</span></code> every time they
compute an inner product, or <code class="docutils literal notranslate"><span class="pre">(a[np.newaxis,</span> <span class="pre">:]</span> <span class="pre">&#64;</span> <span class="pre">Mat</span> <span class="pre">&#64;</span> <span class="pre">b[:,</span>
<span class="pre">np.newaxis])[0,</span> <span class="pre">0]</span></code> for general quadratic forms instead of <code class="docutils literal notranslate"><span class="pre">a</span> <span class="pre">&#64;</span>
<span class="pre">Mat</span> <span class="pre">&#64;</span> <span class="pre">b</span></code>. In addition, sage and sympy (see below) use these
non-associative semantics with an infix matrix multiplication
operator (they use <code class="docutils literal notranslate"><span class="pre">*</span></code>), and they report that they havent
experienced any problems caused by it.</p>
</li>
<li>For inputs with more than 2 dimensions, we treat the last two
dimensions as being the dimensions of the matrices to multiply, and
broadcast across the other dimensions. This provides a convenient
way to quickly compute many matrix products in a single operation.
For example, <code class="docutils literal notranslate"><span class="pre">arr(10,</span> <span class="pre">2,</span> <span class="pre">3)</span> <span class="pre">&#64;</span> <span class="pre">arr(10,</span> <span class="pre">3,</span> <span class="pre">4)</span></code> performs 10 separate
matrix multiplies, each of which multiplies a 2x3 and a 3x4 matrix
to produce a 2x4 matrix, and then returns the 10 resulting matrices
together in an array with shape (10, 2, 4). The intuition here is
that we treat these 3d arrays of numbers as if they were 1d arrays
<em>of matrices</em>, and then apply matrix multiplication in an
elementwise manner, where now each element is a whole matrix.
Note that broadcasting is not limited to perfectly aligned arrays;
in more complicated cases, it allows several simple but powerful
tricks for controlling how arrays are aligned with each other; see
<a class="footnote-reference brackets" href="#broadcasting" id="id8">[10]</a> for details. (In particular, it turns out that
when broadcasting is taken into account, the standard scalar *
matrix product is a special case of the elementwise multiplication
operator <code class="docutils literal notranslate"><span class="pre">*</span></code>.)<p>If one operand is &gt;2d, and another operand is 1d, then the above
rules apply unchanged, with 1d-&gt;2d promotion performed before
broadcasting. E.g., <code class="docutils literal notranslate"><span class="pre">arr(10,</span> <span class="pre">2,</span> <span class="pre">3)</span> <span class="pre">&#64;</span> <span class="pre">arr(3)</span></code> first promotes to
<code class="docutils literal notranslate"><span class="pre">arr(10,</span> <span class="pre">2,</span> <span class="pre">3)</span> <span class="pre">&#64;</span> <span class="pre">arr(3,</span> <span class="pre">1)</span></code>, then broadcasts the right argument to
create the aligned operation <code class="docutils literal notranslate"><span class="pre">arr(10,</span> <span class="pre">2,</span> <span class="pre">3)</span> <span class="pre">&#64;</span> <span class="pre">arr(10,</span> <span class="pre">3,</span> <span class="pre">1)</span></code>,
multiplies to get an array with shape (10, 2, 1), and finally
removes the added dimension, returning an array with shape (10, 2).
Similarly, <code class="docutils literal notranslate"><span class="pre">arr(2)</span> <span class="pre">&#64;</span> <span class="pre">arr(10,</span> <span class="pre">2,</span> <span class="pre">3)</span></code> produces an intermediate array
with shape (10, 1, 3), and a final array with shape (10, 3).</p>
</li>
<li>0d (scalar) inputs raise an error. Scalar * matrix multiplication
is a mathematically and algorithmically distinct operation from
matrix &#64; matrix multiplication, and is already covered by the
elementwise <code class="docutils literal notranslate"><span class="pre">*</span></code> operator. Allowing scalar &#64; matrix would thus
both require an unnecessary special case, and violate TOOWTDI.</li>
</ul>
</section>
<section id="adoption">
<h3><a class="toc-backref" href="#adoption" role="doc-backlink">Adoption</a></h3>
<p>We group existing Python projects which provide array- or matrix-like
types based on what API they currently use for elementwise and matrix
multiplication.</p>
<p><strong>Projects which currently use * for elementwise multiplication, and
function/method calls for matrix multiplication:</strong></p>
<p>The developers of the following projects have expressed an intention
to implement <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> on their array-like types using the above
semantics:</p>
<ul class="simple">
<li>numpy</li>
<li>pandas</li>
<li>blaze</li>
<li>theano</li>
</ul>
<p>The following projects have been alerted to the existence of the PEP,
but its not yet known what they plan to do if its accepted. We
dont anticipate that theyll have any objections, though, since
everything proposed here is consistent with how they already do
things:</p>
<ul class="simple">
<li>pycuda</li>
<li>panda3d</li>
</ul>
<p><strong>Projects which currently use * for matrix multiplication, and
function/method calls for elementwise multiplication:</strong></p>
<p>The following projects have expressed an intention, if this PEP is
accepted, to migrate from their current API to the elementwise-<code class="docutils literal notranslate"><span class="pre">*</span></code>,
matmul-<code class="docutils literal notranslate"><span class="pre">&#64;</span></code> convention (i.e., this is a list of projects whose API
fragmentation will probably be eliminated if this PEP is accepted):</p>
<ul class="simple">
<li>numpy (<code class="docutils literal notranslate"><span class="pre">numpy.matrix</span></code>)</li>
<li>scipy.sparse</li>
<li>pyoperators</li>
<li>pyviennacl</li>
</ul>
<p>The following projects have been alerted to the existence of the PEP,
but its not known what they plan to do if its accepted (i.e., this
is a list of projects whose API fragmentation may or may not be
eliminated if this PEP is accepted):</p>
<ul class="simple">
<li>cvxopt</li>
</ul>
<p><strong>Projects which currently use * for matrix multiplication, and which
dont really care about elementwise multiplication of matrices:</strong></p>
<p>There are several projects which implement matrix types, but from a
very different perspective than the numerical libraries discussed
above. These projects focus on computational methods for analyzing
matrices in the sense of abstract mathematical objects (i.e., linear
maps over free modules over rings), rather than as big bags full of
numbers that need crunching. And it turns out that from the abstract
math point of view, there isnt much use for elementwise operations in
the first place; as discussed in the Background section above,
elementwise operations are motivated by the bag-of-numbers approach.
So these projects dont encounter the basic problem that this PEP
exists to address, making it mostly irrelevant to them; while they
appear superficially similar to projects like numpy, theyre actually
doing something quite different. They use <code class="docutils literal notranslate"><span class="pre">*</span></code> for matrix
multiplication (and for group actions, and so forth), and if this PEP
is accepted, their expressed intention is to continue doing so, while
perhaps adding <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> as an alias. These projects include:</p>
<ul class="simple">
<li>sympy</li>
<li>sage</li>
</ul>
</section>
</section>
<section id="implementation-details">
<h2><a class="toc-backref" href="#implementation-details" role="doc-backlink">Implementation details</a></h2>
<p>New functions <code class="docutils literal notranslate"><span class="pre">operator.matmul</span></code> and <code class="docutils literal notranslate"><span class="pre">operator.__matmul__</span></code> are
added to the standard library, with the usual semantics.</p>
<p>A corresponding function <code class="docutils literal notranslate"><span class="pre">PyObject*</span> <span class="pre">PyObject_MatrixMultiply(PyObject</span>
<span class="pre">*o1,</span> <span class="pre">PyObject</span> <span class="pre">*o2)</span></code> is added to the C API.</p>
<p>A new AST node is added named <code class="docutils literal notranslate"><span class="pre">MatMult</span></code>, along with a new token
<code class="docutils literal notranslate"><span class="pre">ATEQUAL</span></code> and new bytecode opcodes <code class="docutils literal notranslate"><span class="pre">BINARY_MATRIX_MULTIPLY</span></code> and
<code class="docutils literal notranslate"><span class="pre">INPLACE_MATRIX_MULTIPLY</span></code>.</p>
<p>Two new type slots are added; whether this is to <code class="docutils literal notranslate"><span class="pre">PyNumberMethods</span></code>
or a new <code class="docutils literal notranslate"><span class="pre">PyMatrixMethods</span></code> struct remains to be determined.</p>
</section>
<section id="rationale-for-specification-details">
<h2><a class="toc-backref" href="#rationale-for-specification-details" role="doc-backlink">Rationale for specification details</a></h2>
<section id="choice-of-operator">
<h3><a class="toc-backref" href="#choice-of-operator" role="doc-backlink">Choice of operator</a></h3>
<p>Why <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> instead of some other spelling? There isnt any consensus
across other programming languages about how this operator should be
named <a class="footnote-reference brackets" href="#matmul-other-langs" id="id9">[11]</a>; here we discuss the various options.</p>
<p>Restricting ourselves only to symbols present on US English keyboards,
the punctuation characters that dont already have a meaning in Python
expression context are: <code class="docutils literal notranslate"><span class="pre">&#64;</span></code>, backtick, <code class="docutils literal notranslate"><span class="pre">$</span></code>, <code class="docutils literal notranslate"><span class="pre">!</span></code>, and <code class="docutils literal notranslate"><span class="pre">?</span></code>. Of
these options, <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> is clearly the best; <code class="docutils literal notranslate"><span class="pre">!</span></code> and <code class="docutils literal notranslate"><span class="pre">?</span></code> are already
heavily freighted with inapplicable meanings in the programming
context, backtick has been banned from Python by BDFL pronouncement
(see <a class="pep reference internal" href="../pep-3099/" title="PEP 3099 Things that will Not Change in Python 3000">PEP 3099</a>), and <code class="docutils literal notranslate"><span class="pre">$</span></code> is uglier, even more dissimilar to <code class="docutils literal notranslate"><span class="pre">*</span></code> and
<span class="formula"></span>, and has Perl/PHP baggage. <code class="docutils literal notranslate"><span class="pre">$</span></code> is probably the
second-best option of these, though.</p>
<p>Symbols which are not present on US English keyboards start at a
significant disadvantage (having to spend 5 minutes at the beginning
of every numeric Python tutorial just going over keyboard layouts is
not a hassle anyone really wants). Plus, even if we somehow overcame
the typing problem, its not clear there are any that are actually
better than <code class="docutils literal notranslate"><span class="pre">&#64;</span></code>. Some options that have been suggested include:</p>
<ul class="simple">
<li>U+00D7 MULTIPLICATION SIGN: <code class="docutils literal notranslate"><span class="pre">A</span> <span class="pre">×</span> <span class="pre">B</span></code></li>
<li>U+22C5 DOT OPERATOR: <code class="docutils literal notranslate"><span class="pre">A</span> <span class="pre"></span> <span class="pre">B</span></code></li>
<li>U+2297 CIRCLED TIMES: <code class="docutils literal notranslate"><span class="pre">A</span> <span class="pre"></span> <span class="pre">B</span></code></li>
<li>U+00B0 DEGREE: <code class="docutils literal notranslate"><span class="pre">A</span> <span class="pre">°</span> <span class="pre">B</span></code></li>
</ul>
<p>What we need, though, is an operator that means “matrix
multiplication, as opposed to scalar/elementwise multiplication”.
There is no conventional symbol with this meaning in either
programming or mathematics, where these operations are usually
distinguished by context. (And U+2297 CIRCLED TIMES is actually used
conventionally to mean exactly the wrong things: elementwise
multiplication the “Hadamard product” or outer product, rather
than matrix/inner product like our operator). <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> at least has the
virtue that it <em>looks</em> like a funny non-commutative operator; a naive
user who knows maths but not programming couldnt look at <code class="docutils literal notranslate"><span class="pre">A</span> <span class="pre">*</span> <span class="pre">B</span></code>
versus <code class="docutils literal notranslate"><span class="pre">A</span> <span class="pre">×</span> <span class="pre">B</span></code>, or <code class="docutils literal notranslate"><span class="pre">A</span> <span class="pre">*</span> <span class="pre">B</span></code> versus <code class="docutils literal notranslate"><span class="pre">A</span> <span class="pre"></span> <span class="pre">B</span></code>, or <code class="docutils literal notranslate"><span class="pre">A</span> <span class="pre">*</span> <span class="pre">B</span></code> versus
<code class="docutils literal notranslate"><span class="pre">A</span> <span class="pre">°</span> <span class="pre">B</span></code> and guess which one is the usual multiplication, and which
one is the special case.</p>
<p>Finally, there is the option of using multi-character tokens. Some
options:</p>
<ul class="simple">
<li>Matlab and Julia use a <code class="docutils literal notranslate"><span class="pre">.*</span></code> operator. Aside from being visually
confusable with <code class="docutils literal notranslate"><span class="pre">*</span></code>, this would be a terrible choice for us
because in Matlab and Julia, <code class="docutils literal notranslate"><span class="pre">*</span></code> means matrix multiplication and
<code class="docutils literal notranslate"><span class="pre">.*</span></code> means elementwise multiplication, so using <code class="docutils literal notranslate"><span class="pre">.*</span></code> for matrix
multiplication would make us exactly backwards from what Matlab and
Julia users expect.</li>
<li>APL apparently used <code class="docutils literal notranslate"><span class="pre">+.×</span></code>, which by combining a multi-character
token, confusing attribute-access-like . syntax, and a unicode
character, ranks somewhere below U+2603 SNOWMAN on our candidate
list. If we like the idea of combining addition and multiplication
operators as being evocative of how matrix multiplication actually
works, then something like <code class="docutils literal notranslate"><span class="pre">+*</span></code> could be used though this may
be too easy to confuse with <code class="docutils literal notranslate"><span class="pre">*+</span></code>, which is just multiplication
combined with the unary <code class="docutils literal notranslate"><span class="pre">+</span></code> operator.</li>
<li><a class="pep reference internal" href="../pep-0211/" title="PEP 211 Adding A New Outer Product Operator">PEP 211</a> suggested <code class="docutils literal notranslate"><span class="pre">~*</span></code>. This has the downside that it sort of
suggests that there is a unary <code class="docutils literal notranslate"><span class="pre">*</span></code> operator that is being combined
with unary <code class="docutils literal notranslate"><span class="pre">~</span></code>, but it could work.</li>
<li>R uses <code class="docutils literal notranslate"><span class="pre">%*%</span></code> for matrix multiplication. In R this forms part of a
general extensible infix system in which all tokens of the form
<code class="docutils literal notranslate"><span class="pre">%foo%</span></code> are user-defined binary operators. We could steal the
token without stealing the system.</li>
<li>Some other plausible candidates that have been suggested: <code class="docutils literal notranslate"><span class="pre">&gt;&lt;</span></code> (=
ascii drawing of the multiplication sign ×); the footnote operator
<code class="docutils literal notranslate"><span class="pre">[*]</span></code> or <code class="docutils literal notranslate"><span class="pre">|*|</span></code> (but when used in context, the use of vertical
grouping symbols tends to recreate the nested parentheses visual
clutter that was noted as one of the major downsides of the function
syntax were trying to get away from); <code class="docutils literal notranslate"><span class="pre">^*</span></code>.</li>
</ul>
<p>So, it doesnt matter much, but <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> seems as good or better than any
of the alternatives:</p>
<ul class="simple">
<li>Its a friendly character that Pythoneers are already used to typing
in decorators, but the decorator usage and the math expression
usage are sufficiently dissimilar that it would be hard to confuse
them in practice.</li>
<li>Its widely accessible across keyboard layouts (and thanks to its
use in email addresses, this is true even of weird keyboards like
those in phones).</li>
<li>Its round like <code class="docutils literal notranslate"><span class="pre">*</span></code> and <span class="formula"></span>.</li>
<li>The mATrices mnemonic is cute.</li>
<li>The swirly shape is reminiscent of the simultaneous sweeps over rows
and columns that define matrix multiplication</li>
<li>Its asymmetry is evocative of its non-commutative nature.</li>
<li>Whatever, we have to pick something.</li>
</ul>
</section>
<section id="precedence-and-associativity">
<h3><a class="toc-backref" href="#precedence-and-associativity" role="doc-backlink">Precedence and associativity</a></h3>
<p>There was a long discussion <a class="footnote-reference brackets" href="#associativity-discussions" id="id10">[15]</a> about
whether <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> should be right- or left-associative (or even something
more exotic <a class="footnote-reference brackets" href="#group-associativity" id="id11">[18]</a>). Almost all Python operators are
left-associative, so following this convention would be the simplest
approach, but there were two arguments that suggested matrix
multiplication might be worth making right-associative as a special
case:</p>
<p>First, matrix multiplication has a tight conceptual association with
function application/composition, so many mathematically sophisticated
users have an intuition that an expression like <span class="formula"><i>R</i><i>S</i><i>x</i></span> proceeds
from right-to-left, with first <span class="formula"><i>S</i></span> transforming the vector
<span class="formula"><i>x</i></span>, and then <span class="formula"><i>R</i></span> transforming the result. This isnt
universally agreed (and not all number-crunchers are steeped in the
pure-math conceptual framework that motivates this intuition
<a class="footnote-reference brackets" href="#oil-industry-versus-right-associativity" id="id12">[16]</a>), but at the least this
intuition is more common than for other operations like <span class="formula">2⋅3⋅4</span> which everyone reads as going from left-to-right.</p>
<p>Second, if expressions like <code class="docutils literal notranslate"><span class="pre">Mat</span> <span class="pre">&#64;</span> <span class="pre">Mat</span> <span class="pre">&#64;</span> <span class="pre">vec</span></code> appear often in code,
then programs will run faster (and efficiency-minded programmers will
be able to use fewer parentheses) if this is evaluated as <code class="docutils literal notranslate"><span class="pre">Mat</span> <span class="pre">&#64;</span> <span class="pre">(Mat</span>
<span class="pre">&#64;</span> <span class="pre">vec)</span></code> then if it is evaluated like <code class="docutils literal notranslate"><span class="pre">(Mat</span> <span class="pre">&#64;</span> <span class="pre">Mat)</span> <span class="pre">&#64;</span> <span class="pre">vec</span></code>.</p>
<p>However, weighing against these arguments are the following:</p>
<p>Regarding the efficiency argument, empirically, we were unable to find
any evidence that <code class="docutils literal notranslate"><span class="pre">Mat</span> <span class="pre">&#64;</span> <span class="pre">Mat</span> <span class="pre">&#64;</span> <span class="pre">vec</span></code> type expressions actually
dominate in real-life code. Parsing a number of large projects that
use numpy, we found that when forced by numpys current funcall syntax
to choose an order of operations for nested calls to <code class="docutils literal notranslate"><span class="pre">dot</span></code>, people
actually use left-associative nesting slightly <em>more</em> often than
right-associative nesting <a class="footnote-reference brackets" href="#numpy-associativity-counts" id="id13">[17]</a>. And anyway,
writing parentheses isnt so bad if an efficiency-minded programmer
is going to take the trouble to think through the best way to evaluate
some expression, they probably <em>should</em> write down the parentheses
regardless of whether theyre needed, just to make it obvious to the
next reader that they order of operations matter.</p>
<p>In addition, it turns out that other languages, including those with
much more of a focus on linear algebra, overwhelmingly make their
matmul operators left-associative. Specifically, the <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> equivalent
is left-associative in R, Matlab, Julia, IDL, and Gauss. The only
exceptions we found are Mathematica, in which <code class="docutils literal notranslate"><span class="pre">a</span> <span class="pre">&#64;</span> <span class="pre">b</span> <span class="pre">&#64;</span> <span class="pre">c</span></code> would be
parsed non-associatively as <code class="docutils literal notranslate"><span class="pre">dot(a,</span> <span class="pre">b,</span> <span class="pre">c)</span></code>, and APL, in which all
operators are right-associative. There do not seem to exist any
languages that make <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> right-associative and <code class="docutils literal notranslate"><span class="pre">*</span></code>
left-associative. And these decisions dont seem to be controversial
Ive never seen anyone complaining about this particular aspect of
any of these other languages, and the left-associativity of <code class="docutils literal notranslate"><span class="pre">*</span></code>
doesnt seem to bother users of the existing Python libraries that use
<code class="docutils literal notranslate"><span class="pre">*</span></code> for matrix multiplication. So, at the least we can conclude from
this that making <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> left-associative will certainly not cause any
disasters. Making <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> right-associative, OTOH, would be exploring
new and uncertain ground.</p>
<p>And another advantage of left-associativity is that it is much easier
to learn and remember that <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> acts like <code class="docutils literal notranslate"><span class="pre">*</span></code>, than it is to
remember first that <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> is unlike other Python operators by being
right-associative, and then on top of this, also have to remember
whether it is more tightly or more loosely binding than
<code class="docutils literal notranslate"><span class="pre">*</span></code>. (Right-associativity forces us to choose a precedence, and
intuitions were about equally split on which precedence made more
sense. So this suggests that no matter which choice we made, no-one
would be able to guess or remember it.)</p>
<p>On net, therefore, the general consensus of the numerical community is
that while matrix multiplication is something of a special case, its
not special enough to break the rules, and <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> should parse like
<code class="docutils literal notranslate"><span class="pre">*</span></code> does.</p>
</section>
<section id="non-definitions-for-built-in-types">
<h3><a class="toc-backref" href="#non-definitions-for-built-in-types" role="doc-backlink">(Non)-Definitions for built-in types</a></h3>
<p>No <code class="docutils literal notranslate"><span class="pre">__matmul__</span></code> or <code class="docutils literal notranslate"><span class="pre">__matpow__</span></code> are defined for builtin numeric
types (<code class="docutils literal notranslate"><span class="pre">float</span></code>, <code class="docutils literal notranslate"><span class="pre">int</span></code>, etc.) or for the <code class="docutils literal notranslate"><span class="pre">numbers.Number</span></code>
hierarchy, because these types represent scalars, and the consensus
semantics for <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> are that it should raise an error on scalars.</p>
<p>We do not for now define a <code class="docutils literal notranslate"><span class="pre">__matmul__</span></code> method on the standard
<code class="docutils literal notranslate"><span class="pre">memoryview</span></code> or <code class="docutils literal notranslate"><span class="pre">array.array</span></code> objects, for several reasons. Of
course this could be added if someone wants it, but these types would
require quite a bit of additional work beyond <code class="docutils literal notranslate"><span class="pre">__matmul__</span></code> before
they could be used for numeric work e.g., they have no way to do
addition or scalar multiplication either! and adding such
functionality is beyond the scope of this PEP. In addition, providing
a quality implementation of matrix multiplication is highly
non-trivial. Naive nested loop implementations are very slow and
shipping such an implementation in CPython would just create a trap
for users. But the alternative providing a modern, competitive
matrix multiply would require that CPython link to a BLAS library,
which brings a set of new complications. In particular, several
popular BLAS libraries (including the one that ships by default on
OS X) currently break the use of <code class="docutils literal notranslate"><span class="pre">multiprocessing</span></code> <a class="footnote-reference brackets" href="#blas-fork" id="id14">[8]</a>.
Together, these considerations mean that the cost/benefit of adding
<code class="docutils literal notranslate"><span class="pre">__matmul__</span></code> to these types just isnt there, so for now well
continue to delegate these problems to numpy and friends, and defer a
more systematic solution to a future proposal.</p>
<p>There are also non-numeric Python builtins which define <code class="docutils literal notranslate"><span class="pre">__mul__</span></code>
(<code class="docutils literal notranslate"><span class="pre">str</span></code>, <code class="docutils literal notranslate"><span class="pre">list</span></code>, …). We do not define <code class="docutils literal notranslate"><span class="pre">__matmul__</span></code> for these
types either, because why would we even do that.</p>
</section>
<section id="non-definition-of-matrix-power">
<h3><a class="toc-backref" href="#non-definition-of-matrix-power" role="doc-backlink">Non-definition of matrix power</a></h3>
<p>Earlier versions of this PEP also proposed a matrix power operator,
<code class="docutils literal notranslate"><span class="pre">&#64;&#64;</span></code>, analogous to <code class="docutils literal notranslate"><span class="pre">**</span></code>. But on further consideration, it was
decided that the utility of this was sufficiently unclear that it
would be better to leave it out for now, and only revisit the issue if
once we have more experience with <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> it turns out that <code class="docutils literal notranslate"><span class="pre">&#64;&#64;</span></code>
is truly missed. <a class="footnote-reference brackets" href="#atat-discussion" id="id15">[14]</a></p>
</section>
</section>
<section id="rejected-alternatives-to-adding-a-new-operator">
<h2><a class="toc-backref" href="#rejected-alternatives-to-adding-a-new-operator" role="doc-backlink">Rejected alternatives to adding a new operator</a></h2>
<p>Over the past few decades, the Python numeric community has explored a
variety of ways to resolve the tension between matrix and elementwise
multiplication operations. <a class="pep reference internal" href="../pep-0211/" title="PEP 211 Adding A New Outer Product Operator">PEP 211</a> and <a class="pep reference internal" href="../pep-0225/" title="PEP 225 Elementwise/Objectwise Operators">PEP 225</a>, both proposed in 2000
and last seriously discussed in 2008 <a class="footnote-reference brackets" href="#threads-2008" id="id16">[9]</a>, were early
attempts to add new operators to solve this problem, but suffered from
serious flaws; in particular, at that time the Python numerical
community had not yet reached consensus on the proper API for array
objects, or on what operators might be needed or useful (e.g., <a class="pep reference internal" href="../pep-0225/" title="PEP 225 Elementwise/Objectwise Operators">PEP 225</a>
proposes 6 new operators with unspecified semantics). Experience
since then has now led to consensus that the best solution, for both
numeric Python and core Python, is to add a single infix operator for
matrix multiply (together with the other new operators this implies
like <code class="docutils literal notranslate"><span class="pre">&#64;=</span></code>).</p>
<p>We review some of the rejected alternatives here.</p>
<p><strong>Use a second type that defines __mul__ as matrix multiplication:</strong>
As discussed above (<a class="reference internal" href="#background-what-s-wrong-with-the-status-quo">Background: Whats wrong with the status quo?</a>),
this has been tried this for many years via the <code class="docutils literal notranslate"><span class="pre">numpy.matrix</span></code> type
(and its predecessors in Numeric and numarray). The result is a
strong consensus among both numpy developers and developers of
downstream packages that <code class="docutils literal notranslate"><span class="pre">numpy.matrix</span></code> should essentially never be
used, because of the problems caused by having conflicting duck types
for arrays. (Of course one could then argue we should <em>only</em> define
<code class="docutils literal notranslate"><span class="pre">__mul__</span></code> to be matrix multiplication, but then wed have the same
problem with elementwise multiplication.) There have been several
pushes to remove <code class="docutils literal notranslate"><span class="pre">numpy.matrix</span></code> entirely; the only counter-arguments
have come from educators who find that its problems are outweighed by
the need to provide a simple and clear mapping between mathematical
notation and code for novices (see <a class="reference internal" href="#transparent-syntax-is-especially-crucial-for-non-expert-programmers">Transparent syntax is especially
crucial for non-expert programmers</a>). But, of course, starting out
newbies with a dispreferred syntax and then expecting them to
transition later causes its own problems. The two-type solution is
worse than the disease.</p>
<p><strong>Add lots of new operators, or add a new generic syntax for defining
infix operators:</strong> In addition to being generally un-Pythonic and
repeatedly rejected by BDFL fiat, this would be using a sledgehammer
to smash a fly. The scientific python community has consensus that
adding one operator for matrix multiplication is enough to fix the one
otherwise unfixable pain point. (In retrospect, we all think <a class="pep reference internal" href="../pep-0225/" title="PEP 225 Elementwise/Objectwise Operators">PEP 225</a>
was a bad idea too or at least far more complex than it needed to
be.)</p>
<p><strong>Add a new &#64; (or whatever) operator that has some other meaning in
general Python, and then overload it in numeric code:</strong> This was the
approach taken by <a class="pep reference internal" href="../pep-0211/" title="PEP 211 Adding A New Outer Product Operator">PEP 211</a>, which proposed defining <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> to be the
equivalent of <code class="docutils literal notranslate"><span class="pre">itertools.product</span></code>. The problem with this is that
when taken on its own terms, its pretty clear that
<code class="docutils literal notranslate"><span class="pre">itertools.product</span></code> doesnt actually need a dedicated operator. It
hasnt even been deemed worth of a builtin. (During discussions of
this PEP, a similar suggestion was made to define <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> as a general
purpose function composition operator, and this suffers from the same
problem; <code class="docutils literal notranslate"><span class="pre">functools.compose</span></code> isnt even useful enough to exist.)
Matrix multiplication has a uniquely strong rationale for inclusion as
an infix operator. There almost certainly dont exist any other
binary operations that will ever justify adding any other infix
operators to Python.</p>
<p><strong>Add a .dot method to array types so as to allow “pseudo-infix”
A.dot(B) syntax:</strong> This has been in numpy for some years, and in many
cases its better than dot(A, B). But its still much less readable
than real infix notation, and in particular still suffers from an
extreme overabundance of parentheses. See <a class="reference internal" href="#why-should-matrix-multiplication-be-infix">Why should matrix
multiplication be infix?</a> above.</p>
<p><strong>Use a with block to toggle the meaning of * within a single code
block</strong>: E.g., numpy could define a special context object so that
wed have:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">c</span> <span class="o">=</span> <span class="n">a</span> <span class="o">*</span> <span class="n">b</span> <span class="c1"># element-wise multiplication</span>
<span class="k">with</span> <span class="n">numpy</span><span class="o">.</span><span class="n">mul_as_dot</span><span class="p">:</span>
<span class="n">c</span> <span class="o">=</span> <span class="n">a</span> <span class="o">*</span> <span class="n">b</span> <span class="c1"># matrix multiplication</span>
</pre></div>
</div>
<p>However, this has two serious problems: first, it requires that every
array-like types <code class="docutils literal notranslate"><span class="pre">__mul__</span></code> method know how to check some global
state (<code class="docutils literal notranslate"><span class="pre">numpy.mul_is_currently_dot</span></code> or whatever). This is fine if
<code class="docutils literal notranslate"><span class="pre">a</span></code> and <code class="docutils literal notranslate"><span class="pre">b</span></code> are numpy objects, but the world contains many
non-numpy array-like objects. So this either requires non-local
coupling every numpy competitor library has to import numpy and
then check <code class="docutils literal notranslate"><span class="pre">numpy.mul_is_currently_dot</span></code> on every operation or
else it breaks duck-typing, with the above code doing radically
different things depending on whether <code class="docutils literal notranslate"><span class="pre">a</span></code> and <code class="docutils literal notranslate"><span class="pre">b</span></code> are numpy
objects or some other sort of object. Second, and worse, <code class="docutils literal notranslate"><span class="pre">with</span></code>
blocks are dynamically scoped, not lexically scoped; i.e., any
function that gets called inside the <code class="docutils literal notranslate"><span class="pre">with</span></code> block will suddenly find
itself executing inside the mul_as_dot world, and crash and burn
horribly if youre lucky. So this is a construct that could only
be used safely in rather limited cases (no function calls), and which
would make it very easy to shoot yourself in the foot without warning.</p>
<p><strong>Use a language preprocessor that adds extra numerically-oriented
operators and perhaps other syntax:</strong> (As per recent BDFL suggestion:
<a class="footnote-reference brackets" href="#preprocessor" id="id17">[1]</a>) This suggestion seems based on the idea that
numerical code needs a wide variety of syntax additions. In fact,
given <code class="docutils literal notranslate"><span class="pre">&#64;</span></code>, most numerical users dont need any other operators or
syntax; it solves the one really painful problem that cannot be solved
by other means, and that causes painful reverberations through the
larger ecosystem. Defining a new language (presumably with its own
parser which would have to be kept in sync with Pythons, etc.), just
to support a single binary operator, is neither practical nor
desirable. In the numerical context, Pythons competition is
special-purpose numerical languages (Matlab, R, IDL, etc.). Compared
to these, Pythons killer feature is exactly that one can mix
specialized numerical code with code for XML parsing, web page
generation, database access, network programming, GUI libraries, and
so forth, and we also gain major benefits from the huge variety of
tutorials, reference material, introductory classes, etc., which use
Python. Fragmenting “numerical Python” from “real Python” would be a
major source of confusion. A major motivation for this PEP is to
<em>reduce</em> fragmentation. Having to set up a preprocessor would be an
especially prohibitive complication for unsophisticated users. And we
use Python because we like Python! We dont want
almost-but-not-quite-Python.</p>
<p><strong>Use overloading hacks to define a “new infix operator” like *dot*,
as in a well-known Python recipe:</strong> (See: <a class="footnote-reference brackets" href="#infix-hack" id="id18">[2]</a>) Beautiful is
better than ugly. This is… not beautiful. And not Pythonic. And
especially unfriendly to beginners, who are just trying to wrap their
heads around the idea that theres a coherent underlying system behind
these magic incantations that theyre learning, when along comes an
evil hack like this that violates that system, creates bizarre error
messages when accidentally misused, and whose underlying mechanisms
cant be understood without deep knowledge of how object oriented
systems work.</p>
<p><strong>Use a special “facade” type to support syntax like arr.M * arr:</strong>
This is very similar to the previous proposal, in that the <code class="docutils literal notranslate"><span class="pre">.M</span></code>
attribute would basically return the same object as <code class="docutils literal notranslate"><span class="pre">arr</span> <span class="pre">*dot</span></code> would,
and thus suffers the same objections about magicalness. This
approach also has some non-obvious complexities: for example, while
<code class="docutils literal notranslate"><span class="pre">arr.M</span> <span class="pre">*</span> <span class="pre">arr</span></code> must return an array, <code class="docutils literal notranslate"><span class="pre">arr.M</span> <span class="pre">*</span> <span class="pre">arr.M</span></code> and
<code class="docutils literal notranslate"><span class="pre">arr</span> <span class="pre">*</span> <span class="pre">arr.M</span></code> must return facade objects, or else <code class="docutils literal notranslate"><span class="pre">arr.M</span> <span class="pre">*</span> <span class="pre">arr.M</span> <span class="pre">*</span> <span class="pre">arr</span></code>
and <code class="docutils literal notranslate"><span class="pre">arr</span> <span class="pre">*</span> <span class="pre">arr.M</span> <span class="pre">*</span> <span class="pre">arr</span></code> will not work. But this means that facade
objects must be able to recognize both other array objects and other
facade objects (which creates additional complexity for writing
interoperating array types from different libraries who must now
recognize both each others array types and their facade types). It
also creates pitfalls for users who may easily type <code class="docutils literal notranslate"><span class="pre">arr</span> <span class="pre">*</span> <span class="pre">arr.M</span></code> or
<code class="docutils literal notranslate"><span class="pre">arr.M</span> <span class="pre">*</span> <span class="pre">arr.M</span></code> and expect to get back an array object; instead,
they will get a mysterious object that throws errors when they attempt
to use it. Basically with this approach users must be careful to
think of <code class="docutils literal notranslate"><span class="pre">.M*</span></code> as an indivisible unit that acts as an infix operator
and as infix-operator-like token strings go, at least <code class="docutils literal notranslate"><span class="pre">*dot*</span></code>
is prettier looking (look at its cute little ears!).</p>
</section>
<section id="discussions-of-this-pep">
<h2><a class="toc-backref" href="#discussions-of-this-pep" role="doc-backlink">Discussions of this PEP</a></h2>
<p>Collected here for reference:</p>
<ul class="simple">
<li>Github pull request containing much of the original discussion and
drafting: <a class="reference external" href="https://github.com/numpy/numpy/pull/4351">https://github.com/numpy/numpy/pull/4351</a></li>
<li>sympy mailing list discussions of an early draft:<ul>
<li><a class="reference external" href="https://groups.google.com/forum/#!topic/sympy/22w9ONLa7qo">https://groups.google.com/forum/#!topic/sympy/22w9ONLa7qo</a></li>
<li><a class="reference external" href="https://groups.google.com/forum/#!topic/sympy/4tGlBGTggZY">https://groups.google.com/forum/#!topic/sympy/4tGlBGTggZY</a></li>
</ul>
</li>
<li>sage-devel mailing list discussions of an early draft:
<a class="reference external" href="https://groups.google.com/forum/#!topic/sage-devel/YxEktGu8DeM">https://groups.google.com/forum/#!topic/sage-devel/YxEktGu8DeM</a></li>
<li>13-Mar-2014 python-ideas thread:
<a class="reference external" href="https://mail.python.org/pipermail/python-ideas/2014-March/027053.html">https://mail.python.org/pipermail/python-ideas/2014-March/027053.html</a></li>
<li>numpy-discussion thread on whether to keep <code class="docutils literal notranslate"><span class="pre">&#64;&#64;</span></code>:
<a class="reference external" href="http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069448.html">http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069448.html</a></li>
<li>numpy-discussion threads on precedence/associativity of <code class="docutils literal notranslate"><span class="pre">&#64;</span></code>:
* <a class="reference external" href="http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069444.html">http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069444.html</a>
* <a class="reference external" href="http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069605.html">http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069605.html</a></li>
</ul>
</section>
<section id="references">
<h2><a class="toc-backref" href="#references" role="doc-backlink">References</a></h2>
<aside class="footnote-list brackets">
<aside class="footnote brackets" id="preprocessor" role="doc-footnote">
<dt class="label" id="preprocessor">[<a href="#id17">1</a>]</dt>
<dd>From a comment by GvR on a G+ post by GvR; the
comment itself does not seem to be directly linkable: <a class="reference external" href="https://plus.google.com/115212051037621986145/posts/hZVVtJ9bK3u">https://plus.google.com/115212051037621986145/posts/hZVVtJ9bK3u</a></aside>
<aside class="footnote brackets" id="infix-hack" role="doc-footnote">
<dt class="label" id="infix-hack">[<a href="#id18">2</a>]</dt>
<dd><a class="reference external" href="http://code.activestate.com/recipes/384122-infix-operators/">http://code.activestate.com/recipes/384122-infix-operators/</a>
<a class="reference external" href="http://www.sagemath.org/doc/reference/misc/sage/misc/decorators.html#sage.misc.decorators.infix_operator">http://www.sagemath.org/doc/reference/misc/sage/misc/decorators.html#sage.misc.decorators.infix_operator</a></aside>
<aside class="footnote brackets" id="scipy-conf" role="doc-footnote">
<dt class="label" id="scipy-conf">[<a href="#id3">3</a>]</dt>
<dd><a class="reference external" href="http://conference.scipy.org/past.html">http://conference.scipy.org/past.html</a></aside>
<aside class="footnote brackets" id="pydata-conf" role="doc-footnote">
<dt class="label" id="pydata-conf">[<a href="#id4">4</a>]</dt>
<dd><a class="reference external" href="http://pydata.org/events/">http://pydata.org/events/</a></aside>
<aside class="footnote brackets" id="lht" role="doc-footnote">
<dt class="label" id="lht">[<a href="#id2">5</a>]</dt>
<dd>In this formula, <span class="formula"><i>β</i></span> is a vector or matrix of
regression coefficients, <span class="formula"><i>V</i></span> is the estimated
variance/covariance matrix for these coefficients, and we want to
test the null hypothesis that <span class="formula"><i>H</i><i>β</i>=<i>r</i></span>; a large <span class="formula"><i>S</i></span>
then indicates that this hypothesis is unlikely to be true. For
example, in an analysis of human height, the vector <span class="formula"><i>β</i></span>
might contain one value which was the average height of the
measured men, and another value which was the average height of the
measured women, and then setting <span class="formula"><i>H</i>=[1,1],<i>r</i>=0</span> would
let us test whether men and women are the same height on
average. Compare to eq. 2.139 in
<a class="reference external" href="http://sfb649.wiwi.hu-berlin.de/fedc_homepage/xplore/tutorials/xegbohtmlnode17.html">http://sfb649.wiwi.hu-berlin.de/fedc_homepage/xplore/tutorials/xegbohtmlnode17.html</a><p>Example code is adapted from <a class="reference external" href="https://github.com/rerpy/rerpy/blob/0d274f85e14c3b1625acb22aed1efa85d122ecb7/rerpy/incremental_ls.py#L202">https://github.com/rerpy/rerpy/blob/0d274f85e14c3b1625acb22aed1efa85d122ecb7/rerpy/incremental_ls.py#L202</a></p>
</aside>
<aside class="footnote brackets" id="pycon-tutorials" role="doc-footnote">
<dt class="label" id="pycon-tutorials">[<a href="#id5">6</a>]</dt>
<dd>Out of the 36 tutorials scheduled for PyCon 2014
(<a class="reference external" href="https://us.pycon.org/2014/schedule/tutorials/">https://us.pycon.org/2014/schedule/tutorials/</a>), we guess that the
8 below will almost certainly deal with matrices:<ul class="simple">
<li>Dynamics and control with Python</li>
<li>Exploring machine learning with Scikit-learn</li>
<li>How to formulate a (science) problem and analyze it using Python
code</li>
<li>Diving deeper into Machine Learning with Scikit-learn</li>
<li>Data Wrangling for Kaggle Data Science Competitions An etude</li>
<li>Hands-on with Pydata: how to build a minimal recommendation
engine.</li>
<li>Python for Social Scientists</li>
<li>Bayesian statistics made simple</li>
</ul>
<p>In addition, the following tutorials could easily involve matrices:</p>
<ul class="simple">
<li>Introduction to game programming</li>
<li>mrjob: Snakes on a Hadoop <em>(“Well introduce some data science
concepts, such as user-user similarity, and show how to calculate
these metrics…”)</em></li>
<li>Mining Social Web APIs with IPython Notebook</li>
<li>Beyond Defaults: Creating Polished Visualizations Using Matplotlib</li>
</ul>
<p>This gives an estimated range of 8 to 12 / 36 = 22% to 33% of
tutorials dealing with matrices; saying ~20% then gives us some
wiggle room in case our estimates are high.</p>
</aside>
<aside class="footnote brackets" id="sloc-details" role="doc-footnote">
<dt class="label" id="sloc-details">[<a href="#id7">7</a>]</dt>
<dd>SLOCs were defined as physical lines which contain
at least one token that is not a COMMENT, NEWLINE, ENCODING,
INDENT, or DEDENT. Counts were made by using <code class="docutils literal notranslate"><span class="pre">tokenize</span></code> module
from Python 3.2.3 to examine the tokens in all files ending <code class="docutils literal notranslate"><span class="pre">.py</span></code>
underneath some directory. Only tokens which occur at least once
in the source trees are included in the table. The counting script
is available <a class="reference external" href="http://hg.python.org/peps/file/tip/pep-0465/scan-ops.py">in the PEP repository</a>.<p>Matrix multiply counts were estimated by counting how often certain
tokens which are used as matrix multiply function names occurred in
each package. This creates a small number of false positives for
scikit-learn, because we also count instances of the wrappers
around <code class="docutils literal notranslate"><span class="pre">dot</span></code> that this package uses, and so there are a few dozen
tokens which actually occur in <code class="docutils literal notranslate"><span class="pre">import</span></code> or <code class="docutils literal notranslate"><span class="pre">def</span></code> statements.</p>
<p>All counts were made using the latest development version of each
project as of 21 Feb 2014.</p>
<p>stdlib is the contents of the Lib/ directory in commit
d6aa3fa646e2 to the cpython hg repository, and treats the following
tokens as indicating matrix multiply: n/a.</p>
<p>scikit-learn is the contents of the sklearn/ directory in commit
69b71623273ccfc1181ea83d8fb9e05ae96f57c7 to the scikit-learn
repository (<a class="reference external" href="https://github.com/scikit-learn/scikit-learn">https://github.com/scikit-learn/scikit-learn</a>), and
treats the following tokens as indicating matrix multiply: <code class="docutils literal notranslate"><span class="pre">dot</span></code>,
<code class="docutils literal notranslate"><span class="pre">fast_dot</span></code>, <code class="docutils literal notranslate"><span class="pre">safe_sparse_dot</span></code>.</p>
<p>nipy is the contents of the nipy/ directory in commit
5419911e99546401b5a13bd8ccc3ad97f0d31037 to the nipy repository
(<a class="reference external" href="https://github.com/nipy/nipy/">https://github.com/nipy/nipy/</a>), and treats the following tokens as
indicating matrix multiply: <code class="docutils literal notranslate"><span class="pre">dot</span></code>.</p>
</aside>
<aside class="footnote brackets" id="blas-fork" role="doc-footnote">
<dt class="label" id="blas-fork">[<a href="#id14">8</a>]</dt>
<dd>BLAS libraries have a habit of secretly spawning
threads, even when used from single-threaded programs. And threads
play very poorly with <code class="docutils literal notranslate"><span class="pre">fork()</span></code>; the usual symptom is that
attempting to perform linear algebra in a child process causes an
immediate deadlock.</aside>
<aside class="footnote brackets" id="threads-2008" role="doc-footnote">
<dt class="label" id="threads-2008">[<a href="#id16">9</a>]</dt>
<dd><a class="reference external" href="http://fperez.org/py4science/numpy-pep225/numpy-pep225.html">http://fperez.org/py4science/numpy-pep225/numpy-pep225.html</a></aside>
<aside class="footnote brackets" id="broadcasting" role="doc-footnote">
<dt class="label" id="broadcasting">[<a href="#id8">10</a>]</dt>
<dd><a class="reference external" href="http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html">http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html</a></aside>
<aside class="footnote brackets" id="matmul-other-langs" role="doc-footnote">
<dt class="label" id="matmul-other-langs">[<a href="#id9">11</a>]</dt>
<dd><a class="reference external" href="http://mail.scipy.org/pipermail/scipy-user/2014-February/035499.html">http://mail.scipy.org/pipermail/scipy-user/2014-February/035499.html</a></aside>
<aside class="footnote brackets" id="github-details" role="doc-footnote">
<dt class="label" id="github-details">[<a href="#id6">12</a>]</dt>
<dd>Counts were produced by manually entering the
string <code class="docutils literal notranslate"><span class="pre">&quot;import</span> <span class="pre">foo&quot;</span></code> or <code class="docutils literal notranslate"><span class="pre">&quot;from</span> <span class="pre">foo</span> <span class="pre">import&quot;</span></code> (with quotes) into
the Github code search page, e.g.:
<a class="reference external" href="https://github.com/search?q=%22import+numpy%22&amp;ref=simplesearch&amp;type=Code">https://github.com/search?q=%22import+numpy%22&amp;ref=simplesearch&amp;type=Code</a>
on 2014-04-10 at ~21:00 UTC. The reported values are the numbers
given in the “Languages” box on the lower-left corner, next to
“Python”. This also causes some undercounting (e.g., leaving out
Cython code, and possibly one should also count HTML docs and so
forth), but these effects are negligible (e.g., only ~1% of numpy
usage appears to occur in Cython code, and probably even less for
the other modules listed). The use of this box is crucial,
however, because these counts appear to be stable, while the
“overall” counts listed at the top of the page (“Weve found ___
code results”) are highly variable even for a single search
simply reloading the page can cause this number to vary by a factor
of 2 (!!). (They do seem to settle down if one reloads the page
repeatedly, but nonetheless this is spooky enough that it seemed
better to avoid these numbers.)<p>These numbers should of course be taken with multiple grains of
salt; its not clear how representative Github is of Python code in
general, and limitations of the search tool make it impossible to
get precise counts. AFAIK this is the best data set currently
available, but itd be nice if it were better. In particular:</p>
<ul class="simple">
<li>Lines like <code class="docutils literal notranslate"><span class="pre">import</span> <span class="pre">sys,</span> <span class="pre">os</span></code> will only be counted in the <code class="docutils literal notranslate"><span class="pre">sys</span></code>
row.</li>
<li>A file containing both <code class="docutils literal notranslate"><span class="pre">import</span> <span class="pre">X</span></code> and <code class="docutils literal notranslate"><span class="pre">from</span> <span class="pre">X</span> <span class="pre">import</span></code> will be
counted twice</li>
<li>Imports of the form <code class="docutils literal notranslate"><span class="pre">from</span> <span class="pre">X.foo</span> <span class="pre">import</span> <span class="pre">...</span></code> are missed. We
could catch these by instead searching for “from X”, but this is
a common phrase in English prose, so wed end up with false
positives from comments, strings, etc. For many of the modules
considered this shouldnt matter too much for example, the
stdlib modules have flat namespaces but it might especially
lead to undercounting of django, scipy, and twisted.</li>
</ul>
<p>Also, its possible there exist other non-stdlib modules we didnt
think to test that are even more-imported than numpy though we
tried quite a few of the obvious suspects. If you find one, let us
know! The modules tested here were chosen based on a combination
of intuition and the top-100 list at pypi-ranking.info.</p>
<p>Fortunately, it doesnt really matter if it turns out that numpy
is, say, merely the <em>third</em> most-imported non-stdlib module, since
the point is just that numeric programming is a common and
mainstream activity.</p>
<p>Finally, we should point out the obvious: whether a package is
import**ed** is rather different from whether its import**ant**.
No-ones claiming numpy is “the most important package” or anything
like that. Certainly more packages depend on distutils, e.g., then
depend on numpy and far fewer source files import distutils than
import numpy. But this is fine for our present purposes. Most
source files dont import distutils because most source files dont
care how theyre distributed, so long as they are; these source
files thus dont care about details of how distutils API works.
This PEP is in some sense about changing how numpys and related
packages APIs work, so the relevant metric is to look at source
files that are choosing to directly interact with that API, which
is sort of like what we get by looking at import statements.</p>
</aside>
<aside class="footnote brackets" id="hugunin" role="doc-footnote">
<dt class="label" id="hugunin">[<a href="#id1">13</a>]</dt>
<dd>The first such proposal occurs in Jim Hugunins very
first email to the matrix SIG in 1995, which lays out the first
draft of what became Numeric. He suggests using <code class="docutils literal notranslate"><span class="pre">*</span></code> for
elementwise multiplication, and <code class="docutils literal notranslate"><span class="pre">%</span></code> for matrix multiplication:
<a class="reference external" href="https://mail.python.org/pipermail/matrix-sig/1995-August/000002.html">https://mail.python.org/pipermail/matrix-sig/1995-August/000002.html</a></aside>
<aside class="footnote brackets" id="atat-discussion" role="doc-footnote">
<dt class="label" id="atat-discussion">[<a href="#id15">14</a>]</dt>
<dd><a class="reference external" href="http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069502.html">http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069502.html</a></aside>
<aside class="footnote brackets" id="associativity-discussions" role="doc-footnote">
<dt class="label" id="associativity-discussions">[<a href="#id10">15</a>]</dt>
<dd><a class="reference external" href="http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069444.html">http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069444.html</a>
<a class="reference external" href="http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069605.html">http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069605.html</a></aside>
<aside class="footnote brackets" id="oil-industry-versus-right-associativity" role="doc-footnote">
<dt class="label" id="oil-industry-versus-right-associativity">[<a href="#id12">16</a>]</dt>
<dd><a class="reference external" href="http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069610.html">http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069610.html</a></aside>
<aside class="footnote brackets" id="numpy-associativity-counts" role="doc-footnote">
<dt class="label" id="numpy-associativity-counts">[<a href="#id13">17</a>]</dt>
<dd><a class="reference external" href="http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069578.html">http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069578.html</a></aside>
<aside class="footnote brackets" id="group-associativity" role="doc-footnote">
<dt class="label" id="group-associativity">[<a href="#id11">18</a>]</dt>
<dd><a class="reference external" href="http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069530.html">http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069530.html</a></aside>
</aside>
</section>
<section id="copyright">
<h2><a class="toc-backref" href="#copyright" role="doc-backlink">Copyright</a></h2>
<p>This document has been placed in the public domain.</p>
</section>
</section>
<hr class="docutils" />
<p>Source: <a class="reference external" href="https://github.com/python/peps/blob/main/peps/pep-0465.rst">https://github.com/python/peps/blob/main/peps/pep-0465.rst</a></p>
<p>Last modified: <a class="reference external" href="https://github.com/python/peps/commits/main/peps/pep-0465.rst">2023-09-09 17:39:29 GMT</a></p>
</article>
<nav id="pep-sidebar">
<h2>Contents</h2>
<ul>
<li><a class="reference internal" href="#abstract">Abstract</a></li>
<li><a class="reference internal" href="#specification">Specification</a></li>
<li><a class="reference internal" href="#motivation">Motivation</a><ul>
<li><a class="reference internal" href="#executive-summary">Executive summary</a></li>
<li><a class="reference internal" href="#background-what-s-wrong-with-the-status-quo">Background: Whats wrong with the status quo?</a></li>
<li><a class="reference internal" href="#why-should-matrix-multiplication-be-infix">Why should matrix multiplication be infix?</a></li>
<li><a class="reference internal" href="#transparent-syntax-is-especially-crucial-for-non-expert-programmers">Transparent syntax is especially crucial for non-expert programmers</a></li>
<li><a class="reference internal" href="#but-isn-t-matrix-multiplication-a-pretty-niche-requirement">But isnt matrix multiplication a pretty niche requirement?</a></li>
<li><a class="reference internal" href="#so-is-good-for-matrix-formulas-but-how-common-are-those-really">So <code class="docutils literal notranslate"><span class="pre">&#64;</span></code> is good for matrix formulas, but how common are those really?</a></li>
<li><a class="reference internal" href="#but-isn-t-it-weird-to-add-an-operator-with-no-stdlib-uses">But isnt it weird to add an operator with no stdlib uses?</a></li>
</ul>
</li>
<li><a class="reference internal" href="#compatibility-considerations">Compatibility considerations</a></li>
<li><a class="reference internal" href="#intended-usage-details">Intended usage details</a><ul>
<li><a class="reference internal" href="#semantics">Semantics</a></li>
<li><a class="reference internal" href="#adoption">Adoption</a></li>
</ul>
</li>
<li><a class="reference internal" href="#implementation-details">Implementation details</a></li>
<li><a class="reference internal" href="#rationale-for-specification-details">Rationale for specification details</a><ul>
<li><a class="reference internal" href="#choice-of-operator">Choice of operator</a></li>
<li><a class="reference internal" href="#precedence-and-associativity">Precedence and associativity</a></li>
<li><a class="reference internal" href="#non-definitions-for-built-in-types">(Non)-Definitions for built-in types</a></li>
<li><a class="reference internal" href="#non-definition-of-matrix-power">Non-definition of matrix power</a></li>
</ul>
</li>
<li><a class="reference internal" href="#rejected-alternatives-to-adding-a-new-operator">Rejected alternatives to adding a new operator</a></li>
<li><a class="reference internal" href="#discussions-of-this-pep">Discussions of this PEP</a></li>
<li><a class="reference internal" href="#references">References</a></li>
<li><a class="reference internal" href="#copyright">Copyright</a></li>
</ul>
<br>
<a id="source" href="https://github.com/python/peps/blob/main/peps/pep-0465.rst">Page Source (GitHub)</a>
</nav>
</section>
<script src="../_static/colour_scheme.js"></script>
<script src="../_static/wrap_tables.js"></script>
<script src="../_static/sticky_banner.js"></script>
</body>
</html>