Totally new version of the Numbers ABC PEP, received fresh from Jeffrey.

This commit is contained in:
Guido van Rossum 2007-05-17 00:23:41 +00:00
parent 94353f3e7b
commit 9efd40b2e2
1 changed files with 282 additions and 389 deletions

View File

@ -1,5 +1,5 @@
PEP: 3141 PEP: 3141
Title: A Type Hierarchy for Numbers (and other algebraic entities) Title: A Type Hierarchy for Numbers
Version: $Revision$ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Jeffrey Yasskin <jyasskin@gmail.com> Author: Jeffrey Yasskin <jyasskin@gmail.com>
@ -13,18 +13,13 @@ Post-History: Not yet posted
Abstract Abstract
======== ========
This proposal defines a hierarchy of Abstract Base Classes (ABCs) (see This proposal defines a hierarchy of Abstract Base Classes (ABCs) (PEP
PEP 3119) to represent numbers and other algebraic entities similar to 3119) to represent number-like classes. It proposes a hierarchy of
numbers. It proposes: ``Number :> Complex :> Real :> Rational :> Integer`` where ``A :> B``
means "A is a supertype of B", and a pair of ``Exact``/``Inexact``
* A hierarchy of algebraic concepts, including monoids, groups, rings, classes to capture the difference between ``floats`` and
and fields with successively more operators and constraints on their ``ints``. These types are significantly inspired by Scheme's numeric
operators. This will be added as a new library module named tower [#schemetower]_.
"algebra".
* A hierarchy of specifically numeric types, which can be converted to
and from the native Python types. This will be added as a new
library module named "numbers".
Rationale Rationale
========= =========
@ -32,439 +27,337 @@ Rationale
Functions that take numbers as arguments should be able to determine Functions that take numbers as arguments should be able to determine
the properties of those numbers, and if and when overloading based on the properties of those numbers, and if and when overloading based on
types is added to the language, should be overloadable based on the types is added to the language, should be overloadable based on the
types of the arguments. This PEP defines some abstract base classes types of the arguments. For example, slicing requires its arguments to
that are useful in numerical calculations. A function can check that be ``Integers``, and the functions in the ``math`` module require
variable is an instance of one of these classes and then rely on the their arguments to be ``Real``.
properties specified for them. Of course, the language cannot check
these properties, so where I say something is "guaranteed", I really
just mean that it's one of those properties a user should be able to
rely on.
This PEP tries to find a balance between providing fine-grained
distinctions and specifying types that few people will ever use.
Specification Specification
============= =============
This PEP specifies a set of Abstract Base Classes with default
implementations. If the reader prefers to think in terms of Roles (PEP
3133), the default implementations for (for example) the Real ABC
would be moved to a RealDefault class, with Real keeping just the
method declarations.
Although this PEP uses terminology from PEP 3119, the hierarchy is Although this PEP uses terminology from PEP 3119, the hierarchy is
meaningful for any systematic method of defining sets of intended to be meaningful for any systematic method of defining sets
classes. **Todo:** link to the Interfaces PEP when it's ready. I'm of classes, including Interfaces. I'm also using the extra notation
also using the extra notation from PEP 3107 (annotations) to specify from PEP 3107 (Function Annotations) to specify some types.
some types.
Object oriented systems have a general problem in constraining
functions that take two arguments. To take addition as an example,
``int(3) + int(4)`` is defined, and ``vector(1,2,3) + vector(3,4,5)``
is defined, but ``int(3) + vector(3,4,5)`` doesn't make much sense. So
``a + b`` is not guaranteed to be defined for any two instances of
``AdditiveGroup``, but it is guaranteed to be defined when ``type(a)
== type(b)``. On the other hand, ``+`` does make sense for any sorts
of numbers, so the ``Complex`` ABC refines the properties for plus so
that ``a + b`` is defined whenever ``isinstance(a,Complex) and
isinstance(b,Complex)``, even if ``type(a) != type(b)``.
Monoids (http://en.wikipedia.org/wiki/Monoid) consist of a set with an Exact vs. Inexact Classes
associative operation, and an identity element under that -------------------------
operation. **Open issue**: Is a @classmethod the best way to define
constants that depend only on the type?::
class MonoidUnderPlus(Abstract): Floating point values may not exactly obey several of the properties
"""+ is associative but not necessarily commutative and has an you would expect. For example, it is possible for ``(X + -X) + 3 ==
identity given by plus_identity(). 3``, but ``X + (-X + 3) == 0``. On the range of values that most
functions deal with this isn't a problem, but it is something to be
aware of.
Subclasses follow the laws: Therefore, I define ``Exact`` and ``Inexact`` ABCs to mark whether
types have this problem. Every instance of ``Integer`` and
``Rational`` should be Exact, but ``Reals`` and ``Complexes`` may or
may not be. (Do we really only need one of these, and the other is
defined as ``not`` the first?)::
a + (b + c) === (a + b) + c class Exact(metaclass=MetaABC): pass
a.plus_identity() + a === a === a + a.plus_identity() class Inexact(metaclass=MetaABC): pass
Sequences are monoids under plus (in Python) but are not
AdditiveGroups.
"""
@abstractmethod
def __add__(self, other):
raise NotImplementedError
@classmethod
@abstractmethod
def plus_identity(cls):
raise NotImplementedError
I skip ordinary non-commutative groups here because I don't have any
common examples of groups that use ``+`` as their operator but aren't
commutative. If we find some, the class can be added later.::
class AdditiveGroup(MonoidUnderPlus):
"""Defines a commutative group whose operator is +, and whose inverses
are produced by -x.
See http://en.wikipedia.org/wiki/Abelian_group.
Where a, b, and c are instances of the same subclass of
AdditiveGroup, the operations should follow these laws, where
'zero' is a.__class__.zero().
a + b === b + a
(a + b) + c === a + (b + c)
zero + a === a
a + (-a) === zero
a - b === a + -b
Some abstract subclasses, such as Complex, may extend the
definition of + to heterogenous subclasses, but AdditiveGroup only
guarantees it's defined on arguments of exactly the same types.
Vectors are AdditiveGroups but are not Rings.
"""
@abstractmethod
def __add__(self, other):
"""Associative commutative operation, whose inverse is negation."""
raise NotImplementedError
**Open issue:** Do we want to give people a choice of which of the
following to define, or should we pick one arbitrarily?::
# AdditiveGroup, continued
def __neg__(self):
"""Must define this or __sub__()."""
return self.zero() - self
def __sub__(self, other):
"""Must define this or __neg__()."""
return self + -other
@classmethod
@abstractmethod
def zero(cls):
"""A better name for +'s identity as we move into more mathematical
domains."""
raise NotImplementedError
@classmethod
def plus_identity(cls):
return cls.zero()
Including Semiring (http://en.wikipedia.org/wiki/Semiring) would help
a little with defining a type for the natural numbers. That can be
split out once someone needs it (see ``IntegralDomain`` for how).::
class Ring(AdditiveGroup):
"""A mathematical ring over the operations + and *.
See http://en.wikipedia.org/wiki/Ring_%28mathematics%29.
In addition to the requirements of the AdditiveGroup superclass, a
Ring has an associative but not necessarily commutative
multiplication operation with identity (one) that distributes over
addition. A Ring can be constructed from any integer 'i' by adding
'one' to itself 'i' times. When R is a subclass of Ring, the
additive identity is R(0), and the multiplicative identity is
R(1).
Matrices are Rings but not Commutative Rings or Division
Rings. The quaternions are a Division Ring but not a
Field. The integers are a Commutative Ring but not a Field.
"""
@abstractmethod
def __init__(self, i:int):
"""An instance of a Ring may be constructed from an integer.
This may be a lossy conversion, as in the case of the integers
modulo N."""
pass
@abstractmethod
def __mul__(self, other):
"""Satisfies:
a * (b * c) === (a * b) * c
one * a === a
a * one === a
a * (b + c) === a * b + a * c
where one == a.__class__(1)
"""
raise NotImplementedError
@classmethod
def zero(cls):
return cls(0)
@classmethod
def one(cls):
return cls(1)
I'm skipping both CommutativeRing and DivisionRing here.::
class Field(Ring):
"""The class Field adds to Ring the requirement that * be a
commutative group operation except that zero does not have an
inverse.
See http://en.wikipedia.org/wiki/Field_%28mathematics%29.
Practically, that means we can define division on a Field. The
additional laws are:
a * b === b * a
a / a === a.__class_(1) # when a != a.__class__(0)
Division lets us construct a Field from any Python float,
although the conversion is likely to be lossy. Some Fields
include the real numbers, rationals, and integers mod a
prime. Python's ``float`` resembles a Field closely.
"""
def __init__(self, f:float):
"""A Field should be constructible from any rational number, which
includes Python floats."""
pass
@abstractmethod
def __div__(self, divisor):
raise NotImplementedError
Division is somewhat complicated in Python. You have both __floordiv__
and __div__, and ints produce floats when they're divided. For the
purposes of this hierarchy, ``__floordiv__(a, b)`` is defined by
``floor(__div__(a, b))``, and, since int is not a subclass of Field,
it's allowed to do whatever it wants with __div__.
There are four more reasonable classes that I'm skipping here in the
interest of keeping the initial library simple. They are:
``Algebraic``
Rational powers of its elements are defined (and maybe a few other
operations)
(http://en.wikipedia.org/wiki/Algebraic_number). Complex numbers
are the most well-known algebraic set. Real numbers are _not_
algebraic, but Python does define these operations on floats,
which makes defining this class somewhat difficult.
``Transcendental``
The elementary functions
(http://en.wikipedia.org/wiki/Elementary_function) are
defined. These are basically arbitrary powers, trig functions, and
logs, the contents of ``cmath``.
The following two classes can be reasonably combined with ``Integral``
for now.
``IntegralDomain``
Defines __divmod__.
(http://darcs.haskell.org/numericprelude/docs/html/Algebra-IntegralDomain.html#t%3AC)
``PrincipalIdealDomain``
Defines gcd and lcm.
(http://darcs.haskell.org/numericprelude/docs/html/Algebra-PrincipalIdealDomain.html#t%3AC)
If someone needs to split them later, they can use code like::
import numbers
class IntegralDomain(Ring): ...
numbers.Integral.__bases__ = (IntegralDomain,) + numbers.Integral.__bases__
Finally, we get to numbers. This is where we switch from the "algebra" Numeric Classes
module to the "numbers" module.:: ---------------
class Complex(Ring, Hashable): We begin with a Number class to make it easy for people to be fuzzy
"""The ``Complex`` ABC indicates that the value lies somewhere about what kind of number they expect. This class only helps with
on the complex plane, not that it in fact has a complex overloading; it doesn't provide any operations. **Open question:**
component: ``int`` is a subclass of ``Complex``. Because these Should it specify ``__add__``, ``__sub__``, ``__neg__``, ``__mul__``,
actually represent complex numbers, they can be converted to and ``__abs__`` like Haskell's ``Num`` class?::
the ``complex`` type.
``Complex`` finally gets around to requiring its subtypes to class Number(metaclass=MetaABC): pass
be immutable so they can be hashed in a standard way.
``Complex`` also requires its operations to accept
heterogenous arguments. Subclasses should override the
operators to be more accurate when they can, but should fall
back on the default definitions to handle arguments of
different (Complex) types.
**Open issue:** __abs__ doesn't fit here because it doesn't Some types (primarily ``float``) define "Not a Number" (NaN) values
exist for the Gaussian integers that return false for any comparison, including equality with
(http://en.wikipedia.org/wiki/Gaussian_integer). In fact, it themselves, and are maintained through operations. Because this
only exists for algebraic complex numbers and real numbers. We doesn't work well with the Reals (which are otherwise totally ordered
could define it in both places, or leave it out of the by ``<``), Guido suggested we might put NaN in its own type. It is
``Complex`` classes entirely and let it be a custom extention conceivable that this can still be represented by C doubles but be
of the ``complex`` type. included in a different ABC at runtime. **Open issue:** Is this a good
idea?::
The Gaussian integers are ``Complex`` but not a ``Field``. class NotANumber(Number):
"""Implement IEEE 754 semantics."""
def __lt__(self, other): return false
def __eq__(self, other): return false
...
def __add__(self, other): return self
def __radd__(self, other): return self
...
Complex numbers are immutable and hashable. Implementors should be
careful that they make equal numbers equal and hash them to the same
values. This may be subtle if there are two different extensions of
the real numbers::
class Complex(Hashable, Number):
"""A ``Complex`` should define the operations that work on the
Python ``complex`` type. If it is given heterogenous
arguments, it may fall back on this class's definition of the
operations.addition, subtraction, negation, and
multiplication. These operators should never return a
TypeError as long as both arguments are instances of Complex
(or even just implement __complex__).
""" """
@abstractmethod @abstractmethod
def __complex__(self): def __complex__(self):
"""Any Complex can be converted to a native complex object.""" """This operation gives the arithmetic operations a fallback.
raise NotImplementedError """
return complex(self.real, self.imag)
@property
def real(self):
return complex(self).real
@property
def imag(self):
return complex(self).imag
I define the reversed operations here so that they serve as the final
fallback for operations involving instances of Complex. **Open
issue:** Should Complex's operations check for ``isinstance(other,
Complex)``? Duck typing seems to imply that we should just try
__complex__ and succeed if it works, but stronger typing might be
justified for the operators. TODO: analyze the combinations of normal
and reversed operations with real and virtual subclasses of Complex::
def __radd__(self, other):
"""Should this catch any type errors and return
NotImplemented instead?"""
return complex(other) + complex(self)
def __rsub__(self, other):
return complex(other) - complex(self)
def __neg__(self):
return -complex(self)
def __rmul__(self, other):
return complex(other) * complex(self)
def __rdiv__(self, other):
return complex(other) / complex(self)
def __abs__(self):
return abs(complex(self))
def conjugate(self):
return complex(self).conjugate()
def __hash__(self): def __hash__(self):
"""Two "equal" values of different complex types should
hash in the same way."""
return hash(complex(self)) return hash(complex(self))
@abstractmethod
def real(self) => Real:
raise NotImplementedError
@abstractmethod The ``Real`` ABC indicates that the value is on the real line, and
def imag(self) => Real: supports the operations of the ``float`` builtin. Real numbers are
raise NotImplementedError totally ordered. (NaNs were handled above.)::
class Real(Complex, metaclass=TotallyOrderedABC):
@abstractmethod @abstractmethod
def __add__(self, other): def __float__(self):
"""The other Ring operations should be implemented similarly.""" """Any Real can be converted to a native float object."""
if isinstance(other, Complex): raise NotImplementedError
return complex(self) + complex(other) def __complex__(self):
"""Which gives us an easy way to define the conversion to
complex."""
return complex(float(self))
@property
def real(self): return self
@property
def imag(self): return 0
def __radd__(self, other):
if isinstance(other, Real):
return float(other) + float(self)
else:
return super(Real, self).__radd__(other)
def __rsub__(self, other):
if isinstance(other, Real):
return float(other) - float(self)
else:
return super(Real, self).__rsub__(other)
def __neg__(self):
return -float(self)
def __rmul__(self, other):
if isinstance(other, Real):
return float(other) * float(self)
else:
return super(Real, self).__rmul__(other)
def __rdiv__(self, other):
if isinstance(other, Real):
return float(other) / float(self)
else:
return super(Real, self).__rdiv__(other)
def __rdivmod__(self, other):
"""Implementing divmod() for your type is sufficient to
get floordiv and mod too.
"""
if isinstance(other, Real):
return divmod(float(other), float(self))
else:
return super(Real, self).__rdivmod__(other)
def __rfloordiv__(self, other):
return divmod(other, self)[0]
def __rmod__(self, other):
return divmod(other, self)[1]
def __trunc__(self):
"""Do we want properfraction, floor, ceiling, and round?"""
return trunc(float(self))
def __abs__(self):
return abs(float(self))
There is no way to define only the reversed comparison operators, so
these operations take precedence over any defined in the other
type. :( ::
def __lt__(self, other):
"""The comparison operators in Python seem to be more
strict about their input types than other functions. I'm
guessing here that we want types to be incompatible even
if they define a __float__ operation, unless they also
declare themselves to be Real numbers.
"""
if isinstance(other, Real):
return float(self) < float(other)
else: else:
return NotImplemented return NotImplemented
``FractionalComplex(Complex, Field)`` might fit here, except that it def __le__(self, other):
wouldn't give us any new operations.:: if isinstance(other, Real):
return float(self) <= float(other)
else:
return NotImplemented
class Real(Complex, TotallyOrdered): def __eq__(self, other):
"""Numbers along the real line. Some subclasses of this class if isinstance(other, Real):
may contain NaNs that are not ordered with the rest of the return float(self) == float(other)
instances of that type. Oh well. **Open issue:** what problems else:
will that cause? Is it worth it in order to get a return NotImplemented
straightforward type hierarchy?
There is no built-in rational type, but it's straightforward to write,
so we provide an ABC for it::
class Rational(Real, Exact):
"""rational.numerator and rational.denominator should be in
lowest terms.
""" """
@abstractmethod @abstractmethod
@property
def numerator(self):
raise NotImplementedError
@abstractmethod
@property
def denominator(self):
raise NotImplementedError
def __float__(self): def __float__(self):
raise NotImplementedError return self.numerator / self.denominator
def __complex__(self):
return complex(float(self))
def real(self) => self.__class__:
return self
def imag(self) => self.__class__:
return self.__class__(0)
def __abs__(self) => self.__class__:
if self < 0: return -self
else: return self
class FractionalReal(Real, Field): class Integer(Rational):
"""Rationals and floats. This class provides concrete
definitions of the other four methods from properfraction and
allows you to convert fractional reals to integers in a
disciplined way.
"""
@abstractmethod
def properfraction(self) => (int, self.__class__):
"""Returns a pair (n,f) such that self == n+f, and:
* n is an integral number with the same sign as self; and
* f is a fraction with the same type and sign as self, and with
absolute value less than 1.
"""
raise NotImplementedError
def floor(self) => int:
n, r = self.properfraction()
if r < 0 then n - 1 else n
def ceiling(self) => int: ...
def __trunc__(self) => int: ...
def round(self) => int: ...
**Open issue:** What's the best name for this class? RealIntegral? Integer?::
class Integral(Real):
"""Integers!"""
@abstractmethod @abstractmethod
def __int__(self): def __int__(self):
raise NotImplementedError raise NotImplementedError
def __float__(self): def __float__(self):
return float(int(self)) return float(int(self))
@property
def numerator(self): return self
@property
def denominator(self): return 1
@abstractmethod def __ror__(self, other):
def __or__(self, other): return int(other) | int(self)
raise NotImplementedError def __rxor__(self, other):
@abstractmethod return int(other) ^ int(self)
def __xor__(self, other): def __rand__(self, other):
raise NotImplementedError return int(other) & int(self)
@abstractmethod def __rlshift__(self, other):
def __and__(self, other): return int(other) << int(self)
raise NotImplementedError def __rrshift__(self, other):
@abstractmethod return int(other) >> int(self)
def __lshift__(self, other):
raise NotImplementedError
@abstractmethod
def __rshift__(self, other):
raise NotImplementedError
@abstractmethod
def __invert__(self): def __invert__(self):
raise NotImplementedError return ~int(self)
def __radd__(self, other):
Floating point values may not exactly obey several of the properties """All of the Real methods need to be overridden here too
you would expect from their superclasses. For example, it is possible in order to get a more exact type for their results.
for ``(large_val + -large_val) + 3 == 3``, but ``large_val +
(-large_val + 3) == 0``. On the values most functions deal with this
isn't a problem, but it is something to be aware of. Types like this
inherit from ``FloatingReal`` so that functions that care can know to
use a numerically stable algorithm on them. **Open issue:** Is this
the proper way to handle floating types?::
class FloatingReal:
"""A "floating" number is one that is represented as
``mantissa * radix**exponent`` where mantissa, radix, and
exponent are all integers. Subclasses of FloatingReal don't
follow all the rules you'd expect numbers to follow. If you
really care about the answer, you have to use numerically
stable algorithms, whatever those are.
**Open issue:** What other operations would be useful here?
These include floats and Decimals.
""" """
@classmethod if isinstance(other, Integer):
@abstractmethod return int(other) + int(self)
def radix(cls) => int: else:
raise NotImplementedError return super(Integer, self).__radd__(other)
...
@classmethod def __hash__(self):
@abstractmethod """Surprisingly, hash() needs to be overridden too, since
def digits(cls) => int: there are integers that float can't represent."""
"""The number of significant digits of base cls.radix().""" return hash(int(self))
raise NotImplementedError
@classmethod
@abstractmethod
def exponentRange(cls) => (int, int):
"""A pair of the (lowest,highest) values possible in the exponent."""
raise NotImplementedError
@abstractmethod
def decode(self) => (int, int):
"""Returns a pair (mantissa, exponent) such that
mantissa*self.radix()**exponent == self."""
raise NotImplementedError
Adding More Numeric ABCs
------------------------
Inspiration There are, of course, more possible ABCs for numbers, and this would
=========== be a poor hierarchy if it precluded the possibility of adding
http://hackage.haskell.org/trac/haskell-prime/wiki/StandardClasses those. You can add ``MyFoo`` between ``Complex`` and ``Real`` with::
http://repetae.net/john/recent/out/classalias.html
class MyFoo(Complex): ...
MyFoo.register(Real)
TODO(jyasskin): Check this.
Rejected Alternatives
=====================
The initial version of this PEP defined an algebraic hierarchy
inspired by a Haskell Numeric Prelude [#numericprelude]_ including
MonoidUnderPlus, AdditiveGroup, Ring, and Field, and mentioned several
other possible algebraic types before getting to the numbers. I had
expected this to be useful to people using vectors and matrices, but
the NumPy community really wasn't interested. The numbers then had a
much more branching structure to include things like the Gaussian
Integers and Z/nZ, which could be Complex but wouldn't necessarily
support things like division. The community decided that this was too
much complication for Python, so the proposal has been scaled back to
resemble the Scheme numeric tower much more closely.
References References
========== ==========
.. [1] Introducing Abstract Base Classes .. [#pep3119] Introducing Abstract Base Classes
(http://www.python.org/dev/peps/pep-3119/) (http://www.python.org/dev/peps/pep-3119/)
.. [2] Function Annotations .. [#pep3107] Function Annotations
(http://www.python.org/dev/peps/pep-3107/) (http://www.python.org/dev/peps/pep-3107/)
.. [3] Possible Python 3K Class Tree?, wiki page created by Bill Janssen .. [3] Possible Python 3K Class Tree?, wiki page created by Bill Janssen
(http://wiki.python.org/moin/AbstractBaseClasses) (http://wiki.python.org/moin/AbstractBaseClasses)
.. [4] NumericPrelude: An experimental alternative .. [#numericprelude] NumericPrelude: An experimental alternative hierarchy of numeric type classes
hierarchy of numeric type classes
(http://darcs.haskell.org/numericprelude/docs/html/index.html) (http://darcs.haskell.org/numericprelude/docs/html/index.html)
.. [#schemetower] The Scheme numerical tower
(http://www.swiss.ai.mit.edu/ftpdir/scheme-reports/r5rs-html/r5rs_8.html#SEC50)
Acknowledgements Acknowledgements
---------------- ================
Thanks to Neal Norwitz for helping me through the PEP process. Thanks to Neil Norwitz for encouraging me to write this PEP in the
first place, to Travis Oliphant for pointing out that the numpy people
The Haskell Numeric Prelude [4]_ nicely condensed a lot didn't really care about the algebraic concepts, and to Guido van
of experience with the Haskell numeric hierarchy into a form that was Rossum, Collin Winter, and lots of other people on the mailing list
relatively easily adaptable to Python. for refining the concept.
Copyright Copyright
========= =========