update from Facundo Batista

This commit is contained in:
David Goodger 2004-06-16 14:27:00 +00:00
parent d62b41a649
commit cf742400b9
1 changed files with 556 additions and 44 deletions

View File

@ -18,25 +18,27 @@ The idea is to have a Decimal data type, for every use where decimals
are needed but binary floating point is too inexact.
The Decimal data type will support the Python standard functions and
operations, and must comply the decimal arithmetic ANSI standard
operations, and must comply with the decimal arithmetic ANSI standard
X3.274-1996 [1]_.
Decimal will be floating point (as opposed to fixed point) and will
have bounded precision (the precision is the upper limit on the
quantity of significant digits in a result).
number of significant digits in a result). However, precision is
user-settable, and a notion of significant trailing zeroes is supported
so that fixed-point usage is also possible.
This work is based on code and test functions written by Eric Price,
Aahz and Tim Peters. Actually I'll work on the Decimal.py code in the
sandbox (at python/nondist/sandbox/decimal in the SourceForge CVS
repository). Much of the explanation in this PEP is taken from
Cowlishaw's work [2]_ and comp.lang.python.
Cowlishaw's work [2]_, comp.lang.python and python-dev.
Motivation
==========
Here I'll expose the reasons of why I think a Decimal data type is
needed and why others numeric data types are not enough.
needed and why other numeric data types are not enough.
I wanted a Money data type, and after proposing a pre-PEP in
comp.lang.python, the community agreed to have a numeric data type
@ -47,7 +49,7 @@ purpose of this PEP to have a data type that can be used as Money
without further effort.
One of the biggest advantages of implementing a standard is that
someone already thought all the creepy cases for you. And to a
someone already thought out all the creepy cases for you. And to a
standard GvR redirected me: Mike Cowlishaw's General Decimal
Arithmetic specification [2]_. This document defines a general
purpose decimal arithmetic. A correct implementation of this
@ -85,7 +87,9 @@ big or too small. For example, with a precision of 5::
1234 ==> 1234e0
12345 ==> 12345e0
123456 ==> 12345e1
123456 ==> 12346e1
(note that in the last line the number got rounded to fit in five digits).
In contrast, we have the example of a ``long`` integer with infinite
precision, meaning that you can have the number as big as you want,
@ -108,12 +112,12 @@ This generated adverse reactions in comp.lang.python, as everybody
wants to have support for the ``/`` operator in a numeric data type.
With this exposed maybe you're thinking "Hey! Can we just store the 1
and the 3 as numerator and denominator?", which take us to the next
and the 3 as numerator and denominator?", which takes us to the next
point.
Why not rational
----------------
Why not rational?
-----------------
Rational numbers are stored using two integer numbers, the numerator
and the denominator. This implies that the arithmetic operations
@ -238,17 +242,23 @@ The context is a set of parameters and rules that the user can select
and which govern the results of operations (for example, the precision
to be used).
The context gets that name because surrounds the Decimal numbers.
It's up to the implementation to work with one or several contexts,
The context gets that name because it surrounds the Decimal numbers,
with parts of context acting as input to, and output of, operations.
It's up to the application to work with one or several contexts,
but definitely the idea is not to get a context per Decimal number.
For example, a typical use would be to set the context's precision to
20 digits at the start of a program, and never explicitly use context
again.
These definitions don't affect the internal storage of the Decimal
numbers, just the way that the arithmetic operations are performed.
The context is defined by the following parameters:
The context is mainly defined by the following parameters (see
`Context Attributes`_ for all context attributes):
- Precision: The maximum number of significant digits that can result
from an arithmetic operation (integer > 0).
from an arithmetic operation (integer > 0). There is no maximum for
this value.
- Rounding: The name of the algorithm to be used when rounding is
necessary, one of "round-down", "round-half-up", "round-half-even",
@ -281,7 +291,7 @@ Extended Default Context:
- flags: all set to 0
- trap-enablers: all set to 0
- precision: is set to the designated single precision
- precision: is set to 9
- rounding: is set to round-half-even
@ -301,7 +311,6 @@ Division by zero division-by-zero [sign,inf]
Division impossible invalid-operation [0,qNaN]
Division undefined invalid-operation [0,qNaN]
Inexact inexact unchanged
Insufficient storage [0,qNaN]
Invalid context invalid-operation [0,qNaN]
Invalid operation invalid-operation [0,qNaN] (or [s,qNaN] or [s,qNaN,d]
when the cause is a signaling NaN)
@ -311,6 +320,53 @@ Subnormal subnormal unchanged
Underflow underflow see spec [2]_
==================== ================= ===================================
Note: when the standard talks about "Insufficient storage", as long as
this is implementation-specific behaviour about not having enough
storage to keep the internals of the number, this implementation will
raise MemoryError.
Regarding Overflow and Underflow, there's been a long discussion in
python-dev about artificial limits. The general consensus is to keep
the artificial limits only if there are important reasons to do that.
Tim Peters gives us three:
...eliminating bounds on exponents effectively means overflow
(and underflow) can never happen. But overflow *is* a valuable
safety net in real life fp use, like a canary in a coal mine,
giving danger signs early when a program goes insane.
Virtually all implementations of 854 use (and as IBM's standard
even suggests) "forbidden" exponent values to encode non-finite
numbers (infinities and NaNs). A bounded exponent can do this at
virtually no extra storage cost. If the exponent is unbounded,
then additional bits have to be used instead. This cost remains
hidden until more time- and space- efficient implementations are
attempted.
Big as it is, the IBM standard is a tiny start at supplying a
complete numeric facility. Having no bound on exponent size will
enormously complicate the implementations of, e.g., decimal sin()
and cos() (there's then no a priori limit on how many digits of
pi effectively need to be known in order to perform argument
reduction).
Edward Loper give us an example of when the limits are to be crossed:
probabilities.
That said, Robert Brewer and Andrew Lentvorski want the limits to be
easily modifiable by the users. Actually, this is quite posible::
>>> d1 = Decimal("1e999999999") # at the exponent limit
>>> d1
Decimal( (0, (1,), 999999999L) )
>>> d1 * 10 # exceed the limit, got infinity
Decimal( (0, (0,), 'F') )
>>> getcontext().Emax = 1000000000 # increase the limit
>>> d1 * 10 # does not exceed any more
Decimal( (0, (1, 0), 999999999L) )
>>> d1 * 100 # exceed again
Decimal( (0, (0,), 'F') )
Rounding Algorithms
-------------------
@ -345,7 +401,7 @@ by 1 if its rightmost digit is odd (to make an even digit)::
``round-ceiling``: If all of the discarded digits are zero or if the
sign is negative the result is unchanged; otherwise, the result is
incremented by 1::
incremented by 1 (round toward positive infinity)::
1.123 --> 1.13
1.128 --> 1.13
@ -354,7 +410,8 @@ incremented by 1::
``round-floor``: If all of the discarded digits are zero or if the
sign is positive the result is unchanged; otherwise, the absolute
value of the result is incremented by 1::
value of the result is incremented by 1 (round toward negative
infinty)::
1.123 --> 1.12
1.128 --> 1.12
@ -386,7 +443,7 @@ Rationale
I must separate the requirements in two sections. The first is to
comply with the ANSI standard. All the requirements for this are
specified in the Mike Cowlishaw's work [2]_. He also provided a
**comprehensive** suite of test cases.
**very large** suite of test cases.
The second section of requirements (standard Python functions support,
usability, etc.) is detailed from here, where I'll include all the
@ -398,7 +455,8 @@ Explicit construction
The explicit construction does not get affected by the context (there
is no rounding, no limits by the precision, etc.), because the context
affects just operations' results.
affects just operations' results. The only exception to this is when
you're `Creating from Context`_.
From int or long
@ -413,14 +471,20 @@ There's no loss and no need to specify any other information::
From string
'''''''''''
Strings with floats in normal and engineering notation will be
supported. In this transformation there is no loss of information, as
the string is directly converted to Decimal (there is not an
intermediate conversion through float)::
Strings containing Python decimal integer literals and Python float
literals will be supported. In this transformation there is no loss
of information, as the string is directly converted to Decimal (there
is not an intermediate conversion through float)::
Decimal("-12")
Decimal("23.2e-7")
Also, you can construct in this way all special values (Infinity and
Not a Number)::
Decimal("Inf")
Decimal("NaN")
From float
''''''''''
@ -444,13 +508,13 @@ Roth, it's easy to implement:
to do it are quite well known.
But If I *really* want my number to be
``Decimal('110000000000000008881784197001252...e-51')``, why can not
``Decimal('110000000000000008881784197001252...e-51')``, why can't I
write ``Decimal(1.1)``? Why should I expect Decimal to be "rounding"
it? Remember that ``1.1`` *is* binary floating point, so I can
predict the result. It's not intuitive to a beginner, but that's the
way it is.
Anyway, Paul Moore shown that (1) can't be, because::
Anyway, Paul Moore showed that (1) can't work, because::
(1) says D(1.1) == D('1.1')
but 1.1 == 1.1000000000000001
@ -460,7 +524,7 @@ Anyway, Paul Moore shown that (1) can't be, because::
which is wrong, because if I write ``Decimal('1.1')`` it is exact, not
``D(1.1000000000000001)``. He also proposed to have an explicit
conversion to float. bokr says you need to put the precision in the
constructor and mwilson has the idea to::
constructor and mwilson agreed::
d = Decimal (1.1, 1) # take float value to 1 decimal place
d = Decimal (1.1) # gets `places` from pre-set context
@ -475,15 +539,16 @@ So, the accepted solution through c.l.p is that you can not call Decimal
with a float. Instead you must use a method: Decimal.from_float(). The
syntax::
Decimal.from_float(floatNumber, [positions])
where ``floatNumber`` is the float number origin of the construction and
``positions`` is the positions after the decimal point where you apply a
round-half-up rounding, if any. In this way you can do, for example::
Decimal.from_float(floatNumber, [decimal_places])
Decimal.from_float(1.1, 2): The same that doing Decimal('1.1').
Decimal.from_float(1.1, 16): The same that doing Decimal('1.1000000000000001').
Decimal.from_float(1.1): The same that doing Decimal('110000000000000008881784197001252...e-51').
where ``floatNumber`` is the float number origin of the construction
and ``decimal_places`` are the number of digits after the decimal
point where you apply a round-half-up rounding, if any. In this way
you can do, for example::
Decimal.from_float(1.1, 2): The same as doing Decimal('1.1').
Decimal.from_float(1.1, 16): The same as doing Decimal('1.1000000000000001').
Decimal.from_float(1.1): The same as doing Decimal('1100000000000000088817841970012523233890533447265625e-51').
From tuples
@ -500,6 +565,11 @@ and the exponent is a signed int or long::
Decimal((1, (3, 2, 2, 5), -2)) # for -32.25
Of course, you can construct in this way all special values::
Decimal( (0, (0,), 'F') ) # for Infinity
Decimal( (0, (0,), 'n') ) # for Not a Number
From Decimal
''''''''''''
@ -513,11 +583,99 @@ Syntax for All Cases
::
Decimal(value1)
Decimal.from_float(value2, [decimal_digits])
Decimal.from_float(value2, [decimal_places])
where ``value1`` can be int, long, string, tuple or Decimal,
``value1`` can be only float, and ``decimal_digits`` is an optional
int.
where ``value1`` can be int, long, string, 3-tuple or Decimal,
``value2`` can only be float, and ``decimal_places`` is an optional
non negative int.
Creating from Context
'''''''''''''''''''''
This item arose in python-dev from two sources in parallel. Ka-Ping
Yee proposes to pass the context as an argument at instance creation
(he wants the context he passes to be used only in creation time: "It
would not be persistent"). Tony Meyer asks from_string to honor the
context if it receives a parameter "honour_context" with a True value.
(I don't like it, because the doc specifies that the context be
honored and I don't want the method to comply with the specification
regarding the value of an argument.)
Tim Peters gives us a reason to have a creation that uses context:
In general number-crunching, literals may be given to high
precision, but that precision isn't free and *usually* isn't
needed
Casey Duncan wants to use another method, not a bool arg:
I find boolean arguments a general anti-pattern, especially given
we have class methods. Why not use an alternate constructor like
Decimal.rounded_to_context("3.14159265").
In the process of deciding the syntax of that, Tim came up with a
better idea: he proposes not to have a method in Decimal to create
with a different context, but having instead a method in Context to
create a Decimal instance. Basically, instead of::
D.using_context(number, context)
it will be::
context.create_decimal(number)
From Tim:
While all operations in the spec except for the two to-string
operations use context, no operations in the spec support an
optional local context. That the Decimal() constructor ignores
context by default is an extension to the spec. We must supply a
context-honoring from-string operation to meet the spec. I
recommend against any concept of "local context" in any operation
-- it complicates the model and isn't necessary.
So, we decided to use a context method to create a Decimal that will
use (only to be created) that context in particular (for further
operations it will use the context of the thread). But, a method with
what name?
Tim Peters proposes three methods to create from diverse sources
(from_string, from_int, from_float). I proposed to use one method,
``create_decimal()``, without caring about the data type. Michael
Chermside: "The name just fits my brain. The fact that it uses the
context is obvious from the fact that it's Context method".
The community agreed with that. I think that it's OK because a newbie
will not be using the creation method from Context (the separate
method in Decimal to construct from float is just to prevent newbies
from encountering binary floating point issues).
So, in short, if you want to create a Decimal instance using a
particular context (that will be used just at creation time and not
any further), you'll have to use a method of that context::
# n is any datatype accepted in Decimal(n) plus float
mycontext.create_decimal(n)
Example::
>>> # create a standard decimal instance
>>> Decimal("11.2233445566778899")
Decimal( (0, (1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9), -16) )
>>>
>>> # create a decimal instance using the thread context
>>> thread_context = getcontext()
>>> thread_context.prec
9
>>> thread_contex.create_decimal("11.2233445566778899")
Decimal( (0, (1, 1, 2, 2, 3, 3, 4, 4, 6), -7L) )
>>>
>>> # create a decimal instance using other context
>>> other_context = thread_context.copy()
>>> other_context.prec = 4
>>> other_context.create_decimal("11.2233445566778899")
Decimal( (0, (1, 1, 2, 2), -2L) )
Implicit construction
@ -632,21 +790,376 @@ Python Usability
- Decimal should support unary operators (``-, +, abs``).
- repr() should round trip, meaning that::
m = Decimal(...)
m == eval(repr(m))
- Decimal should be immutable.
- Decimal should support the built-in methods:
- min, max
- float, int, long
- str, repr
- hash
- copy, deepcopy
- bool (0 is false, otherwise true)
- Calling repr() should do round trip, meaning that::
There's been some discussion in python-dev about the behaviour of
``hash()``. The community agrees that if the values are the same, the
hashes of those values should also be the same. So, while Decimal(25)
== 25 is True, hash(Decimal(25)) should be equal to hash(25).
m = Decimal(...)
m == eval(repr(m))
The detail is that you can NOT compare Decimal to floats or strings,
so we should not worry about them giving the same hashes. In short::
- Decimal should be immutable.
hash(n) == hash(Decimal(n)) # Only if n is int, long, or Decimal
Regarding str() and repr() behaviour, Ka-Ping Yee proposes that repr()
have the same behaviour as str() and Tim Peters proposes that str()
behave like the to-scientific-string operation from the Spec.
This is posible, because (from Aahz): "The string form already
contains all the necessary information to reconstruct a Decimal
object".
And it also complies with the Spec; Tim Peters:
There's no requirement to have a method *named* "to_sci_string",
the only requirement is that *some* way to spell to-sci-string's
functionality be supplied. The meaning of to-sci-string is
precisely specified by the standard, and is a good choice for both
str(Decimal) and repr(Decimal).
Documentation
=============
This section explains all the public methods and attributes of Decimal
and Context.
Decimal Attributes
------------------
Decimal has no public attributes. The internal information is stored
in slots and should not be accessed by end users.
Decimal Methods
---------------
Following are the conversion and arithmetic operations defined in the
Spec, and how that functionality can be achieved with the actual
implementation.
- to-scientific-string: Use builtin function ``str()``::
>>> d = Decimal('123456789012.345')
>>> str(d)
'1.23456789E+11'
- to-engineering-string: Use method ``to_eng_string()``::
>>> d = Decimal('123456789012.345')
>>> d.to_eng_string()
'123.456789E+9'
- to-number: Use Context method ``create_decimal()``. The standard
constructor or ``from_float()`` constructor cannot be used because
these do not use the context (as is specified in the Spec for this
conversion).
- abs: Use builtin function ``abs()``::
>>> d = Decimal('-15.67')
>>> abs(d)
Decimal('15.67')
- add: Use operator ``+``::
>>> d = Decimal('15.6')
>>> d + 8
Decimal('23.6')
- subtract: Use operator ``-``::
>>> d = Decimal('15.6')
>>> d - 8
Decimal('7.6')
- compare: Use method ``compare()``. This method (and not the
built-in function cmp()) should only be used when dealing with
*special values*::
>>> d = Decimal('-15.67')
>>> nan = Decimal('NaN')
>>> d.compare(23)
'-1'
>>> d.compare(nan)
'NaN'
>>> cmp(d, 23)
-1
>>> cmp(d, nan)
1
- divide: Use operator ``/``::
>>> d = Decimal('-15.67')
>>> d / 2
Decimal('-7.835')
- divide-integer: Use operator ``//``::
>>> d = Decimal('-15.67')
>>> d // 2
Decimal('-7')
- max: Use method ``max()``. Only use this method (and not the
built-in function max()) when dealing with *special values*::
>>> d = Decimal('15')
>>> nan = Decimal('NaN')
>>> d.max(8)
Decimal('15')
>>> d.max(nan)
Decimal('NaN')
- min: Use method ``min()``. Only use this method (and not the
built-in function min()) when dealing with *special values*::
>>> d = Decimal('15')
>>> nan = Decimal('NaN')
>>> d.min(8)
Decimal('8')
>>> d.min(nan)
Decimal('NaN')
- minus: Use unary operator ``-``::
>>> d = Decimal('-15.67')
>>> -d
Decimal('15.67')
- plus: Use unary operator ``+``::
>>> d = Decimal('-15.67')
>>> +d
Decimal('-15.67')
- multiply: Use operator ``*``::
>>> d = Decimal('5.7')
>>> d * 3
Decimal('17.1')
- normalize: Use method ``normalize()``::
>>> d = Decimal('123.45000')
>>> d.normalize()
Decimal('123.45')
>>> d = Decimal('120.00')
>>> d.normalize()
Decimal('1.2E+2')
- quantize: Use method ``quantize()``::
>>> d = Decimal('2.17')
>>> d.quantize(Decimal('0.001'))
Decimal('2.170')
>>> d.quantize(Decimal('0.1'))
Decimal('2.2')
- remainder: Use operator ``%``::
>>> d = Decimal('10')
>>> d % 3
Decimal('1')
>>> d % 6
Decimal('4')
- remainder-near: Use method ``remainder_near()``::
>>> d = Decimal('10')
>>> d.remainder_near(3)
Decimal('1')
>>> d.remainder_near(6)
Decimal('-2')
- round-to-integral-value: Use method ``to_integral()``::
>>> d = Decimal('-123.456')
>>> d.to_integral()
Decimal('-123')
- same-quantum: Use method ``same_quantum()``::
>>> d = Decimal('123.456')
>>> d.same_quantum(Decimal('0.001'))
True
>>> d.same_quantum(Decimal('0.01'))
False
- square-root: Use method ``sqrt()``::
>>> d = Decimal('123.456')
>>> d.sqrt()
Decimal('11.1110756')
- power: User operator ``**``::
>>> d = Decimal('12.56')
>>> d ** 2
Decimal('157.7536')
Following are other methods and why they exist:
- ``adjusted()``: Returns the adjusted exponent. This concept is
defined in the Spec: the adjusted exponent is the value of the
exponent of a number when that number is expressed as though in
scientific notation with one digit before any decimal point::
>>> d = Decimal('12.56')
>>> d.adjusted()
1
- ``from_float()``: Class method to create instances from float data
types::
>>> d = Decimal.from_float(12.35)
>>> d
Decimal('12.3500000')
- ``as_tuple()``: Show the internal structure of the Decimal, the
triple tuple. This method is not required by the Spec, but Tim
Peters proposed it and the community agreed to have it (it's useful
for developing and debugging)::
>>> d = Decimal('123.4')
>>> d.as_tuple()
(0, (1, 2, 3, 4), -1)
>>> d = Decimal('-2.34e5')
>>> d.as_tuple()
(1, (2, 3, 4), 3)
Context Attributes
------------------
These are the attributes that can be changed to modify the context.
- ``prec`` (int): the precision::
>>> c.prec
9
- ``rounding`` (str): rounding type (how to round)::
>>> c.rounding
'half_even'
- ``trap_enablers`` (dict): if trap_enablers[exception] = 1, then an
exception is raised when it is caused::
>>> c.trap_enablers[Underflow]
0
>>> c.trap_enablers[Clamped]
0
- ``flags`` (dict): when an exception is caused, flags[exception] is
incremented (whether or not the trap_enabler is set). Should be
reset by the user of Decimal instance::
>>> c.flags[Underflow]
0
>>> c.flags[Clamped]
0
- ``Emin`` (int): minimum exponent::
>>> c.Emin
-999999999
- ``Emax`` (int): maximum exponent::
>>> c.Emax
999999999
- ``capitals`` (int): boolean flag to use 'E' (True/1) or 'e'
(False/0) in the string (for example, '1.32e+2' or '1.32E+2')::
>>> c.capitals
1
Context Methods
---------------
The following methods comply with Decimal functionality from the Spec.
Be aware that the operations that are called through a specific
context use that context and not the thread context.
To use these methods, take note that the syntax changes when the
operator is binary or unary, for example::
>>> mycontext.abs(Decimal('-2'))
'2'
>>> mycontext.multiply(Decimal('2.3'), 5)
'11.5'
So, the following are the Spec operations and conversions and how to
achieve them through a context (where ``d`` is a Decimal instance and
``n`` a number that can be used in an `Implicit construction`_):
- to-scientific-string: ``to_sci_string(d)``
- to-engineering-string: ``to_eng_string(d)``
- to-number: ``create_decimal(number)``, see `Explicit construction`_
for ``number``.
- abs: ``abs(d)``
- add: ``add(d, n)``
- subtract: ``subtract(d, n)``
- compare: ``compare(d, n)``
- divide: ``divide(d, n)``
- divide-integer: ``divide_int(d, n)``
- max: ``max(d, n)``
- min: ``min(d, n)``
- minus: ``minus(d)``
- plus: ``plus(d)``
- multiply: ``multiply(d, n)``
- normalize: ``normalize(d)``
- quantize: ``quantize(d, d)``
- remainder: ``remainder(d)``
- remainder-near: ``remainder_near(d)``
- round-to-integral-value: ``to_integral(d)``
- same-quantum: ``same_quantum(d, d)``
- square-root: ``sqrt(d)``
- power: ``power(d, n)``
The following methods are to support decimal functionality through
Context:
- ``divmod(d, n)``
- ``eq(d, d)``
- ``gt(d, d)``
- ``lt(d, d)``
These are methods that return useful information from the Context:
- ``Etiny()``: Minimum exponent considering precision.
>>> c.Emin
-999999999
>>> c.Etiny()
-1000000007
- ``Etop()``: Maximum exponent considering precision.
>>> c.Emax
999999999
>>> c.Etop()
999999991
- ``copy()``: Returns a copy of the context.
Reference Implementation
@ -656,7 +1169,6 @@ To be included later:
- code
- test code
- documentation
References