diff --git a/pep-0327.txt b/pep-0327.txt index f06e7f183..3d6fe4c8e 100644 --- a/pep-0327.txt +++ b/pep-0327.txt @@ -18,25 +18,27 @@ The idea is to have a Decimal data type, for every use where decimals are needed but binary floating point is too inexact. The Decimal data type will support the Python standard functions and -operations, and must comply the decimal arithmetic ANSI standard +operations, and must comply with the decimal arithmetic ANSI standard X3.274-1996 [1]_. Decimal will be floating point (as opposed to fixed point) and will have bounded precision (the precision is the upper limit on the -quantity of significant digits in a result). +number of significant digits in a result). However, precision is +user-settable, and a notion of significant trailing zeroes is supported +so that fixed-point usage is also possible. This work is based on code and test functions written by Eric Price, Aahz and Tim Peters. Actually I'll work on the Decimal.py code in the sandbox (at python/nondist/sandbox/decimal in the SourceForge CVS repository). Much of the explanation in this PEP is taken from -Cowlishaw's work [2]_ and comp.lang.python. +Cowlishaw's work [2]_, comp.lang.python and python-dev. Motivation ========== Here I'll expose the reasons of why I think a Decimal data type is -needed and why others numeric data types are not enough. +needed and why other numeric data types are not enough. I wanted a Money data type, and after proposing a pre-PEP in comp.lang.python, the community agreed to have a numeric data type @@ -47,7 +49,7 @@ purpose of this PEP to have a data type that can be used as Money without further effort. One of the biggest advantages of implementing a standard is that -someone already thought all the creepy cases for you. And to a +someone already thought out all the creepy cases for you. And to a standard GvR redirected me: Mike Cowlishaw's General Decimal Arithmetic specification [2]_. This document defines a general purpose decimal arithmetic. A correct implementation of this @@ -85,7 +87,9 @@ big or too small. For example, with a precision of 5:: 1234 ==> 1234e0 12345 ==> 12345e0 - 123456 ==> 12345e1 + 123456 ==> 12346e1 + +(note that in the last line the number got rounded to fit in five digits). In contrast, we have the example of a ``long`` integer with infinite precision, meaning that you can have the number as big as you want, @@ -108,12 +112,12 @@ This generated adverse reactions in comp.lang.python, as everybody wants to have support for the ``/`` operator in a numeric data type. With this exposed maybe you're thinking "Hey! Can we just store the 1 -and the 3 as numerator and denominator?", which take us to the next +and the 3 as numerator and denominator?", which takes us to the next point. -Why not rational ----------------- +Why not rational? +----------------- Rational numbers are stored using two integer numbers, the numerator and the denominator. This implies that the arithmetic operations @@ -238,17 +242,23 @@ The context is a set of parameters and rules that the user can select and which govern the results of operations (for example, the precision to be used). -The context gets that name because surrounds the Decimal numbers. -It's up to the implementation to work with one or several contexts, +The context gets that name because it surrounds the Decimal numbers, +with parts of context acting as input to, and output of, operations. +It's up to the application to work with one or several contexts, but definitely the idea is not to get a context per Decimal number. +For example, a typical use would be to set the context's precision to +20 digits at the start of a program, and never explicitly use context +again. These definitions don't affect the internal storage of the Decimal numbers, just the way that the arithmetic operations are performed. -The context is defined by the following parameters: +The context is mainly defined by the following parameters (see +`Context Attributes`_ for all context attributes): - Precision: The maximum number of significant digits that can result - from an arithmetic operation (integer > 0). + from an arithmetic operation (integer > 0). There is no maximum for + this value. - Rounding: The name of the algorithm to be used when rounding is necessary, one of "round-down", "round-half-up", "round-half-even", @@ -281,7 +291,7 @@ Extended Default Context: - flags: all set to 0 - trap-enablers: all set to 0 -- precision: is set to the designated single precision +- precision: is set to 9 - rounding: is set to round-half-even @@ -301,7 +311,6 @@ Division by zero division-by-zero [sign,inf] Division impossible invalid-operation [0,qNaN] Division undefined invalid-operation [0,qNaN] Inexact inexact unchanged -Insufficient storage [0,qNaN] Invalid context invalid-operation [0,qNaN] Invalid operation invalid-operation [0,qNaN] (or [s,qNaN] or [s,qNaN,d] when the cause is a signaling NaN) @@ -311,6 +320,53 @@ Subnormal subnormal unchanged Underflow underflow see spec [2]_ ==================== ================= =================================== +Note: when the standard talks about "Insufficient storage", as long as +this is implementation-specific behaviour about not having enough +storage to keep the internals of the number, this implementation will +raise MemoryError. + +Regarding Overflow and Underflow, there's been a long discussion in +python-dev about artificial limits. The general consensus is to keep +the artificial limits only if there are important reasons to do that. +Tim Peters gives us three: + + ...eliminating bounds on exponents effectively means overflow + (and underflow) can never happen. But overflow *is* a valuable + safety net in real life fp use, like a canary in a coal mine, + giving danger signs early when a program goes insane. + + Virtually all implementations of 854 use (and as IBM's standard + even suggests) "forbidden" exponent values to encode non-finite + numbers (infinities and NaNs). A bounded exponent can do this at + virtually no extra storage cost. If the exponent is unbounded, + then additional bits have to be used instead. This cost remains + hidden until more time- and space- efficient implementations are + attempted. + + Big as it is, the IBM standard is a tiny start at supplying a + complete numeric facility. Having no bound on exponent size will + enormously complicate the implementations of, e.g., decimal sin() + and cos() (there's then no a priori limit on how many digits of + pi effectively need to be known in order to perform argument + reduction). + +Edward Loper give us an example of when the limits are to be crossed: +probabilities. + +That said, Robert Brewer and Andrew Lentvorski want the limits to be +easily modifiable by the users. Actually, this is quite posible:: + + >>> d1 = Decimal("1e999999999") # at the exponent limit + >>> d1 + Decimal( (0, (1,), 999999999L) ) + >>> d1 * 10 # exceed the limit, got infinity + Decimal( (0, (0,), 'F') ) + >>> getcontext().Emax = 1000000000 # increase the limit + >>> d1 * 10 # does not exceed any more + Decimal( (0, (1, 0), 999999999L) ) + >>> d1 * 100 # exceed again + Decimal( (0, (0,), 'F') ) + Rounding Algorithms ------------------- @@ -345,7 +401,7 @@ by 1 if its rightmost digit is odd (to make an even digit):: ``round-ceiling``: If all of the discarded digits are zero or if the sign is negative the result is unchanged; otherwise, the result is -incremented by 1:: +incremented by 1 (round toward positive infinity):: 1.123 --> 1.13 1.128 --> 1.13 @@ -354,7 +410,8 @@ incremented by 1:: ``round-floor``: If all of the discarded digits are zero or if the sign is positive the result is unchanged; otherwise, the absolute -value of the result is incremented by 1:: +value of the result is incremented by 1 (round toward negative +infinty):: 1.123 --> 1.12 1.128 --> 1.12 @@ -386,7 +443,7 @@ Rationale I must separate the requirements in two sections. The first is to comply with the ANSI standard. All the requirements for this are specified in the Mike Cowlishaw's work [2]_. He also provided a -**comprehensive** suite of test cases. +**very large** suite of test cases. The second section of requirements (standard Python functions support, usability, etc.) is detailed from here, where I'll include all the @@ -398,7 +455,8 @@ Explicit construction The explicit construction does not get affected by the context (there is no rounding, no limits by the precision, etc.), because the context -affects just operations' results. +affects just operations' results. The only exception to this is when +you're `Creating from Context`_. From int or long @@ -413,14 +471,20 @@ There's no loss and no need to specify any other information:: From string ''''''''''' -Strings with floats in normal and engineering notation will be -supported. In this transformation there is no loss of information, as -the string is directly converted to Decimal (there is not an -intermediate conversion through float):: +Strings containing Python decimal integer literals and Python float +literals will be supported. In this transformation there is no loss +of information, as the string is directly converted to Decimal (there +is not an intermediate conversion through float):: Decimal("-12") Decimal("23.2e-7") +Also, you can construct in this way all special values (Infinity and +Not a Number):: + + Decimal("Inf") + Decimal("NaN") + From float '''''''''' @@ -444,13 +508,13 @@ Roth, it's easy to implement: to do it are quite well known. But If I *really* want my number to be -``Decimal('110000000000000008881784197001252...e-51')``, why can not +``Decimal('110000000000000008881784197001252...e-51')``, why can't I write ``Decimal(1.1)``? Why should I expect Decimal to be "rounding" it? Remember that ``1.1`` *is* binary floating point, so I can predict the result. It's not intuitive to a beginner, but that's the way it is. -Anyway, Paul Moore shown that (1) can't be, because:: +Anyway, Paul Moore showed that (1) can't work, because:: (1) says D(1.1) == D('1.1') but 1.1 == 1.1000000000000001 @@ -460,7 +524,7 @@ Anyway, Paul Moore shown that (1) can't be, because:: which is wrong, because if I write ``Decimal('1.1')`` it is exact, not ``D(1.1000000000000001)``. He also proposed to have an explicit conversion to float. bokr says you need to put the precision in the -constructor and mwilson has the idea to:: +constructor and mwilson agreed:: d = Decimal (1.1, 1) # take float value to 1 decimal place d = Decimal (1.1) # gets `places` from pre-set context @@ -475,15 +539,16 @@ So, the accepted solution through c.l.p is that you can not call Decimal with a float. Instead you must use a method: Decimal.from_float(). The syntax:: - Decimal.from_float(floatNumber, [positions]) - -where ``floatNumber`` is the float number origin of the construction and -``positions`` is the positions after the decimal point where you apply a -round-half-up rounding, if any. In this way you can do, for example:: + Decimal.from_float(floatNumber, [decimal_places]) - Decimal.from_float(1.1, 2): The same that doing Decimal('1.1'). - Decimal.from_float(1.1, 16): The same that doing Decimal('1.1000000000000001'). - Decimal.from_float(1.1): The same that doing Decimal('110000000000000008881784197001252...e-51'). +where ``floatNumber`` is the float number origin of the construction +and ``decimal_places`` are the number of digits after the decimal +point where you apply a round-half-up rounding, if any. In this way +you can do, for example:: + + Decimal.from_float(1.1, 2): The same as doing Decimal('1.1'). + Decimal.from_float(1.1, 16): The same as doing Decimal('1.1000000000000001'). + Decimal.from_float(1.1): The same as doing Decimal('1100000000000000088817841970012523233890533447265625e-51'). From tuples @@ -500,6 +565,11 @@ and the exponent is a signed int or long:: Decimal((1, (3, 2, 2, 5), -2)) # for -32.25 +Of course, you can construct in this way all special values:: + + Decimal( (0, (0,), 'F') ) # for Infinity + Decimal( (0, (0,), 'n') ) # for Not a Number + From Decimal '''''''''''' @@ -513,11 +583,99 @@ Syntax for All Cases :: Decimal(value1) - Decimal.from_float(value2, [decimal_digits]) + Decimal.from_float(value2, [decimal_places]) -where ``value1`` can be int, long, string, tuple or Decimal, -``value1`` can be only float, and ``decimal_digits`` is an optional -int. +where ``value1`` can be int, long, string, 3-tuple or Decimal, +``value2`` can only be float, and ``decimal_places`` is an optional +non negative int. + + +Creating from Context +''''''''''''''''''''' + +This item arose in python-dev from two sources in parallel. Ka-Ping +Yee proposes to pass the context as an argument at instance creation +(he wants the context he passes to be used only in creation time: "It +would not be persistent"). Tony Meyer asks from_string to honor the +context if it receives a parameter "honour_context" with a True value. +(I don't like it, because the doc specifies that the context be +honored and I don't want the method to comply with the specification +regarding the value of an argument.) + +Tim Peters gives us a reason to have a creation that uses context: + + In general number-crunching, literals may be given to high + precision, but that precision isn't free and *usually* isn't + needed + +Casey Duncan wants to use another method, not a bool arg: + + I find boolean arguments a general anti-pattern, especially given + we have class methods. Why not use an alternate constructor like + Decimal.rounded_to_context("3.14159265"). + +In the process of deciding the syntax of that, Tim came up with a +better idea: he proposes not to have a method in Decimal to create +with a different context, but having instead a method in Context to +create a Decimal instance. Basically, instead of:: + + D.using_context(number, context) + +it will be:: + + context.create_decimal(number) + +From Tim: + + While all operations in the spec except for the two to-string + operations use context, no operations in the spec support an + optional local context. That the Decimal() constructor ignores + context by default is an extension to the spec. We must supply a + context-honoring from-string operation to meet the spec. I + recommend against any concept of "local context" in any operation + -- it complicates the model and isn't necessary. + +So, we decided to use a context method to create a Decimal that will +use (only to be created) that context in particular (for further +operations it will use the context of the thread). But, a method with +what name? + +Tim Peters proposes three methods to create from diverse sources +(from_string, from_int, from_float). I proposed to use one method, +``create_decimal()``, without caring about the data type. Michael +Chermside: "The name just fits my brain. The fact that it uses the +context is obvious from the fact that it's Context method". + +The community agreed with that. I think that it's OK because a newbie +will not be using the creation method from Context (the separate +method in Decimal to construct from float is just to prevent newbies +from encountering binary floating point issues). + +So, in short, if you want to create a Decimal instance using a +particular context (that will be used just at creation time and not +any further), you'll have to use a method of that context:: + + # n is any datatype accepted in Decimal(n) plus float + mycontext.create_decimal(n) + +Example:: + + >>> # create a standard decimal instance + >>> Decimal("11.2233445566778899") + Decimal( (0, (1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9), -16) ) + >>> + >>> # create a decimal instance using the thread context + >>> thread_context = getcontext() + >>> thread_context.prec + 9 + >>> thread_contex.create_decimal("11.2233445566778899") + Decimal( (0, (1, 1, 2, 2, 3, 3, 4, 4, 6), -7L) ) + >>> + >>> # create a decimal instance using other context + >>> other_context = thread_context.copy() + >>> other_context.prec = 4 + >>> other_context.create_decimal("11.2233445566778899") + Decimal( (0, (1, 1, 2, 2), -2L) ) Implicit construction @@ -632,21 +790,376 @@ Python Usability - Decimal should support unary operators (``-, +, abs``). +- repr() should round trip, meaning that:: + + m = Decimal(...) + m == eval(repr(m)) + +- Decimal should be immutable. + - Decimal should support the built-in methods: - min, max - float, int, long - str, repr - hash - - copy, deepcopy - bool (0 is false, otherwise true) -- Calling repr() should do round trip, meaning that:: +There's been some discussion in python-dev about the behaviour of +``hash()``. The community agrees that if the values are the same, the +hashes of those values should also be the same. So, while Decimal(25) +== 25 is True, hash(Decimal(25)) should be equal to hash(25). - m = Decimal(...) - m == eval(repr(m)) +The detail is that you can NOT compare Decimal to floats or strings, +so we should not worry about them giving the same hashes. In short:: -- Decimal should be immutable. + hash(n) == hash(Decimal(n)) # Only if n is int, long, or Decimal + +Regarding str() and repr() behaviour, Ka-Ping Yee proposes that repr() +have the same behaviour as str() and Tim Peters proposes that str() +behave like the to-scientific-string operation from the Spec. + +This is posible, because (from Aahz): "The string form already +contains all the necessary information to reconstruct a Decimal +object". + +And it also complies with the Spec; Tim Peters: + + There's no requirement to have a method *named* "to_sci_string", + the only requirement is that *some* way to spell to-sci-string's + functionality be supplied. The meaning of to-sci-string is + precisely specified by the standard, and is a good choice for both + str(Decimal) and repr(Decimal). + + +Documentation +============= + +This section explains all the public methods and attributes of Decimal +and Context. + + +Decimal Attributes +------------------ + +Decimal has no public attributes. The internal information is stored +in slots and should not be accessed by end users. + + +Decimal Methods +--------------- + +Following are the conversion and arithmetic operations defined in the +Spec, and how that functionality can be achieved with the actual +implementation. + +- to-scientific-string: Use builtin function ``str()``:: + + >>> d = Decimal('123456789012.345') + >>> str(d) + '1.23456789E+11' + +- to-engineering-string: Use method ``to_eng_string()``:: + + >>> d = Decimal('123456789012.345') + >>> d.to_eng_string() + '123.456789E+9' + +- to-number: Use Context method ``create_decimal()``. The standard + constructor or ``from_float()`` constructor cannot be used because + these do not use the context (as is specified in the Spec for this + conversion). + +- abs: Use builtin function ``abs()``:: + + >>> d = Decimal('-15.67') + >>> abs(d) + Decimal('15.67') + +- add: Use operator ``+``:: + + >>> d = Decimal('15.6') + >>> d + 8 + Decimal('23.6') + +- subtract: Use operator ``-``:: + + >>> d = Decimal('15.6') + >>> d - 8 + Decimal('7.6') + +- compare: Use method ``compare()``. This method (and not the + built-in function cmp()) should only be used when dealing with + *special values*:: + + >>> d = Decimal('-15.67') + >>> nan = Decimal('NaN') + >>> d.compare(23) + '-1' + >>> d.compare(nan) + 'NaN' + >>> cmp(d, 23) + -1 + >>> cmp(d, nan) + 1 + +- divide: Use operator ``/``:: + + >>> d = Decimal('-15.67') + >>> d / 2 + Decimal('-7.835') + +- divide-integer: Use operator ``//``:: + + >>> d = Decimal('-15.67') + >>> d // 2 + Decimal('-7') + +- max: Use method ``max()``. Only use this method (and not the + built-in function max()) when dealing with *special values*:: + + >>> d = Decimal('15') + >>> nan = Decimal('NaN') + >>> d.max(8) + Decimal('15') + >>> d.max(nan) + Decimal('NaN') + +- min: Use method ``min()``. Only use this method (and not the + built-in function min()) when dealing with *special values*:: + + >>> d = Decimal('15') + >>> nan = Decimal('NaN') + >>> d.min(8) + Decimal('8') + >>> d.min(nan) + Decimal('NaN') + +- minus: Use unary operator ``-``:: + + >>> d = Decimal('-15.67') + >>> -d + Decimal('15.67') + +- plus: Use unary operator ``+``:: + + >>> d = Decimal('-15.67') + >>> +d + Decimal('-15.67') + +- multiply: Use operator ``*``:: + + >>> d = Decimal('5.7') + >>> d * 3 + Decimal('17.1') + +- normalize: Use method ``normalize()``:: + + >>> d = Decimal('123.45000') + >>> d.normalize() + Decimal('123.45') + >>> d = Decimal('120.00') + >>> d.normalize() + Decimal('1.2E+2') + +- quantize: Use method ``quantize()``:: + + >>> d = Decimal('2.17') + >>> d.quantize(Decimal('0.001')) + Decimal('2.170') + >>> d.quantize(Decimal('0.1')) + Decimal('2.2') + +- remainder: Use operator ``%``:: + + >>> d = Decimal('10') + >>> d % 3 + Decimal('1') + >>> d % 6 + Decimal('4') + +- remainder-near: Use method ``remainder_near()``:: + + >>> d = Decimal('10') + >>> d.remainder_near(3) + Decimal('1') + >>> d.remainder_near(6) + Decimal('-2') + +- round-to-integral-value: Use method ``to_integral()``:: + + >>> d = Decimal('-123.456') + >>> d.to_integral() + Decimal('-123') + +- same-quantum: Use method ``same_quantum()``:: + + >>> d = Decimal('123.456') + >>> d.same_quantum(Decimal('0.001')) + True + >>> d.same_quantum(Decimal('0.01')) + False + +- square-root: Use method ``sqrt()``:: + + >>> d = Decimal('123.456') + >>> d.sqrt() + Decimal('11.1110756') + +- power: User operator ``**``:: + + >>> d = Decimal('12.56') + >>> d ** 2 + Decimal('157.7536') + +Following are other methods and why they exist: + +- ``adjusted()``: Returns the adjusted exponent. This concept is + defined in the Spec: the adjusted exponent is the value of the + exponent of a number when that number is expressed as though in + scientific notation with one digit before any decimal point:: + + >>> d = Decimal('12.56') + >>> d.adjusted() + 1 + +- ``from_float()``: Class method to create instances from float data + types:: + + >>> d = Decimal.from_float(12.35) + >>> d + Decimal('12.3500000') + +- ``as_tuple()``: Show the internal structure of the Decimal, the + triple tuple. This method is not required by the Spec, but Tim + Peters proposed it and the community agreed to have it (it's useful + for developing and debugging):: + + >>> d = Decimal('123.4') + >>> d.as_tuple() + (0, (1, 2, 3, 4), -1) + >>> d = Decimal('-2.34e5') + >>> d.as_tuple() + (1, (2, 3, 4), 3) + + +Context Attributes +------------------ + +These are the attributes that can be changed to modify the context. + +- ``prec`` (int): the precision:: + + >>> c.prec + 9 + +- ``rounding`` (str): rounding type (how to round):: + + >>> c.rounding + 'half_even' + +- ``trap_enablers`` (dict): if trap_enablers[exception] = 1, then an + exception is raised when it is caused:: + + >>> c.trap_enablers[Underflow] + 0 + >>> c.trap_enablers[Clamped] + 0 + +- ``flags`` (dict): when an exception is caused, flags[exception] is + incremented (whether or not the trap_enabler is set). Should be + reset by the user of Decimal instance:: + + >>> c.flags[Underflow] + 0 + >>> c.flags[Clamped] + 0 + +- ``Emin`` (int): minimum exponent:: + + >>> c.Emin + -999999999 + +- ``Emax`` (int): maximum exponent:: + + >>> c.Emax + 999999999 + +- ``capitals`` (int): boolean flag to use 'E' (True/1) or 'e' + (False/0) in the string (for example, '1.32e+2' or '1.32E+2'):: + + >>> c.capitals + 1 + + +Context Methods +--------------- + +The following methods comply with Decimal functionality from the Spec. +Be aware that the operations that are called through a specific +context use that context and not the thread context. + +To use these methods, take note that the syntax changes when the +operator is binary or unary, for example:: + + >>> mycontext.abs(Decimal('-2')) + '2' + >>> mycontext.multiply(Decimal('2.3'), 5) + '11.5' + +So, the following are the Spec operations and conversions and how to +achieve them through a context (where ``d`` is a Decimal instance and +``n`` a number that can be used in an `Implicit construction`_): + +- to-scientific-string: ``to_sci_string(d)`` +- to-engineering-string: ``to_eng_string(d)`` +- to-number: ``create_decimal(number)``, see `Explicit construction`_ + for ``number``. +- abs: ``abs(d)`` +- add: ``add(d, n)`` +- subtract: ``subtract(d, n)`` +- compare: ``compare(d, n)`` +- divide: ``divide(d, n)`` +- divide-integer: ``divide_int(d, n)`` +- max: ``max(d, n)`` +- min: ``min(d, n)`` +- minus: ``minus(d)`` +- plus: ``plus(d)`` +- multiply: ``multiply(d, n)`` +- normalize: ``normalize(d)`` +- quantize: ``quantize(d, d)`` +- remainder: ``remainder(d)`` +- remainder-near: ``remainder_near(d)`` +- round-to-integral-value: ``to_integral(d)`` +- same-quantum: ``same_quantum(d, d)`` +- square-root: ``sqrt(d)`` +- power: ``power(d, n)`` + +The following methods are to support decimal functionality through +Context: + +- ``divmod(d, n)`` +- ``eq(d, d)`` +- ``gt(d, d)`` +- ``lt(d, d)`` + +These are methods that return useful information from the Context: + +- ``Etiny()``: Minimum exponent considering precision. + + >>> c.Emin + -999999999 + >>> c.Etiny() + -1000000007 + +- ``Etop()``: Maximum exponent considering precision. + + >>> c.Emax + 999999999 + >>> c.Etop() + 999999991 + +- ``copy()``: Returns a copy of the context. Reference Implementation @@ -656,7 +1169,6 @@ To be included later: - code - test code -- documentation References