Update PEP 485 from Chris Barker's edits

2015-02-05 10:17:42 +11:00 · 2015-02-05 10:17:42 +11:00 · 1641c51ddd
parent f19d84a049
commit 1641c51ddd
1 changed files with 397 additions and 87 deletions
--- a/pep-0485.txt
+++ b/pep-0485.txt
@ -15,8 +15,10 @@ Abstract
 ========

 This PEP proposes the addition of a function to the standard library
-that determines whether one value is approximately equal or "close"
-to another value.
+that determines whether one value is approximately equal or "close" to
+another value. It is also proposed that an assertion be added to the
+``unittest.TestCase`` class to provide easy access for those using
+unittest for testing.


 Rationale
@ -37,27 +39,35 @@ the standard library.
 Existing Implementations
 ------------------------

-The standard library includes the
-``unittest.TestCase.assertAlmostEqual`` method, but it:
+The standard library includes the ``unittest.TestCase.assertAlmostEqual``
+method, but it:

 * Is buried in the unittest.TestCase class

-* Is an assertion, so you can't use it as a general test (easily)
+* Is an assertion, so you can't use it as a general test at the command
+  line, etc. (easily)

-* Uses number of decimal digits or an absolute delta, which are
-  particular use cases that don't provide a general relative error.
+* Is an absolute difference test. Often the measure of difference
+  requires, particularly for floating point numbers, a relative error,
+  i.e "Are these two values within x% of each-other?", rather than an
+  absolute error. Particularly when the magnatude of the values is
+  unknown a priori.

-The numpy package has the ``allclose()`` and ``isclose()`` functions.
+The numpy package has the ``allclose()`` and ``isclose()`` functions,
+but they are only available with numpy.

 The statistics package tests include an implementation, used for its
 unit tests.

 One can also find discussion and sample implementations on Stack
-Overflow, and other help sites.
+Overflow and other help sites.

-These existing implementations indicate that this is a common need,
-and not trivial to write oneself, making it a candidate for the
-standard library.
+Many other non-python systems provide such a test, including the Boost C++
+library and the APL language (reference?).
+
+These existing implementations indicate that this is a common need and
+not trivial to write oneself, making it a candidate for the standard
+library.


 Proposed Implementation
@ -68,21 +78,22 @@ python-ideas list [1]_.

 The new function will have the following signature::

-  is_close_to(actual, expected, tol=1e-8, abs_tol=0.0)
+  is_close(a, b, rel_tolerance=1e-9, abs_tolerance=0.0)

-``actual``: is the value that has been computed, measured, etc.
+``a`` and ``b``: are the two values to be tested to relative closeness

-``expected``: is the "known" value.
+``rel_tolerance``: is the relative tolerance -- it is the amount of
+error allowed, relative to the magnitude a and b. For example, to set
+a tolerance of 5%, pass tol=0.05. The default tolerance is 1e-8, which
+assures that the two values are the same within about 8 decimal
+digits.

-``tol``: is the relative tolerance -- it is the amount of error
-allowed, relative to the magnitude of the expected value.
-
-``abs_tol``: is an minimum absolute tolerance level -- useful for
+``abs_tolerance``: is an minimum absolute tolerance level -- useful for
 comparisons near zero.

 Modulo error checking, etc, the function will return the result of::

-    abs(expected-actual) <= max(tol*expected, abs_tol)
+  abs(a-b) <= max( rel_tolerance * min(abs(a), abs(b), abs_tolerance )


 Handling of non-finite numbers
@ -108,7 +119,7 @@ accommodate these types:
 * ``int``

 * ``Fraction``
- 
+
 * ``complex``: for complex, ``abs(z)`` will be used for scaling and
  comparison.

@ -116,35 +127,85 @@ accommodate these types:
 Behavior near zero
 ------------------

-Relative comparison is problematic if either value is zero. In this
-case, the difference is relative to zero, and thus will always be
-smaller than the prescribed tolerance. To handle this case, an
-optional parameter, ``abs_tol`` (default 0.0) can be used to set a
-minimum tolerance to be used in the case of very small relative
-tolerance. That is, the values will be considered close if::
+Relative comparison is problematic if either value is zero. By
+definition, no value is small relative to zero. And computationally,
+if either value is zero, the difference is the absolute value of the
+other value, and the computed absolute tolerance will be rel_tolerance
+times that value. rel-tolerance is always less than one, so the
+difference will never be less than the tolerance.

-    abs(a-b) <= abs(tol*expected) or abs(a-b) <= abs_tol
+However, while mathematically correct, there are many use cases where
+a user will need to know if a computed value is "close" to zero. This
+calls for an absolute tolerance test. If the user needs to call this
+function inside a loop or comprehension, where some, but not all, of
+the expected values may be zero, it is important that both a relative
+tolerance and absolute tolerance can be tested for with a single
+function with a single set of parameters.

-If the user sets the rel_tol parameter to 0.0, then only the absolute
-tolerance will effect the result, so this function provides an
-absolute tolerance check as well.
+There is a similar issue if the two values to be compared straddle zero:
+if a is approximately equal to -b, then a and b will never be computed
+as "close".
+
+To handle this case, an optional parameter, ``abs_tolerance`` can be
+used to set a minimum tolerance used in the case of very small or zero
+computed absolute tolerance. That is, the values will be always be
+considered close if the difference between them is less than the
+abs_tolerance
+
+The default absolute tolerance value is set to zero because there is
+no value that is appropriate for the general case. It is impossible to
+know an appropriate value without knowing the likely values expected
+for a given use case. If all the values tested are on order of one,
+then a value of about 1e-8 might be appropriate, but that would be far
+too large if expected values are on order of 1e-12 or smaller.
+
+Any non-zero default might result in user's tests passing totally
+inappropriately. If, on the other hand a test against zero fails the
+first time with defaults, a user will be prompted to select an
+appropriate value for the problem at hand in order to get the test to
+pass.
+
+NOTE: that the author of this PEP has resolved to go back over many of
+his tests that use the numpy ``all_close()`` function, which provides
+a default abs_tolerance, and make sure that the default value is
+appropriate.
+
+If the user sets the rel_tolerance parameter to 0.0, then only the
+absolute tolerance will effect the result. While not the goal of the
+function, it does allow it to be used as a purely absolute tolerance
+check as well.
+
+unittest assertion
+-------------------
+
+[need text here]
+
+implementation
+--------------

 A sample implementation is available (as of Jan 22, 2015) on gitHub:

-https://github.com/PythonCHB/close_pep/blob/master/is_close_to.py
+https://github.com/PythonCHB/close_pep/blob/master

+This implementation has a flag that lets the user select which
+relative tolerance test to apply -- this PEP does not suggest that
+that be retained, but rather than the strong test be selected.

 Relative Difference
 ===================

 There are essentially two ways to think about how close two numbers
-are to each-other: absolute difference: simply ``abs(a-b)``, and
-relative difference: ``abs(a-b)/scale_factor`` [2]_. The absolute
-difference is trivial enough that this proposal focuses on the
-relative difference.
+are to each-other:
+
+Absolute difference: simply ``abs(a-b)``
+
+Relative difference: ``abs(a-b)/scale_factor`` [2]_.
+
+The absolute difference is trivial enough that this proposal focuses
+on the relative difference.

 Usually, the scale factor is some function of the values under
-consideration, for instance: 
+consideration, for instance:

 1) The absolute value of one of the input values

@ -152,7 +213,106 @@ consideration, for instance:

 3) The minimum absolute value of the two.

-4) The arithmetic mean of the two
+4) The absolute value of the arithmetic mean of the two
+
+These lead to the following possibilities for determining if two
+values, a and b, are close to each other.
+
+1) ``abs(a-b) <= tol*abs(a)``
+
+2) ``abs(a-b) <= tol * max( abs(a), abs(b) )``
+
+3) ``abs(a-b) <= tol * min( abs(a), abs(b) )``
+
+4) ``abs(a-b) <= tol * (a + b)/2``
+
+NOTE: (2) and (3) can also be written as:
+
+2) ``(abs(a-b) <= tol*abs(a)) or (abs(a-b) <= tol*abs(a))``
+
+3) ``(abs(a-b) <= tol*abs(a)) and (abs(a-b) <= tol*abs(a))``
+
+(Boost refers to these as the "weak" and "strong" formulations [3]_)
+These can be a tiny bit more computationally efficient, and thus are
+used in the example code.
+
+Each of these formulations can lead to slightly different results.
+However, if the tolerance value is small, the differences are quite
+small. In fact, often less than available floating point precision.
+
+How much difference does it make?
+---------------------------------
+
+When selecting a method to determine closeness, one might want to know
+how much  of a difference it could make to use one test or the other
+-- i.e. how many values are there (or what range of values) that will
+pass one test, but not the other.
+
+The largest difference is between options (2) and (3) where the
+allowable absolute difference is scaled by either the larger or
+smaller of the values.
+
+Define ``delta`` to be the difference between the allowable absolute
+tolerance defined by the larger value and that defined by the smaller
+value. That is, the amount that the two input values need to be
+different in order to get a different result from the two tests.
+``tol`` is the relative tolerance value.
+
+Assume that ``a`` is the larger value and that both ``a`` and ``b``
+are positive, to make the analysis a bit easier. ``delta`` is
+therefore::
+
+  delta = tol * (a-b)
+
+
+or::
+
+  delta / tol = (a-b)
+
+
+The largest absolute difference that would pass the test: ``(a-b)``,
+equals the tolerance times the larger value::
+
+  (a-b) = tol * a
+
+
+Substituting into the expression for delta::
+
+  delta / tol = tol * a
+
+
+so::
+
+  delta = tol**2 * a
+
+
+For example, for ``a = 10``, ``b = 9``, ``tol = 0.1`` (10%):
+
+maximum tolerance ``tol * a == 0.1 * 10 == 1.0``
+
+minimum tolerance ``tol * b == 0.1 * 9.0 == 0.9``
+
+delta = ``(1.0 - 0.9) * 0.1 = 0.1`` or  ``tol**2 * a = 0.1**2 * 10 = .01``
+
+The absolute difference between the maximum and minimum tolerance
+tests in this case could be substantial. However, the primary use
+case for the proposed function is testing the results of computations.
+In that case a relative tolerance is likely to be selected of much
+smaller magnitude.
+
+For example, a relative tolerance of ``1e-8`` is about half the
+precision available in a python float. In that case, the difference
+between the two tests is ``1e-8**2 * a`` or ``1e-16 * a``, which is
+close to the limit of precision of a python float. If the relative
+tolerance is set to the proposed default of 1e-9 (or smaller), the
+difference between the two tests will be lost to the limits of
+precision of floating point. That is, each of the four methods will
+yield exactly the same results for all values of a and b.
+
+In addition, in common use, tolerances are defined to 1 significant
+figure -- that is, 1e-8 is specifying about 8 decimal digits of
+accuracy. So the difference between the various possible tests is well
+below the precision to which the tolerance is specified.


 Symmetry
@ -161,46 +321,113 @@ Symmetry
 A relative comparison can be either symmetric or non-symmetric. For a
 symmetric algorithm:

-``is_close_to(a,b)`` is always equal to ``is_close_to(b,a)``
+``is_close_to(a,b)`` is always the same as ``is_close_to(b,a)``

-This is an appealing consistency -- it mirrors the symmetry of
-equality, and is less likely to confuse people. However, often the
-question at hand is:
+If a relative closeness test uses only one of the values (such as (1)
+above), then the result is asymmetric, i.e. is_close_to(a,b) is not
+necessarily the same as is_close_to(b,a).

-"Is this computed or measured value within some tolerance of a known
-value?"
+Which approach is most appropriate depends on what question is being
+asked. If the question is: "are these two numbers close to each
+other?", there is no obvious ordering, and a symmetric test is most
+appropriate.

-In this case, the user wants the relative tolerance to be specifically
-scaled against the known value. It is also easier for the user to
-reason about.
+However, if the question is: "Is the computed value within x% of this
+known value?", then it is appropriate to scale the tolerance to the
+known value, and an asymmetric test is most appropriate.

-This proposal uses this asymmetric test to allow this specific
-definition of relative tolerance.
+From the previous section, it is clear that either approach would
+yield the same or similar results in the common use cases. In that
+case, the goal of this proposal is to provide a function that is least
+likely to produce surprising results.

-Example:
+The symmetric approach provide an appealing consistency -- it
+mirrors the symmetry of equality, and is less likely to confuse
+people. A symmetric test also relieves the user of the need to think
+about the order in which to set the arguments.  It was also pointed
+out that there may be some cases where the order of evaluation may not
+be well defined, for instance in the case of comparing a set of values
+all against each other.

-For the question: "Is the value of a within 10% of b?", Using b to
-scale the percent error clearly defines the result.
+There may be cases when a user does need to know that a value is
+within a particular range of a known value. In that case, it is easy
+enough to simply write the test directly::

-However, as this approach is not symmetric, a may be within 10% of b,
-but b is not within 10% of a. Consider the case::
+  if a-b <= tol*a:

-  a =  9.0
-  b = 10.0
+(assuming a > b in this case). There is little need to provide a
+function for this particular case.

-The difference between a and b is 1.0. 10% of a is 0.9, so b is not
-within 10% of a. But 10% of b is 1.0, so a is within 10% of b. 
+This proposal uses a symmetric test.

-Casual users might reasonably expect that if a is close to b, then b
-would also be close to a. However, in the common cases, the tolerance
-is quite small and often poorly defined, i.e. 1e-8, defined to only
-one significant figure, so the result will be very similar regardless
-of the order of the values. And if the user does care about the
-precise result, s/he can take care to always pass in the two
-parameters in sorted order.
+Which symmetric test?
+---------------------

-This proposed implementation uses asymmetric criteria with the scaling
-value clearly identified.
+There are three symmetric tests considered:
+
+The case that uses the arithmetic mean of the two values requires that
+the value be either added together before dividing by 2, which could
+result in extra overflow to inf for very large numbers, or require
+each value to be divided by two before being added together, which
+could result in underflow to -inf for very small numbers. This effect
+would only occur at the very limit of float values, but it was decided
+there as no benefit to the method worth reducing the range of
+functionality.
+
+This leaves the boost "weak" test (2)-- or using the smaller value to
+scale the tolerance, or the Boost "strong" (3) test, which uses the
+smaller of the values to scale the tolerance. For small tolerance,
+they yield the same result, but this proposal uses the boost "strong"
+test case: it is symmetric and provides a slightly stricter criteria
+for tolerance.
+
+
+Defaults
+========
+
+Default values are required for the relative and absolute tolerance.
+
+Relative Tolerance Default
+--------------------------
+
+The relative tolerance required for two values to be considered
+"close" is entirely use-case dependent. Nevertheless, the relative
+tolerance needs to be less than 1.0, and greater than 1e-16
+(approximate precision of a python float). The value of 1e-9 was
+selected because it is the largest relative tolerance for which the
+various possible methods will yield the same result, and it is also
+about half of the precision available to a python float. In the
+general case, a good numerical algorithm is not expected to lose more
+than about half of available digits of accuracy, and if a much larger
+tolerance is acceptable, the user should be considering the proper
+value in that case. Thus 1-e9 is expected to "just work" for many
+cases.
+
+Absolute tolerance default
+--------------------------
+
+The absolute tolerance value will be used primarily for comparing to
+zero. The absolute tolerance required to determine if a value is
+"close" to zero is entirely use-case dependent. There is also
+essentially no bounds to the useful range -- expected values would
+conceivably be anywhere within the limits of a python float.  Thus a
+default of 0.0 is selected.
+
+If, for a given use case, a user needs to compare to zero, the test
+will be guaranteed to fail the first time, and the user can select an
+appropriate value.
+
+It was suggested that comparing to zero is, in fact, a common use case
+(evidence suggest that the numpy functions are often used with zero).
+In this case, it would be desirable to have a "useful" default. Values
+around 1-e8 were suggested, being about half of floating point
+precision for values of around value 1.
+
+However, to quote The Zen: "In the face of ambiguity, refuse the
+temptation to guess." Guessing that users will most often be concerned
+with values close to 1.0 would lead to spurious passing tests when used
+with smaller values -- this is potentially more damaging than
+requiring the user to thoughtfully select an appropriate value.


 Expected Uses
@ -208,10 +435,23 @@ Expected Uses

 The primary expected use case is various forms of testing -- "are the
 results computed near what I expect as a result?" This sort of test
-may or may not be part of a formal unit testing suite.
+may or may not be part of a formal unit testing suite. Such testing
+could be used one-off at the command line, in an iPython notebook,
+part of doctests, or simple assets in an ``if __name__ == "__main__"``
+block.

-The function might be used also to determine if a measured value is
-within an expected value.
+The proposed unitest.TestCase assertion would have course be used in
+unit testing.
+
+It would also be an appropriate function to use for the termination
+criteria for a simple iterative solution to an implicit function::
+
+    guess = something
+    while True:
+        new_guess = implicit_function(guess, *args)
+        if is_close(new_guess, guess):
+            break
+        guess = new_guess


 Inappropriate uses
@ -238,8 +478,8 @@ Tests that values are approximately (or not approximately) equal by
 computing the difference, rounding to the given number of decimal
 places (default 7), and comparing to zero.

-This method was not selected for this proposal, as the use of decimal
-digits is a specific, not generally useful or flexible test.
+This method is purely an absolute tolerance test, and does not address
+the need for a relative tolerance test.

 numpy ``is_close()``
 --------------------
@ -262,13 +502,16 @@ all_close, for similar use cases as this proposal:
 In this approach, the absolute and relative tolerance are added
 together, rather than the ``or`` method used in this proposal. This is
 computationally more simple, and if relative tolerance is larger than
-the absolute tolerance, then the addition will have no effect. But if
-the absolute and relative tolerances are of similar magnitude, then
+the absolute tolerance, then the addition will have no effect. However,
+if the absolute and relative tolerances are of similar magnitude, then
 the allowed difference will be about twice as large as expected.

-Also, if the value passed in are small compared to the absolute
-tolerance, then the relative tolerance will be completely swamped,
-perhaps unexpectedly.
+This makes the function harder to understand, with no computational
+advantage in this context.
+
+Even more critically, if the values passed in are small compared to
+the absolute  tolerance, then the relative tolerance will be
+completely swamped, perhaps unexpectedly.

 This is why, in this proposal, the absolute tolerance defaults to zero
 -- the user will be required to choose a value appropriate for the
@ -279,25 +522,92 @@ Boost floating-point comparison
 -------------------------------

 The Boost project ( [3]_ ) provides a floating point comparison
-function. Is is a symetric approach, with both "weak" (larger of the
+function. Is is a symmetric approach, with both "weak" (larger of the
 two relative errors) and "strong" (smaller of the two relative errors)
-options.
+options. This proposal uses the Boost "strong" approach. There is no
+need to complicate the API by providing the option to select different
+methods when the results will be similar in most cases, and the user
+is unlikely to know which to select in any case.
+
+
+Alternate Proposals
+-------------------
+
+
+A Recipe
+'''''''''
+
+The primary alternate proposal was to not provide a standard library
+function at all, but rather, provide a recipe for users to refer to.
+This would have the advantage that the recipe could provide and
+explain the various options, and let the user select that which is
+most appropriate. However, that would require anyone needing such a
+test to, at the very least, copy the function into their code base,
+and select the comparison method to use.
+
+In addition, adding the function to the standard library allows it to
+be used in the ``unittest.TestCase.assertIsClose()`` method, providing
+a substantial convenience to those using unittest.
+
+
+``zero_tol``
+''''''''''''
+
+One possibility was to provide a zero tolerance parameter, rather than
+the absolute tolerance parameter. This would be an absolute tolerance
+that would only be applied in the case of one of the arguments being
+exactly zero. This would have the advantage of retaining the full
+relative tolerance behavior for all non-zero values, while allowing
+tests against zero to work. However, it would also result in the
+potentially surprising result that a small value could be "close" to
+zero, but not "close" to an even smaller value. e.g., 1e-10 is "close"
+to zero, but not "close" to 1e-11.
+
+
+No absolute tolerance
+'''''''''''''''''''''
+
+Given the issues with comparing to zero, another possibility would
+have been to only provide a relative tolerance, and let every
+comparison to zero fail. In this case, the user would need to do a
+simple absolute test: `abs(val) < zero_tol` in the case where the
+comparison involved zero.
+
+However, this would not allow the same call to be used for a sequence
+of values, such as in a loop or comprehension, or in the
+``TestCase.assertClose()`` method. Making the function far less
+useful. It is noted that the default abs_tolerance=0.0 achieves the
+same effect if the default is not overidden.
+
+Other tests
+''''''''''''
+
+The other tests considered are all discussed in the Relative Error
+section above.

-It was decided that a method that clearly defined which value was used
-to scale the relative error would be more appropriate for the standard
-library.

 References
 ==========

-.. [1] Python-ideas list discussion thread
-   (https://mail.python.org/pipermail/python-ideas/2015-January/030947.html)
+.. [1] Python-ideas list discussion threads

-.. [2] Wikipedaia page on relative difference
-   (http://en.wikipedia.org/wiki/Relative_change_and_difference)
+   https://mail.python.org/pipermail/python-ideas/2015-January/030947.html
+
+   https://mail.python.org/pipermail/python-ideas/2015-January/031124.html
+
+   https://mail.python.org/pipermail/python-ideas/2015-January/031313.html
+
+.. [2] Wikipedia page on relative difference
+
+   http://en.wikipedia.org/wiki/Relative_change_and_difference

 .. [3] Boost project floating-point comparison algorithms
-   (http://www.boost.org/doc/libs/1_35_0/libs/test/doc/components/test_tools/floating_point_comparison.html)
+
+   http://www.boost.org/doc/libs/1_35_0/libs/test/doc/components/test_tools/floating_point_comparison.html
+
+.. Bruce Dawson's discussion of floating point.
+
+   https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/


 Copyright