From 1641c51dddfa4dcefe1758ea2cc8a8120b15b098 Mon Sep 17 00:00:00 2001 From: Chris Angelico Date: Thu, 5 Feb 2015 10:17:42 +1100 Subject: [PATCH] Update PEP 485 from Chris Barker's edits --- pep-0485.txt | 484 ++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 397 insertions(+), 87 deletions(-) diff --git a/pep-0485.txt b/pep-0485.txt index 595dc0bcc..2dffef1ed 100644 --- a/pep-0485.txt +++ b/pep-0485.txt @@ -15,8 +15,10 @@ Abstract ======== This PEP proposes the addition of a function to the standard library -that determines whether one value is approximately equal or "close" -to another value. +that determines whether one value is approximately equal or "close" to +another value. It is also proposed that an assertion be added to the +``unittest.TestCase`` class to provide easy access for those using +unittest for testing. Rationale @@ -37,27 +39,35 @@ the standard library. Existing Implementations ------------------------ -The standard library includes the -``unittest.TestCase.assertAlmostEqual`` method, but it: +The standard library includes the ``unittest.TestCase.assertAlmostEqual`` +method, but it: * Is buried in the unittest.TestCase class -* Is an assertion, so you can't use it as a general test (easily) +* Is an assertion, so you can't use it as a general test at the command + line, etc. (easily) -* Uses number of decimal digits or an absolute delta, which are - particular use cases that don't provide a general relative error. +* Is an absolute difference test. Often the measure of difference + requires, particularly for floating point numbers, a relative error, + i.e "Are these two values within x% of each-other?", rather than an + absolute error. Particularly when the magnatude of the values is + unknown a priori. -The numpy package has the ``allclose()`` and ``isclose()`` functions. +The numpy package has the ``allclose()`` and ``isclose()`` functions, +but they are only available with numpy. The statistics package tests include an implementation, used for its unit tests. One can also find discussion and sample implementations on Stack -Overflow, and other help sites. +Overflow and other help sites. -These existing implementations indicate that this is a common need, -and not trivial to write oneself, making it a candidate for the -standard library. +Many other non-python systems provide such a test, including the Boost C++ +library and the APL language (reference?). + +These existing implementations indicate that this is a common need and +not trivial to write oneself, making it a candidate for the standard +library. Proposed Implementation @@ -68,21 +78,22 @@ python-ideas list [1]_. The new function will have the following signature:: - is_close_to(actual, expected, tol=1e-8, abs_tol=0.0) + is_close(a, b, rel_tolerance=1e-9, abs_tolerance=0.0) -``actual``: is the value that has been computed, measured, etc. +``a`` and ``b``: are the two values to be tested to relative closeness -``expected``: is the "known" value. +``rel_tolerance``: is the relative tolerance -- it is the amount of +error allowed, relative to the magnitude a and b. For example, to set +a tolerance of 5%, pass tol=0.05. The default tolerance is 1e-8, which +assures that the two values are the same within about 8 decimal +digits. -``tol``: is the relative tolerance -- it is the amount of error -allowed, relative to the magnitude of the expected value. - -``abs_tol``: is an minimum absolute tolerance level -- useful for +``abs_tolerance``: is an minimum absolute tolerance level -- useful for comparisons near zero. Modulo error checking, etc, the function will return the result of:: - abs(expected-actual) <= max(tol*expected, abs_tol) + abs(a-b) <= max( rel_tolerance * min(abs(a), abs(b), abs_tolerance ) Handling of non-finite numbers @@ -108,7 +119,7 @@ accommodate these types: * ``int`` * ``Fraction`` - + * ``complex``: for complex, ``abs(z)`` will be used for scaling and comparison. @@ -116,35 +127,85 @@ accommodate these types: Behavior near zero ------------------ -Relative comparison is problematic if either value is zero. In this -case, the difference is relative to zero, and thus will always be -smaller than the prescribed tolerance. To handle this case, an -optional parameter, ``abs_tol`` (default 0.0) can be used to set a -minimum tolerance to be used in the case of very small relative -tolerance. That is, the values will be considered close if:: +Relative comparison is problematic if either value is zero. By +definition, no value is small relative to zero. And computationally, +if either value is zero, the difference is the absolute value of the +other value, and the computed absolute tolerance will be rel_tolerance +times that value. rel-tolerance is always less than one, so the +difference will never be less than the tolerance. - abs(a-b) <= abs(tol*expected) or abs(a-b) <= abs_tol +However, while mathematically correct, there are many use cases where +a user will need to know if a computed value is "close" to zero. This +calls for an absolute tolerance test. If the user needs to call this +function inside a loop or comprehension, where some, but not all, of +the expected values may be zero, it is important that both a relative +tolerance and absolute tolerance can be tested for with a single +function with a single set of parameters. -If the user sets the rel_tol parameter to 0.0, then only the absolute -tolerance will effect the result, so this function provides an -absolute tolerance check as well. +There is a similar issue if the two values to be compared straddle zero: +if a is approximately equal to -b, then a and b will never be computed +as "close". + +To handle this case, an optional parameter, ``abs_tolerance`` can be +used to set a minimum tolerance used in the case of very small or zero +computed absolute tolerance. That is, the values will be always be +considered close if the difference between them is less than the +abs_tolerance + +The default absolute tolerance value is set to zero because there is +no value that is appropriate for the general case. It is impossible to +know an appropriate value without knowing the likely values expected +for a given use case. If all the values tested are on order of one, +then a value of about 1e-8 might be appropriate, but that would be far +too large if expected values are on order of 1e-12 or smaller. + +Any non-zero default might result in user's tests passing totally +inappropriately. If, on the other hand a test against zero fails the +first time with defaults, a user will be prompted to select an +appropriate value for the problem at hand in order to get the test to +pass. + +NOTE: that the author of this PEP has resolved to go back over many of +his tests that use the numpy ``all_close()`` function, which provides +a default abs_tolerance, and make sure that the default value is +appropriate. + +If the user sets the rel_tolerance parameter to 0.0, then only the +absolute tolerance will effect the result. While not the goal of the +function, it does allow it to be used as a purely absolute tolerance +check as well. + +unittest assertion +------------------- + +[need text here] + +implementation +-------------- A sample implementation is available (as of Jan 22, 2015) on gitHub: -https://github.com/PythonCHB/close_pep/blob/master/is_close_to.py +https://github.com/PythonCHB/close_pep/blob/master +This implementation has a flag that lets the user select which +relative tolerance test to apply -- this PEP does not suggest that +that be retained, but rather than the strong test be selected. Relative Difference =================== There are essentially two ways to think about how close two numbers -are to each-other: absolute difference: simply ``abs(a-b)``, and -relative difference: ``abs(a-b)/scale_factor`` [2]_. The absolute -difference is trivial enough that this proposal focuses on the -relative difference. +are to each-other: + +Absolute difference: simply ``abs(a-b)`` + +Relative difference: ``abs(a-b)/scale_factor`` [2]_. + +The absolute difference is trivial enough that this proposal focuses +on the relative difference. Usually, the scale factor is some function of the values under -consideration, for instance: +consideration, for instance: 1) The absolute value of one of the input values @@ -152,7 +213,106 @@ consideration, for instance: 3) The minimum absolute value of the two. -4) The arithmetic mean of the two +4) The absolute value of the arithmetic mean of the two + +These lead to the following possibilities for determining if two +values, a and b, are close to each other. + +1) ``abs(a-b) <= tol*abs(a)`` + +2) ``abs(a-b) <= tol * max( abs(a), abs(b) )`` + +3) ``abs(a-b) <= tol * min( abs(a), abs(b) )`` + +4) ``abs(a-b) <= tol * (a + b)/2`` + +NOTE: (2) and (3) can also be written as: + +2) ``(abs(a-b) <= tol*abs(a)) or (abs(a-b) <= tol*abs(a))`` + +3) ``(abs(a-b) <= tol*abs(a)) and (abs(a-b) <= tol*abs(a))`` + +(Boost refers to these as the "weak" and "strong" formulations [3]_) +These can be a tiny bit more computationally efficient, and thus are +used in the example code. + +Each of these formulations can lead to slightly different results. +However, if the tolerance value is small, the differences are quite +small. In fact, often less than available floating point precision. + +How much difference does it make? +--------------------------------- + +When selecting a method to determine closeness, one might want to know +how much of a difference it could make to use one test or the other +-- i.e. how many values are there (or what range of values) that will +pass one test, but not the other. + +The largest difference is between options (2) and (3) where the +allowable absolute difference is scaled by either the larger or +smaller of the values. + +Define ``delta`` to be the difference between the allowable absolute +tolerance defined by the larger value and that defined by the smaller +value. That is, the amount that the two input values need to be +different in order to get a different result from the two tests. +``tol`` is the relative tolerance value. + +Assume that ``a`` is the larger value and that both ``a`` and ``b`` +are positive, to make the analysis a bit easier. ``delta`` is +therefore:: + + delta = tol * (a-b) + + +or:: + + delta / tol = (a-b) + + +The largest absolute difference that would pass the test: ``(a-b)``, +equals the tolerance times the larger value:: + + (a-b) = tol * a + + +Substituting into the expression for delta:: + + delta / tol = tol * a + + +so:: + + delta = tol**2 * a + + +For example, for ``a = 10``, ``b = 9``, ``tol = 0.1`` (10%): + +maximum tolerance ``tol * a == 0.1 * 10 == 1.0`` + +minimum tolerance ``tol * b == 0.1 * 9.0 == 0.9`` + +delta = ``(1.0 - 0.9) * 0.1 = 0.1`` or ``tol**2 * a = 0.1**2 * 10 = .01`` + +The absolute difference between the maximum and minimum tolerance +tests in this case could be substantial. However, the primary use +case for the proposed function is testing the results of computations. +In that case a relative tolerance is likely to be selected of much +smaller magnitude. + +For example, a relative tolerance of ``1e-8`` is about half the +precision available in a python float. In that case, the difference +between the two tests is ``1e-8**2 * a`` or ``1e-16 * a``, which is +close to the limit of precision of a python float. If the relative +tolerance is set to the proposed default of 1e-9 (or smaller), the +difference between the two tests will be lost to the limits of +precision of floating point. That is, each of the four methods will +yield exactly the same results for all values of a and b. + +In addition, in common use, tolerances are defined to 1 significant +figure -- that is, 1e-8 is specifying about 8 decimal digits of +accuracy. So the difference between the various possible tests is well +below the precision to which the tolerance is specified. Symmetry @@ -161,46 +321,113 @@ Symmetry A relative comparison can be either symmetric or non-symmetric. For a symmetric algorithm: -``is_close_to(a,b)`` is always equal to ``is_close_to(b,a)`` +``is_close_to(a,b)`` is always the same as ``is_close_to(b,a)`` -This is an appealing consistency -- it mirrors the symmetry of -equality, and is less likely to confuse people. However, often the -question at hand is: +If a relative closeness test uses only one of the values (such as (1) +above), then the result is asymmetric, i.e. is_close_to(a,b) is not +necessarily the same as is_close_to(b,a). -"Is this computed or measured value within some tolerance of a known -value?" +Which approach is most appropriate depends on what question is being +asked. If the question is: "are these two numbers close to each +other?", there is no obvious ordering, and a symmetric test is most +appropriate. -In this case, the user wants the relative tolerance to be specifically -scaled against the known value. It is also easier for the user to -reason about. +However, if the question is: "Is the computed value within x% of this +known value?", then it is appropriate to scale the tolerance to the +known value, and an asymmetric test is most appropriate. -This proposal uses this asymmetric test to allow this specific -definition of relative tolerance. +From the previous section, it is clear that either approach would +yield the same or similar results in the common use cases. In that +case, the goal of this proposal is to provide a function that is least +likely to produce surprising results. -Example: +The symmetric approach provide an appealing consistency -- it +mirrors the symmetry of equality, and is less likely to confuse +people. A symmetric test also relieves the user of the need to think +about the order in which to set the arguments. It was also pointed +out that there may be some cases where the order of evaluation may not +be well defined, for instance in the case of comparing a set of values +all against each other. -For the question: "Is the value of a within 10% of b?", Using b to -scale the percent error clearly defines the result. +There may be cases when a user does need to know that a value is +within a particular range of a known value. In that case, it is easy +enough to simply write the test directly:: -However, as this approach is not symmetric, a may be within 10% of b, -but b is not within 10% of a. Consider the case:: + if a-b <= tol*a: - a = 9.0 - b = 10.0 +(assuming a > b in this case). There is little need to provide a +function for this particular case. -The difference between a and b is 1.0. 10% of a is 0.9, so b is not -within 10% of a. But 10% of b is 1.0, so a is within 10% of b. +This proposal uses a symmetric test. -Casual users might reasonably expect that if a is close to b, then b -would also be close to a. However, in the common cases, the tolerance -is quite small and often poorly defined, i.e. 1e-8, defined to only -one significant figure, so the result will be very similar regardless -of the order of the values. And if the user does care about the -precise result, s/he can take care to always pass in the two -parameters in sorted order. +Which symmetric test? +--------------------- -This proposed implementation uses asymmetric criteria with the scaling -value clearly identified. +There are three symmetric tests considered: + +The case that uses the arithmetic mean of the two values requires that +the value be either added together before dividing by 2, which could +result in extra overflow to inf for very large numbers, or require +each value to be divided by two before being added together, which +could result in underflow to -inf for very small numbers. This effect +would only occur at the very limit of float values, but it was decided +there as no benefit to the method worth reducing the range of +functionality. + +This leaves the boost "weak" test (2)-- or using the smaller value to +scale the tolerance, or the Boost "strong" (3) test, which uses the +smaller of the values to scale the tolerance. For small tolerance, +they yield the same result, but this proposal uses the boost "strong" +test case: it is symmetric and provides a slightly stricter criteria +for tolerance. + + +Defaults +======== + +Default values are required for the relative and absolute tolerance. + +Relative Tolerance Default +-------------------------- + +The relative tolerance required for two values to be considered +"close" is entirely use-case dependent. Nevertheless, the relative +tolerance needs to be less than 1.0, and greater than 1e-16 +(approximate precision of a python float). The value of 1e-9 was +selected because it is the largest relative tolerance for which the +various possible methods will yield the same result, and it is also +about half of the precision available to a python float. In the +general case, a good numerical algorithm is not expected to lose more +than about half of available digits of accuracy, and if a much larger +tolerance is acceptable, the user should be considering the proper +value in that case. Thus 1-e9 is expected to "just work" for many +cases. + +Absolute tolerance default +-------------------------- + +The absolute tolerance value will be used primarily for comparing to +zero. The absolute tolerance required to determine if a value is +"close" to zero is entirely use-case dependent. There is also +essentially no bounds to the useful range -- expected values would +conceivably be anywhere within the limits of a python float. Thus a +default of 0.0 is selected. + +If, for a given use case, a user needs to compare to zero, the test +will be guaranteed to fail the first time, and the user can select an +appropriate value. + +It was suggested that comparing to zero is, in fact, a common use case +(evidence suggest that the numpy functions are often used with zero). +In this case, it would be desirable to have a "useful" default. Values +around 1-e8 were suggested, being about half of floating point +precision for values of around value 1. + +However, to quote The Zen: "In the face of ambiguity, refuse the +temptation to guess." Guessing that users will most often be concerned +with values close to 1.0 would lead to spurious passing tests when used +with smaller values -- this is potentially more damaging than +requiring the user to thoughtfully select an appropriate value. Expected Uses @@ -208,10 +435,23 @@ Expected Uses The primary expected use case is various forms of testing -- "are the results computed near what I expect as a result?" This sort of test -may or may not be part of a formal unit testing suite. +may or may not be part of a formal unit testing suite. Such testing +could be used one-off at the command line, in an iPython notebook, +part of doctests, or simple assets in an ``if __name__ == "__main__"`` +block. -The function might be used also to determine if a measured value is -within an expected value. +The proposed unitest.TestCase assertion would have course be used in +unit testing. + +It would also be an appropriate function to use for the termination +criteria for a simple iterative solution to an implicit function:: + + guess = something + while True: + new_guess = implicit_function(guess, *args) + if is_close(new_guess, guess): + break + guess = new_guess Inappropriate uses @@ -238,8 +478,8 @@ Tests that values are approximately (or not approximately) equal by computing the difference, rounding to the given number of decimal places (default 7), and comparing to zero. -This method was not selected for this proposal, as the use of decimal -digits is a specific, not generally useful or flexible test. +This method is purely an absolute tolerance test, and does not address +the need for a relative tolerance test. numpy ``is_close()`` -------------------- @@ -262,13 +502,16 @@ all_close, for similar use cases as this proposal: In this approach, the absolute and relative tolerance are added together, rather than the ``or`` method used in this proposal. This is computationally more simple, and if relative tolerance is larger than -the absolute tolerance, then the addition will have no effect. But if -the absolute and relative tolerances are of similar magnitude, then +the absolute tolerance, then the addition will have no effect. However, +if the absolute and relative tolerances are of similar magnitude, then the allowed difference will be about twice as large as expected. -Also, if the value passed in are small compared to the absolute -tolerance, then the relative tolerance will be completely swamped, -perhaps unexpectedly. +This makes the function harder to understand, with no computational +advantage in this context. + +Even more critically, if the values passed in are small compared to +the absolute tolerance, then the relative tolerance will be +completely swamped, perhaps unexpectedly. This is why, in this proposal, the absolute tolerance defaults to zero -- the user will be required to choose a value appropriate for the @@ -279,25 +522,92 @@ Boost floating-point comparison ------------------------------- The Boost project ( [3]_ ) provides a floating point comparison -function. Is is a symetric approach, with both "weak" (larger of the +function. Is is a symmetric approach, with both "weak" (larger of the two relative errors) and "strong" (smaller of the two relative errors) -options. +options. This proposal uses the Boost "strong" approach. There is no +need to complicate the API by providing the option to select different +methods when the results will be similar in most cases, and the user +is unlikely to know which to select in any case. + + +Alternate Proposals +------------------- + + +A Recipe +''''''''' + +The primary alternate proposal was to not provide a standard library +function at all, but rather, provide a recipe for users to refer to. +This would have the advantage that the recipe could provide and +explain the various options, and let the user select that which is +most appropriate. However, that would require anyone needing such a +test to, at the very least, copy the function into their code base, +and select the comparison method to use. + +In addition, adding the function to the standard library allows it to +be used in the ``unittest.TestCase.assertIsClose()`` method, providing +a substantial convenience to those using unittest. + + +``zero_tol`` +'''''''''''' + +One possibility was to provide a zero tolerance parameter, rather than +the absolute tolerance parameter. This would be an absolute tolerance +that would only be applied in the case of one of the arguments being +exactly zero. This would have the advantage of retaining the full +relative tolerance behavior for all non-zero values, while allowing +tests against zero to work. However, it would also result in the +potentially surprising result that a small value could be "close" to +zero, but not "close" to an even smaller value. e.g., 1e-10 is "close" +to zero, but not "close" to 1e-11. + + +No absolute tolerance +''''''''''''''''''''' + +Given the issues with comparing to zero, another possibility would +have been to only provide a relative tolerance, and let every +comparison to zero fail. In this case, the user would need to do a +simple absolute test: `abs(val) < zero_tol` in the case where the +comparison involved zero. + +However, this would not allow the same call to be used for a sequence +of values, such as in a loop or comprehension, or in the +``TestCase.assertClose()`` method. Making the function far less +useful. It is noted that the default abs_tolerance=0.0 achieves the +same effect if the default is not overidden. + +Other tests +'''''''''''' + +The other tests considered are all discussed in the Relative Error +section above. -It was decided that a method that clearly defined which value was used -to scale the relative error would be more appropriate for the standard -library. References ========== -.. [1] Python-ideas list discussion thread - (https://mail.python.org/pipermail/python-ideas/2015-January/030947.html) +.. [1] Python-ideas list discussion threads -.. [2] Wikipedaia page on relative difference - (http://en.wikipedia.org/wiki/Relative_change_and_difference) + https://mail.python.org/pipermail/python-ideas/2015-January/030947.html + + https://mail.python.org/pipermail/python-ideas/2015-January/031124.html + + https://mail.python.org/pipermail/python-ideas/2015-January/031313.html + +.. [2] Wikipedia page on relative difference + + http://en.wikipedia.org/wiki/Relative_change_and_difference .. [3] Boost project floating-point comparison algorithms - (http://www.boost.org/doc/libs/1_35_0/libs/test/doc/components/test_tools/floating_point_comparison.html) + + http://www.boost.org/doc/libs/1_35_0/libs/test/doc/components/test_tools/floating_point_comparison.html + +.. Bruce Dawson's discussion of floating point. + + https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/ Copyright