638 lines
22 KiB
Plaintext
638 lines
22 KiB
Plaintext
PEP: 485
|
|
Title: A Function for testing approximate equality
|
|
Version: $Revision$
|
|
Last-Modified: $Date$
|
|
Author: Christopher Barker <Chris.Barker@noaa.gov>
|
|
Status: Accepted
|
|
Type: Standards Track
|
|
Content-Type: text/x-rst
|
|
Created: 20-Jan-2015
|
|
Python-Version: 3.5
|
|
Post-History:
|
|
|
|
|
|
Abstract
|
|
========
|
|
|
|
This PEP proposes the addition of a function to the standard library
|
|
that determines whether one value is approximately equal or "close" to
|
|
another value.
|
|
|
|
|
|
Rationale
|
|
=========
|
|
|
|
Floating point values contain limited precision, which results in
|
|
their being unable to exactly represent some values, and for errors to
|
|
accumulate with repeated computation. As a result, it is common
|
|
advice to only use an equality comparison in very specific situations.
|
|
Often a inequality comparison fits the bill, but there are times
|
|
(often in testing) where the programmer wants to determine whether a
|
|
computed value is "close" to an expected value, without requiring them
|
|
to be exactly equal. This is common enough, particularly in testing,
|
|
and not always obvious how to do it, so it would be useful addition to
|
|
the standard library.
|
|
|
|
|
|
Existing Implementations
|
|
------------------------
|
|
|
|
The standard library includes the ``unittest.TestCase.assertAlmostEqual``
|
|
method, but it:
|
|
|
|
* Is buried in the unittest.TestCase class
|
|
|
|
* Is an assertion, so you can't use it as a general test at the command
|
|
line, etc. (easily)
|
|
|
|
* Is an absolute difference test. Often the measure of difference
|
|
requires, particularly for floating point numbers, a relative error,
|
|
i.e. "Are these two values within x% of each-other?", rather than an
|
|
absolute error. Particularly when the magnitude of the values is
|
|
unknown a priori.
|
|
|
|
The numpy package has the ``allclose()`` and ``isclose()`` functions,
|
|
but they are only available with numpy.
|
|
|
|
The statistics package tests include an implementation, used for its
|
|
unit tests.
|
|
|
|
One can also find discussion and sample implementations on Stack
|
|
Overflow and other help sites.
|
|
|
|
Many other non-python systems provide such a test, including the Boost C++
|
|
library and the APL language [4]_.
|
|
|
|
These existing implementations indicate that this is a common need and
|
|
not trivial to write oneself, making it a candidate for the standard
|
|
library.
|
|
|
|
|
|
Proposed Implementation
|
|
=======================
|
|
|
|
NOTE: this PEP is the result of extended discussions on the
|
|
python-ideas list [1]_.
|
|
|
|
The new function will go into the math module, and have the following
|
|
signature::
|
|
|
|
isclose(a, b, rel_tol=1e-9, abs_tol=0.0)
|
|
|
|
``a`` and ``b``: are the two values to be tested to relative closeness
|
|
|
|
``rel_tol``: is the relative tolerance -- it is the amount of error
|
|
allowed, relative to the larger absolute value of a or b. For example,
|
|
to set a tolerance of 5%, pass tol=0.05. The default tolerance is 1e-9,
|
|
which assures that the two values are the same within about 9 decimal
|
|
digits. ``rel_tol`` must be greater than 0.0
|
|
|
|
``abs_tol``: is an minimum absolute tolerance level -- useful for
|
|
comparisons near zero.
|
|
|
|
Modulo error checking, etc, the function will return the result of::
|
|
|
|
abs(a-b) <= max( rel_tol * max(abs(a), abs(b)), abs_tol )
|
|
|
|
The name, ``isclose``, is selected for consistency with the existing
|
|
``isnan`` and ``isinf``.
|
|
|
|
Handling of non-finite numbers
|
|
------------------------------
|
|
|
|
The IEEE 754 special values of NaN, inf, and -inf will be handled
|
|
according to IEEE rules. Specifically, NaN is not considered close to
|
|
any other value, including NaN. inf and -inf are only considered close
|
|
to themselves.
|
|
|
|
|
|
Non-float types
|
|
---------------
|
|
|
|
The primary use-case is expected to be floating point numbers.
|
|
However, users may want to compare other numeric types similarly. In
|
|
theory, it should work for any type that supports ``abs()``,
|
|
multiplication, comparisons, and subtraction. The code will be written
|
|
and tested to accommodate these types:
|
|
|
|
* ``Decimal``
|
|
|
|
* ``int``
|
|
|
|
* ``Fraction``
|
|
|
|
* ``complex``: for complex, the absolute value of the complex values
|
|
will be used for scaling and comparison. If a complex tolerance is
|
|
passed in, the absolute value will be used as the tolerance.
|
|
|
|
|
|
Behavior near zero
|
|
------------------
|
|
|
|
Relative comparison is problematic if either value is zero. By
|
|
definition, no value is small relative to zero. And computationally,
|
|
if either value is zero, the difference is the absolute value of the
|
|
other value, and the computed absolute tolerance will be ``rel_tol``
|
|
times that value. When ``rel_tol`` is less than one, the difference will
|
|
never be less than the tolerance.
|
|
|
|
However, while mathematically correct, there are many use cases where
|
|
a user will need to know if a computed value is "close" to zero. This
|
|
calls for an absolute tolerance test. If the user needs to call this
|
|
function inside a loop or comprehension, where some, but not all, of
|
|
the expected values may be zero, it is important that both a relative
|
|
tolerance and absolute tolerance can be tested for with a single
|
|
function with a single set of parameters.
|
|
|
|
There is a similar issue if the two values to be compared straddle zero:
|
|
if a is approximately equal to -b, then a and b will never be computed
|
|
as "close".
|
|
|
|
To handle this case, an optional parameter, ``abs_tol`` can be
|
|
used to set a minimum tolerance used in the case of very small or zero
|
|
computed relative tolerance. That is, the values will be always be
|
|
considered close if the difference between them is less than
|
|
``abs_tol``
|
|
|
|
The default absolute tolerance value is set to zero because there is
|
|
no value that is appropriate for the general case. It is impossible to
|
|
know an appropriate value without knowing the likely values expected
|
|
for a given use case. If all the values tested are on order of one,
|
|
then a value of about 1e-9 might be appropriate, but that would be far
|
|
too large if expected values are on order of 1e-9 or smaller.
|
|
|
|
Any non-zero default might result in user's tests passing totally
|
|
inappropriately. If, on the other hand, a test against zero fails the
|
|
first time with defaults, a user will be prompted to select an
|
|
appropriate value for the problem at hand in order to get the test to
|
|
pass.
|
|
|
|
NOTE: that the author of this PEP has resolved to go back over many of
|
|
his tests that use the numpy ``allclose()`` function, which provides
|
|
a default absolute tolerance, and make sure that the default value is
|
|
appropriate.
|
|
|
|
If the user sets the rel_tol parameter to 0.0, then only the
|
|
absolute tolerance will effect the result. While not the goal of the
|
|
function, it does allow it to be used as a purely absolute tolerance
|
|
check as well.
|
|
|
|
|
|
Implementation
|
|
--------------
|
|
|
|
A sample implementation is available (as of Jan 22, 2015) on gitHub:
|
|
|
|
https://github.com/PythonCHB/close_pep/blob/master/is_close.py
|
|
|
|
This implementation has a flag that lets the user select which
|
|
relative tolerance test to apply -- this PEP does not suggest that
|
|
that be retained, but rather than the weak test be selected.
|
|
|
|
There are also drafts of this PEP and test code, etc. there:
|
|
|
|
https://github.com/PythonCHB/close_pep
|
|
|
|
|
|
Relative Difference
|
|
===================
|
|
|
|
There are essentially two ways to think about how close two numbers
|
|
are to each-other:
|
|
|
|
Absolute difference: simply ``abs(a-b)``
|
|
|
|
Relative difference: ``abs(a-b)/scale_factor`` [2]_.
|
|
|
|
The absolute difference is trivial enough that this proposal focuses
|
|
on the relative difference.
|
|
|
|
Usually, the scale factor is some function of the values under
|
|
consideration, for instance:
|
|
|
|
1) The absolute value of one of the input values
|
|
|
|
2) The maximum absolute value of the two
|
|
|
|
3) The minimum absolute value of the two.
|
|
|
|
4) The absolute value of the arithmetic mean of the two
|
|
|
|
These leads to the following possibilities for determining if two
|
|
values, a and b, are close to each other.
|
|
|
|
1) ``abs(a-b) <= tol*abs(a)``
|
|
|
|
2) ``abs(a-b) <= tol * max( abs(a), abs(b) )``
|
|
|
|
3) ``abs(a-b) <= tol * min( abs(a), abs(b) )``
|
|
|
|
4) ``abs(a-b) <= tol * (a + b)/2``
|
|
|
|
NOTE: (2) and (3) can also be written as:
|
|
|
|
2) ``(abs(a-b) <= abs(tol*a)) or (abs(a-b) <= abs(tol*b))``
|
|
|
|
3) ``(abs(a-b) <= abs(tol*a)) and (abs(a-b) <= abs(tol*b))``
|
|
|
|
(Boost refers to these as the "weak" and "strong" formulations [3]_)
|
|
These can be a tiny bit more computationally efficient, and thus are
|
|
used in the example code.
|
|
|
|
Each of these formulations can lead to slightly different results.
|
|
However, if the tolerance value is small, the differences are quite
|
|
small. In fact, often less than available floating point precision.
|
|
|
|
How much difference does it make?
|
|
---------------------------------
|
|
|
|
When selecting a method to determine closeness, one might want to know
|
|
how much of a difference it could make to use one test or the other
|
|
-- i.e. how many values are there (or what range of values) that will
|
|
pass one test, but not the other.
|
|
|
|
The largest difference is between options (2) and (3) where the
|
|
allowable absolute difference is scaled by either the larger or
|
|
smaller of the values.
|
|
|
|
Define ``delta`` to be the difference between the allowable absolute
|
|
tolerance defined by the larger value and that defined by the smaller
|
|
value. That is, the amount that the two input values need to be
|
|
different in order to get a different result from the two tests.
|
|
``tol`` is the relative tolerance value.
|
|
|
|
Assume that ``a`` is the larger value and that both ``a`` and ``b``
|
|
are positive, to make the analysis a bit easier. ``delta`` is
|
|
therefore::
|
|
|
|
delta = tol * (a-b)
|
|
|
|
|
|
or::
|
|
|
|
delta / tol = (a-b)
|
|
|
|
|
|
The largest absolute difference that would pass the test: ``(a-b)``,
|
|
equals the tolerance times the larger value::
|
|
|
|
(a-b) = tol * a
|
|
|
|
|
|
Substituting into the expression for delta::
|
|
|
|
delta / tol = tol * a
|
|
|
|
|
|
so::
|
|
|
|
delta = tol**2 * a
|
|
|
|
|
|
For example, for ``a = 10``, ``b = 9``, ``tol = 0.1`` (10%):
|
|
|
|
maximum tolerance ``tol * a == 0.1 * 10 == 1.0``
|
|
|
|
minimum tolerance ``tol * b == 0.1 * 9.0 == 0.9``
|
|
|
|
delta = ``(1.0 - 0.9) = 0.1`` or ``tol**2 * a = 0.1**2 * 10 = .1``
|
|
|
|
The absolute difference between the maximum and minimum tolerance
|
|
tests in this case could be substantial. However, the primary use
|
|
case for the proposed function is testing the results of computations.
|
|
In that case a relative tolerance is likely to be selected of much
|
|
smaller magnitude.
|
|
|
|
For example, a relative tolerance of ``1e-8`` is about half the
|
|
precision available in a python float. In that case, the difference
|
|
between the two tests is ``1e-8**2 * a`` or ``1e-16 * a``, which is
|
|
close to the limit of precision of a python float. If the relative
|
|
tolerance is set to the proposed default of 1e-9 (or smaller), the
|
|
difference between the two tests will be lost to the limits of
|
|
precision of floating point. That is, each of the four methods will
|
|
yield exactly the same results for all values of a and b.
|
|
|
|
In addition, in common use, tolerances are defined to 1 significant
|
|
figure -- that is, 1e-9 is specifying about 9 decimal digits of
|
|
accuracy. So the difference between the various possible tests is well
|
|
below the precision to which the tolerance is specified.
|
|
|
|
|
|
Symmetry
|
|
--------
|
|
|
|
A relative comparison can be either symmetric or non-symmetric. For a
|
|
symmetric algorithm:
|
|
|
|
``isclose(a,b)`` is always the same as ``isclose(b,a)``
|
|
|
|
If a relative closeness test uses only one of the values (such as (1)
|
|
above), then the result is asymmetric, i.e. isclose(a,b) is not
|
|
necessarily the same as isclose(b,a).
|
|
|
|
Which approach is most appropriate depends on what question is being
|
|
asked. If the question is: "are these two numbers close to each
|
|
other?", there is no obvious ordering, and a symmetric test is most
|
|
appropriate.
|
|
|
|
However, if the question is: "Is the computed value within x% of this
|
|
known value?", then it is appropriate to scale the tolerance to the
|
|
known value, and an asymmetric test is most appropriate.
|
|
|
|
From the previous section, it is clear that either approach would
|
|
yield the same or similar results in the common use cases. In that
|
|
case, the goal of this proposal is to provide a function that is least
|
|
likely to produce surprising results.
|
|
|
|
The symmetric approach provide an appealing consistency -- it
|
|
mirrors the symmetry of equality, and is less likely to confuse
|
|
people. A symmetric test also relieves the user of the need to think
|
|
about the order in which to set the arguments. It was also pointed
|
|
out that there may be some cases where the order of evaluation may not
|
|
be well defined, for instance in the case of comparing a set of values
|
|
all against each other.
|
|
|
|
There may be cases when a user does need to know that a value is
|
|
within a particular range of a known value. In that case, it is easy
|
|
enough to simply write the test directly::
|
|
|
|
if a-b <= tol*a:
|
|
|
|
(assuming a > b in this case). There is little need to provide a
|
|
function for this particular case.
|
|
|
|
This proposal uses a symmetric test.
|
|
|
|
Which symmetric test?
|
|
---------------------
|
|
|
|
There are three symmetric tests considered:
|
|
|
|
The case that uses the arithmetic mean of the two values requires that
|
|
the value be either added together before dividing by 2, which could
|
|
result in extra overflow to inf for very large numbers, or require
|
|
each value to be divided by two before being added together, which
|
|
could result in underflow to zero for very small numbers. This effect
|
|
would only occur at the very limit of float values, but it was decided
|
|
there was no benefit to the method worth reducing the range of
|
|
functionality or adding the complexity of checking values to determine
|
|
the order of computation.
|
|
|
|
This leaves the boost "weak" test (2)-- or using the smaller value to
|
|
scale the tolerance, or the Boost "strong" (3) test, which uses the
|
|
smaller of the values to scale the tolerance. For small tolerance,
|
|
they yield the same result, but this proposal uses the boost "weak"
|
|
test case: it is symmetric and provides a more useful result for very
|
|
large tolerances.
|
|
|
|
Large tolerances
|
|
----------------
|
|
|
|
The most common use case is expected to be small tolerances -- on order of the
|
|
default 1e-9. However there may be use cases where a user wants to know if two
|
|
fairly disparate values are within a particular range of each other: "is a
|
|
within 200% (rel_tol = 2.0) of b? In this case, the strong test would never
|
|
indicate that two values are within that range of each other if one of them is
|
|
zero. The weak case, however would use the larger (non-zero) value for the
|
|
test, and thus return true if one value is zero. For example: is 0 within 200%
|
|
of 10? 200% of ten is 20, so the range within 200% of ten is -10 to +30. Zero
|
|
falls within that range, so it will return True.
|
|
|
|
Defaults
|
|
========
|
|
|
|
Default values are required for the relative and absolute tolerance.
|
|
|
|
Relative Tolerance Default
|
|
--------------------------
|
|
|
|
The relative tolerance required for two values to be considered
|
|
"close" is entirely use-case dependent. Nevertheless, the relative
|
|
tolerance needs to be greater than 1e-16 (approximate precision of a
|
|
python float). The value of 1e-9 was selected because it is the
|
|
largest relative tolerance for which the various possible methods will
|
|
yield the same result, and it is also about half of the precision
|
|
available to a python float. In the general case, a good numerical
|
|
algorithm is not expected to lose more than about half of available
|
|
digits of accuracy, and if a much larger tolerance is acceptable, the
|
|
user should be considering the proper value in that case. Thus 1-e9 is
|
|
expected to "just work" for many cases.
|
|
|
|
Absolute tolerance default
|
|
--------------------------
|
|
|
|
The absolute tolerance value will be used primarily for comparing to
|
|
zero. The absolute tolerance required to determine if a value is
|
|
"close" to zero is entirely use-case dependent. There is also
|
|
essentially no bounds to the useful range -- expected values would
|
|
conceivably be anywhere within the limits of a python float. Thus a
|
|
default of 0.0 is selected.
|
|
|
|
If, for a given use case, a user needs to compare to zero, the test
|
|
will be guaranteed to fail the first time, and the user can select an
|
|
appropriate value.
|
|
|
|
It was suggested that comparing to zero is, in fact, a common use case
|
|
(evidence suggest that the numpy functions are often used with zero).
|
|
In this case, it would be desirable to have a "useful" default. Values
|
|
around 1-e8 were suggested, being about half of floating point
|
|
precision for values of around value 1.
|
|
|
|
However, to quote The Zen: "In the face of ambiguity, refuse the
|
|
temptation to guess." Guessing that users will most often be concerned
|
|
with values close to 1.0 would lead to spurious passing tests when used
|
|
with smaller values -- this is potentially more damaging than
|
|
requiring the user to thoughtfully select an appropriate value.
|
|
|
|
|
|
Expected Uses
|
|
=============
|
|
|
|
The primary expected use case is various forms of testing -- "are the
|
|
results computed near what I expect as a result?" This sort of test
|
|
may or may not be part of a formal unit testing suite. Such testing
|
|
could be used one-off at the command line, in an iPython notebook,
|
|
part of doctests, or simple asserts in an ``if __name__ == "__main__"``
|
|
block.
|
|
|
|
It would also be an appropriate function to use for the termination
|
|
criteria for a simple iterative solution to an implicit function::
|
|
|
|
guess = something
|
|
while True:
|
|
new_guess = implicit_function(guess, *args)
|
|
if isclose(new_guess, guess):
|
|
break
|
|
guess = new_guess
|
|
|
|
|
|
Inappropriate uses
|
|
------------------
|
|
|
|
One use case for floating point comparison is testing the accuracy of
|
|
a numerical algorithm. However, in this case, the numerical analyst
|
|
ideally would be doing careful error propagation analysis, and should
|
|
understand exactly what to test for. It is also likely that ULP (Unit
|
|
in the Last Place) comparison may be called for. While this function
|
|
may prove useful in such situations, It is not intended to be used in
|
|
that way.
|
|
|
|
|
|
Other Approaches
|
|
================
|
|
|
|
``unittest.TestCase.assertAlmostEqual``
|
|
---------------------------------------
|
|
|
|
(https://docs.python.org/3/library/unittest.html#unittest.TestCase.assertAlmostEqual)
|
|
|
|
Tests that values are approximately (or not approximately) equal by
|
|
computing the difference, rounding to the given number of decimal
|
|
places (default 7), and comparing to zero.
|
|
|
|
This method is purely an absolute tolerance test, and does not address
|
|
the need for a relative tolerance test.
|
|
|
|
numpy ``isclose()``
|
|
-------------------
|
|
|
|
http://docs.scipy.org/doc/numpy-dev/reference/generated/numpy.isclose.html
|
|
|
|
The numpy package provides the vectorized functions isclose() and
|
|
allclose(), for similar use cases as this proposal:
|
|
|
|
``isclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False)``
|
|
|
|
Returns a boolean array where two arrays are element-wise equal
|
|
within a tolerance.
|
|
|
|
The tolerance values are positive, typically very small numbers.
|
|
The relative difference (rtol * abs(b)) and the absolute
|
|
difference atol are added together to compare against the
|
|
absolute difference between a and b
|
|
|
|
In this approach, the absolute and relative tolerance are added
|
|
together, rather than the ``or`` method used in this proposal. This is
|
|
computationally more simple, and if relative tolerance is larger than
|
|
the absolute tolerance, then the addition will have no effect. However,
|
|
if the absolute and relative tolerances are of similar magnitude, then
|
|
the allowed difference will be about twice as large as expected.
|
|
|
|
This makes the function harder to understand, with no computational
|
|
advantage in this context.
|
|
|
|
Even more critically, if the values passed in are small compared to
|
|
the absolute tolerance, then the relative tolerance will be
|
|
completely swamped, perhaps unexpectedly.
|
|
|
|
This is why, in this proposal, the absolute tolerance defaults to zero
|
|
-- the user will be required to choose a value appropriate for the
|
|
values at hand.
|
|
|
|
|
|
Boost floating-point comparison
|
|
-------------------------------
|
|
|
|
The Boost project ( [3]_ ) provides a floating point comparison
|
|
function. It is a symmetric approach, with both "weak" (larger of the
|
|
two relative errors) and "strong" (smaller of the two relative errors)
|
|
options. This proposal uses the Boost "weak" approach. There is no
|
|
need to complicate the API by providing the option to select different
|
|
methods when the results will be similar in most cases, and the user
|
|
is unlikely to know which to select in any case.
|
|
|
|
|
|
Alternate Proposals
|
|
-------------------
|
|
|
|
|
|
A Recipe
|
|
'''''''''
|
|
|
|
The primary alternate proposal was to not provide a standard library
|
|
function at all, but rather, provide a recipe for users to refer to.
|
|
This would have the advantage that the recipe could provide and
|
|
explain the various options, and let the user select that which is
|
|
most appropriate. However, that would require anyone needing such a
|
|
test to, at the very least, copy the function into their code base,
|
|
and select the comparison method to use.
|
|
|
|
|
|
``zero_tol``
|
|
''''''''''''
|
|
|
|
One possibility was to provide a zero tolerance parameter, rather than
|
|
the absolute tolerance parameter. This would be an absolute tolerance
|
|
that would only be applied in the case of one of the arguments being
|
|
exactly zero. This would have the advantage of retaining the full
|
|
relative tolerance behavior for all non-zero values, while allowing
|
|
tests against zero to work. However, it would also result in the
|
|
potentially surprising result that a small value could be "close" to
|
|
zero, but not "close" to an even smaller value. e.g., 1e-10 is "close"
|
|
to zero, but not "close" to 1e-11.
|
|
|
|
|
|
No absolute tolerance
|
|
'''''''''''''''''''''
|
|
|
|
Given the issues with comparing to zero, another possibility would
|
|
have been to only provide a relative tolerance, and let comparison to
|
|
zero fail. In this case, the user would need to do a simple absolute
|
|
test: `abs(val) < zero_tol` in the case where the comparison involved
|
|
zero.
|
|
|
|
However, this would not allow the same call to be used for a sequence
|
|
of values, such as in a loop or comprehension. Making the function far
|
|
less useful. It is noted that the default abs_tol=0.0 achieves the
|
|
same effect if the default is not overridden.
|
|
|
|
Other tests
|
|
''''''''''''
|
|
|
|
The other tests considered are all discussed in the Relative Error
|
|
section above.
|
|
|
|
|
|
References
|
|
==========
|
|
|
|
.. [1] Python-ideas list discussion threads
|
|
|
|
https://mail.python.org/pipermail/python-ideas/2015-January/030947.html
|
|
|
|
https://mail.python.org/pipermail/python-ideas/2015-January/031124.html
|
|
|
|
https://mail.python.org/pipermail/python-ideas/2015-January/031313.html
|
|
|
|
.. [2] Wikipedia page on relative difference
|
|
|
|
http://en.wikipedia.org/wiki/Relative_change_and_difference
|
|
|
|
.. [3] Boost project floating-point comparison algorithms
|
|
|
|
http://www.boost.org/doc/libs/1_35_0/libs/test/doc/components/test_tools/floating_point_comparison.html
|
|
|
|
.. [4] 1976. R. H. Lathwell. APL comparison tolerance. Proceedings of
|
|
the eighth international conference on APL Pages 255 - 258
|
|
|
|
http://dl.acm.org/citation.cfm?doid=800114.803685
|
|
|
|
.. Bruce Dawson's discussion of floating point.
|
|
|
|
https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/
|
|
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
|
|
..
|
|
Local Variables:
|
|
mode: indented-text
|
|
indent-tabs-mode: nil
|
|
sentence-end-double-space: t
|
|
fill-column: 70
|
|
coding: utf-8
|