[Doc] Add a chart about the relative error of the percentiles aggregation.

This commit is contained in:
Adrien Grand 2014-03-14 12:22:48 +01:00
parent d80dd00424
commit eef71da650
2 changed files with 10 additions and 0 deletions

Binary file not shown.

After

Width:  |  Height:  |  Size: 20 KiB

View File

@ -146,6 +146,16 @@ the percentiles. It is effectively trading accuracy for memory savings. The
exact level of inaccuracy is difficult to generalize, since it depends on your
data distribution and volume of data being aggregated
The following chart shows the relative error on a uniform distribution depending
on the number of collected values and the requested percentile:
image:images/percentiles_error.png[]
It shows how precision is better for extreme percentiles. The reason why error diminishes
for large number of values is that the law of large numbers makes the distribution of
values more and more uniform and the t-digest tree can do a better job at summarizing
it. It would not be the case on more skewed distributions.
==== Compression
Approximate algorithms must balance memory utilization with estimation accuracy.