From d9817461420de4be034895e208c5292c7aaf7c19 Mon Sep 17 00:00:00 2001
From: Zachary Tong <polyfractal@elastic.co>
Date: Mon, 22 Oct 2018 13:15:45 -0400
Subject: [PATCH] [Docs] clarification about cardinality accuracy (#34616)

Adds a bit more clarification about how accuracy is dependent
on the dataset in question.

Closes #18231
---
 .../metrics/cardinality-aggregation.asciidoc     | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/docs/reference/aggregations/metrics/cardinality-aggregation.asciidoc b/docs/reference/aggregations/metrics/cardinality-aggregation.asciidoc
index 96822f6ea9c..a451c6da0db 100644
--- a/docs/reference/aggregations/metrics/cardinality-aggregation.asciidoc
+++ b/docs/reference/aggregations/metrics/cardinality-aggregation.asciidoc
@@ -150,10 +150,18 @@ public static void main(String[] args) {
 
 image:images/cardinality_error.png[]
 
-For all 3 thresholds, counts have been accurate up to the configured threshold
-(although not guaranteed, this is likely to be the case). Please also note that
-even with a threshold as low as 100, the error remains very low, even when
-counting millions of items.
+For all 3 thresholds, counts have been accurate up to the configured threshold.
+Although not guaranteed, this is likely to be the case.  Accuracy in practice depends
+on the dataset in question.  In general, most datasets show consistently good
+accuracy. Also note that even with a threshold as low as 100, the error
+remains very low (1-6% as seen in the above graph) even when counting millions of items.
+
+The HyperLogLog++ algorithm depends on the leading zeros of hashed
+values, the exact distributions of hashes in a dataset can affect the 
+accuracy of the cardinality.  
+
+Please also note that even with a threshold as low as 100, the error remains
+very low, even when counting millions of items.
 
 ==== Pre-computed hashes