Docs: Update execution hint docs for Significant terms agg

copied over the relevant pieces from the terms agg Closes #8532
2025-02-26 14:54:56 +00:00 · 2014-11-18 13:13:22 +01:00 · 2014-11-18 13:13:22 +01:00 · 1e16375d04
commit 1e16375d04
parent 7f27664ae0
1 changed files with 22 additions and 7 deletions
--- a/docs/reference/search/aggregations/bucket/significantterms-aggregation.asciidoc
+++ b/docs/reference/search/aggregations/bucket/significantterms-aggregation.asciidoc
@ -445,12 +445,26 @@ described in the <<search-aggregations-bucket-terms-aggregation,terms aggregatio

 ===== Execution hint

-There are two mechanisms by which terms aggregations can be executed: either by using field values directly in order to aggregate
-data per-bucket (`map`), or by using ordinals of the field values instead of the values themselves (`ordinals`). Although the
-latter execution mode can be expected to be slightly faster, it is only available for use when the underlying data source exposes
-those terms ordinals. Moreover, it may actually be slower if most field values are unique. Elasticsearch tries to have sensible
-defaults when it comes to the execution mode that should be used, but in case you know that an execution mode may perform better
-than the other one, you have the ability to provide Elasticsearch with a hint:
+
+There are different mechanisms by which terms aggregations can be executed:
+
+ - by using field values directly in order to aggregate data per-bucket (`map`)
+ - by using ordinals of the field and preemptively allocating one bucket per ordinal value (`global_ordinals`)
+ - by using ordinals of the field and dynamically allocating one bucket per ordinal value (`global_ordinals_hash`)
+ 
+Elasticsearch tries to have sensible defaults so this is something that generally doesn't need to be configured.
+
+`map` should only be considered when very few documents match a query. Otherwise the ordinals-based execution modes
+are significantly faster. By default, `map` is only used when running an aggregation on scripts, since they don't have
+ordinals.
+
+`global_ordinals` is the second fastest option, but the fact that it preemptively allocates buckets can be memory-intensive,
+especially if you have one or more sub aggregations. It is used by default on top-level terms aggregations.
+
+`global_ordinals_hash` on the contrary to `global_ordinals` and `global_ordinals_low_cardinality` allocates buckets dynamically
+so memory usage is linear to the number of values of the documents that are part of the aggregation scope. It is used by default
+in inner aggregations.
+

 [source,js]
 --------------------------------------------------
@ -466,6 +480,7 @@ than the other one, you have the ability to provide Elasticsearch with a hint:
 }
 --------------------------------------------------

-<1> the possible values are `map` and `ordinals`
+<1> the possible values are `map`, `global_ordinals` and `global_ordinals_hash`

 Please note that Elasticsearch will ignore this execution hint if it is not applicable.
+