Docs: Update execution hint docs for Significant terms agg

copied over the relevant pieces from the terms agg

Closes #8532
This commit is contained in:
Boaz Leskes 2014-11-18 13:13:22 +01:00
parent 7f27664ae0
commit 1e16375d04
1 changed files with 22 additions and 7 deletions

View File

@ -445,12 +445,26 @@ described in the <<search-aggregations-bucket-terms-aggregation,terms aggregatio
===== Execution hint
There are two mechanisms by which terms aggregations can be executed: either by using field values directly in order to aggregate
data per-bucket (`map`), or by using ordinals of the field values instead of the values themselves (`ordinals`). Although the
latter execution mode can be expected to be slightly faster, it is only available for use when the underlying data source exposes
those terms ordinals. Moreover, it may actually be slower if most field values are unique. Elasticsearch tries to have sensible
defaults when it comes to the execution mode that should be used, but in case you know that an execution mode may perform better
than the other one, you have the ability to provide Elasticsearch with a hint:
There are different mechanisms by which terms aggregations can be executed:
- by using field values directly in order to aggregate data per-bucket (`map`)
- by using ordinals of the field and preemptively allocating one bucket per ordinal value (`global_ordinals`)
- by using ordinals of the field and dynamically allocating one bucket per ordinal value (`global_ordinals_hash`)
Elasticsearch tries to have sensible defaults so this is something that generally doesn't need to be configured.
`map` should only be considered when very few documents match a query. Otherwise the ordinals-based execution modes
are significantly faster. By default, `map` is only used when running an aggregation on scripts, since they don't have
ordinals.
`global_ordinals` is the second fastest option, but the fact that it preemptively allocates buckets can be memory-intensive,
especially if you have one or more sub aggregations. It is used by default on top-level terms aggregations.
`global_ordinals_hash` on the contrary to `global_ordinals` and `global_ordinals_low_cardinality` allocates buckets dynamically
so memory usage is linear to the number of values of the documents that are part of the aggregation scope. It is used by default
in inner aggregations.
[source,js]
--------------------------------------------------
@ -466,6 +480,7 @@ than the other one, you have the ability to provide Elasticsearch with a hint:
}
--------------------------------------------------
<1> the possible values are `map` and `ordinals`
<1> the possible values are `map`, `global_ordinals` and `global_ordinals_hash`
Please note that Elasticsearch will ignore this execution hint if it is not applicable.