diff --git a/docs/reference/search/aggregations/bucket/terms-aggregation.asciidoc b/docs/reference/search/aggregations/bucket/terms-aggregation.asciidoc index 5d40021eba7..4310847d026 100644 --- a/docs/reference/search/aggregations/bucket/terms-aggregation.asciidoc +++ b/docs/reference/search/aggregations/bucket/terms-aggregation.asciidoc @@ -395,15 +395,32 @@ this would typically be too costly in terms of RAM. ==== Execution hint -added[1.2.0] The `global_ordinals` execution mode +added[1.2.0] Added the `global_ordinals`, `global_ordinals_hash` and `global_ordinals_low_cardinality` execution modes -There are three mechanisms by which terms aggregations can be executed: either by using field values directly in order to aggregate -data per-bucket (`map`), by using ordinals of the field values instead of the values themselves (`ordinals`) or by using global -ordinals of the field (`global_ordinals`). The latter is faster, especially for fields with many unique -values. However it can be slower if only a few documents match, when for example a terms aggregator is nested in another -aggregator, this applies for both `ordinals` and `global_ordinals` execution modes. Elasticsearch tries to have sensible -defaults when it comes to the execution mode that should be used, but in case you know that one execution mode may -perform better than the other one, you have the ability to "hint" it to Elasticsearch: +deprecated[1.3.0] Removed the `ordinals` execution mode + +There are different mechanisms by which terms aggregations can be executed: + + - by using field values directly in order to aggregate data per-bucket (`map`) + - by using ordinals of the field and preemptively allocating one bucket per ordinal value (`global_ordinals`) + - by using ordinals of the field and dynamically allocating one bucket per ordinal value (`global_ordinals_hash`) + - by using per-segment ordinals to compute counts and remap these counts to global counts using global ordinals (`global_ordinals_low_cardinality`) + +Elasticsearch tries to have sensible defaults so this is something that generally doesn't need to be configured. + +`map` should only be considered when very few documents match a query. Otherwise the ordinals-based execution modes +are significantly faster. By default, `map` is only used when running an aggregation on scripts, since they don't have +ordinals. + +`global_ordinals_low_cardinality` only works for leaf terms aggregations but is usually the fastest execution mode. Memory +usage is linear with the number of unique values in the field, so it is only enabled by default on low-cardinality fields. + +`global_ordinals` is the second fastest option, but the fact that it preemptively allocates buckets can be memory-intensive, +especially if you have one or more sub aggregations. It is used by default on top-level terms aggregations. + +`global_ordinals_hash` on the contrary to `global_ordinals` and `global_ordinals_low_cardinality` allocates buckets dynamically +so memory usage is linear to the number of values of the documents that are part of the aggregation scope. It is used by default +in inner aggregations. [source,js] -------------------------------------------------- @@ -419,6 +436,6 @@ perform better than the other one, you have the ability to "hint" it to Elastics } -------------------------------------------------- -<1> the possible values are `map`, `ordinals` and `global_ordinals` +<1> the possible values are `map`, `global_ordinals`, `global_ordinals_hash` and `global_ordinals_low_cardinality` -Please note that Elasticsearch will ignore this execution hint if it is not applicable. +Please note that Elasticsearch will ignore this execution hint if it is not applicable and that there is no backward compatibility guarantee on these hints.