Docs: Update documentation about execution hints for the terms aggregation.
This commit is contained in:
parent
f3114fe774
commit
abeefbddea
|
@ -395,15 +395,32 @@ this would typically be too costly in terms of RAM.
|
|||
|
||||
==== Execution hint
|
||||
|
||||
added[1.2.0] The `global_ordinals` execution mode
|
||||
added[1.2.0] Added the `global_ordinals`, `global_ordinals_hash` and `global_ordinals_low_cardinality` execution modes
|
||||
|
||||
There are three mechanisms by which terms aggregations can be executed: either by using field values directly in order to aggregate
|
||||
data per-bucket (`map`), by using ordinals of the field values instead of the values themselves (`ordinals`) or by using global
|
||||
ordinals of the field (`global_ordinals`). The latter is faster, especially for fields with many unique
|
||||
values. However it can be slower if only a few documents match, when for example a terms aggregator is nested in another
|
||||
aggregator, this applies for both `ordinals` and `global_ordinals` execution modes. Elasticsearch tries to have sensible
|
||||
defaults when it comes to the execution mode that should be used, but in case you know that one execution mode may
|
||||
perform better than the other one, you have the ability to "hint" it to Elasticsearch:
|
||||
deprecated[1.3.0] Removed the `ordinals` execution mode
|
||||
|
||||
There are different mechanisms by which terms aggregations can be executed:
|
||||
|
||||
- by using field values directly in order to aggregate data per-bucket (`map`)
|
||||
- by using ordinals of the field and preemptively allocating one bucket per ordinal value (`global_ordinals`)
|
||||
- by using ordinals of the field and dynamically allocating one bucket per ordinal value (`global_ordinals_hash`)
|
||||
- by using per-segment ordinals to compute counts and remap these counts to global counts using global ordinals (`global_ordinals_low_cardinality`)
|
||||
|
||||
Elasticsearch tries to have sensible defaults so this is something that generally doesn't need to be configured.
|
||||
|
||||
`map` should only be considered when very few documents match a query. Otherwise the ordinals-based execution modes
|
||||
are significantly faster. By default, `map` is only used when running an aggregation on scripts, since they don't have
|
||||
ordinals.
|
||||
|
||||
`global_ordinals_low_cardinality` only works for leaf terms aggregations but is usually the fastest execution mode. Memory
|
||||
usage is linear with the number of unique values in the field, so it is only enabled by default on low-cardinality fields.
|
||||
|
||||
`global_ordinals` is the second fastest option, but the fact that it preemptively allocates buckets can be memory-intensive,
|
||||
especially if you have one or more sub aggregations. It is used by default on top-level terms aggregations.
|
||||
|
||||
`global_ordinals_hash` on the contrary to `global_ordinals` and `global_ordinals_low_cardinality` allocates buckets dynamically
|
||||
so memory usage is linear to the number of values of the documents that are part of the aggregation scope. It is used by default
|
||||
in inner aggregations.
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
|
@ -419,6 +436,6 @@ perform better than the other one, you have the ability to "hint" it to Elastics
|
|||
}
|
||||
--------------------------------------------------
|
||||
|
||||
<1> the possible values are `map`, `ordinals` and `global_ordinals`
|
||||
<1> the possible values are `map`, `global_ordinals`, `global_ordinals_hash` and `global_ordinals_low_cardinality`
|
||||
|
||||
Please note that Elasticsearch will ignore this execution hint if it is not applicable.
|
||||
Please note that Elasticsearch will ignore this execution hint if it is not applicable and that there is no backward compatibility guarantee on these hints.
|
||||
|
|
Loading…
Reference in New Issue