OpenSearch/docs/reference/aggregations/bucket
Luca Cavanna 42ea644903
Remove single shard optimization when suggesting shard_size (#37041)
When executing terms aggregations we set the shard_size, meaning the
number of buckets to collect on each shard, to a value that's higher than
the number of requested buckets, to guarantee some basic level of
precision. We have an optimization in place so that we leave shard_size
set to size whenever we are searching against a single shard, in which
case maximum precision is guaranteed by definition.

Such optimization requires us access to the total number of shards that
the search is executing against. In the context of cross-cluster search,
once we will introduce multiple reduction steps (one per cluster) each
cluster will only know the number of local shards, which is problematic
as we should only optimize if we are searching against a single shard in a
single cluster. It could be that we are searching against one shard per cluster
in which case the current code would optimize number of terms causing
a loss of precision.

While discussing how to address the CCS scenario, we decided that we do
not want to introduce further complexity caused by this single shard
optimization, as it benefits only a minority of cases, especially when
the benefits are not so great.

This commit removes the single shard optimization, meaning that we will
always have heuristic enabled on how many number of buckets to collect
on the shards, even when searching against a single shard.

This will cause more buckets to be collected when searching against a single
shard compared to before. If that becomes a problem for some users, they
can work around that by setting the shard_size equal to the size.

Relates to #32125
2019-01-02 17:45:49 +01:00
..
adjacency-matrix-aggregation.asciidoc Docs - removed experimental/beta markers from adjacency matrix aggregation (#34599) 2018-10-19 09:33:59 +01:00
autodatehistogram-aggregation.asciidoc Add interval response parameter to AutoDateInterval histogram (#33254) 2018-09-05 07:35:59 -04:00
children-aggregation.asciidoc Make hits.total an object in the search response (#35849) 2018-12-05 19:49:06 +01:00
composite-aggregation.asciidoc Make sure to use the type _doc in the REST documentation. (#34662) 2018-10-22 11:54:04 -07:00
datehistogram-aggregation.asciidoc [Docs] Fix typo in datehistogram-aggregation.asciidoc (#35855) 2018-11-23 15:16:53 +01:00
daterange-aggregation.asciidoc Document and test date_range "missing" support (#28983) 2018-03-13 12:58:30 -07:00
diversified-sampler-aggregation.asciidoc Adds deprecation logging to ScriptDocValues#getValues. (#34279) 2018-11-27 14:30:13 -05:00
filter-aggregation.asciidoc Update filter-aggregation.asciidoc (#24138) 2017-04-17 18:46:13 -04:00
filters-aggregation.asciidoc filters agg docs duplicated 'bucket' word removal (#30677) 2018-05-17 15:21:50 +01:00
geodistance-aggregation.asciidoc Make sure to use the type _doc in the REST documentation. (#34662) 2018-10-22 11:54:04 -07:00
geohashgrid-aggregation.asciidoc Make sure to use the type _doc in the REST documentation. (#34662) 2018-10-22 11:54:04 -07:00
global-aggregation.asciidoc CONSOLE-ify global-aggregation.asciidoc 2017-01-20 14:36:51 -05:00
histogram-aggregation.asciidoc [DOC] Fix mathematical representation on interval (range) (#27450) 2017-11-21 17:06:26 +00:00
iprange-aggregation.asciidoc Ensure that ip_range aggregations always return bucket keys. (#30701) 2018-05-24 08:55:14 -07:00
missing-aggregation.asciidoc CONSOLEify some more aggregation docs 2017-05-16 17:25:24 -04:00
nested-aggregation.asciidoc fixing typo in nested-aggregation.asciidoc (#26481) 2017-09-04 06:42:44 +02:00
parent-aggregation.asciidoc Make hits.total an object in the search response (#35849) 2018-12-05 19:49:06 +01:00
range-aggregation.asciidoc [Docs] Convert remaining code snippets in docs (#26422) 2017-08-30 12:11:10 +02:00
reverse-nested-aggregation.asciidoc [Doc] Fixs typo in reverse-nested-aggregation.asciidoc (#28348) 2018-01-24 17:54:02 +01:00
sampler-aggregation.asciidoc Update experimental labels in the docs (#25727) 2017-07-18 14:06:22 +02:00
significantterms-aggregation.asciidoc Remove single shard optimization when suggesting shard_size (#37041) 2019-01-02 17:45:49 +01:00
significanttext-aggregation.asciidoc Remove single shard optimization when suggesting shard_size (#37041) 2019-01-02 17:45:49 +01:00
terms-aggregation.asciidoc Remove single shard optimization when suggesting shard_size (#37041) 2019-01-02 17:45:49 +01:00