Historically only two things happened in the final reduction:
empty buckets were filled, and pipeline aggs were reduced (since it
was the final reduction, this was safe). Usage of the final reduction
is growing however. Auto-date-histo might need to perform
many reductions on final-reduce to merge down buckets, CCS
may need to side-step the final reduction if sending to a
different cluster, etc
Having pipelines generate their output in the final reduce was
convenient, but is becoming increasingly difficult to manage
as the rest of the agg framework advances.
This commit decouples pipeline aggs from the final reduction by
introducing a new "top level" reduce, which should be called
at the beginning of the reduce cycle (e.g. from the SearchPhaseController).
This will only reduce pipeline aggs on the final reduce after
the non-pipeline agg tree has been fully reduced.
By separating pipeline reduction into their own set of methods,
aggregations are free to use the final reduction for whatever
purpose without worrying about generating pipeline results
which are non-reducible
This is a pure code rearrangement refactor. Logic for what specific ValuesSource instance to use for a given type (e.g. script or field) moved out of ValuesSourceConfig and into CoreValuesSourceType (previously just ValueSourceType; we extract an interface for future extensibility). ValueSourceConfig still selects which case to use, and then the ValuesSourceType instance knows how to construct the ValuesSource for that case.
Backport of #47468 to 7.x
This PR adds a new metric aggregation called string_stats that operates on string terms of a document and returns the following:
min_length: The length of the shortest term
max_length: The length of the longest term
avg_length: The average length of all terms
distribution: The probability distribution of all characters appearing in all terms
entropy: The total Shannon entropy value calculated for all terms
This aggregation has been implemented as an analytics plugin.
This commit replaces the `SearchContext` with the `QueryShardContext` when building aggregator factories. Aggregator factories are part of the `SearchContext` so they shouldn't require a `SearchContext` to create them.
The main changes here are the signatures of `AggregationBuilder#build` that now takes a `QueryShardContext` and `AggregatorFactory#createInternal` that passes the `SearchContext` to build the `Aggregator`.
Relates #46523