Adrien Grand 866a5459f0 Make significant terms work on fields that are indexed with points. #18031
It will keep using the caching terms enum for keyword/text fields and falls back
to IndexSearcher.count for fields that do not use the inverted index for
searching (such as numbers and ip addresses). Note that this probably means that
significant terms aggregations on these fields will be less efficient than they
used to be. It should be ok under a sampler aggregation though.

This moves tests back to the state they were in before numbers started using
points, and also adds a new test that significant terms aggs fail if a field is
not indexed.

In the long term, we might want to follow the approach that Robert initially
proposed that consists in collecting all documents from the background filter in
order to compute frequencies using doc values. This would also mean that
significant terms aggregations do not require fields to be indexed anymore.
2016-05-11 16:52:58 +02:00

16 lines
755 B
Plaintext

[[breaking_50_aggregations_changes]]
=== Aggregation changes
==== Significant terms on numeric fields
Numeric fields have been refactored to use a different data structure that
performs better for range queries. However, since this data structure does
not record document frequencies, numeric fields need to fall back to running
queries in order to estimate the number of matching documents in the
background set, which may incur a performance degradation.
It is recommended to use <<keyword,`keyword`>> fields instead, either directly
or through a <<multi-fields,multi-field>> if the numeric representation is
still needed for sorting, range queries or numeric aggregations like
<<search-aggregations-metrics-stats-aggregation,`stats` aggregations>>.