OpenSearch/docs/reference/aggregations
Zachary Tong ea1794832f Add RareTerms aggregation (#35718)
This adds a `rare_terms` aggregation.  It is an aggregation designed
to identify the long-tail of keywords, e.g. terms that are "rare" or
have low doc counts.

This aggregation is designed to be more memory efficient than the
alternative, which is setting a terms aggregation to size: LONG_MAX
(or worse, ordering a terms agg by count ascending, which has
unbounded error).

This aggregation works by maintaining a map of terms that have
been seen. A counter associated with each value is incremented
when we see the term again.  If the counter surpasses a predefined
threshold, the term is removed from the map and inserted into a cuckoo
filter.  If a future term is found in the cuckoo filter we assume it
was previously removed from the map and is "common".

The map keys are the "rare" terms after collection is done.
2019-07-01 10:30:02 -04:00
..
bucket Add RareTerms aggregation (#35718) 2019-07-01 10:30:02 -04:00
matrix Allow `_doc` as a type. (#27816) 2017-12-14 17:47:53 +01:00
metrics [DOCS] Add anchors for Asciidoctor migration (#41648) 2019-04-30 10:20:17 -04:00
pipeline [DOCS] Remove unneeded `ifdef::asciidoctor[]` conditionals (#42758) 2019-05-31 11:08:54 -04:00
bucket.asciidoc Add RareTerms aggregation (#35718) 2019-07-01 10:30:02 -04:00
matrix.asciidoc refactor matrix agg documentation from modules to main agg section 2016-06-06 07:39:00 -05:00
metrics.asciidoc median absolute deviation agg (#34482) 2018-10-30 07:22:52 -07:00
misc.asciidoc [7.x Backport] Force selection of calendar or fixed intervals (#41906) 2019-05-20 12:07:29 -04:00
pipeline.asciidoc [7.x Backport] Force selection of calendar or fixed intervals (#41906) 2019-05-20 12:07:29 -04:00