OpenSearch/docs/reference/aggregations/bucket
Zachary Tong ea1794832f Add RareTerms aggregation (#35718)
This adds a `rare_terms` aggregation.  It is an aggregation designed
to identify the long-tail of keywords, e.g. terms that are "rare" or
have low doc counts.

This aggregation is designed to be more memory efficient than the
alternative, which is setting a terms aggregation to size: LONG_MAX
(or worse, ordering a terms agg by count ascending, which has
unbounded error).

This aggregation works by maintaining a map of terms that have
been seen. A counter associated with each value is incremented
when we see the term again.  If the counter surpasses a predefined
threshold, the term is removed from the map and inserted into a cuckoo
filter.  If a future term is found in the cuckoo filter we assume it
was previously removed from the map and is "common".

The map keys are the "rare" terms after collection is done.
2019-07-01 10:30:02 -04:00
..
adjacency-matrix-aggregation.asciidoc Use new bulk API endpoint in the docs (#37698) 2019-01-23 09:46:28 +01:00
autodatehistogram-aggregation.asciidoc [backport] Adds a minimum interval to `auto_date_histogram`. (#42814) (#43285) 2019-06-19 07:06:45 -04:00
children-aggregation.asciidoc Remove more include_type_name and types from docs (#37601) 2019-01-18 14:11:18 +01:00
composite-aggregation.asciidoc [7.x Backport] Force selection of calendar or fixed intervals (#41906) 2019-05-20 12:07:29 -04:00
datehistogram-aggregation.asciidoc [7.x Backport] Force selection of calendar or fixed intervals (#41906) 2019-05-20 12:07:29 -04:00
daterange-aggregation.asciidoc [DOCS] Fixes callout for Asciidoctor migration (#41127) 2019-04-11 12:06:10 -07:00
diversified-sampler-aggregation.asciidoc [DOCS] Add anchors for Asciidoctor migration (#41648) 2019-04-30 10:20:17 -04:00
filter-aggregation.asciidoc Update filter-aggregation.asciidoc (#24138) 2017-04-17 18:46:13 -04:00
filters-aggregation.asciidoc [DOCS] Add anchors for Asciidoctor migration (#41648) 2019-04-30 10:20:17 -04:00
geodistance-aggregation.asciidoc Remove more include_type_name and types from docs (#37601) 2019-01-18 14:11:18 +01:00
geohashgrid-aggregation.asciidoc Remove more include_type_name and types from docs (#37601) 2019-01-18 14:11:18 +01:00
geotilegrid-aggregation.asciidoc geotile_grid implementation (#37842) 2019-01-31 19:11:30 -05:00
global-aggregation.asciidoc CONSOLE-ify global-aggregation.asciidoc 2017-01-20 14:36:51 -05:00
histogram-aggregation.asciidoc [DOC] Fix mathematical representation on interval (range) (#27450) 2017-11-21 17:06:26 +00:00
iprange-aggregation.asciidoc Ensure that ip_range aggregations always return bucket keys. (#30701) 2018-05-24 08:55:14 -07:00
missing-aggregation.asciidoc CONSOLEify some more aggregation docs 2017-05-16 17:25:24 -04:00
nested-aggregation.asciidoc Remove `include_type_name` in asciidoc where possible (#37568) 2019-01-18 09:34:11 +01:00
parent-aggregation.asciidoc Remove more include_type_name and types from docs (#37601) 2019-01-18 14:11:18 +01:00
range-aggregation.asciidoc [Docs] Convert remaining code snippets in docs (#26422) 2017-08-30 12:11:10 +02:00
rare-terms-aggregation.asciidoc Add RareTerms aggregation (#35718) 2019-07-01 10:30:02 -04:00
reverse-nested-aggregation.asciidoc Remove `include_type_name` in asciidoc where possible (#37568) 2019-01-18 09:34:11 +01:00
sampler-aggregation.asciidoc [DOCS] Add anchors for Asciidoctor migration (#41648) 2019-04-30 10:20:17 -04:00
significantterms-aggregation.asciidoc [DOCS] Add anchors for Asciidoctor migration (#41648) 2019-04-30 10:20:17 -04:00
significanttext-aggregation.asciidoc [DOCS] Add anchors for Asciidoctor migration (#41648) 2019-04-30 10:20:17 -04:00
terms-aggregation.asciidoc Use the breadth first collection mode for significant terms aggs. (#29042) 2019-04-11 15:56:02 -07:00