OpenSearch/docs/reference
Jim Ferenczi 623367d793
Add composite aggregator (#26800)
* This change adds a module called `aggs-composite` that defines a new aggregation named `composite`.
The `composite` aggregation is a multi-buckets aggregation that creates composite buckets made of multiple sources.
The sources for each bucket can be defined as:
  * A `terms` source, values are extracted from a field or a script.
  * A `date_histogram` source, values are extracted from a date field and rounded to the provided interval.
This aggregation can be used to retrieve all buckets of a deeply nested aggregation by flattening the nested aggregation in composite buckets.
A composite buckets is composed of one value per source and is built for each document as the combinations of values in the provided sources.
For instance the following aggregation:

````
"test_agg": {
  "terms": {
    "field": "field1"
  },
  "aggs": {
    "nested_test_agg":
      "terms": {
        "field": "field2"
      }
  }
}
````
... which retrieves the top N terms for `field1` and for each top term in `field1` the top N terms for `field2`, can be replaced by a `composite` aggregation in order to retrieve **all** the combinations of `field1`, `field2` in the matching documents:

````
"composite_agg": {
  "composite": {
    "sources": [
      {
	"field1": {
          "terms": {
              "field": "field1"
            }
        }
      },
      {
	"field2": {
          "terms": {
            "field": "field2"
          }
        }
      },
    }
  }
````

The response of the aggregation looks like this:

````
"aggregations": {
  "composite_agg": {
    "buckets": [
      {
        "key": {
          "field1": "alabama",
          "field2": "almanach"
        },
        "doc_count": 100
      },
      {
        "key": {
          "field1": "alabama",
          "field2": "calendar"
        },
        "doc_count": 1
      },
      {
        "key": {
          "field1": "arizona",
          "field2": "calendar"
        },
        "doc_count": 1
      }
    ]
  }
}
````

By default this aggregation returns 10 buckets sorted in ascending order of the composite key.
Pagination can be achieved by providing `after` values, the values of the composite key to aggregate after.
For instance the following aggregation will aggregate all composite keys that sorts after `arizona, calendar`:

````
"composite_agg": {
  "composite": {
    "after": {"field1": "alabama", "field2": "calendar"},
    "size": 100,
    "sources": [
      {
	"field1": {
          "terms": {
            "field": "field1"
          }
        }
      },
      {
	"field2": {
          "terms": {
            "field": "field2"
          }
	}
      }
    }
  }
````

This aggregation is optimized for indices that set an index sorting that match the composite source definition.
For instance the aggregation above could run faster on indices that defines an index sorting like this:

````
"settings": {
  "index.sort.field": ["field1", "field2"]
}
````

In this case the `composite` aggregation can early terminate on each segment.
This aggregation also accepts multi-valued field but disables early termination for these fields even if index sorting matches the sources definition.
This is mandatory because index sorting picks only one value per document to perform the sort.
2017-11-16 15:13:36 +01:00
..
aggregations Add composite aggregator (#26800) 2017-11-16 15:13:36 +01:00
analysis Add limits for ngram and shingle settings (#27211) 2017-11-07 08:14:55 -05:00
cat Add docs on full_id parameter in cat nodes API 2017-10-13 13:49:25 -04:00
cluster Add cgroup memory usage/limit to OS stats on Linux (#26166) 2017-10-03 12:08:36 +01:00
docs action.auto_create_index can be set as a dynamic cluster setting (#27026) 2017-10-17 20:44:18 +00:00
how-to Add documentation about disabling `_field_names`. (#26813) 2017-10-06 16:49:15 +02:00
images Docs/windows installer (#27369) 2017-11-15 21:35:54 +11:00
index-modules Logging: Unify log rotation for index/search slow log (#27298) 2017-11-15 10:01:32 +01:00
indices Add ability to split shards (#26931) 2017-11-06 11:37:55 +01:00
ingest add json-processor support for non-map json types (#27335) 2017-11-13 10:28:19 -08:00
mapping [Docs] Restore section about multi-level parent/child relation in parent-join (#27392) 2017-11-16 11:29:16 +01:00
migration Fail queries with scroll that explicitely set request_cache (#27342) 2017-11-10 16:02:06 +01:00
modules Fixed references to Multi Index Syntax (#27283) 2017-11-06 19:15:36 +01:00
query-dsl Added new terms_set query 2017-11-01 10:55:18 +01:00
release-notes Migrate migration docs from 6.0 to 7.0 (#26227) 2017-08-16 13:12:44 -06:00
search Fix profiling naming issues (#27133) 2017-11-06 16:37:33 -05:00
setup Docs/windows installer (#27369) 2017-11-15 21:35:54 +11:00
testing Docs: Replace deprecated pluginList with Arrays.asList (#24270) 2017-04-24 13:30:37 +02:00
aggregations.asciidoc Update aggregation.asciidoc (#24042) 2017-04-11 09:02:38 -04:00
analysis.asciidoc Add the ability to set an analyzer on keyword fields. (#21919) 2016-12-30 09:36:10 +01:00
api-conventions.asciidoc [Docs] Fix minor paragraph indentation error for multiple Indices params (#25535) 2017-11-06 10:20:20 +01:00
cat.asciidoc Enforce that responses in docs are valid json (#26249) 2017-08-17 09:02:10 -04:00
cluster.asciidoc [docs] include two cluster doc pages missing from index (#25180) 2017-06-12 12:33:56 -07:00
docs.asciidoc Inclusion of link to Multi Delete (#22619) 2017-01-16 12:58:59 +01:00
getting-started.asciidoc Tests: Improve size regex in documentation test (#26879) 2017-11-13 10:21:53 +01:00
glossary.asciidoc Remove usage of multi-types from the docs and added a page explaining type removal (#25543) 2017-07-05 12:30:19 +02:00
gs-index.asciidoc [DOCS] Adding index file for GS "mini book". 2017-07-18 13:44:08 -07:00
how-to.asciidoc Correct grammar in list in how-to docs 2017-01-17 20:57:22 -05:00
index-modules.asciidoc Add limits for ngram and shingle settings (#27211) 2017-11-07 08:14:55 -05:00
index-shared1.asciidoc [DOCS] Split index-shared.asciidoc into multiple smaller files (#25302) 2017-06-19 15:14:53 -07:00
index-shared2.asciidoc [DOCS] Added index-shared4 and index-shared5.asciidoc 2017-09-20 10:54:26 -07:00
index-shared3.asciidoc [DOCS] Added index-shared4 and index-shared5.asciidoc 2017-09-20 10:54:26 -07:00
index-shared4.asciidoc [DOCS] Added index-shared4 and index-shared5.asciidoc 2017-09-20 10:54:26 -07:00
index-shared5.asciidoc [DOCS] Added index-shared4 and index-shared5.asciidoc 2017-09-20 10:54:26 -07:00
index.asciidoc [DOCS] Added index-shared4 and index-shared5.asciidoc 2017-09-20 10:54:26 -07:00
indices.asciidoc add split index reference in indices.asciidoc 2017-11-06 12:55:41 +01:00
ingest.asciidoc [Docs] Update ingest.asciidoc (#26599) 2017-09-15 11:15:31 +02:00
mapping.asciidoc Remove the _all metadata field (#26356) 2017-08-28 17:43:59 +02:00
modules.asciidoc [DOCS] Remove edit link from ML node 2017-09-14 16:18:29 -07:00
query-dsl.asciidoc
redirects.asciidoc Mark filtered query example as not to be used (#25661) 2017-07-14 11:45:21 +02:00
release-notes.asciidoc Migrate migration docs from 6.0 to 7.0 (#26227) 2017-08-16 13:12:44 -06:00
search.asciidoc Enable adaptive replica selection by default (#26522) 2017-09-07 09:25:05 -06:00
setup.asciidoc Reorganised setup docs into better order 2017-07-21 11:24:46 +02:00
testing.asciidoc