[[search-aggregations]] = Aggregations [partintro] -- An aggregation summarizes your data as metrics, statistics, or other analytics. Aggregations help you answer questions like: * What's the average load time for my website? * Who are my most valuable customers based on transaction volume? * What would be considered a large file on my network? * How many products are in each product category? {es} organizes aggregations into three categories: * <> aggregations that calculate metrics, such as a sum or average, from field values. * <> aggregations that group documents into buckets, also called bins, based on field values, ranges, or other criteria. * <> aggregations that take input from other aggregations instead of documents or fields. [discrete] [[run-an-agg]] === Run an aggregation You can run aggregations as part of a <> by specifying the <>'s `aggs` parameter. The following search runs a <> on `my-field`: [source,console] ---- GET /my-index-000001/_search { "aggs": { "my-agg-name": { "terms": { "field": "my-field" } } } } ---- // TEST[setup:my_index] // TEST[s/my-field/http.request.method/] Aggregation results are in the response's `aggregations` object: [source,console-result] ---- { "took": 78, "timed_out": false, "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 5, "relation": "eq" }, "max_score": 1.0, "hits": [...] }, "aggregations": { "my-agg-name": { <1> "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [] } } } ---- // TESTRESPONSE[s/"took": 78/"took": "$body.took"/] // TESTRESPONSE[s/\.\.\.$/"took": "$body.took", "timed_out": false, "_shards": "$body._shards", /] // TESTRESPONSE[s/"hits": \[\.\.\.\]/"hits": "$body.hits.hits"/] // TESTRESPONSE[s/"buckets": \[\]/"buckets":\[\{"key":"get","doc_count":5\}\]/] <1> Results for the `my-agg-name` aggregation. [discrete] [[change-agg-scope]] === Change an aggregation's scope Use the `query` parameter to limit the documents on which an aggregation runs: [source,console] ---- GET /my-index-000001/_search { "query": { "range": { "@timestamp": { "gte": "now-1d/d", "lt": "now/d" } } }, "aggs": { "my-agg-name": { "terms": { "field": "my-field" } } } } ---- // TEST[setup:my_index] // TEST[s/my-field/http.request.method/] [discrete] [[return-only-agg-results]] === Return only aggregation results By default, searches containing an aggregation return both search hits and aggregation results. To return only aggregation results, set `size` to `0`: [source,console] ---- GET /my-index-000001/_search { "size": 0, "aggs": { "my-agg-name": { "terms": { "field": "my-field" } } } } ---- // TEST[setup:my_index] // TEST[s/my-field/http.request.method/] [discrete] [[run-multiple-aggs]] === Run multiple aggregations You can specify multiple aggregations in the same request: [source,console] ---- GET /my-index-000001/_search { "aggs": { "my-first-agg-name": { "terms": { "field": "my-field" } }, "my-second-agg-name": { "avg": { "field": "my-other-field" } } } } ---- // TEST[setup:my_index] // TEST[s/my-field/http.request.method/] // TEST[s/my-other-field/http.response.bytes/] [discrete] [[run-sub-aggs]] === Run sub-aggregations Bucket aggregations support bucket or metric sub-aggregations. For example, a terms aggregation with an <> sub-aggregation calculates an average value for each bucket of documents. There is no level or depth limit for nesting sub-aggregations. [source,console] ---- GET /my-index-000001/_search { "aggs": { "my-agg-name": { "terms": { "field": "my-field" }, "aggs": { "my-sub-agg-name": { "avg": { "field": "my-other-field" } } } } } } ---- // TEST[setup:my_index] // TEST[s/_search/_search?size=0/] // TEST[s/my-field/http.request.method/] // TEST[s/my-other-field/http.response.bytes/] The response nests sub-aggregation results under their parent aggregation: [source,console-result] ---- { ... "aggregations": { "my-agg-name": { <1> "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "foo", "doc_count": 5, "my-sub-agg-name": { <2> "value": 75.0 } } ] } } } ---- // TESTRESPONSE[s/\.\.\./"took": "$body.took", "timed_out": false, "_shards": "$body._shards", "hits": "$body.hits",/] // TESTRESPONSE[s/"key": "foo"/"key": "get"/] // TESTRESPONSE[s/"value": 75.0/"value": $body.aggregations.my-agg-name.buckets.0.my-sub-agg-name.value/] <1> Results for the parent aggregation, `my-agg-name`. <2> Results for `my-agg-name`'s sub-aggregation, `my-sub-agg-name`. [discrete] [[add-metadata-to-an-agg]] === Add custom metadata Use the `meta` object to associate custom metadata with an aggregation: [source,console] ---- GET /my-index-000001/_search { "aggs": { "my-agg-name": { "terms": { "field": "my-field" }, "meta": { "my-metadata-field": "foo" } } } } ---- // TEST[setup:my_index] // TEST[s/_search/_search?size=0/] The response returns the `meta` object in place: [source,console-result] ---- { ... "aggregations": { "my-agg-name": { "meta": { "my-metadata-field": "foo" }, "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [] } } } ---- // TESTRESPONSE[s/\.\.\./"took": "$body.took", "timed_out": false, "_shards": "$body._shards", "hits": "$body.hits",/] [discrete] [[return-agg-type]] === Return the aggregation type By default, aggregation results include the aggregation's name but not its type. To return the aggregation type, use the `typed_keys` query parameter. [source,console] ---- GET /my-index-000001/_search?typed_keys { "aggs": { "my-agg-name": { "histogram": { "field": "my-field", "interval": 1000 } } } } ---- // TEST[setup:my_index] // TEST[s/typed_keys/typed_keys&size=0/] // TEST[s/my-field/http.response.bytes/] The response returns the aggregation type as a prefix to the aggregation's name. IMPORTANT: Some aggregations return a different aggregation type from the type in the request. For example, the terms, <>, and <> aggregations return different aggregations types depending on the data type of the aggregated field. [source,console-result] ---- { ... "aggregations": { "histogram#my-agg-name": { <1> "buckets": [] } } } ---- // TESTRESPONSE[s/\.\.\./"took": "$body.took", "timed_out": false, "_shards": "$body._shards", "hits": "$body.hits",/] // TESTRESPONSE[s/"buckets": \[\]/"buckets":\[\{"key":1070000.0,"doc_count":5\}\]/] <1> The aggregation type, `histogram`, followed by a `#` separator and the aggregation's name, `my-agg-name`. [discrete] [[use-scripts-in-an-agg]] === Use scripts in an aggregation Some aggregations support <>. You can use a `script` to extract or generate values for the aggregation: [source,console] ---- GET /my-index-000001/_search { "aggs": { "my-agg-name": { "histogram": { "interval": 1000, "script": { "source": "doc['my-field'].value.length()" } } } } } ---- // TEST[setup:my_index] // TEST[s/my-field/http.request.method/] If you also specify a `field`, the `script` modifies the field values used in the aggregation. The following aggregation uses a script to modify `my-field` values: [source,console] ---- GET /my-index-000001/_search { "aggs": { "my-agg-name": { "histogram": { "field": "my-field", "interval": 1000, "script": "_value / 1000" } } } } ---- // TEST[setup:my_index] // TEST[s/my-field/http.response.bytes/] Some aggregations only work on specific data types. Use the `value_type` parameter to specify a data type for a script-generated value or an unmapped field. `value_type` accepts the following values: * `boolean` * `date` * `double`, used for all floating-point numbers * `long`, used for all integers * `ip` * `string` [source,console] ---- GET /my-index-000001/_search { "aggs": { "my-agg-name": { "histogram": { "field": "my-field", "interval": 1000, "script": "_value / 1000", "value_type": "long" } } } } ---- // TEST[setup:my_index] // TEST[s/my-field/http.response.bytes/] [discrete] [[agg-caches]] === Aggregation caches For faster responses, {es} caches the results of frequently run aggregations in the <>. To get cached results, use the same <> for each search. If you don't need search hits, <> to avoid filling the cache. {es} routes searches with the same preference string to the same shards. If the shards' data doesn’t change between searches, the shards return cached aggregation results. [discrete] [[limits-for-long-values]] === Limits for `long` values When running aggregations, {es} uses <> values to hold and represent numeric data. As a result, aggregations on <> numbers greater than +2^53^+ are approximate. -- include::aggregations/bucket.asciidoc[] include::aggregations/metrics.asciidoc[] include::aggregations/pipeline.asciidoc[]