2013-11-24 06:13:08 -05:00
|
|
|
|
[[search-aggregations]]
|
2015-05-01 16:04:55 -04:00
|
|
|
|
= Aggregations
|
2013-11-24 06:13:08 -05:00
|
|
|
|
|
2015-05-01 16:04:55 -04:00
|
|
|
|
[partintro]
|
|
|
|
|
--
|
2020-10-30 10:31:16 -04:00
|
|
|
|
An aggregation summarizes your data as metrics, statistics, or other analytics.
|
|
|
|
|
Aggregations help you answer questions like:
|
2013-11-24 06:13:08 -05:00
|
|
|
|
|
2020-10-30 10:31:16 -04:00
|
|
|
|
* What's the average load time for my website?
|
|
|
|
|
* Who are my most valuable customers based on transaction volume?
|
|
|
|
|
* What would be considered a large file on my network?
|
|
|
|
|
* How many products are in each product category?
|
2013-11-24 06:13:08 -05:00
|
|
|
|
|
2020-10-30 10:31:16 -04:00
|
|
|
|
{es} organizes aggregations into three categories:
|
2013-11-24 06:13:08 -05:00
|
|
|
|
|
2020-10-30 10:31:16 -04:00
|
|
|
|
* <<search-aggregations-metrics,Metric>> aggregations that calculate metrics,
|
|
|
|
|
such as a sum or average, from field values.
|
2013-11-24 06:13:08 -05:00
|
|
|
|
|
2020-10-30 10:31:16 -04:00
|
|
|
|
* <<search-aggregations-bucket,Bucket>> aggregations that
|
|
|
|
|
group documents into buckets, also called bins, based on field values, ranges,
|
|
|
|
|
or other criteria.
|
2013-11-24 06:13:08 -05:00
|
|
|
|
|
2020-10-30 10:31:16 -04:00
|
|
|
|
* <<search-aggregations-pipeline,Pipeline>> aggregations that take input from
|
|
|
|
|
other aggregations instead of documents or fields.
|
2016-06-02 17:30:50 -04:00
|
|
|
|
|
2020-10-30 10:31:16 -04:00
|
|
|
|
[discrete]
|
|
|
|
|
[[run-an-agg]]
|
|
|
|
|
=== Run an aggregation
|
|
|
|
|
|
|
|
|
|
You can run aggregations as part of a <<search-your-data,search>> by specifying the <<search-search,search API>>'s `aggs` parameter. The
|
|
|
|
|
following search runs a
|
|
|
|
|
<<search-aggregations-bucket-terms-aggregation,terms aggregation>> on
|
|
|
|
|
`my-field`:
|
|
|
|
|
|
|
|
|
|
[source,console]
|
|
|
|
|
----
|
|
|
|
|
GET /my-index-000001/_search
|
|
|
|
|
{
|
|
|
|
|
"aggs": {
|
|
|
|
|
"my-agg-name": {
|
|
|
|
|
"terms": {
|
|
|
|
|
"field": "my-field"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
----
|
|
|
|
|
// TEST[setup:my_index]
|
|
|
|
|
// TEST[s/my-field/http.request.method/]
|
|
|
|
|
|
|
|
|
|
Aggregation results are in the response's `aggregations` object:
|
|
|
|
|
|
|
|
|
|
[source,console-result]
|
|
|
|
|
----
|
|
|
|
|
{
|
|
|
|
|
"took": 78,
|
|
|
|
|
"timed_out": false,
|
|
|
|
|
"_shards": {
|
|
|
|
|
"total": 1,
|
|
|
|
|
"successful": 1,
|
|
|
|
|
"skipped": 0,
|
|
|
|
|
"failed": 0
|
|
|
|
|
},
|
|
|
|
|
"hits": {
|
|
|
|
|
"total": {
|
|
|
|
|
"value": 5,
|
|
|
|
|
"relation": "eq"
|
|
|
|
|
},
|
|
|
|
|
"max_score": 1.0,
|
|
|
|
|
"hits": [...]
|
|
|
|
|
},
|
|
|
|
|
"aggregations": {
|
|
|
|
|
"my-agg-name": { <1>
|
|
|
|
|
"doc_count_error_upper_bound": 0,
|
|
|
|
|
"sum_other_doc_count": 0,
|
|
|
|
|
"buckets": []
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
----
|
|
|
|
|
// TESTRESPONSE[s/"took": 78/"took": "$body.took"/]
|
|
|
|
|
// TESTRESPONSE[s/\.\.\.$/"took": "$body.took", "timed_out": false, "_shards": "$body._shards", /]
|
|
|
|
|
// TESTRESPONSE[s/"hits": \[\.\.\.\]/"hits": "$body.hits.hits"/]
|
|
|
|
|
// TESTRESPONSE[s/"buckets": \[\]/"buckets":\[\{"key":"get","doc_count":5\}\]/]
|
|
|
|
|
|
|
|
|
|
<1> Results for the `my-agg-name` aggregation.
|
|
|
|
|
|
|
|
|
|
[discrete]
|
|
|
|
|
[[change-agg-scope]]
|
|
|
|
|
=== Change an aggregation's scope
|
|
|
|
|
|
|
|
|
|
Use the `query` parameter to limit the documents on which an aggregation runs:
|
|
|
|
|
|
|
|
|
|
[source,console]
|
|
|
|
|
----
|
|
|
|
|
GET /my-index-000001/_search
|
|
|
|
|
{
|
|
|
|
|
"query": {
|
|
|
|
|
"range": {
|
|
|
|
|
"@timestamp": {
|
|
|
|
|
"gte": "now-1d/d",
|
|
|
|
|
"lt": "now/d"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
},
|
|
|
|
|
"aggs": {
|
|
|
|
|
"my-agg-name": {
|
|
|
|
|
"terms": {
|
|
|
|
|
"field": "my-field"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
----
|
|
|
|
|
// TEST[setup:my_index]
|
|
|
|
|
// TEST[s/my-field/http.request.method/]
|
|
|
|
|
|
|
|
|
|
[discrete]
|
|
|
|
|
[[return-only-agg-results]]
|
|
|
|
|
=== Return only aggregation results
|
|
|
|
|
|
|
|
|
|
By default, searches containing an aggregation return both search hits and
|
|
|
|
|
aggregation results. To return only aggregation results, set `size` to `0`:
|
2015-05-01 16:04:55 -04:00
|
|
|
|
|
2020-10-30 10:31:16 -04:00
|
|
|
|
[source,console]
|
|
|
|
|
----
|
|
|
|
|
GET /my-index-000001/_search
|
|
|
|
|
{
|
|
|
|
|
"size": 0,
|
|
|
|
|
"aggs": {
|
|
|
|
|
"my-agg-name": {
|
|
|
|
|
"terms": {
|
|
|
|
|
"field": "my-field"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
----
|
|
|
|
|
// TEST[setup:my_index]
|
|
|
|
|
// TEST[s/my-field/http.request.method/]
|
2013-11-24 06:13:08 -05:00
|
|
|
|
|
2020-10-30 10:31:16 -04:00
|
|
|
|
[discrete]
|
|
|
|
|
[[run-multiple-aggs]]
|
|
|
|
|
=== Run multiple aggregations
|
2013-11-24 06:13:08 -05:00
|
|
|
|
|
2020-10-30 10:31:16 -04:00
|
|
|
|
You can specify multiple aggregations in the same request:
|
|
|
|
|
|
|
|
|
|
[source,console]
|
|
|
|
|
----
|
|
|
|
|
GET /my-index-000001/_search
|
|
|
|
|
{
|
|
|
|
|
"aggs": {
|
|
|
|
|
"my-first-agg-name": {
|
|
|
|
|
"terms": {
|
|
|
|
|
"field": "my-field"
|
|
|
|
|
}
|
|
|
|
|
},
|
|
|
|
|
"my-second-agg-name": {
|
|
|
|
|
"avg": {
|
|
|
|
|
"field": "my-other-field"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
----
|
|
|
|
|
// TEST[setup:my_index]
|
|
|
|
|
// TEST[s/my-field/http.request.method/]
|
|
|
|
|
// TEST[s/my-other-field/http.response.bytes/]
|
2018-03-28 09:01:45 -04:00
|
|
|
|
|
2020-07-23 12:42:33 -04:00
|
|
|
|
[discrete]
|
2020-10-30 10:31:16 -04:00
|
|
|
|
[[run-sub-aggs]]
|
|
|
|
|
=== Run sub-aggregations
|
|
|
|
|
|
|
|
|
|
Bucket aggregations support bucket or metric sub-aggregations. For example, a
|
|
|
|
|
terms aggregation with an <<search-aggregations-metrics-avg-aggregation,avg>>
|
|
|
|
|
sub-aggregation calculates an average value for each bucket of documents. There
|
|
|
|
|
is no level or depth limit for nesting sub-aggregations.
|
|
|
|
|
|
|
|
|
|
[source,console]
|
|
|
|
|
----
|
|
|
|
|
GET /my-index-000001/_search
|
|
|
|
|
{
|
|
|
|
|
"aggs": {
|
|
|
|
|
"my-agg-name": {
|
|
|
|
|
"terms": {
|
|
|
|
|
"field": "my-field"
|
|
|
|
|
},
|
|
|
|
|
"aggs": {
|
|
|
|
|
"my-sub-agg-name": {
|
|
|
|
|
"avg": {
|
|
|
|
|
"field": "my-other-field"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
----
|
|
|
|
|
// TEST[setup:my_index]
|
|
|
|
|
// TEST[s/_search/_search?size=0/]
|
|
|
|
|
// TEST[s/my-field/http.request.method/]
|
|
|
|
|
// TEST[s/my-other-field/http.response.bytes/]
|
2013-11-24 06:13:08 -05:00
|
|
|
|
|
2020-10-30 10:31:16 -04:00
|
|
|
|
The response nests sub-aggregation results under their parent aggregation:
|
2013-11-24 06:13:08 -05:00
|
|
|
|
|
2020-10-30 10:31:16 -04:00
|
|
|
|
[source,console-result]
|
|
|
|
|
----
|
|
|
|
|
{
|
|
|
|
|
...
|
|
|
|
|
"aggregations": {
|
|
|
|
|
"my-agg-name": { <1>
|
|
|
|
|
"doc_count_error_upper_bound": 0,
|
|
|
|
|
"sum_other_doc_count": 0,
|
|
|
|
|
"buckets": [
|
|
|
|
|
{
|
|
|
|
|
"key": "foo",
|
|
|
|
|
"doc_count": 5,
|
|
|
|
|
"my-sub-agg-name": { <2>
|
|
|
|
|
"value": 75.0
|
|
|
|
|
}
|
2013-11-24 06:13:08 -05:00
|
|
|
|
}
|
2020-10-30 10:31:16 -04:00
|
|
|
|
]
|
2013-11-24 06:13:08 -05:00
|
|
|
|
}
|
2020-10-30 10:31:16 -04:00
|
|
|
|
}
|
2013-11-24 06:13:08 -05:00
|
|
|
|
}
|
2020-10-30 10:31:16 -04:00
|
|
|
|
----
|
|
|
|
|
// TESTRESPONSE[s/\.\.\./"took": "$body.took", "timed_out": false, "_shards": "$body._shards", "hits": "$body.hits",/]
|
|
|
|
|
// TESTRESPONSE[s/"key": "foo"/"key": "get"/]
|
|
|
|
|
// TESTRESPONSE[s/"value": 75.0/"value": $body.aggregations.my-agg-name.buckets.0.my-sub-agg-name.value/]
|
|
|
|
|
|
|
|
|
|
<1> Results for the parent aggregation, `my-agg-name`.
|
|
|
|
|
<2> Results for `my-agg-name`'s sub-aggregation, `my-sub-agg-name`.
|
2013-11-24 06:13:08 -05:00
|
|
|
|
|
2020-07-23 12:42:33 -04:00
|
|
|
|
[discrete]
|
2020-10-30 10:31:16 -04:00
|
|
|
|
[[add-metadata-to-an-agg]]
|
|
|
|
|
=== Add custom metadata
|
2013-11-24 06:13:08 -05:00
|
|
|
|
|
2020-10-30 10:31:16 -04:00
|
|
|
|
Use the `meta` object to associate custom metadata with an aggregation:
|
2014-10-29 08:55:33 -04:00
|
|
|
|
|
2020-10-30 10:31:16 -04:00
|
|
|
|
[source,console]
|
|
|
|
|
----
|
|
|
|
|
GET /my-index-000001/_search
|
|
|
|
|
{
|
|
|
|
|
"aggs": {
|
|
|
|
|
"my-agg-name": {
|
|
|
|
|
"terms": {
|
|
|
|
|
"field": "my-field"
|
|
|
|
|
},
|
|
|
|
|
"meta": {
|
|
|
|
|
"my-metadata-field": "foo"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
----
|
|
|
|
|
// TEST[setup:my_index]
|
|
|
|
|
// TEST[s/_search/_search?size=0/]
|
2013-11-29 06:35:25 -05:00
|
|
|
|
|
2020-10-30 10:31:16 -04:00
|
|
|
|
The response returns the `meta` object in place:
|
2015-04-16 09:07:40 -04:00
|
|
|
|
|
2020-10-30 10:31:16 -04:00
|
|
|
|
[source,console-result]
|
|
|
|
|
----
|
|
|
|
|
{
|
|
|
|
|
...
|
|
|
|
|
"aggregations": {
|
|
|
|
|
"my-agg-name": {
|
|
|
|
|
"meta": {
|
|
|
|
|
"my-metadata-field": "foo"
|
|
|
|
|
},
|
|
|
|
|
"doc_count_error_upper_bound": 0,
|
|
|
|
|
"sum_other_doc_count": 0,
|
|
|
|
|
"buckets": []
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
----
|
|
|
|
|
// TESTRESPONSE[s/\.\.\./"took": "$body.took", "timed_out": false, "_shards": "$body._shards", "hits": "$body.hits",/]
|
2015-04-16 09:07:40 -04:00
|
|
|
|
|
2020-10-30 10:31:16 -04:00
|
|
|
|
[discrete]
|
|
|
|
|
[[return-agg-type]]
|
|
|
|
|
=== Return the aggregation type
|
|
|
|
|
|
|
|
|
|
By default, aggregation results include the aggregation's name but not its type.
|
|
|
|
|
To return the aggregation type, use the `typed_keys` query parameter.
|
|
|
|
|
|
|
|
|
|
[source,console]
|
|
|
|
|
----
|
|
|
|
|
GET /my-index-000001/_search?typed_keys
|
|
|
|
|
{
|
|
|
|
|
"aggs": {
|
|
|
|
|
"my-agg-name": {
|
|
|
|
|
"histogram": {
|
|
|
|
|
"field": "my-field",
|
|
|
|
|
"interval": 1000
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
----
|
|
|
|
|
// TEST[setup:my_index]
|
|
|
|
|
// TEST[s/typed_keys/typed_keys&size=0/]
|
|
|
|
|
// TEST[s/my-field/http.response.bytes/]
|
|
|
|
|
|
|
|
|
|
The response returns the aggregation type as a prefix to the aggregation's name.
|
|
|
|
|
|
|
|
|
|
IMPORTANT: Some aggregations return a different aggregation type from the
|
|
|
|
|
type in the request. For example, the terms,
|
|
|
|
|
<<search-aggregations-bucket-significantterms-aggregation,significant terms>>,
|
|
|
|
|
and <<search-aggregations-metrics-percentile-aggregation,percentiles>>
|
|
|
|
|
aggregations return different aggregations types depending on the data type of
|
|
|
|
|
the aggregated field.
|
|
|
|
|
|
|
|
|
|
[source,console-result]
|
|
|
|
|
----
|
|
|
|
|
{
|
|
|
|
|
...
|
|
|
|
|
"aggregations": {
|
|
|
|
|
"histogram#my-agg-name": { <1>
|
|
|
|
|
"buckets": []
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
----
|
|
|
|
|
// TESTRESPONSE[s/\.\.\./"took": "$body.took", "timed_out": false, "_shards": "$body._shards", "hits": "$body.hits",/]
|
|
|
|
|
// TESTRESPONSE[s/"buckets": \[\]/"buckets":\[\{"key":1070000.0,"doc_count":5\}\]/]
|
|
|
|
|
|
|
|
|
|
<1> The aggregation type, `histogram`, followed by a `#` separator and the aggregation's name, `my-agg-name`.
|
|
|
|
|
|
|
|
|
|
[discrete]
|
|
|
|
|
[[use-scripts-in-an-agg]]
|
|
|
|
|
=== Use scripts in an aggregation
|
|
|
|
|
|
|
|
|
|
Some aggregations support <<modules-scripting,scripts>>. You can
|
|
|
|
|
use a `script` to extract or generate values for the aggregation:
|
|
|
|
|
|
|
|
|
|
[source,console]
|
|
|
|
|
----
|
|
|
|
|
GET /my-index-000001/_search
|
|
|
|
|
{
|
|
|
|
|
"aggs": {
|
|
|
|
|
"my-agg-name": {
|
|
|
|
|
"histogram": {
|
|
|
|
|
"interval": 1000,
|
|
|
|
|
"script": {
|
|
|
|
|
"source": "doc['my-field'].value.length()"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
----
|
|
|
|
|
// TEST[setup:my_index]
|
|
|
|
|
// TEST[s/my-field/http.request.method/]
|
|
|
|
|
|
|
|
|
|
If you also specify a `field`, the `script` modifies the field values used in
|
|
|
|
|
the aggregation. The following aggregation uses a script to modify `my-field`
|
|
|
|
|
values:
|
|
|
|
|
|
|
|
|
|
[source,console]
|
|
|
|
|
----
|
|
|
|
|
GET /my-index-000001/_search
|
|
|
|
|
{
|
|
|
|
|
"aggs": {
|
|
|
|
|
"my-agg-name": {
|
|
|
|
|
"histogram": {
|
|
|
|
|
"field": "my-field",
|
|
|
|
|
"interval": 1000,
|
|
|
|
|
"script": "_value / 1000"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
----
|
|
|
|
|
// TEST[setup:my_index]
|
|
|
|
|
// TEST[s/my-field/http.response.bytes/]
|
|
|
|
|
|
|
|
|
|
Some aggregations only work on specific data types. Use the `value_type`
|
|
|
|
|
parameter to specify a data type for a script-generated value or an unmapped
|
|
|
|
|
field. `value_type` accepts the following values:
|
|
|
|
|
|
|
|
|
|
* `boolean`
|
|
|
|
|
* `date`
|
|
|
|
|
* `double`, used for all floating-point numbers
|
|
|
|
|
* `long`, used for all integers
|
|
|
|
|
* `ip`
|
|
|
|
|
* `string`
|
|
|
|
|
|
|
|
|
|
[source,console]
|
|
|
|
|
----
|
|
|
|
|
GET /my-index-000001/_search
|
|
|
|
|
{
|
|
|
|
|
"aggs": {
|
|
|
|
|
"my-agg-name": {
|
|
|
|
|
"histogram": {
|
|
|
|
|
"field": "my-field",
|
|
|
|
|
"interval": 1000,
|
|
|
|
|
"script": "_value / 1000",
|
|
|
|
|
"value_type": "long"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
----
|
|
|
|
|
// TEST[setup:my_index]
|
|
|
|
|
// TEST[s/my-field/http.response.bytes/]
|
|
|
|
|
|
|
|
|
|
[discrete]
|
|
|
|
|
[[agg-caches]]
|
|
|
|
|
=== Aggregation caches
|
|
|
|
|
|
|
|
|
|
For faster responses, {es} caches the results of frequently run aggregations in
|
|
|
|
|
the <<shard-request-cache,shard request cache>>. To get cached results, use the
|
|
|
|
|
same <<shard-and-node-preference,`preference` string>> for each search. If you
|
|
|
|
|
don't need search hits, <<return-only-agg-results,set `size` to `0`>> to avoid
|
|
|
|
|
filling the cache.
|
2016-06-02 17:30:50 -04:00
|
|
|
|
|
2020-10-30 10:31:16 -04:00
|
|
|
|
{es} routes searches with the same preference string to the same shards. If the
|
|
|
|
|
shards' data doesn’t change between searches, the shards return cached
|
|
|
|
|
aggregation results.
|
|
|
|
|
|
|
|
|
|
[discrete]
|
|
|
|
|
[[limits-for-long-values]]
|
|
|
|
|
=== Limits for `long` values
|
|
|
|
|
|
|
|
|
|
When running aggregations, {es} uses <<number,`double`>> values to hold and
|
|
|
|
|
represent numeric data. As a result, aggregations on <<number,`long`>> numbers
|
|
|
|
|
greater than +2^53^+ are approximate.
|
|
|
|
|
--
|
|
|
|
|
|
|
|
|
|
include::aggregations/bucket.asciidoc[]
|
|
|
|
|
|
|
|
|
|
include::aggregations/metrics.asciidoc[]
|
|
|
|
|
|
|
|
|
|
include::aggregations/pipeline.asciidoc[]
|