2013-11-24 06:13:08 -05:00
[[search-aggregations-metrics-sum-aggregation]]
2014-05-12 19:35:58 -04:00
=== Sum Aggregation
2013-11-24 06:13:08 -05:00
2020-04-29 08:06:12 -04:00
A `single-value` metrics aggregation that sums up numeric values that are extracted from the aggregated documents.
These values can be extracted either from specific numeric or <<histogram,histogram>> fields in the documents,
or be generated by a provided script.
2013-11-24 06:13:08 -05:00
2017-02-07 14:17:54 -05:00
Assuming the data consists of documents representing sales records we can sum
the sale price of all hats with:
2013-11-24 06:13:08 -05:00
2019-09-05 10:11:25 -04:00
[source,console]
2013-11-24 06:13:08 -05:00
--------------------------------------------------
2017-02-07 14:17:54 -05:00
POST /sales/_search?size=0
2013-11-24 06:13:08 -05:00
{
2020-07-20 15:59:00 -04:00
"query": {
"constant_score": {
"filter": {
"match": { "type": "hat" }
}
2013-11-24 06:13:08 -05:00
}
2020-07-20 15:59:00 -04:00
},
"aggs": {
"hat_prices": { "sum": { "field": "price" } }
}
2013-11-24 06:13:08 -05:00
}
--------------------------------------------------
2017-02-07 14:17:54 -05:00
// TEST[setup:sales]
2013-11-24 06:13:08 -05:00
2017-02-07 14:17:54 -05:00
Resulting in:
2013-11-24 06:13:08 -05:00
2019-09-06 16:09:09 -04:00
[source,console-result]
2013-11-24 06:13:08 -05:00
--------------------------------------------------
{
2020-07-20 15:59:00 -04:00
...
"aggregations": {
"hat_prices": {
"value": 450.0
2013-11-24 06:13:08 -05:00
}
2020-07-20 15:59:00 -04:00
}
2013-11-24 06:13:08 -05:00
}
--------------------------------------------------
2017-02-07 14:17:54 -05:00
// TESTRESPONSE[s/\.\.\./"took": $body.took,"timed_out": false,"_shards": $body._shards,"hits": $body.hits,/]
2013-11-24 06:13:08 -05:00
2018-09-17 21:21:15 -04:00
The name of the aggregation (`hat_prices` above) also serves as the key by which the aggregation result can be retrieved from the returned response.
2013-11-24 06:13:08 -05:00
==== Script
2017-02-07 14:17:54 -05:00
We could also use a script to fetch the sales price:
2013-11-24 06:13:08 -05:00
2019-09-05 10:11:25 -04:00
[source,console]
2013-11-24 06:13:08 -05:00
--------------------------------------------------
2017-02-07 14:17:54 -05:00
POST /sales/_search?size=0
2013-11-24 06:13:08 -05:00
{
2020-07-20 15:59:00 -04:00
"query": {
"constant_score": {
"filter": {
"match": { "type": "hat" }
}
}
},
"aggs": {
"hat_prices": {
"sum": {
"script": {
"source": "doc.price.value"
2016-06-27 09:55:16 -04:00
}
2020-07-20 15:59:00 -04:00
}
2013-11-24 06:13:08 -05:00
}
2020-07-20 15:59:00 -04:00
}
2013-11-24 06:13:08 -05:00
}
--------------------------------------------------
2017-02-07 14:17:54 -05:00
// TEST[setup:sales]
2013-11-24 06:13:08 -05:00
2017-05-17 17:42:25 -04:00
This will interpret the `script` parameter as an `inline` script with the `painless` script language and no script parameters. To use a stored script use the following syntax:
2015-05-12 05:37:22 -04:00
2019-09-05 10:11:25 -04:00
[source,console]
2015-05-12 05:37:22 -04:00
--------------------------------------------------
2017-02-07 14:17:54 -05:00
POST /sales/_search?size=0
2015-05-12 05:37:22 -04:00
{
2020-07-20 15:59:00 -04:00
"query": {
"constant_score": {
"filter": {
"match": { "type": "hat" }
}
}
},
"aggs": {
"hat_prices": {
"sum": {
"script": {
"id": "my_script",
"params": {
"field": "price"
}
2015-05-12 05:37:22 -04:00
}
2020-07-20 15:59:00 -04:00
}
2015-05-12 05:37:22 -04:00
}
2020-07-20 15:59:00 -04:00
}
2015-05-12 05:37:22 -04:00
}
--------------------------------------------------
2017-05-17 17:42:25 -04:00
// TEST[setup:sales,stored_example_script]
2015-04-26 11:30:38 -04:00
2013-11-24 06:13:08 -05:00
===== Value Script
2017-02-07 14:17:54 -05:00
It is also possible to access the field value from the script using `_value`.
For example, this will sum the square of the prices for all hats:
2013-11-24 06:13:08 -05:00
2019-09-05 10:11:25 -04:00
[source,console]
2013-11-24 06:13:08 -05:00
--------------------------------------------------
2017-02-07 14:17:54 -05:00
POST /sales/_search?size=0
2013-11-24 06:13:08 -05:00
{
2020-07-20 15:59:00 -04:00
"query": {
"constant_score": {
"filter": {
"match": { "type": "hat" }
}
}
},
"aggs": {
"square_hats": {
"sum": {
"field": "price",
"script": {
"source": "_value * _value"
2013-11-24 06:13:08 -05:00
}
2020-07-20 15:59:00 -04:00
}
2013-11-24 06:13:08 -05:00
}
2020-07-20 15:59:00 -04:00
}
2013-11-24 06:13:08 -05:00
}
--------------------------------------------------
2017-02-07 14:17:54 -05:00
// TEST[setup:sales]
2015-05-07 10:46:40 -04:00
==== Missing value
2017-02-07 14:17:54 -05:00
The `missing` parameter defines how documents that are missing a value should
be treated. By default documents missing the value will be ignored but it is
also possible to treat them as if they had a value. For example, this treats
all hat sales without a price as being `100`.
2015-05-07 10:46:40 -04:00
2019-09-05 10:11:25 -04:00
[source,console]
2015-05-07 10:46:40 -04:00
--------------------------------------------------
2017-02-07 14:17:54 -05:00
POST /sales/_search?size=0
2015-05-07 10:46:40 -04:00
{
2020-07-20 15:59:00 -04:00
"query": {
"constant_score": {
"filter": {
"match": { "type": "hat" }
}
2015-05-07 10:46:40 -04:00
}
2020-07-20 15:59:00 -04:00
},
"aggs": {
"hat_prices": {
"sum": {
"field": "price",
"missing": 100 <1>
}
}
}
2015-05-07 10:46:40 -04:00
}
--------------------------------------------------
2017-02-07 14:17:54 -05:00
// TEST[setup:sales]
2020-04-29 08:06:12 -04:00
[[search-aggregations-metrics-sum-aggregation-histogram-fields]]
==== Histogram fields
2020-05-04 06:23:02 -04:00
When sum is computed on <<histogram,histogram fields>>, the result of the aggregation is the sum of all elements in the `values`
2020-04-29 08:06:12 -04:00
array multiplied by the number in the same position in the `counts` array.
2020-05-04 06:23:02 -04:00
For example, for the following index that stores pre-aggregated histograms with latency metrics for different networks:
2020-04-29 08:06:12 -04:00
[source,console]
--------------------------------------------------
PUT metrics_index/_doc/1
{
"network.name" : "net-1",
"latency_histo" : {
"values" : [0.1, 0.2, 0.3, 0.4, 0.5], <1>
"counts" : [3, 7, 23, 12, 6] <2>
}
}
PUT metrics_index/_doc/2
{
"network.name" : "net-2",
"latency_histo" : {
"values" : [0.1, 0.2, 0.3, 0.4, 0.5], <1>
"counts" : [8, 17, 8, 7, 6] <2>
}
}
POST /metrics_index/_search?size=0
{
2020-07-20 15:59:00 -04:00
"aggs" : {
"total_latency" : { "sum" : { "field" : "latency_histo" } }
}
2020-04-29 08:06:12 -04:00
}
--------------------------------------------------
2020-05-04 06:23:02 -04:00
For each histogram field the `sum` aggregation will multiply each number in the `values` array <1> multiplied by its associated count
2020-04-29 08:06:12 -04:00
in the `counts` array <2>. Eventually, it will add all values for all histograms and return the following result:
[source,console-result]
--------------------------------------------------
{
2020-07-20 15:59:00 -04:00
...
"aggregations": {
"total_latency": {
"value": 28.8
2020-04-29 08:06:12 -04:00
}
2020-07-20 15:59:00 -04:00
}
2020-04-29 08:06:12 -04:00
}
--------------------------------------------------
// TESTRESPONSE[skip:test not setup]