mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-02-06 13:08:29 +00:00
Backports #55933 to 7.x Implements value_count and avg aggregations over Histogram fields as discussed in #53285 - value_count returns the sum of all counts array of the histograms - avg computes a weighted average of the values array of the histogram by multiplying each value with its associated element in the counts array
214 lines
5.6 KiB
Plaintext
214 lines
5.6 KiB
Plaintext
[[search-aggregations-metrics-sum-aggregation]]
|
|
=== Sum Aggregation
|
|
|
|
A `single-value` metrics aggregation that sums up numeric values that are extracted from the aggregated documents.
|
|
These values can be extracted either from specific numeric or <<histogram,histogram>> fields in the documents,
|
|
or be generated by a provided script.
|
|
|
|
Assuming the data consists of documents representing sales records we can sum
|
|
the sale price of all hats with:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
POST /sales/_search?size=0
|
|
{
|
|
"query" : {
|
|
"constant_score" : {
|
|
"filter" : {
|
|
"match" : { "type" : "hat" }
|
|
}
|
|
}
|
|
},
|
|
"aggs" : {
|
|
"hat_prices" : { "sum" : { "field" : "price" } }
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TEST[setup:sales]
|
|
|
|
Resulting in:
|
|
|
|
[source,console-result]
|
|
--------------------------------------------------
|
|
{
|
|
...
|
|
"aggregations" : {
|
|
"hat_prices" : {
|
|
"value" : 450.0
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TESTRESPONSE[s/\.\.\./"took": $body.took,"timed_out": false,"_shards": $body._shards,"hits": $body.hits,/]
|
|
|
|
The name of the aggregation (`hat_prices` above) also serves as the key by which the aggregation result can be retrieved from the returned response.
|
|
|
|
==== Script
|
|
|
|
We could also use a script to fetch the sales price:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
POST /sales/_search?size=0
|
|
{
|
|
"query" : {
|
|
"constant_score" : {
|
|
"filter" : {
|
|
"match" : { "type" : "hat" }
|
|
}
|
|
}
|
|
},
|
|
"aggs" : {
|
|
"hat_prices" : {
|
|
"sum" : {
|
|
"script" : {
|
|
"source": "doc.price.value"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TEST[setup:sales]
|
|
|
|
This will interpret the `script` parameter as an `inline` script with the `painless` script language and no script parameters. To use a stored script use the following syntax:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
POST /sales/_search?size=0
|
|
{
|
|
"query" : {
|
|
"constant_score" : {
|
|
"filter" : {
|
|
"match" : { "type" : "hat" }
|
|
}
|
|
}
|
|
},
|
|
"aggs" : {
|
|
"hat_prices" : {
|
|
"sum" : {
|
|
"script" : {
|
|
"id": "my_script",
|
|
"params" : {
|
|
"field" : "price"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TEST[setup:sales,stored_example_script]
|
|
|
|
===== Value Script
|
|
|
|
It is also possible to access the field value from the script using `_value`.
|
|
For example, this will sum the square of the prices for all hats:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
POST /sales/_search?size=0
|
|
{
|
|
"query" : {
|
|
"constant_score" : {
|
|
"filter" : {
|
|
"match" : { "type" : "hat" }
|
|
}
|
|
}
|
|
},
|
|
"aggs" : {
|
|
"square_hats" : {
|
|
"sum" : {
|
|
"field" : "price",
|
|
"script" : {
|
|
"source": "_value * _value"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TEST[setup:sales]
|
|
|
|
==== Missing value
|
|
|
|
The `missing` parameter defines how documents that are missing a value should
|
|
be treated. By default documents missing the value will be ignored but it is
|
|
also possible to treat them as if they had a value. For example, this treats
|
|
all hat sales without a price as being `100`.
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
POST /sales/_search?size=0
|
|
{
|
|
"query" : {
|
|
"constant_score" : {
|
|
"filter" : {
|
|
"match" : { "type" : "hat" }
|
|
}
|
|
}
|
|
},
|
|
"aggs" : {
|
|
"hat_prices" : {
|
|
"sum" : {
|
|
"field" : "price",
|
|
"missing": 100 <1>
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TEST[setup:sales]
|
|
|
|
[[search-aggregations-metrics-sum-aggregation-histogram-fields]]
|
|
==== Histogram fields
|
|
|
|
When sum is computed on <<histogram,histogram fields>>, the result of the aggregation is the sum of all elements in the `values`
|
|
array multiplied by the number in the same position in the `counts` array.
|
|
|
|
For example, for the following index that stores pre-aggregated histograms with latency metrics for different networks:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
PUT metrics_index/_doc/1
|
|
{
|
|
"network.name" : "net-1",
|
|
"latency_histo" : {
|
|
"values" : [0.1, 0.2, 0.3, 0.4, 0.5], <1>
|
|
"counts" : [3, 7, 23, 12, 6] <2>
|
|
}
|
|
}
|
|
|
|
PUT metrics_index/_doc/2
|
|
{
|
|
"network.name" : "net-2",
|
|
"latency_histo" : {
|
|
"values" : [0.1, 0.2, 0.3, 0.4, 0.5], <1>
|
|
"counts" : [8, 17, 8, 7, 6] <2>
|
|
}
|
|
}
|
|
|
|
POST /metrics_index/_search?size=0
|
|
{
|
|
"aggs" : {
|
|
"total_latency" : { "sum" : { "field" : "latency_histo" } }
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
For each histogram field the `sum` aggregation will multiply each number in the `values` array <1> multiplied by its associated count
|
|
in the `counts` array <2>. Eventually, it will add all values for all histograms and return the following result:
|
|
|
|
[source,console-result]
|
|
--------------------------------------------------
|
|
{
|
|
...
|
|
"aggregations" : {
|
|
"total_latency" : {
|
|
"value" : 28.8
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TESTRESPONSE[skip:test not setup]
|