2018-07-23 18:33:15 -04:00
|
|
|
[[search-aggregations-metrics-weight-avg-aggregation]]
|
|
|
|
=== Weighted Avg Aggregation
|
|
|
|
|
|
|
|
A `single-value` metrics aggregation that computes the weighted average of numeric values that are extracted from the aggregated documents.
|
|
|
|
These values can be extracted either from specific numeric fields in the documents.
|
|
|
|
|
|
|
|
When calculating a regular average, each datapoint has an equal "weight" ... it contributes equally to the final value. Weighted averages,
|
|
|
|
on the other hand, weight each datapoint differently. The amount that each datapoint contributes to the final value is extracted from the
|
|
|
|
document, or provided by a script.
|
|
|
|
|
|
|
|
As a formula, a weighted average is the `∑(value * weight) / ∑(weight)`
|
|
|
|
|
|
|
|
A regular average can be thought of as a weighted average where every value has an implicit weight of `1`.
|
|
|
|
|
2019-04-30 10:19:09 -04:00
|
|
|
[[weighted-avg-params]]
|
2018-07-23 18:33:15 -04:00
|
|
|
.`weighted_avg` Parameters
|
2019-04-30 10:19:09 -04:00
|
|
|
[options="header"]
|
2018-07-23 18:33:15 -04:00
|
|
|
|===
|
|
|
|
|Parameter Name |Description |Required |Default Value
|
|
|
|
|`value` | The configuration for the field or script that provides the values |Required |
|
|
|
|
|`weight` | The configuration for the field or script that provides the weights |Required |
|
|
|
|
|`format` | The numeric response formatter |Optional |
|
|
|
|
|`value_type` | A hint about the values for pure scripts or unmapped fields |Optional |
|
|
|
|
|===
|
|
|
|
|
|
|
|
The `value` and `weight` objects have per-field specific configuration:
|
|
|
|
|
2019-04-30 10:19:09 -04:00
|
|
|
[[value-params]]
|
2018-07-23 18:33:15 -04:00
|
|
|
.`value` Parameters
|
2019-04-30 10:19:09 -04:00
|
|
|
[options="header"]
|
2018-07-23 18:33:15 -04:00
|
|
|
|===
|
|
|
|
|Parameter Name |Description |Required |Default Value
|
|
|
|
|`field` | The field that values should be extracted from |Required |
|
|
|
|
|`missing` | A value to use if the field is missing entirely |Optional |
|
|
|
|
|`script` | A script which provides the values for the document. This is mutually exclusive with `field` |Optional
|
|
|
|
|===
|
|
|
|
|
2019-04-30 10:19:09 -04:00
|
|
|
[[weight-params]]
|
2018-07-23 18:33:15 -04:00
|
|
|
.`weight` Parameters
|
2019-04-30 10:19:09 -04:00
|
|
|
[options="header"]
|
2018-07-23 18:33:15 -04:00
|
|
|
|===
|
|
|
|
|Parameter Name |Description |Required |Default Value
|
|
|
|
|`field` | The field that weights should be extracted from |Required |
|
|
|
|
|`missing` | A weight to use if the field is missing entirely |Optional |
|
|
|
|
|`script` | A script which provides the weights for the document. This is mutually exclusive with `field` |Optional
|
|
|
|
|===
|
|
|
|
|
|
|
|
|
|
|
|
==== Examples
|
|
|
|
|
|
|
|
If our documents have a `"grade"` field that holds a 0-100 numeric score, and a `"weight"` field which holds an arbitrary numeric weight,
|
|
|
|
we can calculate the weighted average using:
|
|
|
|
|
2019-09-05 10:11:25 -04:00
|
|
|
[source,console]
|
2018-07-23 18:33:15 -04:00
|
|
|
--------------------------------------------------
|
|
|
|
POST /exams/_search
|
|
|
|
{
|
2020-07-20 15:59:00 -04:00
|
|
|
"size": 0,
|
|
|
|
"aggs": {
|
|
|
|
"weighted_grade": {
|
|
|
|
"weighted_avg": {
|
|
|
|
"value": {
|
|
|
|
"field": "grade"
|
|
|
|
},
|
|
|
|
"weight": {
|
|
|
|
"field": "weight"
|
2018-07-23 18:33:15 -04:00
|
|
|
}
|
2020-07-20 15:59:00 -04:00
|
|
|
}
|
2018-07-23 18:33:15 -04:00
|
|
|
}
|
2020-07-20 15:59:00 -04:00
|
|
|
}
|
2018-07-23 18:33:15 -04:00
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
// TEST[setup:exams]
|
|
|
|
|
|
|
|
Which yields a response like:
|
|
|
|
|
2019-09-06 16:09:09 -04:00
|
|
|
[source,console-result]
|
2018-07-23 18:33:15 -04:00
|
|
|
--------------------------------------------------
|
|
|
|
{
|
2020-07-20 15:59:00 -04:00
|
|
|
...
|
|
|
|
"aggregations": {
|
|
|
|
"weighted_grade": {
|
|
|
|
"value": 70.0
|
2018-07-23 18:33:15 -04:00
|
|
|
}
|
2020-07-20 15:59:00 -04:00
|
|
|
}
|
2018-07-23 18:33:15 -04:00
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
// TESTRESPONSE[s/\.\.\./"took": $body.took,"timed_out": false,"_shards": $body._shards,"hits": $body.hits,/]
|
|
|
|
|
|
|
|
|
|
|
|
While multiple values-per-field are allowed, only one weight is allowed. If the aggregation encounters
|
|
|
|
a document that has more than one weight (e.g. the weight field is a multi-valued field) it will throw an exception.
|
|
|
|
If you have this situation, you will need to specify a `script` for the weight field, and use the script
|
|
|
|
to combine the multiple values into a single value to be used.
|
|
|
|
|
|
|
|
This single weight will be applied independently to each value extracted from the `value` field.
|
|
|
|
|
|
|
|
This example show how a single document with multiple values will be averaged with a single weight:
|
|
|
|
|
2019-09-05 10:11:25 -04:00
|
|
|
[source,console]
|
2018-07-23 18:33:15 -04:00
|
|
|
--------------------------------------------------
|
|
|
|
POST /exams/_doc?refresh
|
|
|
|
{
|
2020-07-20 15:59:00 -04:00
|
|
|
"grade": [1, 2, 3],
|
|
|
|
"weight": 2
|
2018-07-23 18:33:15 -04:00
|
|
|
}
|
|
|
|
|
|
|
|
POST /exams/_search
|
|
|
|
{
|
2020-07-20 15:59:00 -04:00
|
|
|
"size": 0,
|
|
|
|
"aggs": {
|
|
|
|
"weighted_grade": {
|
|
|
|
"weighted_avg": {
|
|
|
|
"value": {
|
|
|
|
"field": "grade"
|
|
|
|
},
|
|
|
|
"weight": {
|
|
|
|
"field": "weight"
|
2018-07-23 18:33:15 -04:00
|
|
|
}
|
2020-07-20 15:59:00 -04:00
|
|
|
}
|
2018-07-23 18:33:15 -04:00
|
|
|
}
|
2020-07-20 15:59:00 -04:00
|
|
|
}
|
2018-07-23 18:33:15 -04:00
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
// TEST
|
|
|
|
|
|
|
|
The three values (`1`, `2`, and `3`) will be included as independent values, all with the weight of `2`:
|
|
|
|
|
2019-09-06 16:09:09 -04:00
|
|
|
[source,console-result]
|
2018-07-23 18:33:15 -04:00
|
|
|
--------------------------------------------------
|
|
|
|
{
|
2020-07-20 15:59:00 -04:00
|
|
|
...
|
|
|
|
"aggregations": {
|
|
|
|
"weighted_grade": {
|
|
|
|
"value": 2.0
|
2018-07-23 18:33:15 -04:00
|
|
|
}
|
2020-07-20 15:59:00 -04:00
|
|
|
}
|
2018-07-23 18:33:15 -04:00
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
// TESTRESPONSE[s/\.\.\./"took": $body.took,"timed_out": false,"_shards": $body._shards,"hits": $body.hits,/]
|
|
|
|
|
|
|
|
The aggregation returns `2.0` as the result, which matches what we would expect when calculating by hand:
|
|
|
|
`((1*2) + (2*2) + (3*2)) / (2+2+2) == 2`
|
|
|
|
|
|
|
|
==== Script
|
|
|
|
|
|
|
|
Both the value and the weight can be derived from a script, instead of a field. As a simple example, the following
|
|
|
|
will add one to the grade and weight in the document using a script:
|
|
|
|
|
2019-09-05 10:11:25 -04:00
|
|
|
[source,console]
|
2018-07-23 18:33:15 -04:00
|
|
|
--------------------------------------------------
|
|
|
|
POST /exams/_search
|
|
|
|
{
|
2020-07-20 15:59:00 -04:00
|
|
|
"size": 0,
|
|
|
|
"aggs": {
|
|
|
|
"weighted_grade": {
|
|
|
|
"weighted_avg": {
|
|
|
|
"value": {
|
|
|
|
"script": "doc.grade.value + 1"
|
|
|
|
},
|
|
|
|
"weight": {
|
|
|
|
"script": "doc.weight.value + 1"
|
2018-07-23 18:33:15 -04:00
|
|
|
}
|
2020-07-20 15:59:00 -04:00
|
|
|
}
|
2018-07-23 18:33:15 -04:00
|
|
|
}
|
2020-07-20 15:59:00 -04:00
|
|
|
}
|
2018-07-23 18:33:15 -04:00
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
// TEST[setup:exams]
|
|
|
|
|
|
|
|
|
|
|
|
==== Missing values
|
|
|
|
|
|
|
|
The `missing` parameter defines how documents that are missing a value should be treated.
|
|
|
|
The default behavior is different for `value` and `weight`:
|
|
|
|
|
|
|
|
By default, if the `value` field is missing the document is ignored and the aggregation moves on to the next document.
|
|
|
|
If the `weight` field is missing, it is assumed to have a weight of `1` (like a normal average).
|
|
|
|
|
|
|
|
Both of these defaults can be overridden with the `missing` parameter:
|
|
|
|
|
2019-09-05 10:11:25 -04:00
|
|
|
[source,console]
|
2018-07-23 18:33:15 -04:00
|
|
|
--------------------------------------------------
|
|
|
|
POST /exams/_search
|
|
|
|
{
|
2020-07-20 15:59:00 -04:00
|
|
|
"size": 0,
|
|
|
|
"aggs": {
|
|
|
|
"weighted_grade": {
|
|
|
|
"weighted_avg": {
|
|
|
|
"value": {
|
|
|
|
"field": "grade",
|
|
|
|
"missing": 2
|
|
|
|
},
|
|
|
|
"weight": {
|
|
|
|
"field": "weight",
|
|
|
|
"missing": 3
|
2018-07-23 18:33:15 -04:00
|
|
|
}
|
2020-07-20 15:59:00 -04:00
|
|
|
}
|
2018-07-23 18:33:15 -04:00
|
|
|
}
|
2020-07-20 15:59:00 -04:00
|
|
|
}
|
2018-07-23 18:33:15 -04:00
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
// TEST[setup:exams]
|
|
|
|
|