OpenSearch/docs/reference/aggregations/metrics/stats-aggregation.asciidoc
Adrien Grand 32e23b9100 Aggs: Make it possible to configure missing values.
Most aggregations (terms, histogram, stats, percentiles, geohash-grid) now
support a new `missing` option which defines the value to consider when a
field does not have a value. This can be handy if you eg. want a terms
aggregation to handle the same way documents that have "N/A" or no value
for a `tag` field.

This works in a very similar way to the `missing` option on the `sort`
element.

One known issue is that this option sometimes cannot make the right decision
in the unmapped case: it needs to replace all values with the `missing` value
but might not know what kind of values source should be produced (numerics,
strings, geo points?). For this reason, we might want to add an `unmapped_type`
option in the future like we did for sorting.

Related to #5324
2015-05-15 16:26:58 +02:00

104 lines
3.0 KiB
Plaintext

[[search-aggregations-metrics-stats-aggregation]]
=== Stats Aggregation
A `multi-value` metrics aggregation that computes stats over numeric values extracted from the aggregated documents. These values can be extracted either from specific numeric fields in the documents, or be generated by a provided script.
The stats that are returned consist of: `min`, `max`, `sum`, `count` and `avg`.
Assuming the data consists of documents representing exams grades (between 0 and 100) of students
[source,js]
--------------------------------------------------
{
"aggs" : {
"grades_stats" : { "stats" : { "field" : "grade" } }
}
}
--------------------------------------------------
The above aggregation computes the grades statistics over all documents. The aggregation type is `stats` and the `field` setting defines the numeric field of the documents the stats will be computed on. The above will return the following:
[source,js]
--------------------------------------------------
{
...
"aggregations": {
"grades_stats": {
"count": 6,
"min": 60,
"max": 98,
"avg": 78.5,
"sum": 471
}
}
}
--------------------------------------------------
The name of the aggregation (`grades_stats` above) also serves as the key by which the aggregation result can be retrieved from the returned response.
==== Script
Computing the grades stats based on a script:
[source,js]
--------------------------------------------------
{
...,
"aggs" : {
"grades_stats" : { "stats" : { "script" : "doc['grade'].value" } }
}
}
--------------------------------------------------
TIP: The `script` parameter expects an inline script. Use `script_id` for indexed scripts and `script_file` for scripts in the `config/scripts/` directory.
===== Value Script
It turned out that the exam was way above the level of the students and a grade correction needs to be applied. We can use a value script to get the new stats:
[source,js]
--------------------------------------------------
{
"aggs" : {
...
"aggs" : {
"grades_stats" : {
"stats" : {
"field" : "grade",
"script" : "_value * correction",
"params" : {
"correction" : 1.2
}
}
}
}
}
}
--------------------------------------------------
==== Missing value
The `missing` parameter defines how documents that are missing a value should be treated.
By default they will be ignored but it is also possible to treat them as if they
had a value.
[source,js]
--------------------------------------------------
{
"aggs" : {
"grades_stats" : {
"stats" : {
"field" : "grade",
"missing": 0 <1>
}
}
}
}
--------------------------------------------------
<1> Documents without a value in the `grade` field will fall into the same bucket as documents that have the value `0`.