OpenSearch/docs/reference/aggregations/reducer/derivative-aggregation.asci...

197 lines
5.6 KiB
Plaintext
Raw Normal View History

[[search-aggregations-reducer-derivative-aggregation]]
=== Derivative Aggregation
A parent reducer aggregation which calculates the derivative of a specified metric in a parent histogram (or date_histogram)
aggregation. The specified metric must be numeric and the enclosing histogram must have `min_doc_count` set to `0` (default
for `histogram` aggregations).
2015-05-01 16:04:55 -04:00
==== Syntax
A `derivative` aggregation looks like this in isolation:
[source,js]
--------------------------------------------------
{
"derivative": {
"buckets_path": "the_sum"
}
}
--------------------------------------------------
.`derivative` Parameters
|===
|Parameter Name |Description |Required |Default Value
|`buckets_path` |Path to the metric of interest (see <<bucket-path-syntax, `buckets_path` Syntax>> for more details |Required |
|===
==== First Order Derivative
The following snippet calculates the derivative of the total monthly `sales`:
[source,js]
--------------------------------------------------
{
"aggs" : {
"sales_per_month" : {
"date_histogram" : {
"field" : "date",
"interval" : "month"
},
"aggs": {
"sales": {
"sum": {
"field": "price"
}
},
"sales_deriv": {
"derivative": {
"buckets_paths": "sales" <1>
}
}
}
}
}
}
--------------------------------------------------
<1> `bucket_paths` instructs this derivative aggregation to use the output of the `sales` aggregation for the derivative
And the following may be the response:
[source,js]
--------------------------------------------------
{
"aggregations": {
"sales_per_month": {
"buckets": [
{
"key_as_string": "2015/01/01 00:00:00",
"key": 1420070400000,
"doc_count": 3,
"sales": {
"value": 550
} <1>
},
{
"key_as_string": "2015/02/01 00:00:00",
"key": 1422748800000,
"doc_count": 2,
"sales": {
"value": 60
},
"sales_deriv": {
"value": -490 <2>
}
},
{
"key_as_string": "2015/03/01 00:00:00",
"key": 1425168000000,
2015-04-27 14:40:04 -04:00
"doc_count": 2, <3>
"sales": {
"value": 375
},
"sales_deriv": {
"value": 315
}
}
]
}
}
}
--------------------------------------------------
<1> No derivative for the first bucket since we need at least 2 data points to calculate the derivative
<2> Derivative value units are implicitly defined by the `sales` aggregation and the parent histogram so in this case the units
would be $/month assuming the `price` field has units of $.
2015-05-01 16:04:55 -04:00
<3> The number of documents in the bucket are represented by the `doc_count` f
==== Second Order Derivative
A second order derivative can be calculated by chaining the derivative reducer aggregation onto the result of another derivative
reducer aggregation as in the following example which will calculate both the first and the second order derivative of the total
monthly sales:
[source,js]
--------------------------------------------------
{
"aggs" : {
"sales_per_month" : {
"date_histogram" : {
"field" : "date",
"interval" : "month"
},
"aggs": {
"sales": {
"sum": {
"field": "price"
}
},
"sales_deriv": {
"derivative": {
"buckets_paths": "sales"
}
},
"sales_2nd_deriv": {
"derivative": {
"buckets_paths": "sales_deriv" <1>
}
}
}
}
}
}
--------------------------------------------------
<1> `bucket_paths` for the second derivative points to the name of the first derivative
And the following may be the response:
[source,js]
--------------------------------------------------
{
"aggregations": {
"sales_per_month": {
"buckets": [
{
"key_as_string": "2015/01/01 00:00:00",
"key": 1420070400000,
"doc_count": 3,
"sales": {
"value": 550
} <1>
},
{
"key_as_string": "2015/02/01 00:00:00",
"key": 1422748800000,
"doc_count": 2,
"sales": {
"value": 60
},
"sales_deriv": {
"value": -490
} <1>
},
{
"key_as_string": "2015/03/01 00:00:00",
"key": 1425168000000,
"doc_count": 2,
"sales": {
"value": 375
},
"sales_deriv": {
"value": 315
},
"sales_2nd_deriv": {
"value": 805
}
}
]
}
}
}
--------------------------------------------------
<1> No second derivative for the first two buckets since we need at least 2 data points from the first derivative to calculate the
second derivative