2014-07-29 03:22:51 -04:00
[[search-aggregations-metrics-scripted-metric-aggregation]]
=== Scripted Metric Aggregation
2015-02-04 09:43:22 -05:00
A metric aggregation that executes using scripts to provide a metric output.
2014-07-29 03:22:51 -04:00
2020-07-20 15:05:33 -04:00
WARNING: Using scripts can result in slower search speeds. See
<<scripts-and-search-speed>>.
2014-07-29 03:22:51 -04:00
Example:
2019-09-05 10:11:25 -04:00
[source,console]
2014-07-29 03:22:51 -04:00
--------------------------------------------------
2016-11-15 11:45:54 -05:00
POST ledger/_search?size=0
2014-07-29 03:22:51 -04:00
{
"query" : {
"match_all" : {}
},
"aggs": {
"profit": {
"scripted_metric": {
2018-10-29 17:13:14 -04:00
"init_script" : "state.transactions = []", <1>
"map_script" : "state.transactions.add(doc.type.value == 'sale' ? doc.amount.value : -1 * doc.amount.value)",
2018-06-25 07:01:33 -04:00
"combine_script" : "double profit = 0; for (t in state.transactions) { profit += t } return profit",
"reduce_script" : "double profit = 0; for (a in states) { profit += a } return profit"
2014-07-29 03:22:51 -04:00
}
}
}
}
--------------------------------------------------
2016-11-15 11:45:54 -05:00
// TEST[setup:ledger]
2014-07-29 03:22:51 -04:00
2018-10-29 17:13:14 -04:00
<1> `init_script` is an optional parameter, all other scripts are required.
2014-07-29 03:22:51 -04:00
The above aggregation demonstrates how one would use the script aggregation compute the total profit from sale and cost transactions.
The response for the above aggregation:
2019-09-06 16:09:09 -04:00
[source,console-result]
2014-07-29 03:22:51 -04:00
--------------------------------------------------
{
2016-11-15 11:45:54 -05:00
"took": 218,
2014-07-29 03:22:51 -04:00
...
"aggregations": {
"profit": {
2016-11-15 11:45:54 -05:00
"value": 240.0
2014-07-29 03:22:51 -04:00
}
}
}
--------------------------------------------------
2016-11-15 11:45:54 -05:00
// TESTRESPONSE[s/"took": 218/"took": $body.took/]
// TESTRESPONSE[s/\.\.\./"_shards": $body._shards, "hits": $body.hits, "timed_out": false,/]
2014-07-29 03:22:51 -04:00
2017-05-17 17:42:25 -04:00
The above example can also be specified using stored scripts as follows:
2015-05-12 05:37:22 -04:00
2019-09-05 10:11:25 -04:00
[source,console]
2015-05-12 05:37:22 -04:00
--------------------------------------------------
2016-11-15 11:45:54 -05:00
POST ledger/_search?size=0
2015-05-12 05:37:22 -04:00
{
"aggs": {
"profit": {
"scripted_metric": {
"init_script" : {
2017-06-09 11:29:25 -04:00
"id": "my_init_script"
2015-05-12 05:37:22 -04:00
},
"map_script" : {
2017-06-09 11:29:25 -04:00
"id": "my_map_script"
2015-05-12 05:37:22 -04:00
},
"combine_script" : {
2017-06-09 11:29:25 -04:00
"id": "my_combine_script"
2015-05-12 05:37:22 -04:00
},
"params": {
2018-06-25 07:01:33 -04:00
"field": "amount" <1>
2015-05-12 05:37:22 -04:00
},
"reduce_script" : {
2017-06-09 11:29:25 -04:00
"id": "my_reduce_script"
2016-11-15 11:45:54 -05:00
}
2015-05-12 05:37:22 -04:00
}
}
}
}
--------------------------------------------------
2017-05-17 17:42:25 -04:00
// TEST[setup:ledger,stored_scripted_metric_script]
2016-11-15 11:45:54 -05:00
<1> script parameters for `init`, `map` and `combine` scripts must be specified
2018-06-25 07:01:33 -04:00
in a global `params` object so that it can be shared between the scripts.
2016-11-15 11:45:54 -05:00
////
Verify this response as well but in a hidden block.
2015-05-12 05:37:22 -04:00
2019-09-06 16:09:09 -04:00
[source,console-result]
2016-11-15 11:45:54 -05:00
--------------------------------------------------
{
"took": 218,
...
"aggregations": {
"profit": {
"value": 240.0
}
}
}
--------------------------------------------------
// TESTRESPONSE[s/"took": 218/"took": $body.took/]
// TESTRESPONSE[s/\.\.\./"_shards": $body._shards, "hits": $body.hits, "timed_out": false,/]
////
2015-05-12 05:37:22 -04:00
2015-09-02 07:13:15 -04:00
For more details on specifying scripts see <<modules-scripting, script documentation>>.
2020-03-16 15:14:01 -04:00
[[scripted-metric-aggregation-return-types]]
2015-09-02 07:13:15 -04:00
==== Allowed return types
2018-06-25 07:01:33 -04:00
Whilst any valid script object can be used within a single script, the scripts must return or store in the `state` object only the following types:
2015-09-02 07:13:15 -04:00
* primitive types
* String
* Map (containing only keys and values of the types listed here)
2016-11-15 11:45:54 -05:00
* Array (containing elements of only the types listed here)
2015-05-12 05:37:22 -04:00
2020-03-16 15:14:01 -04:00
[[scripted-metric-aggregation-scope]]
2014-07-29 03:22:51 -04:00
==== Scope of scripts
The scripted metric aggregation uses scripts at 4 stages of its execution:
init_script:: Executed prior to any collection of documents. Allows the aggregation to set up any initial state.
+
2018-06-25 07:01:33 -04:00
In the above example, the `init_script` creates an array `transactions` in the `state` object.
2014-07-29 03:22:51 -04:00
2018-10-29 17:13:14 -04:00
map_script:: Executed once per document collected. This is a required script. If no combine_script is specified, the resulting state
2018-06-25 07:01:33 -04:00
needs to be stored in the `state` object.
2014-07-29 03:22:51 -04:00
+
2016-11-15 11:45:54 -05:00
In the above example, the `map_script` checks the value of the type field. If the value is 'sale' the value of the amount field
is added to the transactions array. If the value of the type field is not 'sale' the negated value of the amount field is added
2014-07-29 03:22:51 -04:00
to transactions.
2018-10-29 17:13:14 -04:00
combine_script:: Executed once on each shard after document collection is complete. This is a required script. Allows the aggregation to
consolidate the state returned from each shard.
2014-07-29 03:22:51 -04:00
+
2016-11-15 11:45:54 -05:00
In the above example, the `combine_script` iterates through all the stored transactions, summing the values in the `profit` variable
2014-07-29 03:22:51 -04:00
and finally returns `profit`.
2018-10-29 17:13:14 -04:00
reduce_script:: Executed once on the coordinating node after all shards have returned their results. This is a required script. The
script is provided with access to a variable `states` which is an array of the result of the combine_script on each
shard.
2014-07-29 03:22:51 -04:00
+
2016-11-15 11:45:54 -05:00
In the above example, the `reduce_script` iterates through the `profit` returned by each shard summing the values before returning the
2014-07-29 03:22:51 -04:00
final combined profit which will be returned in the response of the aggregation.
2020-03-16 15:14:01 -04:00
[[scripted-metric-aggregation-example]]
==== Worked example
2014-07-29 03:22:51 -04:00
2017-02-27 14:37:04 -05:00
Imagine a situation where you index the following documents into an index with 2 shards:
2014-07-29 03:22:51 -04:00
2019-09-05 10:11:25 -04:00
[source,console]
2014-07-29 03:22:51 -04:00
--------------------------------------------------
2019-01-23 03:46:28 -05:00
PUT /transactions/_bulk?refresh
2016-11-15 11:45:54 -05:00
{"index":{"_id":1}}
{"type": "sale","amount": 80}
{"index":{"_id":2}}
{"type": "cost","amount": 10}
2017-01-05 13:30:05 -05:00
{"index":{"_id":3}}
2016-11-15 11:45:54 -05:00
{"type": "cost","amount": 30}
2017-01-05 13:30:05 -05:00
{"index":{"_id":4}}
2016-11-15 11:45:54 -05:00
{"type": "sale","amount": 130}
2014-07-29 03:22:51 -04:00
--------------------------------------------------
2016-11-15 11:45:54 -05:00
Lets say that documents 1 and 3 end up on shard A and documents 2 and 4 end up on shard B. The following is a breakdown of what the aggregation result is
2014-07-29 03:22:51 -04:00
at each stage of the example above.
===== Before init_script
2018-06-25 07:01:33 -04:00
`state` is initialized as a new empty object.
2014-07-29 03:22:51 -04:00
[source,js]
--------------------------------------------------
2018-06-25 07:01:33 -04:00
"state" : {}
2014-07-29 03:22:51 -04:00
--------------------------------------------------
2017-08-03 16:05:43 -04:00
// NOTCONSOLE
2014-07-29 03:22:51 -04:00
===== After init_script
This is run once on each shard before any document collection is performed, and so we will have a copy on each shard:
Shard A::
+
[source,js]
--------------------------------------------------
2018-06-25 07:01:33 -04:00
"state" : {
"transactions" : []
2014-07-29 03:22:51 -04:00
}
--------------------------------------------------
2017-08-03 16:05:43 -04:00
// NOTCONSOLE
2014-07-29 03:22:51 -04:00
Shard B::
+
[source,js]
--------------------------------------------------
2018-06-25 07:01:33 -04:00
"state" : {
"transactions" : []
2014-07-29 03:22:51 -04:00
}
--------------------------------------------------
2017-08-03 16:05:43 -04:00
// NOTCONSOLE
2014-07-29 03:22:51 -04:00
===== After map_script
Each shard collects its documents and runs the map_script on each document that is collected:
Shard A::
+
[source,js]
--------------------------------------------------
2018-06-25 07:01:33 -04:00
"state" : {
"transactions" : [ 80, -30 ]
2014-07-29 03:22:51 -04:00
}
--------------------------------------------------
2017-08-03 16:05:43 -04:00
// NOTCONSOLE
2014-07-29 03:22:51 -04:00
Shard B::
+
[source,js]
--------------------------------------------------
2018-06-25 07:01:33 -04:00
"state" : {
"transactions" : [ -10, 130 ]
2014-07-29 03:22:51 -04:00
}
--------------------------------------------------
2017-08-03 16:05:43 -04:00
// NOTCONSOLE
2014-07-29 03:22:51 -04:00
===== After combine_script
2016-11-15 11:45:54 -05:00
The combine_script is executed on each shard after document collection is complete and reduces all the transactions down to a single profit figure for each
2014-07-29 03:22:51 -04:00
shard (by summing the values in the transactions array) which is passed back to the coordinating node:
Shard A:: 50
Shard B:: 120
===== After reduce_script
2018-06-25 07:01:33 -04:00
The reduce_script receives a `states` array containing the result of the combine script for each shard:
2014-07-29 03:22:51 -04:00
[source,js]
--------------------------------------------------
2018-06-25 07:01:33 -04:00
"states" : [
2014-07-29 03:22:51 -04:00
50,
120
]
--------------------------------------------------
2017-08-03 16:05:43 -04:00
// NOTCONSOLE
2014-07-29 03:22:51 -04:00
2016-11-15 11:45:54 -05:00
It reduces the responses for the shards down to a final overall profit figure (by summing the values) and returns this as the result of the aggregation to
2014-07-29 03:22:51 -04:00
produce the response:
[source,js]
--------------------------------------------------
{
...
"aggregations": {
"profit": {
2014-09-16 12:03:54 -04:00
"value": 170
2014-07-29 03:22:51 -04:00
}
}
}
--------------------------------------------------
2017-08-03 16:05:43 -04:00
// NOTCONSOLE
2014-07-29 03:22:51 -04:00
2020-03-16 15:14:01 -04:00
[[scripted-metric-aggregation-parameters]]
==== Other parameters
2014-07-29 03:22:51 -04:00
[horizontal]
2016-11-15 11:45:54 -05:00
params:: Optional. An object whose contents will be passed as variables to the `init_script`, `map_script` and `combine_script`. This can be
useful to allow the user to control the behavior of the aggregation and for storing state between the scripts. If this is not specified,
2014-07-29 03:22:51 -04:00
the default is the equivalent of providing:
+
[source,js]
--------------------------------------------------
2018-06-25 07:01:33 -04:00
"params" : {}
2014-07-29 03:22:51 -04:00
--------------------------------------------------
2017-08-03 16:05:43 -04:00
// NOTCONSOLE
2017-06-02 06:00:27 -04:00
2020-03-16 15:14:01 -04:00
[[scripted-metric-aggregation-empty-buckets]]
==== Empty buckets
2017-06-02 06:00:27 -04:00
If a parent bucket of the scripted metric aggregation does not collect any documents an empty aggregation response will be returned from the
2018-06-25 07:01:33 -04:00
shard with a `null` value. In this case the `reduce_script`'s `states` variable will contain `null` as a response from that shard.
2017-06-02 06:00:27 -04:00
`reduce_script`'s should therefore expect and deal with `null` responses from shards.