2014-06-06 10:25:21 -04:00
|
|
|
[[search-aggregations-metrics-percentile-rank-aggregation]]
|
|
|
|
=== Percentile Ranks Aggregation
|
|
|
|
|
|
|
|
A `multi-value` metrics aggregation that calculates one or more percentile ranks
|
|
|
|
over numeric values extracted from the aggregated documents. These values
|
|
|
|
can be extracted either from specific numeric fields in the documents, or
|
|
|
|
be generated by a provided script.
|
|
|
|
|
|
|
|
[NOTE]
|
|
|
|
==================================================
|
|
|
|
Please see <<search-aggregations-metrics-percentile-aggregation-approximation>>
|
|
|
|
and <<search-aggregations-metrics-percentile-aggregation-compression>> for advice
|
|
|
|
regarding approximation and memory use of the percentile ranks aggregation
|
|
|
|
==================================================
|
|
|
|
|
|
|
|
Percentile rank show the percentage of observed values which are below certain
|
|
|
|
value. For example, if a value is greater than or equal to 95% of the observed values
|
|
|
|
it is said to be at the 95th percentile rank.
|
|
|
|
|
|
|
|
Assume your data consists of website load times. You may have a service agreement that
|
|
|
|
95% of page loads completely within 15ms and 99% of page loads complete within 30ms.
|
|
|
|
|
|
|
|
Let's look at a range of percentiles representing load time:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
{
|
|
|
|
"aggs" : {
|
|
|
|
"load_time_outlier" : {
|
|
|
|
"percentile_ranks" : {
|
2015-02-21 04:19:11 -05:00
|
|
|
"field" : "load_time", <1>
|
2014-06-06 10:25:21 -04:00
|
|
|
"values" : [15, 30]
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
<1> The field `load_time` must be a numeric field
|
|
|
|
|
|
|
|
The response will look like this:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
{
|
|
|
|
...
|
|
|
|
|
|
|
|
"aggregations": {
|
|
|
|
"load_time_outlier": {
|
|
|
|
"values" : {
|
|
|
|
"15": 92,
|
|
|
|
"30": 100
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
|
|
|
|
From this information you can determine you are hitting the 99% load time target but not quite
|
|
|
|
hitting the 95% load time target
|
|
|
|
|
|
|
|
|
|
|
|
==== Script
|
|
|
|
|
|
|
|
The percentile rank metric supports scripting. For example, if our load times
|
|
|
|
are in milliseconds but we want to specify values in seconds, we could use
|
|
|
|
a script to convert them on-the-fly:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
{
|
|
|
|
"aggs" : {
|
|
|
|
"load_time_outlier" : {
|
|
|
|
"percentile_ranks" : {
|
|
|
|
"values" : [3, 5],
|
|
|
|
"script" : "doc['load_time'].value / timeUnit", <1>
|
|
|
|
"params" : {
|
|
|
|
"timeUnit" : 1000 <2>
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
<1> The `field` parameter is replaced with a `script` parameter, which uses the
|
|
|
|
script to generate values which percentile ranks are calculated on
|
|
|
|
<2> Scripting supports parameterized input just like any other script
|