CONSOLEify percentile and percentile-ranks docs

Related #18160
This commit is contained in:
Zachary Tong 2017-08-02 17:47:27 -04:00
parent 268923ebdc
commit d8414ffa29
No known key found for this signature in database
GPG Key ID: A42721DDA5679EFB
3 changed files with 136 additions and 59 deletions

View File

@ -31,8 +31,6 @@ buildRestTests.expectedUnconvertedCandidates = [
'reference/aggregations/bucket/significantterms-aggregation.asciidoc', 'reference/aggregations/bucket/significantterms-aggregation.asciidoc',
'reference/aggregations/bucket/terms-aggregation.asciidoc', 'reference/aggregations/bucket/terms-aggregation.asciidoc',
'reference/aggregations/matrix/stats-aggregation.asciidoc', 'reference/aggregations/matrix/stats-aggregation.asciidoc',
'reference/aggregations/metrics/percentile-aggregation.asciidoc',
'reference/aggregations/metrics/percentile-rank-aggregation.asciidoc',
'reference/aggregations/metrics/scripted-metric-aggregation.asciidoc', 'reference/aggregations/metrics/scripted-metric-aggregation.asciidoc',
'reference/aggregations/metrics/tophits-aggregation.asciidoc', 'reference/aggregations/metrics/tophits-aggregation.asciidoc',
'reference/cluster/allocation-explain.asciidoc', 'reference/cluster/allocation-explain.asciidoc',
@ -476,3 +474,35 @@ buildRestTests.setups['analyze_sample'] = '''
properties: properties:
obj1.field1: obj1.field1:
type: text''' type: text'''
// Used by percentile/percentile-rank aggregations
buildRestTests.setups['latency'] = '''
- do:
indices.create:
index: latency
body:
settings:
number_of_shards: 1
number_of_replicas: 1
mappings:
data:
properties:
load_time:
type: long
- do:
bulk:
index: latency
type: data
refresh: true
body: |'''
for (int i = 0; i < 100; i++) {
def value = i
if (i % 10) {
value = i*10
}
buildRestTests.setups['latency'] += """
{"index":{}}
{"load_time": "$value"}"""
}

View File

@ -26,7 +26,9 @@ Let's look at a range of percentiles representing load time:
[source,js] [source,js]
-------------------------------------------------- --------------------------------------------------
GET latency/data/_search
{ {
"size": 0,
"aggs" : { "aggs" : {
"load_time_outlier" : { "load_time_outlier" : {
"percentiles" : { "percentiles" : {
@ -36,6 +38,8 @@ Let's look at a range of percentiles representing load time:
} }
} }
-------------------------------------------------- --------------------------------------------------
// CONSOLE
// TEST[setup:latency]
<1> The field `load_time` must be a numeric field <1> The field `load_time` must be a numeric field
By default, the `percentile` metric will generate a range of By default, the `percentile` metric will generate a range of
@ -49,18 +53,19 @@ percentiles: `[ 1, 5, 25, 50, 75, 95, 99 ]`. The response will look like this:
"aggregations": { "aggregations": {
"load_time_outlier": { "load_time_outlier": {
"values" : { "values" : {
"1.0": 15, "1.0": 9.9,
"5.0": 20, "5.0": 29.500000000000004,
"25.0": 23, "25.0": 167.5,
"50.0": 25, "50.0": 445.0,
"75.0": 29, "75.0": 722.5,
"95.0": 60, "95.0": 940.5,
"99.0": 150 "99.0": 980.1000000000001
} }
} }
} }
} }
-------------------------------------------------- --------------------------------------------------
// TESTRESPONSE[s/\.\.\./"took": $body.took,"timed_out": false,"_shards": $body._shards,"hits": $body.hits,/]
As you can see, the aggregation will return a calculated value for each percentile As you can see, the aggregation will return a calculated value for each percentile
in the default range. If we assume response times are in milliseconds, it is in the default range. If we assume response times are in milliseconds, it is
@ -73,7 +78,9 @@ must be a value between 0-100 inclusive):
[source,js] [source,js]
-------------------------------------------------- --------------------------------------------------
GET latency/data/_search
{ {
"size": 0,
"aggs" : { "aggs" : {
"load_time_outlier" : { "load_time_outlier" : {
"percentiles" : { "percentiles" : {
@ -84,6 +91,8 @@ must be a value between 0-100 inclusive):
} }
} }
-------------------------------------------------- --------------------------------------------------
// CONSOLE
// TEST[setup:latency]
<1> Use the `percents` parameter to specify particular percentiles to calculate <1> Use the `percents` parameter to specify particular percentiles to calculate
==== Keyed Response ==== Keyed Response
@ -92,12 +101,13 @@ By default the `keyed` flag is set to `true` which associates a unique string ke
[source,js] [source,js]
-------------------------------------------------- --------------------------------------------------
POST bank/account/_search?size=0 GET latency/data/_search
{ {
"size": 0,
"aggs": { "aggs": {
"balance_outlier": { "load_time_outlier": {
"percentiles": { "percentiles": {
"field": "balance", "field": "load_time",
"keyed": false "keyed": false
} }
} }
@ -105,7 +115,7 @@ POST bank/account/_search?size=0
} }
-------------------------------------------------- --------------------------------------------------
// CONSOLE // CONSOLE
// TEST[setup:bank] // TEST[setup:latency]
Response: Response:
@ -115,35 +125,35 @@ Response:
... ...
"aggregations": { "aggregations": {
"balance_outlier": { "load_time_outlier": {
"values": [ "values": [
{ {
"key": 1.0, "key": 1.0,
"value": 1462.8400000000001 "value": 9.9
}, },
{ {
"key": 5.0, "key": 5.0,
"value": 3591.85 "value": 29.500000000000004
}, },
{ {
"key": 25.0, "key": 25.0,
"value": 13709.333333333334 "value": 167.5
}, },
{ {
"key": 50.0, "key": 50.0,
"value": 26020.11666666667 "value": 445.0
}, },
{ {
"key": 75.0, "key": 75.0,
"value": 38139.648148148146 "value": 722.5
}, },
{ {
"key": 95.0, "key": 95.0,
"value": 47551.549999999996 "value": 940.5
}, },
{ {
"key": 99.0, "key": 99.0,
"value": 49339.16 "value": 980.1000000000001
} }
] ]
} }
@ -151,13 +161,6 @@ Response:
} }
-------------------------------------------------- --------------------------------------------------
// TESTRESPONSE[s/\.\.\./"took": $body.took,"timed_out": false,"_shards": $body._shards,"hits": $body.hits,/] // TESTRESPONSE[s/\.\.\./"took": $body.took,"timed_out": false,"_shards": $body._shards,"hits": $body.hits,/]
// TESTRESPONSE[s/1462.8400000000001/$body.aggregations.balance_outlier.values.0.value/]
// TESTRESPONSE[s/3591.85/$body.aggregations.balance_outlier.values.1.value/]
// TESTRESPONSE[s/13709.333333333334/$body.aggregations.balance_outlier.values.2.value/]
// TESTRESPONSE[s/26020.11666666667/$body.aggregations.balance_outlier.values.3.value/]
// TESTRESPONSE[s/38139.648148148146/$body.aggregations.balance_outlier.values.4.value/]
// TESTRESPONSE[s/47551.549999999996/$body.aggregations.balance_outlier.values.5.value/]
// TESTRESPONSE[s/49339.16/$body.aggregations.balance_outlier.values.6.value/]
==== Script ==== Script
@ -167,7 +170,9 @@ a script to convert them on-the-fly:
[source,js] [source,js]
-------------------------------------------------- --------------------------------------------------
GET latency/data/_search
{ {
"size": 0,
"aggs" : { "aggs" : {
"load_time_outlier" : { "load_time_outlier" : {
"percentiles" : { "percentiles" : {
@ -183,6 +188,9 @@ a script to convert them on-the-fly:
} }
} }
-------------------------------------------------- --------------------------------------------------
// CONSOLE
// TEST[setup:latency]
<1> The `field` parameter is replaced with a `script` parameter, which uses the <1> The `field` parameter is replaced with a `script` parameter, which uses the
script to generate values which percentiles are calculated on script to generate values which percentiles are calculated on
<2> Scripting supports parameterized input just like any other script <2> Scripting supports parameterized input just like any other script
@ -191,14 +199,16 @@ This will interpret the `script` parameter as an `inline` script with the `painl
[source,js] [source,js]
-------------------------------------------------- --------------------------------------------------
GET latency/data/_search
{ {
"size": 0,
"aggs" : { "aggs" : {
"load_time_outlier" : { "load_time_outlier" : {
"percentiles" : { "percentiles" : {
"script" : { "script" : {
"id": "my_script", "id": "my_script",
"params" : { "params": {
"timeUnit" : 1000 "field": "load_time"
} }
} }
} }
@ -206,6 +216,8 @@ This will interpret the `script` parameter as an `inline` script with the `painl
} }
} }
-------------------------------------------------- --------------------------------------------------
// CONSOLE
// TEST[setup:latency,stored_example_script]
[[search-aggregations-metrics-percentile-aggregation-approximation]] [[search-aggregations-metrics-percentile-aggregation-approximation]]
==== Percentiles are (usually) approximate ==== Percentiles are (usually) approximate
@ -252,7 +264,9 @@ This balance can be controlled using a `compression` parameter:
[source,js] [source,js]
-------------------------------------------------- --------------------------------------------------
GET latency/data/_search
{ {
"size": 0,
"aggs" : { "aggs" : {
"load_time_outlier" : { "load_time_outlier" : {
"percentiles" : { "percentiles" : {
@ -265,6 +279,9 @@ This balance can be controlled using a `compression` parameter:
} }
} }
-------------------------------------------------- --------------------------------------------------
// CONSOLE
// TEST[setup:latency]
<1> Compression controls memory usage and approximation error <1> Compression controls memory usage and approximation error
The TDigest algorithm uses a number of "nodes" to approximate percentiles -- the The TDigest algorithm uses a number of "nodes" to approximate percentiles -- the
@ -298,7 +315,9 @@ The HDR Histogram can be used by specifying the `method` parameter in the reques
[source,js] [source,js]
-------------------------------------------------- --------------------------------------------------
GET latency/data/_search
{ {
"size": 0,
"aggs" : { "aggs" : {
"load_time_outlier" : { "load_time_outlier" : {
"percentiles" : { "percentiles" : {
@ -312,6 +331,9 @@ The HDR Histogram can be used by specifying the `method` parameter in the reques
} }
} }
-------------------------------------------------- --------------------------------------------------
// CONSOLE
// TEST[setup:latency]
<1> `hdr` object indicates that HDR Histogram should be used to calculate the percentiles and specific settings for this algorithm can be specified inside the object <1> `hdr` object indicates that HDR Histogram should be used to calculate the percentiles and specific settings for this algorithm can be specified inside the object
<2> `number_of_significant_value_digits` specifies the resolution of values for the histogram in number of significant digits <2> `number_of_significant_value_digits` specifies the resolution of values for the histogram in number of significant digits
@ -326,7 +348,9 @@ had a value.
[source,js] [source,js]
-------------------------------------------------- --------------------------------------------------
GET latency/data/_search
{ {
"size": 0,
"aggs" : { "aggs" : {
"grade_percentiles" : { "grade_percentiles" : {
"percentiles" : { "percentiles" : {
@ -337,5 +361,7 @@ had a value.
} }
} }
-------------------------------------------------- --------------------------------------------------
// CONSOLE
// TEST[setup:latency]
<1> Documents without a value in the `grade` field will fall into the same bucket as documents that have the value `10`. <1> Documents without a value in the `grade` field will fall into the same bucket as documents that have the value `10`.

View File

@ -24,17 +24,21 @@ Let's look at a range of percentiles representing load time:
[source,js] [source,js]
-------------------------------------------------- --------------------------------------------------
GET latency/data/_search
{ {
"size": 0,
"aggs" : { "aggs" : {
"load_time_outlier" : { "load_time_ranks" : {
"percentile_ranks" : { "percentile_ranks" : {
"field" : "load_time", <1> "field" : "load_time", <1>
"values" : [15, 30] "values" : [500, 600]
} }
} }
} }
} }
-------------------------------------------------- --------------------------------------------------
// CONSOLE
// TEST[setup:latency]
<1> The field `load_time` must be a numeric field <1> The field `load_time` must be a numeric field
The response will look like this: The response will look like this:
@ -45,15 +49,16 @@ The response will look like this:
... ...
"aggregations": { "aggregations": {
"load_time_outlier": { "load_time_ranks": {
"values" : { "values" : {
"15": 92, "500.0": 55.00000000000001,
"30": 100 "600.0": 64.0
} }
} }
} }
} }
-------------------------------------------------- --------------------------------------------------
// TESTRESPONSE[s/\.\.\./"took": $body.took,"timed_out": false,"_shards": $body._shards,"hits": $body.hits,/]
From this information you can determine you are hitting the 99% load time target but not quite From this information you can determine you are hitting the 99% load time target but not quite
hitting the 95% load time target hitting the 95% load time target
@ -64,13 +69,14 @@ By default the `keyed` flag is set to `true` associates a unique string key with
[source,js] [source,js]
-------------------------------------------------- --------------------------------------------------
POST bank/account/_search?size=0 GET latency/data/_search
{ {
"size": 0,
"aggs": { "aggs": {
"balance_outlier": { "load_time_ranks": {
"percentile_ranks": { "percentile_ranks": {
"field": "balance", "field": "load_time",
"values": [25000, 50000], "values": [500, 600],
"keyed": false "keyed": false
} }
} }
@ -78,7 +84,7 @@ POST bank/account/_search?size=0
} }
-------------------------------------------------- --------------------------------------------------
// CONSOLE // CONSOLE
// TEST[setup:bank] // TEST[setup:latency]
Response: Response:
@ -88,15 +94,15 @@ Response:
... ...
"aggregations": { "aggregations": {
"balance_outlier": { "load_time_ranks": {
"values": [ "values": [
{ {
"key": 25000.0, "key": 500.0,
"value": 48.537724935732655 "value": 55.00000000000001
}, },
{ {
"key": 50000.0, "key": 600.0,
"value": 99.85567010309278 "value": 64.0
} }
] ]
} }
@ -104,8 +110,7 @@ Response:
} }
-------------------------------------------------- --------------------------------------------------
// TESTRESPONSE[s/\.\.\./"took": $body.took,"timed_out": false,"_shards": $body._shards,"hits": $body.hits,/] // TESTRESPONSE[s/\.\.\./"took": $body.took,"timed_out": false,"_shards": $body._shards,"hits": $body.hits,/]
// TESTRESPONSE[s/48.537724935732655/$body.aggregations.balance_outlier.values.0.value/]
// TESTRESPONSE[s/99.85567010309278/$body.aggregations.balance_outlier.values.1.value/]
==== Script ==== Script
@ -115,11 +120,13 @@ a script to convert them on-the-fly:
[source,js] [source,js]
-------------------------------------------------- --------------------------------------------------
GET latency/data/_search
{ {
"size": 0,
"aggs" : { "aggs" : {
"load_time_outlier" : { "load_time_ranks" : {
"percentile_ranks" : { "percentile_ranks" : {
"values" : [3, 5], "values" : [500, 600],
"script" : { "script" : {
"lang": "painless", "lang": "painless",
"source": "doc['load_time'].value / params.timeUnit", <1> "source": "doc['load_time'].value / params.timeUnit", <1>
@ -132,6 +139,8 @@ a script to convert them on-the-fly:
} }
} }
-------------------------------------------------- --------------------------------------------------
// CONSOLE
// TEST[setup:latency]
<1> The `field` parameter is replaced with a `script` parameter, which uses the <1> The `field` parameter is replaced with a `script` parameter, which uses the
script to generate values which percentile ranks are calculated on script to generate values which percentile ranks are calculated on
<2> Scripting supports parameterized input just like any other script <2> Scripting supports parameterized input just like any other script
@ -140,15 +149,17 @@ This will interpret the `script` parameter as an `inline` script with the `painl
[source,js] [source,js]
-------------------------------------------------- --------------------------------------------------
GET latency/data/_search
{ {
"size": 0,
"aggs" : { "aggs" : {
"load_time_outlier" : { "load_time_ranks" : {
"percentile_ranks" : { "percentile_ranks" : {
"values" : [3, 5], "values" : [500, 600],
"script" : { "script" : {
"id": "my_script", "id": "my_script",
"params" : { "params": {
"timeUnit" : 1000 "field": "load_time"
} }
} }
} }
@ -156,6 +167,8 @@ This will interpret the `script` parameter as an `inline` script with the `painl
} }
} }
-------------------------------------------------- --------------------------------------------------
// CONSOLE
// TEST[setup:latency,stored_example_script]
==== HDR Histogram ==== HDR Histogram
@ -172,12 +185,14 @@ The HDR Histogram can be used by specifying the `method` parameter in the reques
[source,js] [source,js]
-------------------------------------------------- --------------------------------------------------
GET latency/data/_search
{ {
"size": 0,
"aggs" : { "aggs" : {
"load_time_outlier" : { "load_time_ranks" : {
"percentile_ranks" : { "percentile_ranks" : {
"field" : "load_time", "field" : "load_time",
"values" : [15, 30], "values" : [500, 600],
"hdr": { <1> "hdr": { <1>
"number_of_significant_value_digits" : 3 <2> "number_of_significant_value_digits" : 3 <2>
} }
@ -186,6 +201,8 @@ The HDR Histogram can be used by specifying the `method` parameter in the reques
} }
} }
-------------------------------------------------- --------------------------------------------------
// CONSOLE
// TEST[setup:latency]
<1> `hdr` object indicates that HDR Histogram should be used to calculate the percentiles and specific settings for this algorithm can be specified inside the object <1> `hdr` object indicates that HDR Histogram should be used to calculate the percentiles and specific settings for this algorithm can be specified inside the object
<2> `number_of_significant_value_digits` specifies the resolution of values for the histogram in number of significant digits <2> `number_of_significant_value_digits` specifies the resolution of values for the histogram in number of significant digits
@ -200,16 +217,20 @@ had a value.
[source,js] [source,js]
-------------------------------------------------- --------------------------------------------------
GET latency/data/_search
{ {
"size": 0,
"aggs" : { "aggs" : {
"grade_ranks" : { "load_time_ranks" : {
"percentile_ranks" : { "percentile_ranks" : {
"field" : "grade", "field" : "load_time",
"values" : [500, 600],
"missing": 10 <1> "missing": 10 <1>
} }
} }
} }
} }
-------------------------------------------------- --------------------------------------------------
// CONSOLE
<1> Documents without a value in the `grade` field will fall into the same bucket as documents that have the value `10`. // TEST[setup:latency]
<1> Documents without a value in the `load_time` field will fall into the same bucket as documents that have the value `10`.