181 lines
5.8 KiB
Plaintext
181 lines
5.8 KiB
Plaintext
[role="xpack"]
|
|
[testenv="basic"]
|
|
[[search-aggregations-metrics-ttest-aggregation]]
|
|
=== T-test aggregation
|
|
++++
|
|
<titleabbrev>T-test</titleabbrev>
|
|
++++
|
|
|
|
A `t_test` metrics aggregation that performs a statistical hypothesis test in which the test statistic follows a Student's t-distribution
|
|
under the null hypothesis on numeric values extracted from the aggregated documents or generated by provided scripts. In practice, this
|
|
will tell you if the difference between two population means are statistically significant and did not occur by chance alone.
|
|
|
|
==== Syntax
|
|
|
|
A `t_test` aggregation looks like this in isolation:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"t_test": {
|
|
"a": "value_before",
|
|
"b": "value_after",
|
|
"type": "paired"
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// NOTCONSOLE
|
|
|
|
Assuming that we have a record of node start up times before and after upgrade, let's look at a t-test to see if upgrade affected
|
|
the node start up time in a meaningful way.
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
GET node_upgrade/_search
|
|
{
|
|
"size": 0,
|
|
"aggs": {
|
|
"startup_time_ttest": {
|
|
"t_test": {
|
|
"a": { "field": "startup_time_before" }, <1>
|
|
"b": { "field": "startup_time_after" }, <2>
|
|
"type": "paired" <3>
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TEST[setup:node_upgrade]
|
|
<1> The field `startup_time_before` must be a numeric field.
|
|
<2> The field `startup_time_after` must be a numeric field.
|
|
<3> Since we have data from the same nodes, we are using paired t-test.
|
|
|
|
The response will return the p-value or probability value for the test. It is the probability of obtaining results at least as extreme as
|
|
the result processed by the aggregation, assuming that the null hypothesis is correct (which means there is no difference between
|
|
population means). Smaller p-value means the null hypothesis is more likely to be incorrect and population means are indeed different.
|
|
|
|
[source,console-result]
|
|
--------------------------------------------------
|
|
{
|
|
...
|
|
|
|
"aggregations": {
|
|
"startup_time_ttest": {
|
|
"value": 0.1914368843365979 <1>
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TESTRESPONSE[s/\.\.\./"took": $body.took,"timed_out": false,"_shards": $body._shards,"hits": $body.hits,/]
|
|
<1> The p-value.
|
|
|
|
==== T-Test Types
|
|
|
|
The `t_test` aggregation supports unpaired and paired two-sample t-tests. The type of the test can be specified using the `type` parameter:
|
|
|
|
`"type": "paired"`:: performs paired t-test
|
|
`"type": "homoscedastic"`:: performs two-sample equal variance test
|
|
`"type": "heteroscedastic"`:: performs two-sample unequal variance test (this is default)
|
|
|
|
==== Filters
|
|
|
|
It is also possible to run unpaired t-test on different sets of records using filters. For example, if we want to test the difference
|
|
of startup times before upgrade between two different groups of nodes, we use the same field `startup_time_before` by separate groups of
|
|
nodes using terms filters on the group name field:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
GET node_upgrade/_search
|
|
{
|
|
"size": 0,
|
|
"aggs": {
|
|
"startup_time_ttest": {
|
|
"t_test": {
|
|
"a": {
|
|
"field": "startup_time_before", <1>
|
|
"filter": {
|
|
"term": {
|
|
"group": "A" <2>
|
|
}
|
|
}
|
|
},
|
|
"b": {
|
|
"field": "startup_time_before", <3>
|
|
"filter": {
|
|
"term": {
|
|
"group": "B" <4>
|
|
}
|
|
}
|
|
},
|
|
"type": "heteroscedastic" <5>
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TEST[setup:node_upgrade]
|
|
<1> The field `startup_time_before` must be a numeric field.
|
|
<2> Any query that separates two groups can be used here.
|
|
<3> We are using the same field
|
|
<4> but we are using different filters.
|
|
<5> Since we have data from different nodes, we cannot use paired t-test.
|
|
|
|
|
|
[source,console-result]
|
|
--------------------------------------------------
|
|
{
|
|
...
|
|
|
|
"aggregations": {
|
|
"startup_time_ttest": {
|
|
"value": 0.2981858007281437 <1>
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TESTRESPONSE[s/\.\.\./"took": $body.took,"timed_out": false,"_shards": $body._shards,"hits": $body.hits,/]
|
|
<1> The p-value.
|
|
|
|
In this example, we are using the same fields for both populations. However this is not a requirement and different fields and even
|
|
combination of fields and scripts can be used. Populations don't have to be in the same index either. If data sets are located in different
|
|
indices, the term filter on the <<mapping-index-field,`_index`>> field can be used to select populations.
|
|
|
|
==== Script
|
|
|
|
The `t_test` metric supports scripting. For example, if we need to adjust out load times for the before values, we could use
|
|
a script to recalculate them on-the-fly:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
GET node_upgrade/_search
|
|
{
|
|
"size": 0,
|
|
"aggs": {
|
|
"startup_time_ttest": {
|
|
"t_test": {
|
|
"a": {
|
|
"script": {
|
|
"lang": "painless",
|
|
"source": "doc['startup_time_before'].value - params.adjustment", <1>
|
|
"params": {
|
|
"adjustment": 10 <2>
|
|
}
|
|
}
|
|
},
|
|
"b": {
|
|
"field": "startup_time_after" <3>
|
|
},
|
|
"type": "paired"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TEST[setup:node_upgrade]
|
|
|
|
<1> The `field` parameter is replaced with a `script` parameter, which uses the
|
|
script to generate values which percentiles are calculated on.
|
|
<2> Scripting supports parameterized input just like any other script.
|
|
<3> We can mix scripts and fields.
|
|
|