--- id: test-stats title: "Test Stats Aggregators" --- This Apache Druid (incubating) extension incorporates test statistics related aggregators, including z-score and p-value. Please refer to [https://www.paypal-engineering.com/2017/06/29/democratizing-experimentation-data-for-product-innovations/](https://www.paypal-engineering.com/2017/06/29/democratizing-experimentation-data-for-product-innovations/) for math background and details. Make sure to include `druid-stats` extension in order to use these aggregrators. ## Z-Score for two sample ztests post aggregator Please refer to [https://www.isixsigma.com/tools-templates/hypothesis-testing/making-sense-two-proportions-test/](https://www.isixsigma.com/tools-templates/hypothesis-testing/making-sense-two-proportions-test/) and [http://www.ucs.louisiana.edu/~jcb0773/Berry_statbook/Berry_statbook_chpt6.pdf](http://www.ucs.louisiana.edu/~jcb0773/Berry_statbook/Berry_statbook_chpt6.pdf) for more details. z = (p1 - p2) / S.E. (assuming null hypothesis is true) Please see below for p1 and p2. Please note S.E. stands for standard error where S.E. = sqrt{ p1 * ( 1 - p1 )/n1 + p2 * (1 - p2)/n2) } (p1 – p2) is the observed difference between two sample proportions. ### zscore2sample post aggregator * **`zscore2sample`**: calculate the z-score using two-sample z-test while converting binary variables (***e.g.*** success or not) to continuous variables (***e.g.*** conversion rate). ```json { "type": "zscore2sample", "name": "", "successCount1": success count of sample 1, "sample1Size": sample 1 size, "successCount2": success count of sample 2, "sample2Size" : sample 2 size } ``` Please note the post aggregator will be converting binary variables to continuous variables for two population proportions. Specifically p1 = (successCount1) / (sample size 1) p2 = (successCount2) / (sample size 2) ### pvalue2tailedZtest post aggregator * **`pvalue2tailedZtest`**: calculate p-value of two-sided z-test from zscore - ***pvalue2tailedZtest(zscore)*** - the input is a z-score which can be calculated using the zscore2sample post aggregator ```json { "type": "pvalue2tailedZtest", "name": "", "zScore": } ``` ## Example Usage In this example, we use zscore2sample post aggregator to calculate z-score, and then feed the z-score to pvalue2tailedZtest post aggregator to calculate p-value. A JSON query example can be as follows: ```json { ... "postAggregations" : { "type" : "pvalue2tailedZtest", "name" : "pvalue", "zScore" : { "type" : "zscore2sample", "name" : "zscore", "successCount1" : { "type" : "constant", "name" : "successCountFromPopulation1Sample", "value" : 300 }, "sample1Size" : { "type" : "constant", "name" : "sampleSizeOfPopulation1", "value" : 500 }, "successCount2": { "type" : "constant", "name" : "successCountFromPopulation2Sample", "value" : 450 }, "sample2Size" : { "type" : "constant", "name" : "sampleSizeOfPopulation2", "value" : 600 } } } } ```