[[search-aggregations-matrix-stats-aggregation]] === Matrix Stats The `matrix_stats` aggregation is a numeric aggregation that computes the following statistics over a set of document fields: [horizontal] `count`:: Number of per field samples included in the calculation. `mean`:: The average value for each field. `variance`:: Per field Measurement for how spread out the samples are from the mean. `skewness`:: Per field measurement quantifying the asymmetric distribution around the mean. `kurtosis`:: Per field measurement quantifying the shape of the distribution. `covariance`:: A matrix that quantitatively describes how changes in one field are associated with another. `correlation`:: The covariance matrix scaled to a range of -1 to 1, inclusive. Describes the relationship between field distributions. ////////////////////////// [source,js] -------------------------------------------------- PUT /statistics/_doc/0 {"poverty": 24.0, "income": 50000.0} PUT /statistics/_doc/1 {"poverty": 13.0, "income": 95687.0} PUT /statistics/_doc/2 {"poverty": 69.0, "income": 7890.0} POST /_refresh -------------------------------------------------- // NOTCONSOLE // TESTSETUP ////////////////////////// The following example demonstrates the use of matrix stats to describe the relationship between income and poverty. [source,console] -------------------------------------------------- GET /_search { "aggs": { "statistics": { "matrix_stats": { "fields": ["poverty", "income"] } } } } -------------------------------------------------- // TEST[s/_search/_search\?filter_path=aggregations/] The aggregation type is `matrix_stats` and the `fields` setting defines the set of fields (as an array) for computing the statistics. The above request returns the following response: [source,console-result] -------------------------------------------------- { ... "aggregations": { "statistics": { "doc_count": 50, "fields": [{ "name": "income", "count": 50, "mean": 51985.1, "variance": 7.383377037755103E7, "skewness": 0.5595114003506483, "kurtosis": 2.5692365287787124, "covariance": { "income": 7.383377037755103E7, "poverty": -21093.65836734694 }, "correlation": { "income": 1.0, "poverty": -0.8352655256272504 } }, { "name": "poverty", "count": 50, "mean": 12.732000000000001, "variance": 8.637730612244896, "skewness": 0.4516049811903419, "kurtosis": 2.8615929677997767, "covariance": { "income": -21093.65836734694, "poverty": 8.637730612244896 }, "correlation": { "income": -0.8352655256272504, "poverty": 1.0 } }] } } } -------------------------------------------------- // TESTRESPONSE[s/\.\.\.//] // TESTRESPONSE[s/: (\-)?[0-9\.E]+/: $body.$_path/] The `doc_count` field indicates the number of documents involved in the computation of the statistics. ==== Multi Value Fields The `matrix_stats` aggregation treats each document field as an independent sample. The `mode` parameter controls what array value the aggregation will use for array or multi-valued fields. This parameter can take one of the following: [horizontal] `avg`:: (default) Use the average of all values. `min`:: Pick the lowest value. `max`:: Pick the highest value. `sum`:: Use the sum of all values. `median`:: Use the median of all values. ==== Missing Values The `missing` parameter defines how documents that are missing a value should be treated. By default they will be ignored but it is also possible to treat them as if they had a value. This is done by adding a set of fieldname : value mappings to specify default values per field. [source,console] -------------------------------------------------- GET /_search { "aggs": { "matrixstats": { "matrix_stats": { "fields": ["poverty", "income"], "missing": {"income" : 50000} <1> } } } } -------------------------------------------------- <1> Documents without a value in the `income` field will have the default value `50000`. ==== Script This aggregation family does not yet support scripting.