From 2d8eb8fbf72c3c6dd8b0862702d97bb487078b8a Mon Sep 17 00:00:00 2001 From: Lisa Cawley Date: Tue, 9 May 2017 08:58:42 -0700 Subject: [PATCH] [DOCS] Add ML metric functions (elastic/x-pack-elasticsearch#1349) * [DOCS] Add ML metric functions * [DOCS] Remove beta note from ML metrics function Original commit: elastic/x-pack-elasticsearch@0bd2e359ac0a09c256b334099a63e1b7ec44474f --- docs/en/ml/functions/metric.asciidoc | 387 +++++++++++++++++++++++++-- 1 file changed, 361 insertions(+), 26 deletions(-) diff --git a/docs/en/ml/functions/metric.asciidoc b/docs/en/ml/functions/metric.asciidoc index a9cdc3d0089..4ecfc804df7 100644 --- a/docs/en/ml/functions/metric.asciidoc +++ b/docs/en/ml/functions/metric.asciidoc @@ -1,45 +1,380 @@ [[ml-metric-functions]] === Metric Functions +The metric functions include functions such as mean, min and max. These values +are calculated for each bucket. Field values that cannot be converted to +double precision floating point numbers are ignored. + The {xpackml} features include the following metric functions: -* `min` -* `max` -* `mean`, `high_mean`, `low_mean` -* `metric` -* `varp`, `high_varp`, `low_varp` +* <> +* <> +* <> +* <> +* <> +* <> +* <> +* <> +* <> +* <> -The metric functions include mean, min and max. These values are calculated for each bucket. -Field values that cannot be converted to double precision floating point numbers -are ignored. +[float] +[[ml-metric-min]] +==== Min -//// -metric:: all of mean, min, and max +The `min` function detects anomalies in the arithmetic minimum of a value. +The minimum value is calculated for each bucket. -mean:: arithmetic mean +High- and low-sided functions are not applicable. -high_mean::: arithmetic mean +This function supports the following properties: -low_mean::: arithmetic mean +* `field_name` (required) +* `by_field_name` (optional) +* `over_field_name` (optional) +* `partition_field_name` (optional) +* `summary_count_field_name` (optional) -median:: statistical median - -min:: arithmetic minimum - -max:: arithmetic maximum - -varp:: population variance - -high_varp::: "" - -low_varp::: "" +For more information about those properties, +see <>. +For example, if you use the following function in a detector in your job, +it detects where the smallest transaction is lower than previously observed. +You can use this function to detect items for sale at unintentionally low +prices due to data entry mistakes. It models the minimum amount for each +product over time. +//Detect when the minumum amount for a product is unusually low compared to its past amounts [source,js] -------------------------------------------------- -{ "function" : "min", "fieldName" : "amt", "byFieldName" : "product" } +{ + "function" : "min", + "field_name" : "amt", + "by_field_name" : "product" +} -------------------------------------------------- +[float] +[[ml-metric-max]] +==== Max -//// +The `max` function detects anomalies in the arithmetic maximum of a value. +The maximum value is calculated for each bucket. + +High- and low-sided functions are not applicable. + +This function supports the following properties: + +* `field_name` (required) +* `by_field_name` (optional) +* `over_field_name` (optional) +* `partition_field_name` (optional) +* `summary_count_field_name` (optional) + +For more information about those properties, +see <>. + +For example, if you use the following function in a detector in your job, +it detects where the longest `responsetime` is longer than previously observed. +You can use this function to detect applications that have `responsetime` +values that are unusually lengthy. It models the maximum `responsetime` for +each application over time and detects when the longest `responsetime` is +unusually long compared to previous applications. + +[source,js] +-------------------------------------------------- +{ + "function" : "max", + "field_name" : "responsetime", + "by_field_name" : "application" +} +-------------------------------------------------- + +This analysis can be performed alongside `high_mean` functions by +application. By combining detectors and using the same influencer this would +detect both unusually long individual response times and average response times +for each bucket. For example: + +[source,js] +-------------------------------------------------- +{ + "function" : "max", + "field_name" : "responsetime", + "by_field_name" : "application" +}, +{ + "function" : "high_mean", + "field_name" : "responsetime", + "by_field_name" : "application" +} +-------------------------------------------------- + +[float] +[[ml-metric-median]] +==== Median + +The `median` function detects anomalies in the statistical median of a value. +The median value is calculated for each bucket. + +High- and low-sided functions are not supported. + +This function supports the following properties: + +* `field_name` (required) +* `by_field_name` (optional) +* `over_field_name` (optional) +* `partition_field_name` (optional) +* `summary_count_field_name` (optional) + +For more information about those properties, +see <>. + +For example, if you use the following function in a detector in your job, +it models the median `responsetime` for each application over time. It detects +when the median `responsetime` is unusual compared to previous `responsetime` +values. + +[source,js] +-------------------------------------------------- +{ + "function" : "median", + "field_name" : "responsetime", + "by_field_name" : "application" +} +-------------------------------------------------- + + +[float] +[[ml-metric-mean]] +==== Mean + +The `mean` function detects anomalies in the arithmetic mean of a value. +The mean value is calculated for each bucket. + +This function supports the following properties: + +* `field_name` (required) +* `by_field_name` (optional) +* `over_field_name` (optional) +* `partition_field_name` (optional) +* `summary_count_field_name` (optional) + +For more information about those properties, +see <>. + +For example, if you use the following function in a detector in your job, +it models the mean `responsetime` for each application over time. It detects +when the mean `responsetime` is unusual compared to previous `responsetime` +values. + +[source,js] +-------------------------------------------------- +{ + "function" : "mean", + "field_name" : "responsetime", + "by_field_name" : "application" +} +-------------------------------------------------- + +[float] +[[ml-metric-high-mean]] +==== High_mean + +The `high_mean` function detects anomalies in the arithmetic mean of a value. +The mean value is calculated for each bucket. +Use this function if you want to monitor unusually high average values. + +This function supports the following properties: + +* `field_name` (required) +* `by_field_name` (optional) +* `over_field_name` (optional) +* `partition_field_name` (optional) +* `summary_count_field_name` (optional) + +For more information about those properties, +see <>. + +For example, if you use the following function in a detector in your job, +it models the mean `responsetime` for each application over time. It detects +when the mean `responsetime` is unusually high compared to previous +`responsetime` values. + +[source,js] +-------------------------------------------------- +{ + "function" : "high_mean", + "field_name" : "responsetime", + "by_field_name" : "application" +} +-------------------------------------------------- + +[float] +[[ml-metric-low-mean]] +==== Low_mean + +The `low_mean` function detects anomalies in the arithmetic mean of a value. +The mean value is calculated for each bucket. +Use this function if you are just interested in unusually low average values. + +This function supports the following properties: + +* `field_name` (required) +* `by_field_name` (optional) +* `over_field_name` (optional) +* `partition_field_name` (optional) +* `summary_count_field_name` (optional) + +For more information about those properties, +see <>. + +For example, if you use the following function in a detector in your job, +it models the mean `responsetime` for each application over time. It detects +when the mean `responsetime` is unusually low +compared to previous `responsetime` values. + +[source,js] +-------------------------------------------------- +{ + "function" : "low_mean", + "field_name" : "responsetime", + "by_field_name" : "application" +} +-------------------------------------------------- + +[float] +[[ml-metric-metric]] +==== Metric + +The `metric` function combines `min`, `max`, and `mean` functions. You can use +it as a shorthand for a combined analysis. If you do not specify a function in +a detector, this is the default function. +//TBD: Is that default behavior still true? + +High- and low-sided functions are not applicable. You cannot use this function +when a `summary_count_field_name` is specified. + +This function supports the following properties: + +* `field_name` (required) +* `by_field_name` (optional) +* `over_field_name` (optional) +* `partition_field_name` (optional) + +For more information about those properties, +see <>. + +For example, if you use the following function in a detector in your job, +it models the mean, min, and max `responsetime` for each application over time. +It detects when the mean, min, or max `responsetime` is unusual compared to +previous `responsetime` values. + +[source,js] +-------------------------------------------------- +{ + "function" : "metric", + "field_name" : "responsetime", + "by_field_name" : "application" +} +-------------------------------------------------- + + +[float] +[[ml-metric-varp]] +==== Varp + +The `varp` function detects anomalies in the variance of a value which is a +measure of the variability and spread in the data. + +This function supports the following properties: + +* `field_name` (required) +* `by_field_name` (optional) +* `over_field_name` (optional) +* `partition_field_name` (optional) +* `summary_count_field_name` (optional) + +For more information about those properties, +see <>. + +For example, if you use the following function in a detector in your job, +it models models the variance in values of `responsetime` for each application +over time. It detects when the variance in `responsetime` is unusual compared +to past application behavior. + +[source,js] +-------------------------------------------------- +{ + "function" : "varp", + "field_name" : "responsetime", + "by_field_name" : "application" +} +-------------------------------------------------- + +[float] +[[ml-metric-high-varp]] +==== High_varp + +The `high_varp` function detects anomalies in the variance of a value which is a +measure of the variability and spread in the data. Use this function if you want +to monitor unusually high variance. + +This function supports the following properties: + +* `field_name` (required) +* `by_field_name` (optional) +* `over_field_name` (optional) +* `partition_field_name` (optional) +* `summary_count_field_name` (optional) + +For more information about those properties, +see <>. + +For example, if you use the following function in a detector in your job, +it models models the variance in values of `responsetime` for each application +over time. It detects when the variance in `responsetime` is unusual compared +to past application behavior. + +[source,js] +-------------------------------------------------- +{ + "function" : "high_varp", + "field_name" : "responsetime", + "by_field_name" : "application" +} +-------------------------------------------------- + + +[float] +[[ml-metric-low-varp]] +==== Low_varp + +The `low_varp` function detects anomalies in the variance of a value which is a +measure of the variability and spread in the data. Use this function if you are +just interested in unusually low variance. + +This function supports the following properties: + +* `field_name` (required) +* `by_field_name` (optional) +* `over_field_name` (optional) +* `partition_field_name` (optional) +* `summary_count_field_name` (optional) + +For more information about those properties, +see <>. + +For example, if you use the following function in a detector in your job, +it models models the variance in values of `responsetime` for each application +over time. It detects when the variance in `responsetime` is unusual compared +to past application behavior. + +[source,js] +-------------------------------------------------- +{ + "function" : "low_varp", + "field_name" : "responsetime", + "by_field_name" : "application" +} +--------------------------------------------------