[DOCS] Fixes code snippet testing for machine learning (#31189)
This commit is contained in:
parent
b44e1c1978
commit
5971eb83c4
|
@ -9,13 +9,6 @@ apply plugin: 'elasticsearch.docs-test'
|
||||||
* only remove entries from this list. When it is empty we'll remove it
|
* only remove entries from this list. When it is empty we'll remove it
|
||||||
* entirely and have a party! There will be cake and everything.... */
|
* entirely and have a party! There will be cake and everything.... */
|
||||||
buildRestTests.expectedUnconvertedCandidates = [
|
buildRestTests.expectedUnconvertedCandidates = [
|
||||||
'en/ml/functions/count.asciidoc',
|
|
||||||
'en/ml/functions/geo.asciidoc',
|
|
||||||
'en/ml/functions/info.asciidoc',
|
|
||||||
'en/ml/functions/metric.asciidoc',
|
|
||||||
'en/ml/functions/rare.asciidoc',
|
|
||||||
'en/ml/functions/sum.asciidoc',
|
|
||||||
'en/ml/functions/time.asciidoc',
|
|
||||||
'en/rest-api/watcher/put-watch.asciidoc',
|
'en/rest-api/watcher/put-watch.asciidoc',
|
||||||
'en/security/authentication/user-cache.asciidoc',
|
'en/security/authentication/user-cache.asciidoc',
|
||||||
'en/security/authorization/field-and-document-access-control.asciidoc',
|
'en/security/authorization/field-and-document-access-control.asciidoc',
|
||||||
|
@ -56,7 +49,6 @@ buildRestTests.expectedUnconvertedCandidates = [
|
||||||
'en/watcher/troubleshooting.asciidoc',
|
'en/watcher/troubleshooting.asciidoc',
|
||||||
'en/rest-api/license/delete-license.asciidoc',
|
'en/rest-api/license/delete-license.asciidoc',
|
||||||
'en/rest-api/license/update-license.asciidoc',
|
'en/rest-api/license/update-license.asciidoc',
|
||||||
'en/ml/api-quickref.asciidoc',
|
|
||||||
'en/rest-api/ml/delete-snapshot.asciidoc',
|
'en/rest-api/ml/delete-snapshot.asciidoc',
|
||||||
'en/rest-api/ml/forecast.asciidoc',
|
'en/rest-api/ml/forecast.asciidoc',
|
||||||
'en/rest-api/ml/get-bucket.asciidoc',
|
'en/rest-api/ml/get-bucket.asciidoc',
|
||||||
|
|
|
@ -1,5 +1,6 @@
|
||||||
|
[role="xpack"]
|
||||||
[[ml-configuring-aggregation]]
|
[[ml-configuring-aggregation]]
|
||||||
=== Aggregating Data For Faster Performance
|
=== Aggregating data for faster performance
|
||||||
|
|
||||||
By default, {dfeeds} fetch data from {es} using search and scroll requests.
|
By default, {dfeeds} fetch data from {es} using search and scroll requests.
|
||||||
It can be significantly more efficient, however, to aggregate data in {es}
|
It can be significantly more efficient, however, to aggregate data in {es}
|
||||||
|
|
|
@ -1,5 +1,6 @@
|
||||||
|
[role="xpack"]
|
||||||
[[ml-api-quickref]]
|
[[ml-api-quickref]]
|
||||||
== API Quick Reference
|
== API quick reference
|
||||||
|
|
||||||
All {ml} endpoints have the following base:
|
All {ml} endpoints have the following base:
|
||||||
|
|
||||||
|
@ -7,6 +8,7 @@ All {ml} endpoints have the following base:
|
||||||
----
|
----
|
||||||
/_xpack/ml/
|
/_xpack/ml/
|
||||||
----
|
----
|
||||||
|
// NOTCONSOLE
|
||||||
|
|
||||||
The main {ml} resources can be accessed with a variety of endpoints:
|
The main {ml} resources can be accessed with a variety of endpoints:
|
||||||
|
|
||||||
|
|
|
@ -1,3 +1,4 @@
|
||||||
|
[role="xpack"]
|
||||||
[[ml-configuring-categories]]
|
[[ml-configuring-categories]]
|
||||||
=== Categorizing log messages
|
=== Categorizing log messages
|
||||||
|
|
||||||
|
@ -77,7 +78,7 @@ NOTE: To add the `categorization_examples_limit` property, you must use the
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
[[ml-configuring-analyzer]]
|
[[ml-configuring-analyzer]]
|
||||||
==== Customizing the Categorization Analyzer
|
==== Customizing the categorization analyzer
|
||||||
|
|
||||||
Categorization uses English dictionary words to identify log message categories.
|
Categorization uses English dictionary words to identify log message categories.
|
||||||
By default, it also uses English tokenization rules. For this reason, if you use
|
By default, it also uses English tokenization rules. For this reason, if you use
|
||||||
|
@ -213,7 +214,7 @@ API examples above.
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
[[ml-viewing-categories]]
|
[[ml-viewing-categories]]
|
||||||
==== Viewing Categorization Results
|
==== Viewing categorization results
|
||||||
|
|
||||||
After you open the job and start the {dfeed} or supply data to the job, you can
|
After you open the job and start the {dfeed} or supply data to the job, you can
|
||||||
view the categorization results in {kib}. For example:
|
view the categorization results in {kib}. For example:
|
||||||
|
|
|
@ -1,5 +1,6 @@
|
||||||
|
[role="xpack"]
|
||||||
[[ml-configuring]]
|
[[ml-configuring]]
|
||||||
== Configuring Machine Learning
|
== Configuring machine learning
|
||||||
|
|
||||||
If you want to use {xpackml} features, there must be at least one {ml} node in
|
If you want to use {xpackml} features, there must be at least one {ml} node in
|
||||||
your cluster and all master-eligible nodes must have {ml} enabled. By default,
|
your cluster and all master-eligible nodes must have {ml} enabled. By default,
|
||||||
|
|
|
@ -48,7 +48,7 @@ using the {ml} APIs.
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
[[ml-configuring-url-strings]]
|
[[ml-configuring-url-strings]]
|
||||||
==== String Substitution in Custom URLs
|
==== String substitution in custom URLs
|
||||||
|
|
||||||
You can use dollar sign ($) delimited tokens in a custom URL. These tokens are
|
You can use dollar sign ($) delimited tokens in a custom URL. These tokens are
|
||||||
substituted for the values of the corresponding fields in the anomaly records.
|
substituted for the values of the corresponding fields in the anomaly records.
|
||||||
|
|
|
@ -1,5 +1,6 @@
|
||||||
|
[role="xpack"]
|
||||||
[[ml-functions]]
|
[[ml-functions]]
|
||||||
== Function Reference
|
== Function reference
|
||||||
|
|
||||||
The {xpackml} features include analysis functions that provide a wide variety of
|
The {xpackml} features include analysis functions that provide a wide variety of
|
||||||
flexible ways to analyze data for anomalies.
|
flexible ways to analyze data for anomalies.
|
||||||
|
|
|
@ -1,5 +1,6 @@
|
||||||
|
[role="xpack"]
|
||||||
[[ml-count-functions]]
|
[[ml-count-functions]]
|
||||||
=== Count Functions
|
=== Count functions
|
||||||
|
|
||||||
Count functions detect anomalies when the number of events in a bucket is
|
Count functions detect anomalies when the number of events in a bucket is
|
||||||
anomalous.
|
anomalous.
|
||||||
|
@ -21,7 +22,7 @@ The {xpackml} features include the following count functions:
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
[[ml-count]]
|
[[ml-count]]
|
||||||
===== Count, High_count, Low_count
|
===== Count, high_count, low_count
|
||||||
|
|
||||||
The `count` function detects anomalies when the number of events in a bucket is
|
The `count` function detects anomalies when the number of events in a bucket is
|
||||||
anomalous.
|
anomalous.
|
||||||
|
@ -44,8 +45,20 @@ see {ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects]
|
||||||
.Example 1: Analyzing events with the count function
|
.Example 1: Analyzing events with the count function
|
||||||
[source,js]
|
[source,js]
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
{ "function" : "count" }
|
PUT _xpack/ml/anomaly_detectors/example1
|
||||||
|
{
|
||||||
|
"analysis_config": {
|
||||||
|
"detectors": [{
|
||||||
|
"function" : "count"
|
||||||
|
}]
|
||||||
|
},
|
||||||
|
"data_description": {
|
||||||
|
"time_field":"timestamp",
|
||||||
|
"time_format": "epoch_ms"
|
||||||
|
}
|
||||||
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// CONSOLE
|
||||||
|
|
||||||
This example is probably the simplest possible analysis. It identifies
|
This example is probably the simplest possible analysis. It identifies
|
||||||
time buckets during which the overall count of events is higher or lower than
|
time buckets during which the overall count of events is higher or lower than
|
||||||
|
@ -57,12 +70,22 @@ and detects when the event rate is unusual compared to its past behavior.
|
||||||
.Example 2: Analyzing errors with the high_count function
|
.Example 2: Analyzing errors with the high_count function
|
||||||
[source,js]
|
[source,js]
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
PUT _xpack/ml/anomaly_detectors/example2
|
||||||
{
|
{
|
||||||
"function" : "high_count",
|
"analysis_config": {
|
||||||
"by_field_name" : "error_code",
|
"detectors": [{
|
||||||
"over_field_name": "user"
|
"function" : "high_count",
|
||||||
|
"by_field_name" : "error_code",
|
||||||
|
"over_field_name": "user"
|
||||||
|
}]
|
||||||
|
},
|
||||||
|
"data_description": {
|
||||||
|
"time_field":"timestamp",
|
||||||
|
"time_format": "epoch_ms"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// CONSOLE
|
||||||
|
|
||||||
If you use this `high_count` function in a detector in your job, it
|
If you use this `high_count` function in a detector in your job, it
|
||||||
models the event rate for each error code. It detects users that generate an
|
models the event rate for each error code. It detects users that generate an
|
||||||
|
@ -72,11 +95,21 @@ unusually high count of error codes compared to other users.
|
||||||
.Example 3: Analyzing status codes with the low_count function
|
.Example 3: Analyzing status codes with the low_count function
|
||||||
[source,js]
|
[source,js]
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
PUT _xpack/ml/anomaly_detectors/example3
|
||||||
{
|
{
|
||||||
"function" : "low_count",
|
"analysis_config": {
|
||||||
"by_field_name" : "status_code"
|
"detectors": [{
|
||||||
|
"function" : "low_count",
|
||||||
|
"by_field_name" : "status_code"
|
||||||
|
}]
|
||||||
|
},
|
||||||
|
"data_description": {
|
||||||
|
"time_field":"timestamp",
|
||||||
|
"time_format": "epoch_ms"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// CONSOLE
|
||||||
|
|
||||||
In this example, the function detects when the count of events for a
|
In this example, the function detects when the count of events for a
|
||||||
status code is lower than usual.
|
status code is lower than usual.
|
||||||
|
@ -88,22 +121,30 @@ compared to its past behavior.
|
||||||
.Example 4: Analyzing aggregated data with the count function
|
.Example 4: Analyzing aggregated data with the count function
|
||||||
[source,js]
|
[source,js]
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
PUT _xpack/ml/anomaly_detectors/example4
|
||||||
{
|
{
|
||||||
"summary_count_field_name" : "events_per_min",
|
"analysis_config": {
|
||||||
"detectors" [
|
"summary_count_field_name" : "events_per_min",
|
||||||
{ "function" : "count" }
|
"detectors": [{
|
||||||
]
|
"function" : "count"
|
||||||
}
|
}]
|
||||||
|
},
|
||||||
|
"data_description": {
|
||||||
|
"time_field":"timestamp",
|
||||||
|
"time_format": "epoch_ms"
|
||||||
|
}
|
||||||
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// CONSOLE
|
||||||
|
|
||||||
If you are analyzing an aggregated `events_per_min` field, do not use a sum
|
If you are analyzing an aggregated `events_per_min` field, do not use a sum
|
||||||
function (for example, `sum(events_per_min)`). Instead, use the count function
|
function (for example, `sum(events_per_min)`). Instead, use the count function
|
||||||
and the `summary_count_field_name` property.
|
and the `summary_count_field_name` property. For more information, see
|
||||||
//TO-DO: For more information, see <<aggreggations.asciidoc>>.
|
<<ml-configuring-aggregation>>.
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
[[ml-nonzero-count]]
|
[[ml-nonzero-count]]
|
||||||
===== Non_zero_count, High_non_zero_count, Low_non_zero_count
|
===== Non_zero_count, high_non_zero_count, low_non_zero_count
|
||||||
|
|
||||||
The `non_zero_count` function detects anomalies when the number of events in a
|
The `non_zero_count` function detects anomalies when the number of events in a
|
||||||
bucket is anomalous, but it ignores cases where the bucket count is zero. Use
|
bucket is anomalous, but it ignores cases where the bucket count is zero. Use
|
||||||
|
@ -144,11 +185,21 @@ The `non_zero_count` function models only the following data:
|
||||||
.Example 5: Analyzing signatures with the high_non_zero_count function
|
.Example 5: Analyzing signatures with the high_non_zero_count function
|
||||||
[source,js]
|
[source,js]
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
PUT _xpack/ml/anomaly_detectors/example5
|
||||||
{
|
{
|
||||||
"function" : "high_non_zero_count",
|
"analysis_config": {
|
||||||
"by_field_name" : "signaturename"
|
"detectors": [{
|
||||||
|
"function" : "high_non_zero_count",
|
||||||
|
"by_field_name" : "signaturename"
|
||||||
|
}]
|
||||||
|
},
|
||||||
|
"data_description": {
|
||||||
|
"time_field":"timestamp",
|
||||||
|
"time_format": "epoch_ms"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// CONSOLE
|
||||||
|
|
||||||
If you use this `high_non_zero_count` function in a detector in your job, it
|
If you use this `high_non_zero_count` function in a detector in your job, it
|
||||||
models the count of events for the `signaturename` field. It ignores any buckets
|
models the count of events for the `signaturename` field. It ignores any buckets
|
||||||
|
@ -163,7 +214,7 @@ data is sparse, use the `count` functions, which are optimized for that scenario
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
[[ml-distinct-count]]
|
[[ml-distinct-count]]
|
||||||
===== Distinct_count, High_distinct_count, Low_distinct_count
|
===== Distinct_count, high_distinct_count, low_distinct_count
|
||||||
|
|
||||||
The `distinct_count` function detects anomalies where the number of distinct
|
The `distinct_count` function detects anomalies where the number of distinct
|
||||||
values in one field is unusual.
|
values in one field is unusual.
|
||||||
|
@ -187,11 +238,21 @@ see {ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects]
|
||||||
.Example 6: Analyzing users with the distinct_count function
|
.Example 6: Analyzing users with the distinct_count function
|
||||||
[source,js]
|
[source,js]
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
PUT _xpack/ml/anomaly_detectors/example6
|
||||||
{
|
{
|
||||||
"function" : "distinct_count",
|
"analysis_config": {
|
||||||
"field_name" : "user"
|
"detectors": [{
|
||||||
|
"function" : "distinct_count",
|
||||||
|
"field_name" : "user"
|
||||||
|
}]
|
||||||
|
},
|
||||||
|
"data_description": {
|
||||||
|
"time_field":"timestamp",
|
||||||
|
"time_format": "epoch_ms"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// CONSOLE
|
||||||
|
|
||||||
This `distinct_count` function detects when a system has an unusual number
|
This `distinct_count` function detects when a system has an unusual number
|
||||||
of logged in users. When you use this function in a detector in your job, it
|
of logged in users. When you use this function in a detector in your job, it
|
||||||
|
@ -201,12 +262,22 @@ users is unusual compared to the past.
|
||||||
.Example 7: Analyzing ports with the high_distinct_count function
|
.Example 7: Analyzing ports with the high_distinct_count function
|
||||||
[source,js]
|
[source,js]
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
PUT _xpack/ml/anomaly_detectors/example7
|
||||||
{
|
{
|
||||||
"function" : "high_distinct_count",
|
"analysis_config": {
|
||||||
"field_name" : "dst_port",
|
"detectors": [{
|
||||||
"over_field_name": "src_ip"
|
"function" : "high_distinct_count",
|
||||||
|
"field_name" : "dst_port",
|
||||||
|
"over_field_name": "src_ip"
|
||||||
|
}]
|
||||||
|
},
|
||||||
|
"data_description": {
|
||||||
|
"time_field":"timestamp",
|
||||||
|
"time_format": "epoch_ms"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// CONSOLE
|
||||||
|
|
||||||
This example detects instances of port scanning. When you use this function in a
|
This example detects instances of port scanning. When you use this function in a
|
||||||
detector in your job, it models the distinct count of ports. It also detects the
|
detector in your job, it models the distinct count of ports. It also detects the
|
||||||
|
|
|
@ -1,5 +1,6 @@
|
||||||
|
[role="xpack"]
|
||||||
[[ml-geo-functions]]
|
[[ml-geo-functions]]
|
||||||
=== Geographic Functions
|
=== Geographic functions
|
||||||
|
|
||||||
The geographic functions detect anomalies in the geographic location of the
|
The geographic functions detect anomalies in the geographic location of the
|
||||||
input data.
|
input data.
|
||||||
|
@ -28,12 +29,22 @@ see {ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects]
|
||||||
.Example 1: Analyzing transactions with the lat_long function
|
.Example 1: Analyzing transactions with the lat_long function
|
||||||
[source,js]
|
[source,js]
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
PUT _xpack/ml/anomaly_detectors/example1
|
||||||
{
|
{
|
||||||
"function" : "lat_long",
|
"analysis_config": {
|
||||||
"field_name" : "transactionCoordinates",
|
"detectors": [{
|
||||||
"by_field_name" : "creditCardNumber"
|
"function" : "lat_long",
|
||||||
|
"field_name" : "transactionCoordinates",
|
||||||
|
"by_field_name" : "creditCardNumber"
|
||||||
|
}]
|
||||||
|
},
|
||||||
|
"data_description": {
|
||||||
|
"time_field":"timestamp",
|
||||||
|
"time_format": "epoch_ms"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// CONSOLE
|
||||||
|
|
||||||
If you use this `lat_long` function in a detector in your job, it
|
If you use this `lat_long` function in a detector in your job, it
|
||||||
detects anomalies where the geographic location of a credit card transaction is
|
detects anomalies where the geographic location of a credit card transaction is
|
||||||
|
@ -54,6 +65,7 @@ For example, JSON data might contain the following transaction coordinates:
|
||||||
"creditCardNumber": "1234123412341234"
|
"creditCardNumber": "1234123412341234"
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// NOTCONSOLE
|
||||||
|
|
||||||
In {es}, location data is likely to be stored in `geo_point` fields. For more
|
In {es}, location data is likely to be stored in `geo_point` fields. For more
|
||||||
information, see {ref}/geo-point.html[Geo-point datatype]. This data type is not
|
information, see {ref}/geo-point.html[Geo-point datatype]. This data type is not
|
||||||
|
@ -64,7 +76,15 @@ format. For example, the following Painless script transforms
|
||||||
|
|
||||||
[source,js]
|
[source,js]
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
PUT _xpack/ml/datafeeds/datafeed-test2
|
||||||
{
|
{
|
||||||
|
"job_id": "farequote",
|
||||||
|
"indices": ["farequote"],
|
||||||
|
"query": {
|
||||||
|
"match_all": {
|
||||||
|
"boost": 1
|
||||||
|
}
|
||||||
|
},
|
||||||
"script_fields": {
|
"script_fields": {
|
||||||
"lat-lon": {
|
"lat-lon": {
|
||||||
"script": {
|
"script": {
|
||||||
|
@ -75,5 +95,7 @@ format. For example, the following Painless script transforms
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// CONSOLE
|
||||||
|
// TEST[setup:farequote_job]
|
||||||
|
|
||||||
For more information, see <<ml-configuring-transform>>.
|
For more information, see <<ml-configuring-transform>>.
|
||||||
|
|
|
@ -40,6 +40,7 @@ For more information about those properties, see
|
||||||
"over_field_name" : "highest_registered_domain"
|
"over_field_name" : "highest_registered_domain"
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// NOTCONSOLE
|
||||||
|
|
||||||
If you use this `info_content` function in a detector in your job, it models
|
If you use this `info_content` function in a detector in your job, it models
|
||||||
information that is present in the `subdomain` string. It detects anomalies
|
information that is present in the `subdomain` string. It detects anomalies
|
||||||
|
@ -60,6 +61,7 @@ choice.
|
||||||
"over_field_name" : "src_ip"
|
"over_field_name" : "src_ip"
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// NOTCONSOLE
|
||||||
|
|
||||||
If you use this `high_info_content` function in a detector in your job, it
|
If you use this `high_info_content` function in a detector in your job, it
|
||||||
models information content that is held in the DNS query string. It detects
|
models information content that is held in the DNS query string. It detects
|
||||||
|
@ -77,6 +79,7 @@ information content is higher than expected.
|
||||||
"by_field_name" : "logfilename"
|
"by_field_name" : "logfilename"
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// NOTCONSOLE
|
||||||
|
|
||||||
If you use this `low_info_content` function in a detector in your job, it models
|
If you use this `low_info_content` function in a detector in your job, it models
|
||||||
information content that is present in the message string for each
|
information content that is present in the message string for each
|
||||||
|
|
|
@ -1,5 +1,6 @@
|
||||||
|
[role="xpack"]
|
||||||
[[ml-metric-functions]]
|
[[ml-metric-functions]]
|
||||||
=== Metric Functions
|
=== Metric functions
|
||||||
|
|
||||||
The metric functions include functions such as mean, min and max. These values
|
The metric functions include functions such as mean, min and max. These values
|
||||||
are calculated for each bucket. Field values that cannot be converted to
|
are calculated for each bucket. Field values that cannot be converted to
|
||||||
|
@ -42,6 +43,7 @@ For more information about those properties, see
|
||||||
"by_field_name" : "product"
|
"by_field_name" : "product"
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// NOTCONSOLE
|
||||||
|
|
||||||
If you use this `min` function in a detector in your job, it detects where the
|
If you use this `min` function in a detector in your job, it detects where the
|
||||||
smallest transaction is lower than previously observed. You can use this
|
smallest transaction is lower than previously observed. You can use this
|
||||||
|
@ -76,6 +78,7 @@ For more information about those properties, see
|
||||||
"by_field_name" : "application"
|
"by_field_name" : "application"
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// NOTCONSOLE
|
||||||
|
|
||||||
If you use this `max` function in a detector in your job, it detects where the
|
If you use this `max` function in a detector in your job, it detects where the
|
||||||
longest `responsetime` is longer than previously observed. You can use this
|
longest `responsetime` is longer than previously observed. You can use this
|
||||||
|
@ -98,6 +101,7 @@ to previous applications.
|
||||||
"by_field_name" : "application"
|
"by_field_name" : "application"
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// NOTCONSOLE
|
||||||
|
|
||||||
The analysis in the previous example can be performed alongside `high_mean`
|
The analysis in the previous example can be performed alongside `high_mean`
|
||||||
functions by application. By combining detectors and using the same influencer
|
functions by application. By combining detectors and using the same influencer
|
||||||
|
@ -106,7 +110,7 @@ response times for each bucket.
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
[[ml-metric-median]]
|
[[ml-metric-median]]
|
||||||
==== Median, High_median, Low_median
|
==== Median, high_median, low_median
|
||||||
|
|
||||||
The `median` function detects anomalies in the statistical median of a value.
|
The `median` function detects anomalies in the statistical median of a value.
|
||||||
The median value is calculated for each bucket.
|
The median value is calculated for each bucket.
|
||||||
|
@ -136,6 +140,7 @@ For more information about those properties, see
|
||||||
"by_field_name" : "application"
|
"by_field_name" : "application"
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// NOTCONSOLE
|
||||||
|
|
||||||
If you use this `median` function in a detector in your job, it models the
|
If you use this `median` function in a detector in your job, it models the
|
||||||
median `responsetime` for each application over time. It detects when the median
|
median `responsetime` for each application over time. It detects when the median
|
||||||
|
@ -143,7 +148,7 @@ median `responsetime` for each application over time. It detects when the median
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
[[ml-metric-mean]]
|
[[ml-metric-mean]]
|
||||||
==== Mean, High_mean, Low_mean
|
==== Mean, high_mean, low_mean
|
||||||
|
|
||||||
The `mean` function detects anomalies in the arithmetic mean of a value.
|
The `mean` function detects anomalies in the arithmetic mean of a value.
|
||||||
The mean value is calculated for each bucket.
|
The mean value is calculated for each bucket.
|
||||||
|
@ -173,6 +178,7 @@ For more information about those properties, see
|
||||||
"by_field_name" : "application"
|
"by_field_name" : "application"
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// NOTCONSOLE
|
||||||
|
|
||||||
If you use this `mean` function in a detector in your job, it models the mean
|
If you use this `mean` function in a detector in your job, it models the mean
|
||||||
`responsetime` for each application over time. It detects when the mean
|
`responsetime` for each application over time. It detects when the mean
|
||||||
|
@ -187,6 +193,7 @@ If you use this `mean` function in a detector in your job, it models the mean
|
||||||
"by_field_name" : "application"
|
"by_field_name" : "application"
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// NOTCONSOLE
|
||||||
|
|
||||||
If you use this `high_mean` function in a detector in your job, it models the
|
If you use this `high_mean` function in a detector in your job, it models the
|
||||||
mean `responsetime` for each application over time. It detects when the mean
|
mean `responsetime` for each application over time. It detects when the mean
|
||||||
|
@ -201,6 +208,7 @@ mean `responsetime` for each application over time. It detects when the mean
|
||||||
"by_field_name" : "application"
|
"by_field_name" : "application"
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// NOTCONSOLE
|
||||||
|
|
||||||
If you use this `low_mean` function in a detector in your job, it models the
|
If you use this `low_mean` function in a detector in your job, it models the
|
||||||
mean `responsetime` for each application over time. It detects when the mean
|
mean `responsetime` for each application over time. It detects when the mean
|
||||||
|
@ -237,6 +245,7 @@ For more information about those properties, see
|
||||||
"by_field_name" : "application"
|
"by_field_name" : "application"
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// NOTCONSOLE
|
||||||
|
|
||||||
If you use this `metric` function in a detector in your job, it models the
|
If you use this `metric` function in a detector in your job, it models the
|
||||||
mean, min, and max `responsetime` for each application over time. It detects
|
mean, min, and max `responsetime` for each application over time. It detects
|
||||||
|
@ -245,7 +254,7 @@ when the mean, min, or max `responsetime` is unusual compared to previous
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
[[ml-metric-varp]]
|
[[ml-metric-varp]]
|
||||||
==== Varp, High_varp, Low_varp
|
==== Varp, high_varp, low_varp
|
||||||
|
|
||||||
The `varp` function detects anomalies in the variance of a value which is a
|
The `varp` function detects anomalies in the variance of a value which is a
|
||||||
measure of the variability and spread in the data.
|
measure of the variability and spread in the data.
|
||||||
|
@ -273,6 +282,7 @@ For more information about those properties, see
|
||||||
"by_field_name" : "application"
|
"by_field_name" : "application"
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// NOTCONSOLE
|
||||||
|
|
||||||
If you use this `varp` function in a detector in your job, it models the
|
If you use this `varp` function in a detector in your job, it models the
|
||||||
variance in values of `responsetime` for each application over time. It detects
|
variance in values of `responsetime` for each application over time. It detects
|
||||||
|
@ -288,6 +298,7 @@ behavior.
|
||||||
"by_field_name" : "application"
|
"by_field_name" : "application"
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// NOTCONSOLE
|
||||||
|
|
||||||
If you use this `high_varp` function in a detector in your job, it models the
|
If you use this `high_varp` function in a detector in your job, it models the
|
||||||
variance in values of `responsetime` for each application over time. It detects
|
variance in values of `responsetime` for each application over time. It detects
|
||||||
|
@ -303,6 +314,7 @@ behavior.
|
||||||
"by_field_name" : "application"
|
"by_field_name" : "application"
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// NOTCONSOLE
|
||||||
|
|
||||||
If you use this `low_varp` function in a detector in your job, it models the
|
If you use this `low_varp` function in a detector in your job, it models the
|
||||||
variance in values of `responsetime` for each application over time. It detects
|
variance in values of `responsetime` for each application over time. It detects
|
||||||
|
|
|
@ -1,5 +1,6 @@
|
||||||
|
[role="xpack"]
|
||||||
[[ml-rare-functions]]
|
[[ml-rare-functions]]
|
||||||
=== Rare Functions
|
=== Rare functions
|
||||||
|
|
||||||
The rare functions detect values that occur rarely in time or rarely for a
|
The rare functions detect values that occur rarely in time or rarely for a
|
||||||
population.
|
population.
|
||||||
|
@ -54,6 +55,7 @@ For more information about those properties, see
|
||||||
"by_field_name" : "status"
|
"by_field_name" : "status"
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// NOTCONSOLE
|
||||||
|
|
||||||
If you use this `rare` function in a detector in your job, it detects values
|
If you use this `rare` function in a detector in your job, it detects values
|
||||||
that are rare in time. It models status codes that occur over time and detects
|
that are rare in time. It models status codes that occur over time and detects
|
||||||
|
@ -69,6 +71,7 @@ status codes in a web access log that have never (or rarely) occurred before.
|
||||||
"over_field_name" : "clientip"
|
"over_field_name" : "clientip"
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// NOTCONSOLE
|
||||||
|
|
||||||
If you use this `rare` function in a detector in your job, it detects values
|
If you use this `rare` function in a detector in your job, it detects values
|
||||||
that are rare in a population. It models status code and client IP interactions
|
that are rare in a population. It models status code and client IP interactions
|
||||||
|
@ -111,6 +114,7 @@ For more information about those properties, see
|
||||||
"over_field_name" : "clientip"
|
"over_field_name" : "clientip"
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// NOTCONSOLE
|
||||||
|
|
||||||
If you use this `freq_rare` function in a detector in your job, it
|
If you use this `freq_rare` function in a detector in your job, it
|
||||||
detects values that are frequently rare in a population. It models URI paths and
|
detects values that are frequently rare in a population. It models URI paths and
|
||||||
|
|
|
@ -1,6 +1,6 @@
|
||||||
|
[role="xpack"]
|
||||||
[[ml-sum-functions]]
|
[[ml-sum-functions]]
|
||||||
=== Sum Functions
|
=== Sum functions
|
||||||
|
|
||||||
The sum functions detect anomalies when the sum of a field in a bucket is anomalous.
|
The sum functions detect anomalies when the sum of a field in a bucket is anomalous.
|
||||||
|
|
||||||
|
@ -16,16 +16,9 @@ The {xpackml} features include the following sum functions:
|
||||||
* xref:ml-sum[`sum`, `high_sum`, `low_sum`]
|
* xref:ml-sum[`sum`, `high_sum`, `low_sum`]
|
||||||
* xref:ml-nonnull-sum[`non_null_sum`, `high_non_null_sum`, `low_non_null_sum`]
|
* xref:ml-nonnull-sum[`non_null_sum`, `high_non_null_sum`, `low_non_null_sum`]
|
||||||
|
|
||||||
////
|
|
||||||
TBD: Incorporate from prelert docs?:
|
|
||||||
Input data may contain pre-calculated fields giving the total count of some value e.g. transactions per minute.
|
|
||||||
Ensure you are familiar with our advice on Summarization of Input Data, as this is likely to provide
|
|
||||||
a more appropriate method to using the sum function.
|
|
||||||
////
|
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
[[ml-sum]]
|
[[ml-sum]]
|
||||||
==== Sum, High_sum, Low_sum
|
==== Sum, high_sum, low_sum
|
||||||
|
|
||||||
The `sum` function detects anomalies where the sum of a field in a bucket is
|
The `sum` function detects anomalies where the sum of a field in a bucket is
|
||||||
anomalous.
|
anomalous.
|
||||||
|
@ -54,6 +47,7 @@ For more information about those properties, see
|
||||||
"over_field_name" : "employee"
|
"over_field_name" : "employee"
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// NOTCONSOLE
|
||||||
|
|
||||||
If you use this `sum` function in a detector in your job, it
|
If you use this `sum` function in a detector in your job, it
|
||||||
models total expenses per employees for each cost center. For each time bucket,
|
models total expenses per employees for each cost center. For each time bucket,
|
||||||
|
@ -69,6 +63,7 @@ to other employees.
|
||||||
"over_field_name" : "cs_host"
|
"over_field_name" : "cs_host"
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// NOTCONSOLE
|
||||||
|
|
||||||
If you use this `high_sum` function in a detector in your job, it
|
If you use this `high_sum` function in a detector in your job, it
|
||||||
models total `cs_bytes`. It detects `cs_hosts` that transfer unusually high
|
models total `cs_bytes`. It detects `cs_hosts` that transfer unusually high
|
||||||
|
@ -79,7 +74,7 @@ to find users that are abusing internet privileges.
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
[[ml-nonnull-sum]]
|
[[ml-nonnull-sum]]
|
||||||
==== Non_null_sum, High_non_null_sum, Low_non_null_sum
|
==== Non_null_sum, high_non_null_sum, low_non_null_sum
|
||||||
|
|
||||||
The `non_null_sum` function is useful if your data is sparse. Buckets without
|
The `non_null_sum` function is useful if your data is sparse. Buckets without
|
||||||
values are ignored and buckets with a zero value are analyzed.
|
values are ignored and buckets with a zero value are analyzed.
|
||||||
|
@ -110,6 +105,7 @@ is not applicable for this function.
|
||||||
"byFieldName" : "employee"
|
"byFieldName" : "employee"
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// NOTCONSOLE
|
||||||
|
|
||||||
If you use this `high_non_null_sum` function in a detector in your job, it
|
If you use this `high_non_null_sum` function in a detector in your job, it
|
||||||
models the total `amount_approved` for each employee. It ignores any buckets
|
models the total `amount_approved` for each employee. It ignores any buckets
|
||||||
|
|
|
@ -1,5 +1,6 @@
|
||||||
|
[role="xpack"]
|
||||||
[[ml-time-functions]]
|
[[ml-time-functions]]
|
||||||
=== Time Functions
|
=== Time functions
|
||||||
|
|
||||||
The time functions detect events that happen at unusual times, either of the day
|
The time functions detect events that happen at unusual times, either of the day
|
||||||
or of the week. These functions can be used to find unusual patterns of behavior,
|
or of the week. These functions can be used to find unusual patterns of behavior,
|
||||||
|
@ -60,6 +61,7 @@ For more information about those properties, see
|
||||||
"by_field_name" : "process"
|
"by_field_name" : "process"
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// NOTCONSOLE
|
||||||
|
|
||||||
If you use this `time_of_day` function in a detector in your job, it
|
If you use this `time_of_day` function in a detector in your job, it
|
||||||
models when events occur throughout a day for each process. It detects when an
|
models when events occur throughout a day for each process. It detects when an
|
||||||
|
@ -91,6 +93,7 @@ For more information about those properties, see
|
||||||
"over_field_name" : "workstation"
|
"over_field_name" : "workstation"
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
// NOTCONSOLE
|
||||||
|
|
||||||
If you use this `time_of_week` function in a detector in your job, it
|
If you use this `time_of_week` function in a detector in your job, it
|
||||||
models when events occur throughout the week for each `eventcode`. It detects
|
models when events occur throughout the week for each `eventcode`. It detects
|
||||||
|
|
|
@ -1,5 +1,6 @@
|
||||||
|
[role="xpack"]
|
||||||
[[ml-configuring-pop]]
|
[[ml-configuring-pop]]
|
||||||
=== Performing Population Analysis
|
=== Performing population analysis
|
||||||
|
|
||||||
Entities or events in your data can be considered anomalous when:
|
Entities or events in your data can be considered anomalous when:
|
||||||
|
|
||||||
|
|
|
@ -1,5 +1,6 @@
|
||||||
|
[role="xpack"]
|
||||||
[[stopping-ml]]
|
[[stopping-ml]]
|
||||||
== Stopping Machine Learning
|
== Stopping machine learning
|
||||||
|
|
||||||
An orderly shutdown of {ml} ensures that:
|
An orderly shutdown of {ml} ensures that:
|
||||||
|
|
||||||
|
@ -24,10 +25,10 @@ request stops the `feed1` {dfeed}:
|
||||||
|
|
||||||
[source,js]
|
[source,js]
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
POST _xpack/ml/datafeeds/feed1/_stop
|
POST _xpack/ml/datafeeds/datafeed-total-requests/_stop
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
// CONSOLE
|
// CONSOLE
|
||||||
// TEST[skip:todo]
|
// TEST[setup:server_metrics_startdf]
|
||||||
|
|
||||||
NOTE: You must have `manage_ml`, or `manage` cluster privileges to stop {dfeeds}.
|
NOTE: You must have `manage_ml`, or `manage` cluster privileges to stop {dfeeds}.
|
||||||
For more information, see <<security-privileges>>.
|
For more information, see <<security-privileges>>.
|
||||||
|
@ -63,10 +64,10 @@ example, the following request closes the `job1` job:
|
||||||
|
|
||||||
[source,js]
|
[source,js]
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
POST _xpack/ml/anomaly_detectors/job1/_close
|
POST _xpack/ml/anomaly_detectors/total-requests/_close
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
// CONSOLE
|
// CONSOLE
|
||||||
// TEST[skip:todo]
|
// TEST[setup:server_metrics_openjob]
|
||||||
|
|
||||||
NOTE: You must have `manage_ml`, or `manage` cluster privileges to stop {dfeeds}.
|
NOTE: You must have `manage_ml`, or `manage` cluster privileges to stop {dfeeds}.
|
||||||
For more information, see <<security-privileges>>.
|
For more information, see <<security-privileges>>.
|
||||||
|
|
|
@ -1,5 +1,6 @@
|
||||||
|
[role="xpack"]
|
||||||
[[ml-configuring-transform]]
|
[[ml-configuring-transform]]
|
||||||
=== Transforming Data With Script Fields
|
=== Transforming data with script fields
|
||||||
|
|
||||||
If you use {dfeeds}, you can add scripts to transform your data before
|
If you use {dfeeds}, you can add scripts to transform your data before
|
||||||
it is analyzed. {dfeeds-cap} contain an optional `script_fields` property, where
|
it is analyzed. {dfeeds-cap} contain an optional `script_fields` property, where
|
||||||
|
@ -602,10 +603,3 @@ The preview {dfeed} API returns the following results, which show that
|
||||||
]
|
]
|
||||||
----------------------------------
|
----------------------------------
|
||||||
// TESTRESPONSE
|
// TESTRESPONSE
|
||||||
|
|
||||||
////
|
|
||||||
==== Configuring Script Fields in {dfeeds-cap}
|
|
||||||
|
|
||||||
//TO-DO: Add Kibana steps from
|
|
||||||
//https://github.com/elastic/prelert-legacy/wiki/Transforming-data-with-script_fields#transforming-geo_point-data-to-a-workable-string-format
|
|
||||||
////
|
|
||||||
|
|
Loading…
Reference in New Issue