[DOCS] Fixes code snippet testing for machine learning (#31189)
This commit is contained in:
parent
b44e1c1978
commit
5971eb83c4
|
@ -9,13 +9,6 @@ apply plugin: 'elasticsearch.docs-test'
|
|||
* only remove entries from this list. When it is empty we'll remove it
|
||||
* entirely and have a party! There will be cake and everything.... */
|
||||
buildRestTests.expectedUnconvertedCandidates = [
|
||||
'en/ml/functions/count.asciidoc',
|
||||
'en/ml/functions/geo.asciidoc',
|
||||
'en/ml/functions/info.asciidoc',
|
||||
'en/ml/functions/metric.asciidoc',
|
||||
'en/ml/functions/rare.asciidoc',
|
||||
'en/ml/functions/sum.asciidoc',
|
||||
'en/ml/functions/time.asciidoc',
|
||||
'en/rest-api/watcher/put-watch.asciidoc',
|
||||
'en/security/authentication/user-cache.asciidoc',
|
||||
'en/security/authorization/field-and-document-access-control.asciidoc',
|
||||
|
@ -56,7 +49,6 @@ buildRestTests.expectedUnconvertedCandidates = [
|
|||
'en/watcher/troubleshooting.asciidoc',
|
||||
'en/rest-api/license/delete-license.asciidoc',
|
||||
'en/rest-api/license/update-license.asciidoc',
|
||||
'en/ml/api-quickref.asciidoc',
|
||||
'en/rest-api/ml/delete-snapshot.asciidoc',
|
||||
'en/rest-api/ml/forecast.asciidoc',
|
||||
'en/rest-api/ml/get-bucket.asciidoc',
|
||||
|
|
|
@ -1,5 +1,6 @@
|
|||
[role="xpack"]
|
||||
[[ml-configuring-aggregation]]
|
||||
=== Aggregating Data For Faster Performance
|
||||
=== Aggregating data for faster performance
|
||||
|
||||
By default, {dfeeds} fetch data from {es} using search and scroll requests.
|
||||
It can be significantly more efficient, however, to aggregate data in {es}
|
||||
|
|
|
@ -1,5 +1,6 @@
|
|||
[role="xpack"]
|
||||
[[ml-api-quickref]]
|
||||
== API Quick Reference
|
||||
== API quick reference
|
||||
|
||||
All {ml} endpoints have the following base:
|
||||
|
||||
|
@ -7,6 +8,7 @@ All {ml} endpoints have the following base:
|
|||
----
|
||||
/_xpack/ml/
|
||||
----
|
||||
// NOTCONSOLE
|
||||
|
||||
The main {ml} resources can be accessed with a variety of endpoints:
|
||||
|
||||
|
|
|
@ -1,3 +1,4 @@
|
|||
[role="xpack"]
|
||||
[[ml-configuring-categories]]
|
||||
=== Categorizing log messages
|
||||
|
||||
|
@ -77,7 +78,7 @@ NOTE: To add the `categorization_examples_limit` property, you must use the
|
|||
|
||||
[float]
|
||||
[[ml-configuring-analyzer]]
|
||||
==== Customizing the Categorization Analyzer
|
||||
==== Customizing the categorization analyzer
|
||||
|
||||
Categorization uses English dictionary words to identify log message categories.
|
||||
By default, it also uses English tokenization rules. For this reason, if you use
|
||||
|
@ -213,7 +214,7 @@ API examples above.
|
|||
|
||||
[float]
|
||||
[[ml-viewing-categories]]
|
||||
==== Viewing Categorization Results
|
||||
==== Viewing categorization results
|
||||
|
||||
After you open the job and start the {dfeed} or supply data to the job, you can
|
||||
view the categorization results in {kib}. For example:
|
||||
|
|
|
@ -1,5 +1,6 @@
|
|||
[role="xpack"]
|
||||
[[ml-configuring]]
|
||||
== Configuring Machine Learning
|
||||
== Configuring machine learning
|
||||
|
||||
If you want to use {xpackml} features, there must be at least one {ml} node in
|
||||
your cluster and all master-eligible nodes must have {ml} enabled. By default,
|
||||
|
|
|
@ -48,7 +48,7 @@ using the {ml} APIs.
|
|||
|
||||
[float]
|
||||
[[ml-configuring-url-strings]]
|
||||
==== String Substitution in Custom URLs
|
||||
==== String substitution in custom URLs
|
||||
|
||||
You can use dollar sign ($) delimited tokens in a custom URL. These tokens are
|
||||
substituted for the values of the corresponding fields in the anomaly records.
|
||||
|
|
|
@ -1,5 +1,6 @@
|
|||
[role="xpack"]
|
||||
[[ml-functions]]
|
||||
== Function Reference
|
||||
== Function reference
|
||||
|
||||
The {xpackml} features include analysis functions that provide a wide variety of
|
||||
flexible ways to analyze data for anomalies.
|
||||
|
|
|
@ -1,5 +1,6 @@
|
|||
[role="xpack"]
|
||||
[[ml-count-functions]]
|
||||
=== Count Functions
|
||||
=== Count functions
|
||||
|
||||
Count functions detect anomalies when the number of events in a bucket is
|
||||
anomalous.
|
||||
|
@ -21,7 +22,7 @@ The {xpackml} features include the following count functions:
|
|||
|
||||
[float]
|
||||
[[ml-count]]
|
||||
===== Count, High_count, Low_count
|
||||
===== Count, high_count, low_count
|
||||
|
||||
The `count` function detects anomalies when the number of events in a bucket is
|
||||
anomalous.
|
||||
|
@ -44,8 +45,20 @@ see {ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects]
|
|||
.Example 1: Analyzing events with the count function
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
{ "function" : "count" }
|
||||
PUT _xpack/ml/anomaly_detectors/example1
|
||||
{
|
||||
"analysis_config": {
|
||||
"detectors": [{
|
||||
"function" : "count"
|
||||
}]
|
||||
},
|
||||
"data_description": {
|
||||
"time_field":"timestamp",
|
||||
"time_format": "epoch_ms"
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
This example is probably the simplest possible analysis. It identifies
|
||||
time buckets during which the overall count of events is higher or lower than
|
||||
|
@ -57,12 +70,22 @@ and detects when the event rate is unusual compared to its past behavior.
|
|||
.Example 2: Analyzing errors with the high_count function
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT _xpack/ml/anomaly_detectors/example2
|
||||
{
|
||||
"function" : "high_count",
|
||||
"by_field_name" : "error_code",
|
||||
"over_field_name": "user"
|
||||
"analysis_config": {
|
||||
"detectors": [{
|
||||
"function" : "high_count",
|
||||
"by_field_name" : "error_code",
|
||||
"over_field_name": "user"
|
||||
}]
|
||||
},
|
||||
"data_description": {
|
||||
"time_field":"timestamp",
|
||||
"time_format": "epoch_ms"
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
If you use this `high_count` function in a detector in your job, it
|
||||
models the event rate for each error code. It detects users that generate an
|
||||
|
@ -72,11 +95,21 @@ unusually high count of error codes compared to other users.
|
|||
.Example 3: Analyzing status codes with the low_count function
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT _xpack/ml/anomaly_detectors/example3
|
||||
{
|
||||
"function" : "low_count",
|
||||
"by_field_name" : "status_code"
|
||||
"analysis_config": {
|
||||
"detectors": [{
|
||||
"function" : "low_count",
|
||||
"by_field_name" : "status_code"
|
||||
}]
|
||||
},
|
||||
"data_description": {
|
||||
"time_field":"timestamp",
|
||||
"time_format": "epoch_ms"
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
In this example, the function detects when the count of events for a
|
||||
status code is lower than usual.
|
||||
|
@ -88,22 +121,30 @@ compared to its past behavior.
|
|||
.Example 4: Analyzing aggregated data with the count function
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT _xpack/ml/anomaly_detectors/example4
|
||||
{
|
||||
"summary_count_field_name" : "events_per_min",
|
||||
"detectors" [
|
||||
{ "function" : "count" }
|
||||
]
|
||||
}
|
||||
"analysis_config": {
|
||||
"summary_count_field_name" : "events_per_min",
|
||||
"detectors": [{
|
||||
"function" : "count"
|
||||
}]
|
||||
},
|
||||
"data_description": {
|
||||
"time_field":"timestamp",
|
||||
"time_format": "epoch_ms"
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
If you are analyzing an aggregated `events_per_min` field, do not use a sum
|
||||
function (for example, `sum(events_per_min)`). Instead, use the count function
|
||||
and the `summary_count_field_name` property.
|
||||
//TO-DO: For more information, see <<aggreggations.asciidoc>>.
|
||||
and the `summary_count_field_name` property. For more information, see
|
||||
<<ml-configuring-aggregation>>.
|
||||
|
||||
[float]
|
||||
[[ml-nonzero-count]]
|
||||
===== Non_zero_count, High_non_zero_count, Low_non_zero_count
|
||||
===== Non_zero_count, high_non_zero_count, low_non_zero_count
|
||||
|
||||
The `non_zero_count` function detects anomalies when the number of events in a
|
||||
bucket is anomalous, but it ignores cases where the bucket count is zero. Use
|
||||
|
@ -144,11 +185,21 @@ The `non_zero_count` function models only the following data:
|
|||
.Example 5: Analyzing signatures with the high_non_zero_count function
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT _xpack/ml/anomaly_detectors/example5
|
||||
{
|
||||
"function" : "high_non_zero_count",
|
||||
"by_field_name" : "signaturename"
|
||||
"analysis_config": {
|
||||
"detectors": [{
|
||||
"function" : "high_non_zero_count",
|
||||
"by_field_name" : "signaturename"
|
||||
}]
|
||||
},
|
||||
"data_description": {
|
||||
"time_field":"timestamp",
|
||||
"time_format": "epoch_ms"
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
If you use this `high_non_zero_count` function in a detector in your job, it
|
||||
models the count of events for the `signaturename` field. It ignores any buckets
|
||||
|
@ -163,7 +214,7 @@ data is sparse, use the `count` functions, which are optimized for that scenario
|
|||
|
||||
[float]
|
||||
[[ml-distinct-count]]
|
||||
===== Distinct_count, High_distinct_count, Low_distinct_count
|
||||
===== Distinct_count, high_distinct_count, low_distinct_count
|
||||
|
||||
The `distinct_count` function detects anomalies where the number of distinct
|
||||
values in one field is unusual.
|
||||
|
@ -187,11 +238,21 @@ see {ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects]
|
|||
.Example 6: Analyzing users with the distinct_count function
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT _xpack/ml/anomaly_detectors/example6
|
||||
{
|
||||
"function" : "distinct_count",
|
||||
"field_name" : "user"
|
||||
"analysis_config": {
|
||||
"detectors": [{
|
||||
"function" : "distinct_count",
|
||||
"field_name" : "user"
|
||||
}]
|
||||
},
|
||||
"data_description": {
|
||||
"time_field":"timestamp",
|
||||
"time_format": "epoch_ms"
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
This `distinct_count` function detects when a system has an unusual number
|
||||
of logged in users. When you use this function in a detector in your job, it
|
||||
|
@ -201,12 +262,22 @@ users is unusual compared to the past.
|
|||
.Example 7: Analyzing ports with the high_distinct_count function
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT _xpack/ml/anomaly_detectors/example7
|
||||
{
|
||||
"function" : "high_distinct_count",
|
||||
"field_name" : "dst_port",
|
||||
"over_field_name": "src_ip"
|
||||
"analysis_config": {
|
||||
"detectors": [{
|
||||
"function" : "high_distinct_count",
|
||||
"field_name" : "dst_port",
|
||||
"over_field_name": "src_ip"
|
||||
}]
|
||||
},
|
||||
"data_description": {
|
||||
"time_field":"timestamp",
|
||||
"time_format": "epoch_ms"
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
This example detects instances of port scanning. When you use this function in a
|
||||
detector in your job, it models the distinct count of ports. It also detects the
|
||||
|
|
|
@ -1,5 +1,6 @@
|
|||
[role="xpack"]
|
||||
[[ml-geo-functions]]
|
||||
=== Geographic Functions
|
||||
=== Geographic functions
|
||||
|
||||
The geographic functions detect anomalies in the geographic location of the
|
||||
input data.
|
||||
|
@ -28,12 +29,22 @@ see {ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects]
|
|||
.Example 1: Analyzing transactions with the lat_long function
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT _xpack/ml/anomaly_detectors/example1
|
||||
{
|
||||
"function" : "lat_long",
|
||||
"field_name" : "transactionCoordinates",
|
||||
"by_field_name" : "creditCardNumber"
|
||||
"analysis_config": {
|
||||
"detectors": [{
|
||||
"function" : "lat_long",
|
||||
"field_name" : "transactionCoordinates",
|
||||
"by_field_name" : "creditCardNumber"
|
||||
}]
|
||||
},
|
||||
"data_description": {
|
||||
"time_field":"timestamp",
|
||||
"time_format": "epoch_ms"
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
If you use this `lat_long` function in a detector in your job, it
|
||||
detects anomalies where the geographic location of a credit card transaction is
|
||||
|
@ -54,6 +65,7 @@ For example, JSON data might contain the following transaction coordinates:
|
|||
"creditCardNumber": "1234123412341234"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
In {es}, location data is likely to be stored in `geo_point` fields. For more
|
||||
information, see {ref}/geo-point.html[Geo-point datatype]. This data type is not
|
||||
|
@ -64,7 +76,15 @@ format. For example, the following Painless script transforms
|
|||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT _xpack/ml/datafeeds/datafeed-test2
|
||||
{
|
||||
"job_id": "farequote",
|
||||
"indices": ["farequote"],
|
||||
"query": {
|
||||
"match_all": {
|
||||
"boost": 1
|
||||
}
|
||||
},
|
||||
"script_fields": {
|
||||
"lat-lon": {
|
||||
"script": {
|
||||
|
@ -75,5 +95,7 @@ format. For example, the following Painless script transforms
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[setup:farequote_job]
|
||||
|
||||
For more information, see <<ml-configuring-transform>>.
|
||||
|
|
|
@ -40,6 +40,7 @@ For more information about those properties, see
|
|||
"over_field_name" : "highest_registered_domain"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `info_content` function in a detector in your job, it models
|
||||
information that is present in the `subdomain` string. It detects anomalies
|
||||
|
@ -60,6 +61,7 @@ choice.
|
|||
"over_field_name" : "src_ip"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `high_info_content` function in a detector in your job, it
|
||||
models information content that is held in the DNS query string. It detects
|
||||
|
@ -77,6 +79,7 @@ information content is higher than expected.
|
|||
"by_field_name" : "logfilename"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `low_info_content` function in a detector in your job, it models
|
||||
information content that is present in the message string for each
|
||||
|
|
|
@ -1,5 +1,6 @@
|
|||
[role="xpack"]
|
||||
[[ml-metric-functions]]
|
||||
=== Metric Functions
|
||||
=== Metric functions
|
||||
|
||||
The metric functions include functions such as mean, min and max. These values
|
||||
are calculated for each bucket. Field values that cannot be converted to
|
||||
|
@ -42,6 +43,7 @@ For more information about those properties, see
|
|||
"by_field_name" : "product"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `min` function in a detector in your job, it detects where the
|
||||
smallest transaction is lower than previously observed. You can use this
|
||||
|
@ -76,6 +78,7 @@ For more information about those properties, see
|
|||
"by_field_name" : "application"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `max` function in a detector in your job, it detects where the
|
||||
longest `responsetime` is longer than previously observed. You can use this
|
||||
|
@ -98,6 +101,7 @@ to previous applications.
|
|||
"by_field_name" : "application"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
The analysis in the previous example can be performed alongside `high_mean`
|
||||
functions by application. By combining detectors and using the same influencer
|
||||
|
@ -106,7 +110,7 @@ response times for each bucket.
|
|||
|
||||
[float]
|
||||
[[ml-metric-median]]
|
||||
==== Median, High_median, Low_median
|
||||
==== Median, high_median, low_median
|
||||
|
||||
The `median` function detects anomalies in the statistical median of a value.
|
||||
The median value is calculated for each bucket.
|
||||
|
@ -136,6 +140,7 @@ For more information about those properties, see
|
|||
"by_field_name" : "application"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `median` function in a detector in your job, it models the
|
||||
median `responsetime` for each application over time. It detects when the median
|
||||
|
@ -143,7 +148,7 @@ median `responsetime` for each application over time. It detects when the median
|
|||
|
||||
[float]
|
||||
[[ml-metric-mean]]
|
||||
==== Mean, High_mean, Low_mean
|
||||
==== Mean, high_mean, low_mean
|
||||
|
||||
The `mean` function detects anomalies in the arithmetic mean of a value.
|
||||
The mean value is calculated for each bucket.
|
||||
|
@ -173,6 +178,7 @@ For more information about those properties, see
|
|||
"by_field_name" : "application"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `mean` function in a detector in your job, it models the mean
|
||||
`responsetime` for each application over time. It detects when the mean
|
||||
|
@ -187,6 +193,7 @@ If you use this `mean` function in a detector in your job, it models the mean
|
|||
"by_field_name" : "application"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `high_mean` function in a detector in your job, it models the
|
||||
mean `responsetime` for each application over time. It detects when the mean
|
||||
|
@ -201,6 +208,7 @@ mean `responsetime` for each application over time. It detects when the mean
|
|||
"by_field_name" : "application"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `low_mean` function in a detector in your job, it models the
|
||||
mean `responsetime` for each application over time. It detects when the mean
|
||||
|
@ -237,6 +245,7 @@ For more information about those properties, see
|
|||
"by_field_name" : "application"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `metric` function in a detector in your job, it models the
|
||||
mean, min, and max `responsetime` for each application over time. It detects
|
||||
|
@ -245,7 +254,7 @@ when the mean, min, or max `responsetime` is unusual compared to previous
|
|||
|
||||
[float]
|
||||
[[ml-metric-varp]]
|
||||
==== Varp, High_varp, Low_varp
|
||||
==== Varp, high_varp, low_varp
|
||||
|
||||
The `varp` function detects anomalies in the variance of a value which is a
|
||||
measure of the variability and spread in the data.
|
||||
|
@ -273,6 +282,7 @@ For more information about those properties, see
|
|||
"by_field_name" : "application"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `varp` function in a detector in your job, it models the
|
||||
variance in values of `responsetime` for each application over time. It detects
|
||||
|
@ -288,6 +298,7 @@ behavior.
|
|||
"by_field_name" : "application"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `high_varp` function in a detector in your job, it models the
|
||||
variance in values of `responsetime` for each application over time. It detects
|
||||
|
@ -303,6 +314,7 @@ behavior.
|
|||
"by_field_name" : "application"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `low_varp` function in a detector in your job, it models the
|
||||
variance in values of `responsetime` for each application over time. It detects
|
||||
|
|
|
@ -1,5 +1,6 @@
|
|||
[role="xpack"]
|
||||
[[ml-rare-functions]]
|
||||
=== Rare Functions
|
||||
=== Rare functions
|
||||
|
||||
The rare functions detect values that occur rarely in time or rarely for a
|
||||
population.
|
||||
|
@ -54,6 +55,7 @@ For more information about those properties, see
|
|||
"by_field_name" : "status"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `rare` function in a detector in your job, it detects values
|
||||
that are rare in time. It models status codes that occur over time and detects
|
||||
|
@ -69,6 +71,7 @@ status codes in a web access log that have never (or rarely) occurred before.
|
|||
"over_field_name" : "clientip"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `rare` function in a detector in your job, it detects values
|
||||
that are rare in a population. It models status code and client IP interactions
|
||||
|
@ -111,6 +114,7 @@ For more information about those properties, see
|
|||
"over_field_name" : "clientip"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `freq_rare` function in a detector in your job, it
|
||||
detects values that are frequently rare in a population. It models URI paths and
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
|
||||
[role="xpack"]
|
||||
[[ml-sum-functions]]
|
||||
=== Sum Functions
|
||||
=== Sum functions
|
||||
|
||||
The sum functions detect anomalies when the sum of a field in a bucket is anomalous.
|
||||
|
||||
|
@ -16,16 +16,9 @@ The {xpackml} features include the following sum functions:
|
|||
* xref:ml-sum[`sum`, `high_sum`, `low_sum`]
|
||||
* xref:ml-nonnull-sum[`non_null_sum`, `high_non_null_sum`, `low_non_null_sum`]
|
||||
|
||||
////
|
||||
TBD: Incorporate from prelert docs?:
|
||||
Input data may contain pre-calculated fields giving the total count of some value e.g. transactions per minute.
|
||||
Ensure you are familiar with our advice on Summarization of Input Data, as this is likely to provide
|
||||
a more appropriate method to using the sum function.
|
||||
////
|
||||
|
||||
[float]
|
||||
[[ml-sum]]
|
||||
==== Sum, High_sum, Low_sum
|
||||
==== Sum, high_sum, low_sum
|
||||
|
||||
The `sum` function detects anomalies where the sum of a field in a bucket is
|
||||
anomalous.
|
||||
|
@ -54,6 +47,7 @@ For more information about those properties, see
|
|||
"over_field_name" : "employee"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `sum` function in a detector in your job, it
|
||||
models total expenses per employees for each cost center. For each time bucket,
|
||||
|
@ -69,6 +63,7 @@ to other employees.
|
|||
"over_field_name" : "cs_host"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `high_sum` function in a detector in your job, it
|
||||
models total `cs_bytes`. It detects `cs_hosts` that transfer unusually high
|
||||
|
@ -79,7 +74,7 @@ to find users that are abusing internet privileges.
|
|||
|
||||
[float]
|
||||
[[ml-nonnull-sum]]
|
||||
==== Non_null_sum, High_non_null_sum, Low_non_null_sum
|
||||
==== Non_null_sum, high_non_null_sum, low_non_null_sum
|
||||
|
||||
The `non_null_sum` function is useful if your data is sparse. Buckets without
|
||||
values are ignored and buckets with a zero value are analyzed.
|
||||
|
@ -110,6 +105,7 @@ is not applicable for this function.
|
|||
"byFieldName" : "employee"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `high_non_null_sum` function in a detector in your job, it
|
||||
models the total `amount_approved` for each employee. It ignores any buckets
|
||||
|
|
|
@ -1,5 +1,6 @@
|
|||
[role="xpack"]
|
||||
[[ml-time-functions]]
|
||||
=== Time Functions
|
||||
=== Time functions
|
||||
|
||||
The time functions detect events that happen at unusual times, either of the day
|
||||
or of the week. These functions can be used to find unusual patterns of behavior,
|
||||
|
@ -60,6 +61,7 @@ For more information about those properties, see
|
|||
"by_field_name" : "process"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `time_of_day` function in a detector in your job, it
|
||||
models when events occur throughout a day for each process. It detects when an
|
||||
|
@ -91,6 +93,7 @@ For more information about those properties, see
|
|||
"over_field_name" : "workstation"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `time_of_week` function in a detector in your job, it
|
||||
models when events occur throughout the week for each `eventcode`. It detects
|
||||
|
|
|
@ -1,5 +1,6 @@
|
|||
[role="xpack"]
|
||||
[[ml-configuring-pop]]
|
||||
=== Performing Population Analysis
|
||||
=== Performing population analysis
|
||||
|
||||
Entities or events in your data can be considered anomalous when:
|
||||
|
||||
|
|
|
@ -1,5 +1,6 @@
|
|||
[role="xpack"]
|
||||
[[stopping-ml]]
|
||||
== Stopping Machine Learning
|
||||
== Stopping machine learning
|
||||
|
||||
An orderly shutdown of {ml} ensures that:
|
||||
|
||||
|
@ -24,10 +25,10 @@ request stops the `feed1` {dfeed}:
|
|||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
POST _xpack/ml/datafeeds/feed1/_stop
|
||||
POST _xpack/ml/datafeeds/datafeed-total-requests/_stop
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[skip:todo]
|
||||
// TEST[setup:server_metrics_startdf]
|
||||
|
||||
NOTE: You must have `manage_ml`, or `manage` cluster privileges to stop {dfeeds}.
|
||||
For more information, see <<security-privileges>>.
|
||||
|
@ -63,10 +64,10 @@ example, the following request closes the `job1` job:
|
|||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
POST _xpack/ml/anomaly_detectors/job1/_close
|
||||
POST _xpack/ml/anomaly_detectors/total-requests/_close
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[skip:todo]
|
||||
// TEST[setup:server_metrics_openjob]
|
||||
|
||||
NOTE: You must have `manage_ml`, or `manage` cluster privileges to stop {dfeeds}.
|
||||
For more information, see <<security-privileges>>.
|
||||
|
|
|
@ -1,5 +1,6 @@
|
|||
[role="xpack"]
|
||||
[[ml-configuring-transform]]
|
||||
=== Transforming Data With Script Fields
|
||||
=== Transforming data with script fields
|
||||
|
||||
If you use {dfeeds}, you can add scripts to transform your data before
|
||||
it is analyzed. {dfeeds-cap} contain an optional `script_fields` property, where
|
||||
|
@ -602,10 +603,3 @@ The preview {dfeed} API returns the following results, which show that
|
|||
]
|
||||
----------------------------------
|
||||
// TESTRESPONSE
|
||||
|
||||
////
|
||||
==== Configuring Script Fields in {dfeeds-cap}
|
||||
|
||||
//TO-DO: Add Kibana steps from
|
||||
//https://github.com/elastic/prelert-legacy/wiki/Transforming-data-with-script_fields#transforming-geo_point-data-to-a-workable-string-format
|
||||
////
|
||||
|
|
Loading…
Reference in New Issue