[DOCS] Updates anomaly detection terminology (#44888)
This commit is contained in:
parent
cef375f883
commit
a041d1eacf
|
@ -4,7 +4,7 @@
|
|||
|
||||
By default, {dfeeds} fetch data from {es} using search and scroll requests.
|
||||
It can be significantly more efficient, however, to aggregate data in {es}
|
||||
and to configure your jobs to analyze aggregated data.
|
||||
and to configure your {anomaly-jobs} to analyze aggregated data.
|
||||
|
||||
One of the benefits of aggregating data this way is that {es} automatically
|
||||
distributes these calculations across your cluster. You can then feed this
|
||||
|
@ -19,8 +19,8 @@ of the last record in the bucket. If you use a terms aggregation and the
|
|||
cardinality of a term is high, then the aggregation might not be effective and
|
||||
you might want to just use the default search and scroll behavior.
|
||||
|
||||
When you create or update a job, you can include the names of aggregations, for
|
||||
example:
|
||||
When you create or update an {anomaly-job}, you can include the names of
|
||||
aggregations, for example:
|
||||
|
||||
[source,js]
|
||||
----------------------------------
|
||||
|
|
|
@ -68,8 +68,8 @@ we do not want the detailed SQL to be considered in the message categorization.
|
|||
This particular categorization filter removes the SQL statement from the categorization
|
||||
algorithm.
|
||||
|
||||
If your data is stored in {es}, you can create an advanced job with these same
|
||||
properties:
|
||||
If your data is stored in {es}, you can create an advanced {anomaly-job} with
|
||||
these same properties:
|
||||
|
||||
[role="screenshot"]
|
||||
image::images/ml-category-advanced.jpg["Advanced job configuration options related to categorization"]
|
||||
|
@ -209,7 +209,7 @@ letters in tokens whereas the `ml_classic` tokenizer does, although that could
|
|||
be fixed by using more complex regular expressions.
|
||||
|
||||
For more information about the `categorization_analyzer` property, see
|
||||
{ref}/ml-job-resource.html#ml-categorizationanalyzer[Categorization Analyzer].
|
||||
{ref}/ml-job-resource.html#ml-categorizationanalyzer[Categorization analyzer].
|
||||
|
||||
NOTE: To add the `categorization_analyzer` property in {kib}, you must use the
|
||||
**Edit JSON** tab and copy the `categorization_analyzer` object from one of the
|
||||
|
|
|
@ -7,8 +7,8 @@ your cluster and all master-eligible nodes must have {ml} enabled. By default,
|
|||
all nodes are {ml} nodes. For more information about these settings, see
|
||||
{ref}/modules-node.html#ml-node[{ml} nodes].
|
||||
|
||||
To use the {ml-features} to analyze your data, you must create a job and
|
||||
send your data to that job.
|
||||
To use the {ml-features} to analyze your data, you can create an {anomaly-job}
|
||||
and send your data to that job.
|
||||
|
||||
* If your data is stored in {es}:
|
||||
|
||||
|
|
|
@ -2,17 +2,17 @@
|
|||
[[ml-configuring-url]]
|
||||
=== Adding custom URLs to machine learning results
|
||||
|
||||
When you create an advanced job or edit any job in {kib}, you can optionally
|
||||
attach one or more custom URLs.
|
||||
When you create an advanced {anomaly-job} or edit any {anomaly-jobs} in {kib},
|
||||
you can optionally attach one or more custom URLs.
|
||||
|
||||
The custom URLs provide links from the anomalies table in the *Anomaly Explorer*
|
||||
or *Single Metric Viewer* window in {kib} to {kib} dashboards, the *Discovery*
|
||||
page, or external websites. For example, you can define a custom URL that
|
||||
provides a way for users to drill down to the source data from the results set.
|
||||
|
||||
When you edit a job in {kib}, it simplifies the creation of the custom URLs for
|
||||
{kib} dashboards and the *Discover* page and it enables you to test your URLs.
|
||||
For example:
|
||||
When you edit an {anomaly-job} in {kib}, it simplifies the creation of the
|
||||
custom URLs for {kib} dashboards and the *Discover* page and it enables you to
|
||||
test your URLs. For example:
|
||||
|
||||
[role="screenshot"]
|
||||
image::images/ml-customurl-edit.jpg["Edit a job to add a custom URL"]
|
||||
|
@ -29,7 +29,8 @@ As in this case, the custom URL can contain
|
|||
are populated when you click the link in the anomalies table. In this example,
|
||||
the custom URL contains `$earliest$`, `$latest$`, and `$service$` tokens, which
|
||||
pass the beginning and end of the time span of the selected anomaly and the
|
||||
pertinent `service` field value to the target page. If you were interested in the following anomaly, for example:
|
||||
pertinent `service` field value to the target page. If you were interested in
|
||||
the following anomaly, for example:
|
||||
|
||||
[role="screenshot"]
|
||||
image::images/ml-customurl.jpg["An example of the custom URL links in the Anomaly Explorer anomalies table"]
|
||||
|
@ -43,8 +44,8 @@ image::images/ml-customurl-discover.jpg["An example of the results on the Discov
|
|||
Since we specified a time range of 2 hours, the time filter restricts the
|
||||
results to the time period two hours before and after the anomaly.
|
||||
|
||||
You can also specify these custom URL settings when you create or update jobs by
|
||||
using the {ml} APIs.
|
||||
You can also specify these custom URL settings when you create or update
|
||||
{anomaly-jobs} by using the APIs.
|
||||
|
||||
[float]
|
||||
[[ml-configuring-url-strings]]
|
||||
|
@ -74,9 +75,9 @@ time as the earliest and latest times. The same is also true if the interval is
|
|||
set to `Auto` and a one hour interval was chosen. You can override this behavior
|
||||
by using the `time_range` setting.
|
||||
|
||||
The `$mlcategoryregex$` and `$mlcategoryterms$` tokens pertain to jobs where you
|
||||
are categorizing field values. For more information about this type of analysis,
|
||||
see <<ml-configuring-categories>>.
|
||||
The `$mlcategoryregex$` and `$mlcategoryterms$` tokens pertain to {anomaly-jobs}
|
||||
where you are categorizing field values. For more information about this type of
|
||||
analysis, see <<ml-configuring-categories>>.
|
||||
|
||||
The `$mlcategoryregex$` token passes the regular expression value of the
|
||||
category of the selected anomaly, as identified by the value of the `mlcategory`
|
||||
|
|
|
@ -22,8 +22,8 @@ functions are not really affected. In these situations, it all comes out okay in
|
|||
the end as the delayed data is distributed randomly. An example would be a `mean`
|
||||
metric for a field in a large collection of data. In this case, checking for
|
||||
delayed data may not provide much benefit. If data are consistently delayed,
|
||||
however, jobs with a `low_count` function may provide false positives. In this
|
||||
situation, it would be useful to see if data comes in after an anomaly is
|
||||
however, {anomaly-jobs} with a `low_count` function may provide false positives.
|
||||
In this situation, it would be useful to see if data comes in after an anomaly is
|
||||
recorded so that you can determine a next course of action.
|
||||
|
||||
==== How do we detect delayed data?
|
||||
|
@ -35,11 +35,11 @@ Every 15 minutes or every `check_window`, whichever is smaller, the datafeed
|
|||
triggers a document search over the configured indices. This search looks over a
|
||||
time span with a length of `check_window` ending with the latest finalized bucket.
|
||||
That time span is partitioned into buckets, whose length equals the bucket span
|
||||
of the associated job. The `doc_count` of those buckets are then compared with
|
||||
the job's finalized analysis buckets to see whether any data has arrived since
|
||||
the analysis. If there is indeed missing data due to their ingest delay, the end
|
||||
user is notified. For example, you can see annotations in {kib} for the periods
|
||||
where these delays occur.
|
||||
of the associated {anomaly-job}. The `doc_count` of those buckets are then
|
||||
compared with the job's finalized analysis buckets to see whether any data has
|
||||
arrived since the analysis. If there is indeed missing data due to their ingest
|
||||
delay, the end user is notified. For example, you can see annotations in {kib}
|
||||
for the periods where these delays occur.
|
||||
|
||||
==== What to do about delayed data?
|
||||
|
||||
|
|
|
@ -16,17 +16,18 @@ Let us see how those can be configured by examples.
|
|||
|
||||
==== Specifying custom rule scope
|
||||
|
||||
Let us assume we are configuring a job in order to detect DNS data exfiltration.
|
||||
Our data contain fields "subdomain" and "highest_registered_domain".
|
||||
We can use a detector that looks like `high_info_content(subdomain) over highest_registered_domain`.
|
||||
If we run such a job it is possible that we discover a lot of anomalies on
|
||||
frequently used domains that we have reasons to trust. As security analysts, we
|
||||
are not interested in such anomalies. Ideally, we could instruct the detector to
|
||||
skip results for domains that we consider safe. Using a rule with a scope allows
|
||||
us to achieve this.
|
||||
Let us assume we are configuring an {anomaly-job} in order to detect DNS data
|
||||
exfiltration. Our data contain fields "subdomain" and "highest_registered_domain".
|
||||
We can use a detector that looks like
|
||||
`high_info_content(subdomain) over highest_registered_domain`. If we run such a
|
||||
job, it is possible that we discover a lot of anomalies on frequently used
|
||||
domains that we have reasons to trust. As security analysts, we are not
|
||||
interested in such anomalies. Ideally, we could instruct the detector to skip
|
||||
results for domains that we consider safe. Using a rule with a scope allows us
|
||||
to achieve this.
|
||||
|
||||
First, we need to create a list of our safe domains. Those lists are called
|
||||
_filters_ in {ml}. Filters can be shared across jobs.
|
||||
_filters_ in {ml}. Filters can be shared across {anomaly-jobs}.
|
||||
|
||||
We create our filter using the {ref}/ml-put-filter.html[put filter API]:
|
||||
|
||||
|
@ -41,8 +42,8 @@ PUT _ml/filters/safe_domains
|
|||
// CONSOLE
|
||||
// TEST[skip:needs-licence]
|
||||
|
||||
Now, we can create our job specifying a scope that uses the `safe_domains`
|
||||
filter for the `highest_registered_domain` field:
|
||||
Now, we can create our {anomaly-job} specifying a scope that uses the
|
||||
`safe_domains` filter for the `highest_registered_domain` field:
|
||||
|
||||
[source,js]
|
||||
----------------------------------
|
||||
|
@ -139,8 +140,8 @@ example, 0.02. Given our knowledge about how CPU utilization behaves we might
|
|||
determine that anomalies with such small actual values are not interesting for
|
||||
investigation.
|
||||
|
||||
Let us now configure a job with a rule that will skip results where CPU
|
||||
utilization is less than 0.20.
|
||||
Let us now configure an {anomaly-job} with a rule that will skip results where
|
||||
CPU utilization is less than 0.20.
|
||||
|
||||
[source,js]
|
||||
----------------------------------
|
||||
|
@ -214,18 +215,18 @@ PUT _ml/anomaly_detectors/rule_with_range
|
|||
==== Custom rules in the life-cycle of a job
|
||||
|
||||
Custom rules only affect results created after the rules were applied.
|
||||
Let us imagine that we have configured a job and it has been running
|
||||
Let us imagine that we have configured an {anomaly-job} and it has been running
|
||||
for some time. After observing its results we decide that we can employ
|
||||
rules in order to get rid of some uninteresting results. We can use
|
||||
the {ref}/ml-update-job.html[update job API] to do so. However, the rule we
|
||||
added will only be in effect for any results created from the moment we added
|
||||
the rule onwards. Past results will remain unaffected.
|
||||
the {ref}/ml-update-job.html[update {anomaly-job} API] to do so. However, the
|
||||
rule we added will only be in effect for any results created from the moment we
|
||||
added the rule onwards. Past results will remain unaffected.
|
||||
|
||||
==== Using custom rules VS filtering data
|
||||
==== Using custom rules vs. filtering data
|
||||
|
||||
It might appear like using rules is just another way of filtering the data
|
||||
that feeds into a job. For example, a rule that skips results when the
|
||||
partition field value is in a filter sounds equivalent to having a query
|
||||
that feeds into an {anomaly-job}. For example, a rule that skips results when
|
||||
the partition field value is in a filter sounds equivalent to having a query
|
||||
that filters out such documents. But it is not. There is a fundamental
|
||||
difference. When the data is filtered before reaching a job it is as if they
|
||||
never existed for the job. With rules, the data still reaches the job and
|
||||
|
|
|
@ -5,10 +5,10 @@
|
|||
The {ml-features} include analysis functions that provide a wide variety of
|
||||
flexible ways to analyze data for anomalies.
|
||||
|
||||
When you create jobs, you specify one or more detectors, which define the type of
|
||||
analysis that needs to be done. If you are creating your job by using {ml} APIs,
|
||||
you specify the functions in
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects].
|
||||
When you create {anomaly-jobs}, you specify one or more detectors, which define
|
||||
the type of analysis that needs to be done. If you are creating your job by
|
||||
using {ml} APIs, you specify the functions in
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector configuration objects].
|
||||
If you are creating your job in {kib}, you specify the functions differently
|
||||
depending on whether you are creating single metric, multi-metric, or advanced
|
||||
jobs.
|
||||
|
@ -24,8 +24,8 @@ You can specify a `summary_count_field_name` with any function except `metric`.
|
|||
When you use `summary_count_field_name`, the {ml} features expect the input
|
||||
data to be pre-aggregated. The value of the `summary_count_field_name` field
|
||||
must contain the count of raw events that were summarized. In {kib}, use the
|
||||
**summary_count_field_name** in advanced jobs. Analyzing aggregated input data
|
||||
provides a significant boost in performance. For more information, see
|
||||
**summary_count_field_name** in advanced {anomaly-jobs}. Analyzing aggregated
|
||||
input data provides a significant boost in performance. For more information, see
|
||||
<<ml-configuring-aggregation>>.
|
||||
|
||||
If your data is sparse, there may be gaps in the data which means you might have
|
||||
|
|
|
@ -40,7 +40,7 @@ These functions support the following properties:
|
|||
* `partition_field_name` (optional)
|
||||
|
||||
For more information about those properties,
|
||||
see {ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects].
|
||||
see {ref}/ml-job-resource.html#ml-detectorconfig[Detector configuration objects].
|
||||
|
||||
.Example 1: Analyzing events with the count function
|
||||
[source,js]
|
||||
|
@ -65,8 +65,9 @@ This example is probably the simplest possible analysis. It identifies
|
|||
time buckets during which the overall count of events is higher or lower than
|
||||
usual.
|
||||
|
||||
When you use this function in a detector in your job, it models the event rate
|
||||
and detects when the event rate is unusual compared to its past behavior.
|
||||
When you use this function in a detector in your {anomaly-job}, it models the
|
||||
event rate and detects when the event rate is unusual compared to its past
|
||||
behavior.
|
||||
|
||||
.Example 2: Analyzing errors with the high_count function
|
||||
[source,js]
|
||||
|
@ -89,7 +90,7 @@ PUT _ml/anomaly_detectors/example2
|
|||
// CONSOLE
|
||||
// TEST[skip:needs-licence]
|
||||
|
||||
If you use this `high_count` function in a detector in your job, it
|
||||
If you use this `high_count` function in a detector in your {anomaly-job}, it
|
||||
models the event rate for each error code. It detects users that generate an
|
||||
unusually high count of error codes compared to other users.
|
||||
|
||||
|
@ -117,9 +118,9 @@ PUT _ml/anomaly_detectors/example3
|
|||
In this example, the function detects when the count of events for a
|
||||
status code is lower than usual.
|
||||
|
||||
When you use this function in a detector in your job, it models the event rate
|
||||
for each status code and detects when a status code has an unusually low count
|
||||
compared to its past behavior.
|
||||
When you use this function in a detector in your {anomaly-job}, it models the
|
||||
event rate for each status code and detects when a status code has an unusually
|
||||
low count compared to its past behavior.
|
||||
|
||||
.Example 4: Analyzing aggregated data with the count function
|
||||
[source,js]
|
||||
|
@ -168,7 +169,7 @@ These functions support the following properties:
|
|||
* `partition_field_name` (optional)
|
||||
|
||||
For more information about those properties,
|
||||
see {ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects].
|
||||
see {ref}/ml-job-resource.html#ml-detectorconfig[Detector configuration objects].
|
||||
|
||||
For example, if you have the following number of events per bucket:
|
||||
|
||||
|
@ -206,10 +207,10 @@ PUT _ml/anomaly_detectors/example5
|
|||
// CONSOLE
|
||||
// TEST[skip:needs-licence]
|
||||
|
||||
If you use this `high_non_zero_count` function in a detector in your job, it
|
||||
models the count of events for the `signaturename` field. It ignores any buckets
|
||||
where the count is zero and detects when a `signaturename` value has an
|
||||
unusually high count of events compared to its past behavior.
|
||||
If you use this `high_non_zero_count` function in a detector in your
|
||||
{anomaly-job}, it models the count of events for the `signaturename` field. It
|
||||
ignores any buckets where the count is zero and detects when a `signaturename`
|
||||
value has an unusually high count of events compared to its past behavior.
|
||||
|
||||
NOTE: Population analysis (using an `over_field_name` property value) is not
|
||||
supported for the `non_zero_count`, `high_non_zero_count`, and
|
||||
|
@ -238,7 +239,7 @@ These functions support the following properties:
|
|||
* `partition_field_name` (optional)
|
||||
|
||||
For more information about those properties,
|
||||
see {ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects].
|
||||
see {ref}/ml-job-resource.html#ml-detectorconfig[Detector configuration objects].
|
||||
|
||||
.Example 6: Analyzing users with the distinct_count function
|
||||
[source,js]
|
||||
|
@ -261,9 +262,9 @@ PUT _ml/anomaly_detectors/example6
|
|||
// TEST[skip:needs-licence]
|
||||
|
||||
This `distinct_count` function detects when a system has an unusual number
|
||||
of logged in users. When you use this function in a detector in your job, it
|
||||
models the distinct count of users. It also detects when the distinct number of
|
||||
users is unusual compared to the past.
|
||||
of logged in users. When you use this function in a detector in your
|
||||
{anomaly-job}, it models the distinct count of users. It also detects when the
|
||||
distinct number of users is unusual compared to the past.
|
||||
|
||||
.Example 7: Analyzing ports with the high_distinct_count function
|
||||
[source,js]
|
||||
|
@ -287,6 +288,6 @@ PUT _ml/anomaly_detectors/example7
|
|||
// TEST[skip:needs-licence]
|
||||
|
||||
This example detects instances of port scanning. When you use this function in a
|
||||
detector in your job, it models the distinct count of ports. It also detects the
|
||||
`src_ip` values that connect to an unusually high number of different
|
||||
detector in your {anomaly-job}, it models the distinct count of ports. It also
|
||||
detects the `src_ip` values that connect to an unusually high number of different
|
||||
`dst_ports` values compared to other `src_ip` values.
|
||||
|
|
|
@ -7,9 +7,9 @@ input data.
|
|||
|
||||
The {ml-features} include the following geographic function: `lat_long`.
|
||||
|
||||
NOTE: You cannot create forecasts for jobs that contain geographic functions.
|
||||
You also cannot add rules with conditions to detectors that use geographic
|
||||
functions.
|
||||
NOTE: You cannot create forecasts for {anomaly-jobs} that contain geographic
|
||||
functions. You also cannot add rules with conditions to detectors that use
|
||||
geographic functions.
|
||||
|
||||
[float]
|
||||
[[ml-lat-long]]
|
||||
|
@ -26,7 +26,7 @@ This function supports the following properties:
|
|||
* `partition_field_name` (optional)
|
||||
|
||||
For more information about those properties,
|
||||
see {ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects].
|
||||
see {ref}/ml-job-resource.html#ml-detectorconfig[Detector configuration objects].
|
||||
|
||||
.Example 1: Analyzing transactions with the lat_long function
|
||||
[source,js]
|
||||
|
@ -49,15 +49,15 @@ PUT _ml/anomaly_detectors/example1
|
|||
// CONSOLE
|
||||
// TEST[skip:needs-licence]
|
||||
|
||||
If you use this `lat_long` function in a detector in your job, it
|
||||
If you use this `lat_long` function in a detector in your {anomaly-job}, it
|
||||
detects anomalies where the geographic location of a credit card transaction is
|
||||
unusual for a particular customer’s credit card. An anomaly might indicate fraud.
|
||||
|
||||
IMPORTANT: The `field_name` that you supply must be a single string that contains
|
||||
two comma-separated numbers of the form `latitude,longitude`, a `geo_point` field,
|
||||
a `geo_shape` field that contains point values, or a `geo_centroid` aggregation.
|
||||
The `latitude` and `longitude` must be in the range -180 to 180 and represent a point on the
|
||||
surface of the Earth.
|
||||
The `latitude` and `longitude` must be in the range -180 to 180 and represent a
|
||||
point on the surface of the Earth.
|
||||
|
||||
For example, JSON data might contain the following transaction coordinates:
|
||||
|
||||
|
@ -75,6 +75,6 @@ In {es}, location data is likely to be stored in `geo_point` fields. For more
|
|||
information, see {ref}/geo-point.html[Geo-point datatype]. This data type is
|
||||
supported natively in {ml-features}. Specifically, {dfeed} when pulling data from
|
||||
a `geo_point` field, will transform the data into the appropriate `lat,lon` string
|
||||
format before sending to the {ml} job.
|
||||
format before sending to the {anomaly-job}.
|
||||
|
||||
For more information, see <<ml-configuring-transform>>.
|
||||
|
|
|
@ -29,7 +29,7 @@ These functions support the following properties:
|
|||
* `partition_field_name` (optional)
|
||||
|
||||
For more information about those properties, see
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects].
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector configuration objects].
|
||||
|
||||
.Example 1: Analyzing subdomain strings with the info_content function
|
||||
[source,js]
|
||||
|
@ -42,9 +42,9 @@ For more information about those properties, see
|
|||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `info_content` function in a detector in your job, it models
|
||||
information that is present in the `subdomain` string. It detects anomalies
|
||||
where the information content is unusual compared to the other
|
||||
If you use this `info_content` function in a detector in your {anomaly-job}, it
|
||||
models information that is present in the `subdomain` string. It detects
|
||||
anomalies where the information content is unusual compared to the other
|
||||
`highest_registered_domain` values. An anomaly could indicate an abuse of the
|
||||
DNS protocol, such as malicious command and control activity.
|
||||
|
||||
|
@ -63,8 +63,8 @@ choice.
|
|||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `high_info_content` function in a detector in your job, it
|
||||
models information content that is held in the DNS query string. It detects
|
||||
If you use this `high_info_content` function in a detector in your {anomaly-job},
|
||||
it models information content that is held in the DNS query string. It detects
|
||||
`src_ip` values where the information content is unusually high compared to
|
||||
other `src_ip` values. This example is similar to the example for the
|
||||
`info_content` function, but it reports anomalies only where the amount of
|
||||
|
@ -81,8 +81,8 @@ information content is higher than expected.
|
|||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `low_info_content` function in a detector in your job, it models
|
||||
information content that is present in the message string for each
|
||||
If you use this `low_info_content` function in a detector in your {anomaly-job},
|
||||
it models information content that is present in the message string for each
|
||||
`logfilename`. It detects anomalies where the information content is low
|
||||
compared to its past behavior. For example, this function detects unusually low
|
||||
amounts of information in a collection of rolling log files. Low information
|
||||
|
|
|
@ -35,7 +35,7 @@ This function supports the following properties:
|
|||
* `partition_field_name` (optional)
|
||||
|
||||
For more information about those properties, see
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects].
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector configuration objects].
|
||||
|
||||
.Example 1: Analyzing minimum transactions with the min function
|
||||
[source,js]
|
||||
|
@ -48,9 +48,9 @@ For more information about those properties, see
|
|||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `min` function in a detector in your job, it detects where the
|
||||
smallest transaction is lower than previously observed. You can use this
|
||||
function to detect items for sale at unintentionally low prices due to data
|
||||
If you use this `min` function in a detector in your {anomaly-job}, it detects
|
||||
where the smallest transaction is lower than previously observed. You can use
|
||||
this function to detect items for sale at unintentionally low prices due to data
|
||||
entry mistakes. It models the minimum amount for each product over time.
|
||||
|
||||
[float]
|
||||
|
@ -70,7 +70,7 @@ This function supports the following properties:
|
|||
* `partition_field_name` (optional)
|
||||
|
||||
For more information about those properties, see
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects].
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector configuration objects].
|
||||
|
||||
.Example 2: Analyzing maximum response times with the max function
|
||||
[source,js]
|
||||
|
@ -83,9 +83,9 @@ For more information about those properties, see
|
|||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `max` function in a detector in your job, it detects where the
|
||||
longest `responsetime` is longer than previously observed. You can use this
|
||||
function to detect applications that have `responsetime` values that are
|
||||
If you use this `max` function in a detector in your {anomaly-job}, it detects
|
||||
where the longest `responsetime` is longer than previously observed. You can use
|
||||
this function to detect applications that have `responsetime` values that are
|
||||
unusually lengthy. It models the maximum `responsetime` for each application
|
||||
over time and detects when the longest `responsetime` is unusually long compared
|
||||
to previous applications.
|
||||
|
@ -132,7 +132,7 @@ These functions support the following properties:
|
|||
* `partition_field_name` (optional)
|
||||
|
||||
For more information about those properties, see
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects].
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector configuration objects].
|
||||
|
||||
.Example 4: Analyzing response times with the median function
|
||||
[source,js]
|
||||
|
@ -145,9 +145,9 @@ For more information about those properties, see
|
|||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `median` function in a detector in your job, it models the
|
||||
median `responsetime` for each application over time. It detects when the median
|
||||
`responsetime` is unusual compared to previous `responsetime` values.
|
||||
If you use this `median` function in a detector in your {anomaly-job}, it models
|
||||
the median `responsetime` for each application over time. It detects when the
|
||||
median `responsetime` is unusual compared to previous `responsetime` values.
|
||||
|
||||
[float]
|
||||
[[ml-metric-mean]]
|
||||
|
@ -170,7 +170,7 @@ These functions support the following properties:
|
|||
* `partition_field_name` (optional)
|
||||
|
||||
For more information about those properties, see
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects].
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector configuration objects].
|
||||
|
||||
.Example 5: Analyzing response times with the mean function
|
||||
[source,js]
|
||||
|
@ -183,8 +183,8 @@ For more information about those properties, see
|
|||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `mean` function in a detector in your job, it models the mean
|
||||
`responsetime` for each application over time. It detects when the mean
|
||||
If you use this `mean` function in a detector in your {anomaly-job}, it models
|
||||
the mean `responsetime` for each application over time. It detects when the mean
|
||||
`responsetime` is unusual compared to previous `responsetime` values.
|
||||
|
||||
.Example 6: Analyzing response times with the high_mean function
|
||||
|
@ -198,9 +198,10 @@ If you use this `mean` function in a detector in your job, it models the mean
|
|||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `high_mean` function in a detector in your job, it models the
|
||||
mean `responsetime` for each application over time. It detects when the mean
|
||||
`responsetime` is unusually high compared to previous `responsetime` values.
|
||||
If you use this `high_mean` function in a detector in your {anomaly-job}, it
|
||||
models the mean `responsetime` for each application over time. It detects when
|
||||
the mean `responsetime` is unusually high compared to previous `responsetime`
|
||||
values.
|
||||
|
||||
.Example 7: Analyzing response times with the low_mean function
|
||||
[source,js]
|
||||
|
@ -213,9 +214,10 @@ mean `responsetime` for each application over time. It detects when the mean
|
|||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `low_mean` function in a detector in your job, it models the
|
||||
mean `responsetime` for each application over time. It detects when the mean
|
||||
`responsetime` is unusually low compared to previous `responsetime` values.
|
||||
If you use this `low_mean` function in a detector in your {anomaly-job}, it
|
||||
models the mean `responsetime` for each application over time. It detects when
|
||||
the mean `responsetime` is unusually low compared to previous `responsetime`
|
||||
values.
|
||||
|
||||
[float]
|
||||
[[ml-metric-metric]]
|
||||
|
@ -236,7 +238,7 @@ This function supports the following properties:
|
|||
* `partition_field_name` (optional)
|
||||
|
||||
For more information about those properties, see
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects].
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector configuration objects].
|
||||
|
||||
.Example 8: Analyzing response times with the metric function
|
||||
[source,js]
|
||||
|
@ -249,8 +251,8 @@ For more information about those properties, see
|
|||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `metric` function in a detector in your job, it models the
|
||||
mean, min, and max `responsetime` for each application over time. It detects
|
||||
If you use this `metric` function in a detector in your {anomaly-job}, it models
|
||||
the mean, min, and max `responsetime` for each application over time. It detects
|
||||
when the mean, min, or max `responsetime` is unusual compared to previous
|
||||
`responsetime` values.
|
||||
|
||||
|
@ -273,7 +275,7 @@ These functions support the following properties:
|
|||
* `partition_field_name` (optional)
|
||||
|
||||
For more information about those properties, see
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects].
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector configuration objects].
|
||||
|
||||
.Example 9: Analyzing response times with the varp function
|
||||
[source,js]
|
||||
|
@ -286,10 +288,10 @@ For more information about those properties, see
|
|||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `varp` function in a detector in your job, it models the
|
||||
variance in values of `responsetime` for each application over time. It detects
|
||||
when the variance in `responsetime` is unusual compared to past application
|
||||
behavior.
|
||||
If you use this `varp` function in a detector in your {anomaly-job}, it models
|
||||
the variance in values of `responsetime` for each application over time. It
|
||||
detects when the variance in `responsetime` is unusual compared to past
|
||||
application behavior.
|
||||
|
||||
.Example 10: Analyzing response times with the high_varp function
|
||||
[source,js]
|
||||
|
@ -302,10 +304,10 @@ behavior.
|
|||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `high_varp` function in a detector in your job, it models the
|
||||
variance in values of `responsetime` for each application over time. It detects
|
||||
when the variance in `responsetime` is unusual compared to past application
|
||||
behavior.
|
||||
If you use this `high_varp` function in a detector in your {anomaly-job}, it
|
||||
models the variance in values of `responsetime` for each application over time.
|
||||
It detects when the variance in `responsetime` is unusual compared to past
|
||||
application behavior.
|
||||
|
||||
.Example 11: Analyzing response times with the low_varp function
|
||||
[source,js]
|
||||
|
@ -318,7 +320,7 @@ behavior.
|
|||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `low_varp` function in a detector in your job, it models the
|
||||
variance in values of `responsetime` for each application over time. It detects
|
||||
when the variance in `responsetime` is unusual compared to past application
|
||||
behavior.
|
||||
If you use this `low_varp` function in a detector in your {anomaly-job}, it
|
||||
models the variance in values of `responsetime` for each application over time.
|
||||
It detects when the variance in `responsetime` is unusual compared to past
|
||||
application behavior.
|
||||
|
|
|
@ -13,8 +13,8 @@ number of times (frequency) rare values occur.
|
|||
====
|
||||
* The `rare` and `freq_rare` functions should not be used in conjunction with
|
||||
`exclude_frequent`.
|
||||
* You cannot create forecasts for jobs that contain `rare` or `freq_rare`
|
||||
functions.
|
||||
* You cannot create forecasts for {anomaly-jobs} that contain `rare` or
|
||||
`freq_rare` functions.
|
||||
* You cannot add rules with conditions to detectors that use `rare` or
|
||||
`freq_rare` functions.
|
||||
* Shorter bucket spans (less than 1 hour, for example) are recommended when
|
||||
|
@ -47,7 +47,7 @@ This function supports the following properties:
|
|||
* `partition_field_name` (optional)
|
||||
|
||||
For more information about those properties, see
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects].
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector configuration objects].
|
||||
|
||||
.Example 1: Analyzing status codes with the rare function
|
||||
[source,js]
|
||||
|
@ -59,10 +59,11 @@ For more information about those properties, see
|
|||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `rare` function in a detector in your job, it detects values
|
||||
that are rare in time. It models status codes that occur over time and detects
|
||||
when rare status codes occur compared to the past. For example, you can detect
|
||||
status codes in a web access log that have never (or rarely) occurred before.
|
||||
If you use this `rare` function in a detector in your {anomaly-job}, it detects
|
||||
values that are rare in time. It models status codes that occur over time and
|
||||
detects when rare status codes occur compared to the past. For example, you can
|
||||
detect status codes in a web access log that have never (or rarely) occurred
|
||||
before.
|
||||
|
||||
.Example 2: Analyzing status codes in a population with the rare function
|
||||
[source,js]
|
||||
|
@ -75,15 +76,15 @@ status codes in a web access log that have never (or rarely) occurred before.
|
|||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `rare` function in a detector in your job, it detects values
|
||||
that are rare in a population. It models status code and client IP interactions
|
||||
that occur. It defines a rare status code as one that occurs for few client IP
|
||||
values compared to the population. It detects client IP values that experience
|
||||
one or more distinct rare status codes compared to the population. For example
|
||||
in a web access log, a `clientip` that experiences the highest number of
|
||||
different rare status codes compared to the population is regarded as highly
|
||||
anomalous. This analysis is based on the number of different status code values,
|
||||
not the count of occurrences.
|
||||
If you use this `rare` function in a detector in your {anomaly-job}, it detects
|
||||
values that are rare in a population. It models status code and client IP
|
||||
interactions that occur. It defines a rare status code as one that occurs for
|
||||
few client IP values compared to the population. It detects client IP values
|
||||
that experience one or more distinct rare status codes compared to the
|
||||
population. For example in a web access log, a `clientip` that experiences the
|
||||
highest number of different rare status codes compared to the population is
|
||||
regarded as highly anomalous. This analysis is based on the number of different
|
||||
status code values, not the count of occurrences.
|
||||
|
||||
NOTE: To define a status code as rare the {ml-features} look at the number
|
||||
of distinct status codes that occur, not the number of times the status code
|
||||
|
@ -105,7 +106,7 @@ This function supports the following properties:
|
|||
* `partition_field_name` (optional)
|
||||
|
||||
For more information about those properties, see
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects].
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector configuration objects].
|
||||
|
||||
.Example 3: Analyzing URI values in a population with the freq_rare function
|
||||
[source,js]
|
||||
|
@ -118,7 +119,7 @@ For more information about those properties, see
|
|||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `freq_rare` function in a detector in your job, it
|
||||
If you use this `freq_rare` function in a detector in your {anomaly-job}, it
|
||||
detects values that are frequently rare in a population. It models URI paths and
|
||||
client IP interactions that occur. It defines a rare URI path as one that is
|
||||
visited by few client IP values compared to the population. It detects the
|
||||
|
|
|
@ -2,7 +2,8 @@
|
|||
[[ml-sum-functions]]
|
||||
=== Sum functions
|
||||
|
||||
The sum functions detect anomalies when the sum of a field in a bucket is anomalous.
|
||||
The sum functions detect anomalies when the sum of a field in a bucket is
|
||||
anomalous.
|
||||
|
||||
If you want to monitor unusually high totals, use high-sided functions.
|
||||
|
||||
|
@ -35,7 +36,7 @@ These functions support the following properties:
|
|||
* `partition_field_name` (optional)
|
||||
|
||||
For more information about those properties, see
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects].
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector configuration objects].
|
||||
|
||||
.Example 1: Analyzing total expenses with the sum function
|
||||
[source,js]
|
||||
|
@ -49,7 +50,7 @@ For more information about those properties, see
|
|||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `sum` function in a detector in your job, it
|
||||
If you use this `sum` function in a detector in your {anomaly-job}, it
|
||||
models total expenses per employees for each cost center. For each time bucket,
|
||||
it detects when an employee’s expenses are unusual for a cost center compared
|
||||
to other employees.
|
||||
|
@ -65,7 +66,7 @@ to other employees.
|
|||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `high_sum` function in a detector in your job, it
|
||||
If you use this `high_sum` function in a detector in your {anomaly-job}, it
|
||||
models total `cs_bytes`. It detects `cs_hosts` that transfer unusually high
|
||||
volumes compared to other `cs_hosts`. This example looks for volumes of data
|
||||
transferred from a client to a server on the internet that are unusual compared
|
||||
|
@ -91,7 +92,7 @@ These functions support the following properties:
|
|||
* `partition_field_name` (optional)
|
||||
|
||||
For more information about those properties, see
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects].
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector configuration objects].
|
||||
|
||||
NOTE: Population analysis (that is to say, use of the `over_field_name` property)
|
||||
is not applicable for this function.
|
||||
|
@ -107,9 +108,7 @@ is not applicable for this function.
|
|||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `high_non_null_sum` function in a detector in your job, it
|
||||
models the total `amount_approved` for each employee. It ignores any buckets
|
||||
If you use this `high_non_null_sum` function in a detector in your {anomaly-job},
|
||||
it models the total `amount_approved` for each employee. It ignores any buckets
|
||||
where the amount is null. It detects employees who approve unusually high
|
||||
amounts compared to their past behavior.
|
||||
//For this credit control system analysis, using non_null_sum will ignore
|
||||
//periods where the employees are not active on the system.
|
||||
|
|
|
@ -14,22 +14,25 @@ The {ml-features} include the following time functions:
|
|||
|
||||
[NOTE]
|
||||
====
|
||||
* NOTE: You cannot create forecasts for jobs that contain time functions.
|
||||
* The `time_of_day` function is not aware of the difference between days, for instance
|
||||
work days and weekends. When modeling different days, use the `time_of_week` function.
|
||||
In general, the `time_of_week` function is more suited to modeling the behavior of people
|
||||
rather than machines, as people vary their behavior according to the day of the week.
|
||||
* Shorter bucket spans (for example, 10 minutes) are recommended when performing a
|
||||
`time_of_day` or `time_of_week` analysis. The time of the events being modeled are not
|
||||
affected by the bucket span, but a shorter bucket span enables quicker alerting on unusual
|
||||
events.
|
||||
* Unusual events are flagged based on the previous pattern of the data, not on what we
|
||||
might think of as unusual based on human experience. So, if events typically occur
|
||||
between 3 a.m. and 5 a.m., and event occurring at 3 p.m. is be flagged as unusual.
|
||||
* When Daylight Saving Time starts or stops, regular events can be flagged as anomalous.
|
||||
This situation occurs because the actual time of the event (as measured against a UTC
|
||||
baseline) has changed. This situation is treated as a step change in behavior and the new
|
||||
times will be learned quickly.
|
||||
* NOTE: You cannot create forecasts for {anomaly-jobs} that contain time
|
||||
functions.
|
||||
* The `time_of_day` function is not aware of the difference between days, for
|
||||
instance work days and weekends. When modeling different days, use the
|
||||
`time_of_week` function. In general, the `time_of_week` function is more suited
|
||||
to modeling the behavior of people rather than machines, as people vary their
|
||||
behavior according to the day of the week.
|
||||
* Shorter bucket spans (for example, 10 minutes) are recommended when performing
|
||||
a `time_of_day` or `time_of_week` analysis. The time of the events being modeled
|
||||
are not affected by the bucket span, but a shorter bucket span enables quicker
|
||||
alerting on unusual events.
|
||||
* Unusual events are flagged based on the previous pattern of the data, not on
|
||||
what we might think of as unusual based on human experience. So, if events
|
||||
typically occur between 3 a.m. and 5 a.m., and event occurring at 3 p.m. is be
|
||||
flagged as unusual.
|
||||
* When Daylight Saving Time starts or stops, regular events can be flagged as
|
||||
anomalous. This situation occurs because the actual time of the event (as
|
||||
measured against a UTC baseline) has changed. This situation is treated as a
|
||||
step change in behavior and the new times will be learned quickly.
|
||||
====
|
||||
|
||||
[float]
|
||||
|
@ -51,7 +54,7 @@ This function supports the following properties:
|
|||
* `partition_field_name` (optional)
|
||||
|
||||
For more information about those properties, see
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects].
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector configuration objects].
|
||||
|
||||
.Example 1: Analyzing events with the time_of_day function
|
||||
[source,js]
|
||||
|
@ -63,7 +66,7 @@ For more information about those properties, see
|
|||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `time_of_day` function in a detector in your job, it
|
||||
If you use this `time_of_day` function in a detector in your {anomaly-job}, it
|
||||
models when events occur throughout a day for each process. It detects when an
|
||||
event occurs for a process that is at an unusual time in the day compared to
|
||||
its past behavior.
|
||||
|
@ -82,7 +85,7 @@ This function supports the following properties:
|
|||
* `partition_field_name` (optional)
|
||||
|
||||
For more information about those properties, see
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects].
|
||||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector configuration objects].
|
||||
|
||||
.Example 2: Analyzing events with the time_of_week function
|
||||
[source,js]
|
||||
|
@ -95,7 +98,7 @@ For more information about those properties, see
|
|||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
If you use this `time_of_week` function in a detector in your job, it
|
||||
If you use this `time_of_week` function in a detector in your {anomaly-job}, it
|
||||
models when events occur throughout the week for each `eventcode`. It detects
|
||||
when a workstation event occurs at an unusual time during the week for that
|
||||
`eventcode` compared to other workstations. It detects events for a
|
||||
|
|
|
@ -57,9 +57,9 @@ PUT _ml/anomaly_detectors/population
|
|||
in each bucket.
|
||||
|
||||
If your data is stored in {es}, you can use the population job wizard in {kib}
|
||||
to create a job with these same properties. For example, if you add the sample
|
||||
web logs in {kib}, you can use the following job settings in the population job
|
||||
wizard:
|
||||
to create an {anomaly-job} with these same properties. For example, if you add
|
||||
the sample web logs in {kib}, you can use the following job settings in the
|
||||
population job wizard:
|
||||
|
||||
[role="screenshot"]
|
||||
image::images/ml-population-job.jpg["Job settings in the population job wizard]
|
||||
|
|
|
@ -1,22 +1,22 @@
|
|||
[role="xpack"]
|
||||
[[stopping-ml]]
|
||||
== Stopping machine learning
|
||||
== Stopping {ml} {anomaly-detect}
|
||||
|
||||
An orderly shutdown of {ml} ensures that:
|
||||
An orderly shutdown ensures that:
|
||||
|
||||
* {dfeeds-cap} are stopped
|
||||
* Buffers are flushed
|
||||
* Model history is pruned
|
||||
* Final results are calculated
|
||||
* Model snapshots are saved
|
||||
* Jobs are closed
|
||||
* {anomaly-jobs-cap} are closed
|
||||
|
||||
This process ensures that jobs are in a consistent state in case you want to
|
||||
subsequently re-open them.
|
||||
|
||||
[float]
|
||||
[[stopping-ml-datafeeds]]
|
||||
=== Stopping {dfeeds-cap}
|
||||
=== Stopping {dfeeds}
|
||||
|
||||
When you stop a {dfeed}, it ceases to retrieve data from {es}. You can stop a
|
||||
{dfeed} by using {kib} or the
|
||||
|
@ -25,7 +25,7 @@ request stops the `feed1` {dfeed}:
|
|||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
POST _ml/datafeeds/datafeed-total-requests/_stop
|
||||
POST _ml/datafeeds/feed1/_stop
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[skip:setup:server_metrics_startdf]
|
||||
|
@ -39,7 +39,7 @@ A {dfeed} can be started and stopped multiple times throughout its lifecycle.
|
|||
|
||||
[float]
|
||||
[[stopping-all-ml-datafeeds]]
|
||||
==== Stopping All {dfeeds-cap}
|
||||
==== Stopping all {dfeeds}
|
||||
|
||||
If you are upgrading your cluster, you can use the following request to stop all
|
||||
{dfeeds}:
|
||||
|
@ -53,19 +53,20 @@ POST _ml/datafeeds/_all/_stop
|
|||
|
||||
[float]
|
||||
[[closing-ml-jobs]]
|
||||
=== Closing Jobs
|
||||
=== Closing {anomaly-jobs}
|
||||
|
||||
When you close a job, it cannot receive data or perform analysis operations.
|
||||
If a job is associated with a {dfeed}, you must stop the {dfeed} before you can
|
||||
close the jobs. If the {dfeed} has an end date, the job closes automatically on
|
||||
that end date.
|
||||
When you close an {anomaly-job}, it cannot receive data or perform analysis
|
||||
operations. If a job is associated with a {dfeed}, you must stop the {dfeed}
|
||||
before you can close the job. If the {dfeed} has an end date, the job closes
|
||||
automatically on that end date.
|
||||
|
||||
You can close a job by using the {ref}/ml-close-job.html[close job API]. For
|
||||
You can close a job by using the
|
||||
{ref}/ml-close-job.html[close {anomaly-job} API]. For
|
||||
example, the following request closes the `job1` job:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
POST _ml/anomaly_detectors/total-requests/_close
|
||||
POST _ml/anomaly_detectors/job1/_close
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[skip:setup:server_metrics_openjob]
|
||||
|
@ -73,14 +74,15 @@ POST _ml/anomaly_detectors/total-requests/_close
|
|||
NOTE: You must have `manage_ml`, or `manage` cluster privileges to stop {dfeeds}.
|
||||
For more information, see <<security-privileges>>.
|
||||
|
||||
A job can be opened and closed multiple times throughout its lifecycle.
|
||||
{anomaly-jobs-cap} can be opened and closed multiple times throughout their
|
||||
lifecycle.
|
||||
|
||||
[float]
|
||||
[[closing-all-ml-datafeeds]]
|
||||
==== Closing All Jobs
|
||||
==== Closing all {anomaly-jobs}
|
||||
|
||||
If you are upgrading your cluster, you can use the following request to close
|
||||
all open jobs on the cluster:
|
||||
all open {anomaly-jobs} on the cluster:
|
||||
|
||||
[source,js]
|
||||
----------------------------------
|
||||
|
|
|
@ -7,9 +7,9 @@ it is analyzed. {dfeeds-cap} contain an optional `script_fields` property, where
|
|||
you can specify scripts that evaluate custom expressions and return script
|
||||
fields.
|
||||
|
||||
If your {dfeed} defines script fields, you can use those fields in your job.
|
||||
For example, you can use the script fields in the analysis functions in one or
|
||||
more detectors.
|
||||
If your {dfeed} defines script fields, you can use those fields in your
|
||||
{anomaly-job}. For example, you can use the script fields in the analysis
|
||||
functions in one or more detectors.
|
||||
|
||||
* <<ml-configuring-transform1>>
|
||||
* <<ml-configuring-transform2>>
|
||||
|
@ -146,12 +146,14 @@ PUT _ml/datafeeds/datafeed-test1
|
|||
within the job.
|
||||
<2> The script field is defined in the {dfeed}.
|
||||
|
||||
This `test1` job contains a detector that uses a script field in a mean analysis
|
||||
function. The `datafeed-test1` {dfeed} defines the script field. It contains a
|
||||
script that adds two fields in the document to produce a "total" error count.
|
||||
This `test1` {anomaly-job} contains a detector that uses a script field in a
|
||||
mean analysis function. The `datafeed-test1` {dfeed} defines the script field.
|
||||
It contains a script that adds two fields in the document to produce a "total"
|
||||
error count.
|
||||
|
||||
The syntax for the `script_fields` property is identical to that used by {es}.
|
||||
For more information, see {ref}/search-request-body.html#request-body-search-script-fields[Script Fields].
|
||||
For more information, see
|
||||
{ref}/search-request-body.html#request-body-search-script-fields[Script fields].
|
||||
|
||||
You can preview the contents of the {dfeed} by using the following API:
|
||||
|
||||
|
@ -181,15 +183,15 @@ insufficient data to generate meaningful results.
|
|||
//For a full demonstration of
|
||||
//how to create jobs with sample data, see <<ml-getting-started>>.
|
||||
|
||||
You can alternatively use {kib} to create an advanced job that uses script
|
||||
fields. To add the `script_fields` property to your {dfeed}, you must use the
|
||||
**Edit JSON** tab. For example:
|
||||
You can alternatively use {kib} to create an advanced {anomaly-job} that uses
|
||||
script fields. To add the `script_fields` property to your {dfeed}, you must use
|
||||
the **Edit JSON** tab. For example:
|
||||
|
||||
[role="screenshot"]
|
||||
image::images/ml-scriptfields.jpg[Adding script fields to a {dfeed} in {kib}]
|
||||
|
||||
[[ml-configuring-transform-examples]]
|
||||
==== Common Script Field Examples
|
||||
==== Common script field examples
|
||||
|
||||
While the possibilities are limitless, there are a number of common scenarios
|
||||
where you might use script fields in your {dfeeds}.
|
||||
|
@ -199,7 +201,7 @@ where you might use script fields in your {dfeeds}.
|
|||
Some of these examples use regular expressions. By default, regular
|
||||
expressions are disabled because they circumvent the protection that Painless
|
||||
provides against long running and memory hungry scripts. For more information,
|
||||
see {ref}/modules-scripting-painless.html[Painless Scripting Language].
|
||||
see {ref}/modules-scripting-painless.html[Painless scripting language].
|
||||
|
||||
Machine learning analysis is case sensitive. For example, "John" is considered
|
||||
to be different than "john". This is one reason you might consider using scripts
|
||||
|
|
Loading…
Reference in New Issue