77 lines
3.2 KiB
Plaintext
77 lines
3.2 KiB
Plaintext
[[xpack-ml]]
|
||
= Machine Learning in the Elastic Stack
|
||
|
||
[partintro]
|
||
--
|
||
The {xpackml} features automate the analysis of time-series data by creating
|
||
accurate baselines of normal behaviors in the data and identifying anomalous
|
||
patterns in that data.
|
||
|
||
Using proprietary {ml} algorithms, the following circumstances are detected,
|
||
scored, and linked with statistically significant influencers in the data:
|
||
|
||
* Anomalies related to temporal deviations in values, counts, or frequencies
|
||
* Statistical rarity
|
||
* Unusual behaviors for a member of a population
|
||
|
||
Automated periodicity detection and quick adaptation to changing data ensure
|
||
that you don’t need to specify algorithms, models, or other data science-related
|
||
configurations in order to get the benefits of {ml}.
|
||
|
||
[float]
|
||
[[ml-intro]]
|
||
== Integration with the Elastic Stack
|
||
|
||
Machine learning is tightly integrated with the Elastic Stack. Data is pulled
|
||
from {es} for analysis and anomaly results are displayed in {kib} dashboards.
|
||
|
||
[float]
|
||
[[ml-concepts]]
|
||
== Basic Concepts
|
||
|
||
There are a few concepts that are core to {ml} in {xpack}. Understanding these
|
||
concepts from the outset will tremendously help ease the learning process.
|
||
|
||
Jobs::
|
||
Machine learning jobs contain the configuration information and metadata
|
||
necessary to perform an analytics task. For a list of the properties associated
|
||
with a job, see <<ml-job-resource, Job Resources>>.
|
||
|
||
{dfeeds-cap}::
|
||
Jobs can analyze either a one-off batch of data or continuously in real time.
|
||
{dfeeds-cap} retrieve data from {es} for analysis. Alternatively you can
|
||
<<ml-post-data,POST data>> from any source directly to an API.
|
||
|
||
Detectors::
|
||
As part of the configuration information that is associated with a job,
|
||
detectors define the type of analysis that needs to be done. They also specify
|
||
which fields to analyze. You can have more than one detector in a job, which
|
||
is more efficient than running multiple jobs against the same data. For a list
|
||
of the properties associated with detectors,
|
||
see <<ml-detectorconfig, Detector Configuration Objects>>.
|
||
|
||
Buckets::
|
||
The {xpackml} features use the concept of a bucket to divide the time
|
||
series into batches for processing. The _bucket span_ is part of the
|
||
configuration information for a job. It defines the time interval that is used
|
||
to summarize and model the data. This is typically between 5 minutes to 1 hour
|
||
and it depends on your data characteristics. When you set the bucket span,
|
||
take into account the granularity at which you want to analyze, the frequency
|
||
of the input data, the typical duration of the anomalies, and the frequency at
|
||
which alerting is required.
|
||
|
||
Machine learning nodes::
|
||
A {ml} node is a node that has `xpack.ml.enabled` and `node.ml` set to `true`,
|
||
which is the default behavior. If you set `node.ml` to `false`, the node can
|
||
service API requests but it cannot run jobs. If you want to use {xpackml}
|
||
features, there must be at least one {ml} node in your cluster. For more
|
||
information about this setting, see <<ml-settings>>.
|
||
|
||
|
||
--
|
||
|
||
include::getting-started.asciidoc[]
|
||
// include::ml-scenarios.asciidoc[]
|
||
include::api-quickref.asciidoc[]
|
||
//include::troubleshooting.asciidoc[] Referenced from x-pack/docs/public/xpack-troubleshooting.asciidoc
|