[[xpack-ml]]
= Machine Learning in the Elastic Stack

[partintro]
--
The {xpackml} features automate the analysis of time-series data by creating
accurate baselines of normal behaviors in the data and identifying anomalous
patterns in that data.

Using proprietary {ml} algorithms, the following circumstances are detected,
scored, and linked with statistically significant influencers in the data:

* Anomalies related to temporal deviations in values, counts, or frequencies
* Statistical rarity
* Unusual behaviors for a member of a population

Automated periodicity detection and quick adaptation to changing data ensure
that you don’t need to specify algorithms, models, or other data science-related
configurations in order to get the benefits of {ml}.

[float]
[[ml-intro]]
== Integration with the Elastic Stack

Machine learning is tightly integrated with the Elastic Stack. Data is pulled
from {es} for analysis and anomaly results are displayed in {kib} dashboards.

[float]
[[ml-concepts]]
== Basic Concepts

There are a few concepts that are core to {ml} in {xpack}. Understanding these
concepts from the outset will tremendously help ease the learning process.

Jobs::
  Machine learning jobs contain the configuration information and metadata
  necessary to perform an analytics task. For a list of the properties associated
  with a job, see <<ml-job-resource, Job Resources>>.

{dfeeds-cap}::
  Jobs can analyze either a one-off batch of data or continuously in real time.
  {dfeeds-cap} retrieve data from {es} for analysis. Alternatively you can
  <<ml-post-data,POST data>> from any source directly to an API.

Detectors::
  As part of the configuration information that is associated with a job,
  detectors define the type of analysis that needs to be done. They also specify
  which fields to analyze. You can have more than one detector in a job, which
  is more efficient than running multiple jobs against the same data. For a list
  of the properties associated with detectors,
  see <<ml-detectorconfig, Detector Configuration Objects>>.

Buckets::
  The {xpackml} features use the concept of a bucket to divide the time
  series into batches for processing. The _bucket span_ is part of the
  configuration information for a job. It defines the time interval that is used
  to summarize and model the data. This is typically between 5 minutes to 1 hour
  and it depends on your data characteristics. When you set the bucket span,
  take into account the granularity at which you want to analyze, the frequency
  of the input data, the typical duration of the anomalies, and the frequency at
  which alerting is required.

Machine learning nodes::
  A {ml} node is a node that has `xpack.ml.enabled` and `node.ml` set to `true`,
  which is the default behavior. If you set `node.ml` to `false`, the node can
  service API requests but it cannot run jobs. If you want to use {xpackml}
  features, there must be at least one {ml} node in your cluster. For more
  information about this setting, see <<ml-settings>>.


--

include::getting-started.asciidoc[]
// include::ml-scenarios.asciidoc[]
include::api-quickref.asciidoc[]
//include::troubleshooting.asciidoc[]  Referenced from x-pack/docs/public/xpack-troubleshooting.asciidoc