2017-05-05 13:40:17 -04:00
|
|
|
[[ml-concepts]]
|
|
|
|
== Overview
|
|
|
|
|
|
|
|
There are a few concepts that are core to {ml} in {xpack}. Understanding these
|
|
|
|
concepts from the outset will tremendously help ease the learning process.
|
|
|
|
|
|
|
|
[float]
|
|
|
|
[[ml-jobs]]
|
|
|
|
=== Jobs
|
|
|
|
|
|
|
|
Machine learning jobs contain the configuration information and metadata
|
|
|
|
necessary to perform an analytics task. For a list of the properties associated
|
2017-06-19 23:24:45 -04:00
|
|
|
with a job, see {ref}/ml-job-resource.html[Job Resources].
|
2017-05-05 13:40:17 -04:00
|
|
|
|
|
|
|
[float]
|
|
|
|
[[ml-dfeeds]]
|
|
|
|
=== {dfeeds-cap}
|
|
|
|
|
|
|
|
Jobs can analyze either a one-off batch of data or continuously in real time.
|
|
|
|
{dfeeds-cap} retrieve data from {es} for analysis. Alternatively you can
|
2017-06-19 22:31:39 -04:00
|
|
|
{ref}/ml-post-data.html[POST data] from any source directly to an API.
|
2017-05-05 13:40:17 -04:00
|
|
|
|
|
|
|
[float]
|
|
|
|
[[ml-detectors]]
|
|
|
|
=== Detectors
|
|
|
|
|
|
|
|
As part of the configuration information that is associated with a job,
|
|
|
|
detectors define the type of analysis that needs to be done. They also specify
|
|
|
|
which fields to analyze. You can have more than one detector in a job, which
|
|
|
|
is more efficient than running multiple jobs against the same data. For a list
|
2017-06-19 22:31:39 -04:00
|
|
|
of the properties associated with detectors, see
|
|
|
|
{ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects].
|
2017-05-05 13:40:17 -04:00
|
|
|
|
|
|
|
[float]
|
|
|
|
[[ml-buckets]]
|
|
|
|
=== Buckets
|
|
|
|
|
|
|
|
The {xpackml} features use the concept of a bucket to divide the time
|
|
|
|
series into batches for processing. The _bucket span_ is part of the
|
|
|
|
configuration information for a job. It defines the time interval that is used
|
|
|
|
to summarize and model the data. This is typically between 5 minutes to 1 hour
|
|
|
|
and it depends on your data characteristics. When you set the bucket span,
|
|
|
|
take into account the granularity at which you want to analyze, the frequency
|
|
|
|
of the input data, the typical duration of the anomalies, and the frequency at
|
|
|
|
which alerting is required.
|
|
|
|
|
|
|
|
[float]
|
|
|
|
[[ml-nodes]]
|
|
|
|
=== Machine learning nodes
|
|
|
|
|
|
|
|
A {ml} node is a node that has `xpack.ml.enabled` and `node.ml` set to `true`,
|
|
|
|
which is the default behavior. If you set `node.ml` to `false`, the node can
|
|
|
|
service API requests but it cannot run jobs. If you want to use {xpackml}
|
|
|
|
features, there must be at least one {ml} node in your cluster. For more
|
2017-06-19 21:01:52 -04:00
|
|
|
information about this setting, see <<xpack-settings>>.
|
2017-05-05 13:40:17 -04:00
|
|
|
|
2017-06-05 16:07:15 -04:00
|
|
|
[float]
|
|
|
|
[[ml-function-overview]]
|
|
|
|
=== Analytical functions
|
2017-05-05 13:40:17 -04:00
|
|
|
|
2017-06-05 16:07:15 -04:00
|
|
|
See <<ml-functions>>.
|