66 lines
2.3 KiB
Plaintext
66 lines
2.3 KiB
Plaintext
[[ml-concepts]]
|
|
== Overview
|
|
|
|
There are a few concepts that are core to {ml} in {xpack}. Understanding these
|
|
concepts from the outset will tremendously help ease the learning process.
|
|
|
|
[float]
|
|
[[ml-jobs]]
|
|
=== Jobs
|
|
|
|
Machine learning jobs contain the configuration information and metadata
|
|
necessary to perform an analytics task. For a list of the properties associated
|
|
with a job, see <<ml-job-resource, Job Resources>>.
|
|
|
|
[float]
|
|
[[ml-dfeeds]]
|
|
=== {dfeeds-cap}
|
|
|
|
Jobs can analyze either a one-off batch of data or continuously in real time.
|
|
{dfeeds-cap} retrieve data from {es} for analysis. Alternatively you can
|
|
<<ml-post-data,POST data>> from any source directly to an API.
|
|
|
|
[float]
|
|
[[ml-detectors]]
|
|
=== Detectors
|
|
|
|
As part of the configuration information that is associated with a job,
|
|
detectors define the type of analysis that needs to be done. They also specify
|
|
which fields to analyze. You can have more than one detector in a job, which
|
|
is more efficient than running multiple jobs against the same data. For a list
|
|
of the properties associated with detectors,
|
|
see <<ml-detectorconfig, Detector Configuration Objects>>.
|
|
|
|
[float]
|
|
[[ml-buckets]]
|
|
=== Buckets
|
|
|
|
The {xpackml} features use the concept of a bucket to divide the time
|
|
series into batches for processing. The _bucket span_ is part of the
|
|
configuration information for a job. It defines the time interval that is used
|
|
to summarize and model the data. This is typically between 5 minutes to 1 hour
|
|
and it depends on your data characteristics. When you set the bucket span,
|
|
take into account the granularity at which you want to analyze, the frequency
|
|
of the input data, the typical duration of the anomalies, and the frequency at
|
|
which alerting is required.
|
|
|
|
[float]
|
|
[[ml-nodes]]
|
|
=== Machine learning nodes
|
|
|
|
A {ml} node is a node that has `xpack.ml.enabled` and `node.ml` set to `true`,
|
|
which is the default behavior. If you set `node.ml` to `false`, the node can
|
|
service API requests but it cannot run jobs. If you want to use {xpackml}
|
|
features, there must be at least one {ml} node in your cluster. For more
|
|
information about this setting, see <<ml-settings>>.
|
|
|
|
include::functions.asciidoc[]
|
|
|
|
include::functions/count.asciidoc[]
|
|
include::functions/geo.asciidoc[]
|
|
include::functions/info.asciidoc[]
|
|
include::functions/metric.asciidoc[]
|
|
include::functions/rare.asciidoc[]
|
|
include::functions/sum.asciidoc[]
|
|
include::functions/time.asciidoc[]
|