[DOCS] Add ML documentation to master (elastic/x-pack-elasticsearch#959)
Original commit: elastic/x-pack-elasticsearch@666a10bd23
This commit is contained in:
parent
d11fbfa70c
commit
843a0d8b3f
|
@ -0,0 +1,84 @@
|
|||
[[ml-api-quickref]]
|
||||
== API Quick Reference
|
||||
|
||||
All {ml} endpoints have the following base:
|
||||
|
||||
----
|
||||
/_xpack/ml/
|
||||
----
|
||||
|
||||
The main {ml} resources can be accessed with a variety of endpoints:
|
||||
|
||||
* <<ml-api-jobs,+/anomaly_detectors/+>>: Create and manage {ml} jobs.
|
||||
* <<ml-api-datafeeds,+/datafeeds/+>>: Update data to be analyzed.
|
||||
* <<ml-api-results,+/results/+>>: Access the results of a {ml} job.
|
||||
* <<ml-api-snapshots,+/modelsnapshots/+>>: Manage model snapshots.
|
||||
* <<ml-api-validate,+/validate/+>>: Validate subsections of job configurations.
|
||||
|
||||
[float]
|
||||
[[ml-api-jobs]]
|
||||
=== /anomaly_detectors/
|
||||
|
||||
* <<ml-put-job,POST /anomaly_detectors>>: Create job
|
||||
* <<ml-open-job,POST /anomaly_detectors/<job_id>/_open>>: Open a job
|
||||
* <<ml-post-data,POST anomaly_detectors/<job_id+++>+++>>: Send data to a job
|
||||
* <<ml-get-job,GET /anomaly_detectors>>: List jobs
|
||||
* <<ml-get-job,GET /anomaly_detectors/<job_id+++>+++>>: Get job details
|
||||
* <<ml-get-job-stats,GET /anomaly_detectors/<job_id>/_stats>>: Get job statistics
|
||||
* <<ml-update-job,POST /anomaly_detectors/<job_id>/_update>>: Update certain properties of the job configuration
|
||||
* <<ml-flush-job,POST anomaly_detectors/<job_id>/_flush>>: Force a job to analyze buffered data
|
||||
* <<ml-close-job,POST /anomaly_detectors/<job_id>/_close>>: Close a job
|
||||
* <<ml-delete-job,DELETE /anomaly_detectors/<job_id+++>+++>>: Delete job
|
||||
|
||||
[float]
|
||||
[[ml-api-datafeeds]]
|
||||
=== /datafeeds/
|
||||
|
||||
* <<ml-put-datafeed,PUT /datafeeds/<datafeedID+++>+++>>: Create a data feed
|
||||
* <<ml-start-datafeed,POST /datafeeds/<feed_id>/_start>>: Start a data feed
|
||||
* <<ml-get-datafeed,GET /datafeeds>>: List data feeds
|
||||
* <<ml-get-datafeed,GET /datafeeds/<feed_id+++>+++>>: Get data feed details
|
||||
* <<ml-get-datafeed-stats,GET /datafeeds/<feed_id>/_stats>>: Get statistical information for data feeds
|
||||
* <<ml-preview-datafeed,GET /datafeeds/<feed_id>/_preview>>: Get a preview of a data feed
|
||||
* <<ml-update-datafeed,POST /datafeeds/<feedid>/_update>>: Update certain settings for a data feed
|
||||
* <<ml-stop-datafeed,POST /datafeeds/<feed_id>/_stop>>: Stop a data feed
|
||||
* <<ml-delete-datafeed,DELETE /datafeeds/<feed_id+++>+++>>: Delete data feed
|
||||
|
||||
[float]
|
||||
[[ml-api-results]]
|
||||
=== /results/
|
||||
|
||||
* <<ml-get-bucket,GET /results/buckets>>: List the buckets in the results
|
||||
* <<ml-get-bucket,GET /results/buckets/<bucket_id+++>+++>>: Get bucket details
|
||||
* <<ml-get-influencer,GET /results/categories>>: List the categories in the results
|
||||
* <<ml-get-influencer,GET /results/categories/<category_id+++>+++>>: Get category details
|
||||
* <<ml-get-influencer,GET /results/influencers>>: Get influencer details
|
||||
* <<ml-get-record,GET /results/records>>: Get records from the results
|
||||
|
||||
[float]
|
||||
[[ml-api-snapshots]]
|
||||
=== /model_snapshots/
|
||||
|
||||
* <<ml-get-snapshot,GET /model_snapshots>>: List model snapshots
|
||||
* <<ml-get-snapshot,GET /model_snapshots/<snapshot_id+++>+++>>: Get model snapshot details
|
||||
* <<ml-revert-snapshot,POST /model_snapshots/<snapshot_id>/_revert>>: Revert a model snapshot
|
||||
* <<ml-update-snapshot,POST /model_snapshots/<snapshot_id>/_update>>: Update certain settings for a model snapshot
|
||||
* <<ml-delete-snapshot,DELETE /model_snapshots/<snapshot_id+++>+++>>: Delete a model snapshot
|
||||
|
||||
[float]
|
||||
[[ml-api-validate]]
|
||||
=== /validate/
|
||||
|
||||
* <<ml-valid-detector,POST /anomaly_detectors/_validate/detector>>: Validate a detector
|
||||
* <<ml-valid-job, POST /anomaly_detectors/_validate>>: Validate a job
|
||||
//[float]
|
||||
//== Where to Go Next
|
||||
|
||||
//<<ml-getting-started, Getting Started>> :: Enable machine learning and start
|
||||
//discovering anomalies in your data.
|
||||
|
||||
//[float]
|
||||
//== Have Comments, Questions, or Feedback?
|
||||
|
||||
//Head over to our {forum}[Graph Discussion Forum] to share your experience, questions, and
|
||||
//suggestions.
|
|
@ -0,0 +1,11 @@
|
|||
[[ml-getting-started]]
|
||||
== Getting Started
|
||||
|
||||
To start exploring anomalies in your data:
|
||||
|
||||
. Open Kibana in your web browser and log in. If you are running Kibana
|
||||
locally, go to `http://localhost:5601/`.
|
||||
|
||||
. Click **ML** in the side navigation ...
|
||||
|
||||
//image::graph-open.jpg["Accessing Graph"]
|
|
@ -0,0 +1,23 @@
|
|||
[[xpack-ml]]
|
||||
= Machine Learning in the Elastic Stack
|
||||
|
||||
[partintro]
|
||||
--
|
||||
Data stored in {es} contains valuable insights into the behavior and
|
||||
performance of your business and systems. However, the following questions can
|
||||
be difficult to answer:
|
||||
|
||||
* Is the response time of my website unusual?
|
||||
* Are users exfiltrating data unusually?
|
||||
|
||||
The good news is that the {xpack} machine learning capabilities enable you to
|
||||
easily answer these types of questions.
|
||||
--
|
||||
|
||||
include::introduction.asciidoc[]
|
||||
include::getting-started.asciidoc[]
|
||||
include::ml-scenarios.asciidoc[]
|
||||
include::api-quickref.asciidoc[]
|
||||
|
||||
//include::troubleshooting.asciidoc[] Referenced from x-pack/docs/public/xpack-troubleshooting.asciidoc
|
||||
//include::release-notes.asciidoc[] Referenced from x-pack/docs/public/xpack-release-notes.asciidoc
|
|
@ -0,0 +1,34 @@
|
|||
[[ml-introduction]]
|
||||
== Introduction
|
||||
|
||||
Machine learning in {xpack} automates the analysis of time-series data by
|
||||
creating accurate baselines of normal behaviors in the data, and identifying
|
||||
anomalous patterns in that data.
|
||||
|
||||
Driven by proprietary machine learning algorithms, anomalies related to temporal
|
||||
deviations in values/counts/frequencies, statistical rarity, and unusual
|
||||
behaviors for a member of a population are detected, scored and linked with
|
||||
statistically significant influencers in the data.
|
||||
|
||||
Automated periodicity detection and quick adaptation to changing data ensure
|
||||
that you don’t need to specify algorithms, models, or other data
|
||||
science-related configurations in order to get the benefits of {ml}.
|
||||
//image::graph-network.jpg["Graph network"]
|
||||
|
||||
=== Integration with the Elastic Stack
|
||||
|
||||
Machine learning is tightly integrated with the Elastic Stack.
|
||||
Data is pulled from {es} for analysis and anomaly results are displayed in
|
||||
{kb} dashboards.
|
||||
|
||||
//[float]
|
||||
//== Where to Go Next
|
||||
|
||||
//<<ml-getting-started, Getting Started>> :: Enable machine learning and start
|
||||
//discovering anomalies in your data.
|
||||
|
||||
//[float]
|
||||
//== Have Comments, Questions, or Feedback?
|
||||
|
||||
//Head over to our {forum}[Graph Discussion Forum] to share your experience, questions, and
|
||||
//suggestions.
|
|
@ -0,0 +1,32 @@
|
|||
[[ml-limitations]]
|
||||
== Machine Learning Limitations
|
||||
|
||||
[float]
|
||||
=== Misleading High Missing Field Counts
|
||||
//See x-pack-elasticsearch/#684
|
||||
|
||||
One of the counts associated with a {ml} job is +missingFieldCount+,
|
||||
which indicates the number of records that are missing a configured field.
|
||||
This information is most useful when your job analyzes CSV data. In this case,
|
||||
missing fields indicate data is not being analyzed and you might receive poor results.
|
||||
|
||||
If your job analyzes JSON data, the +missingFieldCount+ might be misleading.
|
||||
Missing fields might be expected due to the structure of the data and therefore do
|
||||
not generate poor results.
|
||||
|
||||
|
||||
//When you refer to a file script in a watch, the watch itself is not updated
|
||||
//if you change the script on the filesystem.
|
||||
|
||||
//Currently, the only way to reload a file script in a watch is to delete
|
||||
//the watch and recreate it.
|
||||
|
||||
//=== The _data Endpoint Requires Data to be in JSON Format
|
||||
|
||||
//See x-pack-elasticsearch/#777
|
||||
|
||||
//=== tBD
|
||||
|
||||
//See x-pack-elasticsearch/#601
|
||||
//When you use aggregations, you must ensure +size+ is configured correctly.
|
||||
//Otherwise, not all data will be analyzed.
|
|
@ -0,0 +1,100 @@
|
|||
[[ml-scenarios]]
|
||||
== Use Cases
|
||||
|
||||
Enterprises, government organizations and cloud based service providers daily
|
||||
process volumes of machine data so massive as to make real-time human
|
||||
analysis impossible. Changing behaviors hidden in this data provide the
|
||||
information needed to quickly resolve massive service outage, detect security
|
||||
breaches before they result in the theft of millions of credit records or
|
||||
identify the next big trend in consumer patterns. Current search and analysis,
|
||||
performance management and cyber security tools are unable to find these
|
||||
anomalies without significant human work in the form of thresholds, rules,
|
||||
signatures and data models.
|
||||
|
||||
By using advanced anomaly detection techniques that learn normal behavior
|
||||
patterns represented by the data and identify and cross-correlate anomalies,
|
||||
performance, security and operational anomalies and their cause can be
|
||||
identified as they develop, so they can be acted on before they impact business.
|
||||
|
||||
Whilst anomaly detection is applicable to any type of data, we focus on machine
|
||||
data scenarios. Enterprise application developers, cloud service providers and
|
||||
technology vendors need to harness the power of machine learning based anomaly
|
||||
detection analytics to better manage complex on-line services, detect the
|
||||
earliest signs of advanced security threats and gain insight to business
|
||||
opportunities and risks represented by changing behaviors hidden in their
|
||||
massive data sets. Here are some real-world examples.
|
||||
|
||||
=== Eliminating noise generated by threshold-based alerts
|
||||
|
||||
Modern IT systems are highly instrumented and can generate TBs of machine data
|
||||
a day. Traditional methods for analyzing data involves alerting when metric
|
||||
values exceed a known value (static thresholds), or looking for simple statistical deviations (dynamic thresholds).
|
||||
|
||||
Setting accurate thresholds for each metric at different times of day is
|
||||
practically impossible. It results in static thresholds generating large volumes
|
||||
of false positives (threshold set too low) and false negatives (threshold set too high).
|
||||
|
||||
The {ml} features in {xpack} automatically learn and calculate the probability
|
||||
of a value being anomalous based on its historical behavior.
|
||||
This enables accurate alerting and highlights only the subset of relevant metrics
|
||||
that have changed. These alerts provide actionable insight into what is a growing
|
||||
mountain of data.
|
||||
|
||||
=== Reducing troubleshooting times and subject matter expert (SME) involvement
|
||||
|
||||
It is said that 75 percent of troubleshooting time is spent mining data to try
|
||||
and identify the root cause of an incident. The {ml} features in {xpack}
|
||||
automatically analyze data and boil down the massive volume of information
|
||||
to the few metrics or log messages that have changed behavior.
|
||||
This allows the subject matter experts (SMEs) to focus on the subset of
|
||||
information that is relevant to an issue, which greatly reduces triage time.
|
||||
|
||||
//In a major credit services provider, within a month of deployment, the company
|
||||
//reported that its overall time to triage was reduced by 70 percent and the use of
|
||||
//outside SMEs’ time to troubleshoot was decreased by 80 percent.
|
||||
|
||||
=== Finding and fixing issues before they impact the end user
|
||||
|
||||
Large-scale systems, such as online banking, typically require complex
|
||||
infrastructures involving hundreds of different interdependent applications.
|
||||
Just accessing an account summary page might involve dozens of different
|
||||
databases, systems and applications.
|
||||
|
||||
Because of their importance to the business, these systems are typically highly
|
||||
resilient and a critical problem will not be allowed to re-occur.
|
||||
If a problem happens, it is likely to be complicated and be the result of a
|
||||
causal sequence of events that span multiple interacting resources.
|
||||
Troubleshooting would require the analysis of large volumes of data with a wide
|
||||
range of characteristics and data types. A variety of experts from multiple
|
||||
disciplines would need to participate in time consuming “war rooms” to mine
|
||||
the data for answers.
|
||||
|
||||
By using {ml} in real-time, large volumes of data can be analyzed to provide
|
||||
alerts to early indicators of problems and highlight the events that were likely
|
||||
to have contributed to the problem.
|
||||
|
||||
=== Finding rare events that may be symptomatic of a security issue
|
||||
|
||||
With several hundred servers under management, the presence of new processes
|
||||
running might indicate a security breach.
|
||||
|
||||
Using typical operational management techniques, each server would require a
|
||||
period of baselining in order to identify which processes are considered standard.
|
||||
Ideally a baseline would be created for each server (or server group)
|
||||
and would be periodically updated, making this a large management overhead.
|
||||
|
||||
By using {ml} features in {xpack}, baselines are automatically built based
|
||||
upon normal behavior patterns for each host and alerts are generated when rare
|
||||
events occur.
|
||||
|
||||
=== Finding anomalies in periodic data
|
||||
|
||||
For data that has periodicity it is difficult for standard monitoring tools to
|
||||
accurately tell whether a change in data is due to a service outage, or is a
|
||||
result of usual time schedules. Daily and weekly trends in data along with
|
||||
peak and off-peak hours, make it difficult to identify anomalies using standard
|
||||
threshold-based methods. A min and max threshold for SMS text activity at 2am
|
||||
would be very different than the thresholds that would be effective during the day.
|
||||
|
||||
By using {ml}, time-related trends are automatically identified and smoothed,
|
||||
leaving the residual to be analyzed for anomalies.
|
|
@ -0,0 +1,12 @@
|
|||
[[ml-release-notes]]
|
||||
== Machine Learning Release Notes
|
||||
|
||||
[[ml-change-list]]
|
||||
=== Change List
|
||||
|
||||
[float]
|
||||
==== 5.4.0
|
||||
|
||||
May 2017
|
||||
|
||||
* Introduces Machine Learning in the Elastic Stack.
|
|
@ -0,0 +1,4 @@
|
|||
[[ml-troubleshooting]]
|
||||
== Machine Learning Troubleshooting
|
||||
|
||||
TBD
|
|
@ -0,0 +1,70 @@
|
|||
[[ml-apis]]
|
||||
== Machine Learning APIs
|
||||
|
||||
Use machine learning to detect anomalies in time series data.
|
||||
|
||||
* <<ml-api-datafeed-endpoint,Datafeeds>>
|
||||
* <<ml-api-job-endpoint,Jobs>>
|
||||
* <<ml-api-snapshot-endpoint, Model Snapshots>>
|
||||
* <<ml-api-result-endpoint,Results>>
|
||||
* <<ml-api-definitions, Definitions>>
|
||||
|
||||
[[ml-api-datafeed-endpoint]]
|
||||
=== Datafeeds
|
||||
|
||||
include::ml/put-datafeed.asciidoc[]
|
||||
include::ml/delete-datafeed.asciidoc[]
|
||||
include::ml/get-datafeed.asciidoc[]
|
||||
include::ml/get-datafeed-stats.asciidoc[]
|
||||
include::ml/preview-datafeed.asciidoc[]
|
||||
include::ml/start-datafeed.asciidoc[]
|
||||
include::ml/stop-datafeed.asciidoc[]
|
||||
include::ml/update-datafeed.asciidoc[]
|
||||
|
||||
[[ml-api-job-endpoint]]
|
||||
=== Jobs
|
||||
|
||||
include::ml/close-job.asciidoc[]
|
||||
include::ml/put-job.asciidoc[]
|
||||
include::ml/delete-job.asciidoc[]
|
||||
include::ml/get-job.asciidoc[]
|
||||
include::ml/get-job-stats.asciidoc[]
|
||||
include::ml/flush-job.asciidoc[]
|
||||
include::ml/open-job.asciidoc[]
|
||||
include::ml/post-data.asciidoc[]
|
||||
include::ml/update-job.asciidoc[]
|
||||
include::ml/validate-job.asciidoc[]
|
||||
include::ml/validate-detector.asciidoc[]
|
||||
|
||||
[[ml-api-snapshot-endpoint]]
|
||||
=== Model Snapshots
|
||||
|
||||
include::ml/delete-snapshot.asciidoc[]
|
||||
include::ml/get-snapshot.asciidoc[]
|
||||
include::ml/revert-snapshot.asciidoc[]
|
||||
include::ml/update-snapshot.asciidoc[]
|
||||
|
||||
[[ml-api-result-endpoint]]
|
||||
=== Results
|
||||
|
||||
include::ml/get-bucket.asciidoc[]
|
||||
include::ml/get-category.asciidoc[]
|
||||
include::ml/get-influencer.asciidoc[]
|
||||
include::ml/get-record.asciidoc[]
|
||||
|
||||
[[ml-api-definitions]]
|
||||
=== Definitions
|
||||
|
||||
include::ml/datafeedresource.asciidoc[]
|
||||
include::ml/jobresource.asciidoc[]
|
||||
include::ml/jobcounts.asciidoc[]
|
||||
include::ml/snapshotresource.asciidoc[]
|
||||
include::ml/resultsresource.asciidoc[]
|
||||
|
||||
|
||||
//* <<ml-put-job>>
|
||||
//* <<ml-delete-job>>
|
||||
//* <<ml-get-job>>
|
||||
//* <<ml-open-close-job>>
|
||||
//* <<ml-flush-job>>
|
||||
//* <<ml-post-data>>
|
|
@ -1,20 +0,0 @@
|
|||
[[ml-api]]
|
||||
== Machine Learning APIs
|
||||
|
||||
Use machine learning to detect anomalies in time series data.
|
||||
|
||||
//=== Job Management APIs
|
||||
//* <<ml-put-job>>
|
||||
//* <<ml-delete-job>>
|
||||
//* <<ml-get-job>>
|
||||
//* <<ml-open-close-job>>
|
||||
//* <<ml-flush-job>>
|
||||
//* <<ml-post-data>>
|
||||
|
||||
|
||||
//include::ml/put-job.asciidoc[]
|
||||
//include::ml/delete-job.asciidoc[]
|
||||
//include::ml/get-job.asciidoc[]
|
||||
//include::ml/open-close-job.asciidoc[]
|
||||
//include::ml/flush-job.asciidoc[]
|
||||
//include::ml/post-data.asciidoc[]
|
|
@ -0,0 +1,63 @@
|
|||
[[ml-close-job]]
|
||||
==== Close Jobs
|
||||
|
||||
An anomaly detection job must be opened in order for it to be ready to receive and analyze data.
|
||||
A job may be opened and closed multiple times throughout its lifecycle.
|
||||
|
||||
===== Request
|
||||
|
||||
`POST _xpack/ml/anomaly_detectors/<job_id>/_close`
|
||||
|
||||
===== Description
|
||||
|
||||
A job can be closed once all data has been analyzed.
|
||||
|
||||
When you close a job, it runs housekeeping tasks such as pruning the model history,
|
||||
flushing buffers, calculating final results and persisting the internal models.
|
||||
Depending upon the size of the job, it could take several minutes to close and
|
||||
the equivalent time to re-open.
|
||||
|
||||
Once closed, the anomaly detection job has almost no overhead on the cluster
|
||||
(except for maintaining its meta data). A closed job is blocked for receiving
|
||||
data and analysis operations, however you can still explore and navigate results.
|
||||
|
||||
//NOTE:
|
||||
//OUTDATED?: If using the {prelert} UI, the job will be automatically closed when stopping a datafeed job.
|
||||
|
||||
===== Path Parameters
|
||||
|
||||
`job_id` (required)::
|
||||
(+string+) Identifier for the job
|
||||
|
||||
===== Query Parameters
|
||||
|
||||
`close_timeout`::
|
||||
(+time+; default: ++30 min++) Controls the time to wait until a job has closed
|
||||
|
||||
////
|
||||
===== Responses
|
||||
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
////
|
||||
===== Examples
|
||||
|
||||
The following example closes the `event_rate` job:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
POST _xpack/ml/anomaly_detectors/event_rate/_close
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[skip:todo]
|
||||
|
||||
When the job is closed, you receive the following results:
|
||||
----
|
||||
{
|
||||
"closed": true
|
||||
}
|
||||
----
|
|
@ -0,0 +1,10 @@
|
|||
[[ml-datafeed-resource]]
|
||||
==== Data Feed Resources
|
||||
|
||||
A data feed resource has the following properties:
|
||||
|
||||
TBD
|
||||
////
|
||||
`analysis_config`::
|
||||
(+object+) The analysis configuration, which specifies how to analyze the data. See <<ml-analysisconfig, analysis configuration objects>>.
|
||||
////
|
|
@ -0,0 +1,56 @@
|
|||
[[ml-delete-datafeed]]
|
||||
==== Delete Data Feeds
|
||||
|
||||
The delete data feed API allows you to delete an existing data feed.
|
||||
|
||||
===== Request
|
||||
|
||||
`DELETE _xpack/ml/datafeeds/<feed_id>`
|
||||
|
||||
////
|
||||
===== Description
|
||||
|
||||
All job configuration, model state and results are deleted.
|
||||
|
||||
IMPORTANT: Deleting a job must be done via this API only. Do not delete the
|
||||
job directly from the `.ml-*` indices using the Elasticsearch
|
||||
DELETE Document API. When {security} is enabled, make sure no `write`
|
||||
privileges are granted to anyone over the `.ml-*` indices.
|
||||
|
||||
Before you can delete a job, you must delete the data feeds that are associated with it.
|
||||
//See <<>>.
|
||||
|
||||
It is not currently possible to delete multiple jobs using wildcards or a comma separated list.
|
||||
////
|
||||
===== Path Parameters
|
||||
|
||||
`feed_id` (required)::
|
||||
(+string+) Identifier for the data feed
|
||||
////
|
||||
===== Responses
|
||||
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
////
|
||||
|
||||
===== Examples
|
||||
|
||||
The following example deletes the `datafeed-it-ops` data feed:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
DELETE _xpack/ml/datafeeds/datafeed-it-ops
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[skip:todo]
|
||||
|
||||
When the data feed is deleted, you receive the following results:
|
||||
----
|
||||
{
|
||||
"acknowledged": true
|
||||
}
|
||||
----
|
|
@ -0,0 +1,55 @@
|
|||
[[ml-delete-job]]
|
||||
==== Delete Jobs
|
||||
|
||||
The delete job API allows you to delete an existing anomaly detection job.
|
||||
|
||||
===== Request
|
||||
|
||||
`DELETE _xpack/ml/anomaly_detectors/<job_id>`
|
||||
|
||||
===== Description
|
||||
|
||||
All job configuration, model state and results are deleted.
|
||||
|
||||
IMPORTANT: Deleting a job must be done via this API only. Do not delete the
|
||||
job directly from the `.ml-*` indices using the Elasticsearch
|
||||
DELETE Document API. When {security} is enabled, make sure no `write`
|
||||
privileges are granted to anyone over the `.ml-*` indices.
|
||||
|
||||
Before you can delete a job, you must delete the data feeds that are associated with it.
|
||||
//See <<>>.
|
||||
|
||||
It is not currently possible to delete multiple jobs using wildcards or a comma separated list.
|
||||
|
||||
===== Path Parameters
|
||||
|
||||
`job_id` (required)::
|
||||
(+string+) Identifier for the job
|
||||
////
|
||||
===== Responses
|
||||
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
////
|
||||
|
||||
===== Examples
|
||||
|
||||
The following example deletes the `event_rate` job:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
DELETE _xpack/ml/anomaly_detectors/event_rate
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[skip:todo]
|
||||
|
||||
When the job is deleted, you receive the following results:
|
||||
----
|
||||
{
|
||||
"acknowledged": true
|
||||
}
|
||||
----
|
|
@ -0,0 +1,60 @@
|
|||
[[ml-delete-snapshot]]
|
||||
==== Delete Model Snapshots
|
||||
|
||||
The delete model snapshot API allows you to delete an existing model snapshot.
|
||||
|
||||
===== Request
|
||||
|
||||
`DELETE _xpack/ml/anomaly_detectors/<job_id>/model_snapshots/<snapshot_id>`
|
||||
|
||||
////
|
||||
===== Description
|
||||
|
||||
All job configuration, model state and results are deleted.
|
||||
|
||||
IMPORTANT: Deleting a job must be done via this API only. Do not delete the
|
||||
job directly from the `.ml-*` indices using the Elasticsearch
|
||||
DELETE Document API. When {security} is enabled, make sure no `write`
|
||||
privileges are granted to anyone over the `.ml-*` indices.
|
||||
|
||||
Before you can delete a job, you must delete the data feeds that are associated with it.
|
||||
//See <<>>.
|
||||
|
||||
It is not currently possible to delete multiple jobs using wildcards or a comma separated list.
|
||||
////
|
||||
===== Path Parameters
|
||||
|
||||
`job_id` (required)::
|
||||
(+string+) Identifier for the job
|
||||
|
||||
`snapshot_id` (required)::
|
||||
(+string+) Identifier for the model snapshot
|
||||
////
|
||||
===== Responses
|
||||
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
|
||||
|
||||
===== Examples
|
||||
|
||||
The following example deletes the `event_rate` job:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
DELETE _xpack/ml/anomaly_detectors/event_rate
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[skip:todo]
|
||||
|
||||
When the job is deleted, you receive the following results:
|
||||
----
|
||||
{
|
||||
"acknowledged": true
|
||||
}
|
||||
----
|
||||
////
|
|
@ -0,0 +1,49 @@
|
|||
[[ml-flush-job]]
|
||||
==== Flush Jobs
|
||||
|
||||
The flush job API forces any buffered data to be processed by the {ml} job.
|
||||
|
||||
===== Request
|
||||
`POST _xpack/ml/anomaly_detectors/<job_id>/_flush`
|
||||
|
||||
===== Description
|
||||
|
||||
The flush job API is only applicable when sending data for analysis using the POST `_data` API.
|
||||
Depending on the content of the buffer, then it might additionally calculate new results.
|
||||
|
||||
Both flush and close operations are similar, however the flush is more efficient if you are expecting to send more data for analysis.
|
||||
When flushing, the job remains open and is available to continue analyzing data.
|
||||
A close operation additionally prunes and persists the model state to disk and the job must be opened again before analyzing further data.
|
||||
|
||||
===== Path Parameters
|
||||
|
||||
`job_id` (required)::
|
||||
( +string+) Identifier for the job
|
||||
|
||||
===== Query Parameters
|
||||
|
||||
`calc_interim`::
|
||||
(+boolean+; default: ++false++) If true (default false), will calculate interim
|
||||
results for the most recent bucket or all buckets within the latency period
|
||||
|
||||
`start`::
|
||||
(+string+; default: ++null++) When used in conjunction with `calc_interim`,
|
||||
specifies the range of buckets on which to calculate interim results
|
||||
|
||||
`end`::
|
||||
(+string+; default: ++null++) When used in conjunction with `calc_interim`,
|
||||
specifies the range of buckets on which to calculate interim results
|
||||
|
||||
|
||||
`advance_time`::
|
||||
(+string+; default: ++null++) Specifies that no data prior to the date `advance_time` is expected
|
||||
|
||||
////
|
||||
===== Responses
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
////
|
|
@ -0,0 +1,86 @@
|
|||
[[ml-get-bucket]]
|
||||
==== Get Buckets
|
||||
|
||||
The get bucket API allows you to retrieve information about buckets in the results from a job.
|
||||
|
||||
===== Request
|
||||
|
||||
`GET _xpack/ml/anomaly_detectors/<job_id>/results/buckets` +
|
||||
|
||||
`GET _xpack/ml/anomaly_detectors/<job_id>/results/buckets/<timestamp>`
|
||||
////
|
||||
===== Description
|
||||
|
||||
OUTDATED?: The get job API can also be applied to all jobs by using `_all` as the job name.
|
||||
////
|
||||
===== Path Parameters
|
||||
|
||||
`job_id`::
|
||||
(+string+) Identifier for the job
|
||||
|
||||
`timestamp`::
|
||||
(+string+) The timestamp of a single bucket result. If you do not specify this optional parameter,
|
||||
the API returns information about all buckets that you have authority to view in the job.
|
||||
|
||||
////
|
||||
===== Results
|
||||
|
||||
The API returns information about the job resource. For more information, see
|
||||
<<ml-job-resource,job resources>>.
|
||||
|
||||
===== Query Parameters
|
||||
|
||||
`_stats`::
|
||||
(+boolean+; default: ++true++) If true (default false), will just validate the cluster definition but will not perform the creation
|
||||
|
||||
===== Responses
|
||||
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
|
||||
===== Examples
|
||||
|
||||
.Example results for a single job
|
||||
----
|
||||
{
|
||||
"count": 1,
|
||||
"jobs": [
|
||||
{
|
||||
"job_id": "it-ops-kpi",
|
||||
"description": "First simple job",
|
||||
"create_time": 1491007356077,
|
||||
"finished_time": 1491007365347,
|
||||
"analysis_config": {
|
||||
"bucket_span": "5m",
|
||||
"latency": "0ms",
|
||||
"summary_count_field_name": "doc_count",
|
||||
"detectors": [
|
||||
{
|
||||
"detector_description": "low_sum(events_per_min)",
|
||||
"function": "low_sum",
|
||||
"field_name": "events_per_min",
|
||||
"detector_rules": []
|
||||
}
|
||||
],
|
||||
"influencers": [],
|
||||
"use_per_partition_normalization": false
|
||||
},
|
||||
"data_description": {
|
||||
"time_field": "@timestamp",
|
||||
"time_format": "epoch_ms"
|
||||
},
|
||||
"model_plot_config": {
|
||||
"enabled": true
|
||||
},
|
||||
"model_snapshot_retention_days": 1,
|
||||
"model_snapshot_id": "1491007364",
|
||||
"results_index_name": "shared"
|
||||
}
|
||||
]
|
||||
}
|
||||
----
|
||||
////
|
|
@ -0,0 +1,86 @@
|
|||
[[ml-get-category]]
|
||||
==== Get Categories
|
||||
|
||||
The get categories API allows you to retrieve information about the categories in the results for a job.
|
||||
|
||||
===== Request
|
||||
|
||||
`GET _xpack/ml/anomaly_detectors/<job_id>/results/categories` +
|
||||
|
||||
`GET _xpack/ml/anomaly_detectors/<job_id>/results/categories/<category_id>`
|
||||
////
|
||||
===== Description
|
||||
|
||||
OUTDATED?: The get job API can also be applied to all jobs by using `_all` as the job name.
|
||||
////
|
||||
===== Path Parameters
|
||||
|
||||
`job_id`::
|
||||
(+string+) Identifier for the job.
|
||||
|
||||
`category_id`::
|
||||
(+string+) Identifier for the category. If you do not specify this optional parameter,
|
||||
the API returns information about all categories that you have authority to view.
|
||||
|
||||
////
|
||||
===== Results
|
||||
|
||||
The API returns information about the job resource. For more information, see
|
||||
<<ml-job-resource,job resources>>.
|
||||
|
||||
===== Query Parameters
|
||||
|
||||
`_stats`::
|
||||
(+boolean+; default: ++true++) If true (default false), will just validate the cluster definition but will not perform the creation
|
||||
|
||||
===== Responses
|
||||
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
|
||||
===== Examples
|
||||
|
||||
.Example results for a single job
|
||||
----
|
||||
{
|
||||
"count": 1,
|
||||
"jobs": [
|
||||
{
|
||||
"job_id": "it-ops-kpi",
|
||||
"description": "First simple job",
|
||||
"create_time": 1491007356077,
|
||||
"finished_time": 1491007365347,
|
||||
"analysis_config": {
|
||||
"bucket_span": "5m",
|
||||
"latency": "0ms",
|
||||
"summary_count_field_name": "doc_count",
|
||||
"detectors": [
|
||||
{
|
||||
"detector_description": "low_sum(events_per_min)",
|
||||
"function": "low_sum",
|
||||
"field_name": "events_per_min",
|
||||
"detector_rules": []
|
||||
}
|
||||
],
|
||||
"influencers": [],
|
||||
"use_per_partition_normalization": false
|
||||
},
|
||||
"data_description": {
|
||||
"time_field": "@timestamp",
|
||||
"time_format": "epoch_ms"
|
||||
},
|
||||
"model_plot_config": {
|
||||
"enabled": true
|
||||
},
|
||||
"model_snapshot_retention_days": 1,
|
||||
"model_snapshot_id": "1491007364",
|
||||
"results_index_name": "shared"
|
||||
}
|
||||
]
|
||||
}
|
||||
----
|
||||
////
|
|
@ -0,0 +1,105 @@
|
|||
[[ml-get-datafeed-stats]]
|
||||
==== Get Data Feed Statistics
|
||||
|
||||
The get data feed statistics API allows you to retrieve usage information for data feeds.
|
||||
|
||||
===== Request
|
||||
|
||||
`GET _xpack/datafeeds/_stats` +
|
||||
|
||||
`GET _xpack/datafeeds/<feed_id>/_stats`
|
||||
|
||||
////
|
||||
===== Description
|
||||
|
||||
TBD
|
||||
////
|
||||
===== Path Parameters
|
||||
|
||||
`feed_id`::
|
||||
(+string+) Identifier for the data feed. If you do not specify this optional parameter,
|
||||
the API returns information about all data feeds that you have authority to view.
|
||||
|
||||
|
||||
////
|
||||
===== Results
|
||||
|
||||
The API returns the following usage information:
|
||||
|
||||
`job_id`::
|
||||
(+string+) A numerical character string that uniquely identifies the job.
|
||||
|
||||
`data_counts`::
|
||||
(+object+) An object that describes the number of records processed and any related error counts.
|
||||
See <<ml-datacounts,data counts objects>>.
|
||||
|
||||
`model_size_stats`::
|
||||
(+object+) An object that provides information about the size and contents of the model.
|
||||
See <<ml-modelsizestats,model size stats objects>>
|
||||
|
||||
`state`::
|
||||
(+string+) The status of the job, which can be one of the following values:
|
||||
running:: The job is actively receiving and processing data.
|
||||
closed:: The job finished successfully with its model state persisted.
|
||||
The job is still available to accept further data. NOTE: If you send data in a periodic cycle
|
||||
and close the job at the end of each transaction, the job is marked as closed in the intervals
|
||||
between when data is sent. For example, if data is sent every minute and it takes 1 second to process,
|
||||
the job has a closed state for 59 seconds.
|
||||
failed:: The job did not finish successfully due to an error. NOTE: This can occur due to invalid input data.
|
||||
In this case, sending corrected data to a failed job re-opens the job and resets it to a running state.
|
||||
|
||||
|
||||
|
||||
===== Responses
|
||||
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
|
||||
===== Examples
|
||||
|
||||
.Example results for a single job
|
||||
----
|
||||
{
|
||||
"count": 1,
|
||||
"jobs": [
|
||||
{
|
||||
"job_id": "it-ops-kpi",
|
||||
"data_counts": {
|
||||
"job_id": "it-ops",
|
||||
"processed_record_count": 43272,
|
||||
"processed_field_count": 86544,
|
||||
"input_bytes": 2846163,
|
||||
"input_field_count": 86544,
|
||||
"invalid_date_count": 0,
|
||||
"missing_field_count": 0,
|
||||
"out_of_order_timestamp_count": 0,
|
||||
"empty_bucket_count": 0,
|
||||
"sparse_bucket_count": 0,
|
||||
"bucket_count": 4329,
|
||||
"earliest_record_timestamp": 1454020560000,
|
||||
"latest_record_timestamp": 1455318900000,
|
||||
"last_data_time": 1491235405945,
|
||||
"input_record_count": 43272
|
||||
},
|
||||
"model_size_stats": {
|
||||
"job_id": "it-ops",
|
||||
"result_type": "model_size_stats",
|
||||
"model_bytes": 25586,
|
||||
"total_by_field_count": 3,
|
||||
"total_over_field_count": 0,
|
||||
"total_partition_field_count": 2,
|
||||
"bucket_allocation_failures_count": 0,
|
||||
"memory_status": "ok",
|
||||
"log_time": 1491235406000,
|
||||
"timestamp": 1455318600000
|
||||
},
|
||||
"state": "closed"
|
||||
}
|
||||
]
|
||||
}
|
||||
----
|
||||
////
|
|
@ -0,0 +1,92 @@
|
|||
[[ml-get-datafeed]]
|
||||
==== Get Data Feeds
|
||||
|
||||
The get data feeds API allows you to retrieve configuration information about data feeds.
|
||||
|
||||
===== Request
|
||||
|
||||
`GET _xpack/ml/datafeeds/` +
|
||||
|
||||
`GET _xpack/ml/datafeeds/<feed_id>`
|
||||
////
|
||||
===== Description
|
||||
|
||||
OUTDATED?: The get job API can also be applied to all jobs by using `_all` as the job name.
|
||||
////
|
||||
===== Path Parameters
|
||||
|
||||
`feed_id`::
|
||||
(+string+) Identifier for the data feed. If you do not specify this optional parameter,
|
||||
the API returns information about all data feeds that you have authority to view.
|
||||
|
||||
===== Results
|
||||
|
||||
The API returns information about the data feed resource.
|
||||
//For more information, see <<ml-job-resource,job resources>>.
|
||||
|
||||
////
|
||||
===== Query Parameters
|
||||
|
||||
`_stats`::
|
||||
(+boolean+; default: ++true++) If true (default false), will just validate the cluster definition but will not perform the creation
|
||||
|
||||
===== Responses
|
||||
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
////
|
||||
===== Examples
|
||||
|
||||
.Example results for a single data feed
|
||||
----
|
||||
{
|
||||
"count": 1,
|
||||
"datafeeds": [
|
||||
{
|
||||
"datafeed_id": "datafeed-it-ops",
|
||||
"job_id": "it-ops",
|
||||
"query_delay": "60s",
|
||||
"frequency": "150s",
|
||||
"indexes": [
|
||||
"it_ops_metrics"
|
||||
],
|
||||
"types": [
|
||||
"network",
|
||||
"kpi",
|
||||
"sql"
|
||||
],
|
||||
"query": {
|
||||
"match_all": {
|
||||
"boost": 1
|
||||
}
|
||||
},
|
||||
"aggregations": {
|
||||
"@timestamp": {
|
||||
"histogram": {
|
||||
"field": "@timestamp",
|
||||
"interval": 30000,
|
||||
"offset": 0,
|
||||
"order": {
|
||||
"_key": "asc"
|
||||
},
|
||||
"keyed": false,
|
||||
"min_doc_count": 0
|
||||
},
|
||||
"aggregations": {
|
||||
"events_per_min": {
|
||||
"sum": {
|
||||
"field": "events_per_min"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"scroll_size": 1000
|
||||
}
|
||||
]
|
||||
}
|
||||
----
|
|
@ -0,0 +1,81 @@
|
|||
[[ml-get-influencer]]
|
||||
==== Get Influencers
|
||||
|
||||
The get influencers API allows you to retrieve information about the influencers in a job.
|
||||
|
||||
===== Request
|
||||
|
||||
`GET _xpack/ml/anomaly_detectors/<job_id>/results/influencers`
|
||||
|
||||
////
|
||||
===== Description
|
||||
|
||||
OUTDATED?: The get job API can also be applied to all jobs by using `_all` as the job name.
|
||||
////
|
||||
===== Path Parameters
|
||||
|
||||
`job_id`::
|
||||
(+string+) Identifier for the job.
|
||||
|
||||
////
|
||||
===== Results
|
||||
|
||||
The API returns information about the job resource. For more information, see
|
||||
<<ml-job-resource,job resources>>.
|
||||
|
||||
===== Query Parameters
|
||||
|
||||
`_stats`::
|
||||
(+boolean+; default: ++true++) If true (default false), will just validate the cluster definition but will not perform the creation
|
||||
|
||||
===== Responses
|
||||
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
|
||||
===== Examples
|
||||
|
||||
.Example results for a single job
|
||||
----
|
||||
{
|
||||
"count": 1,
|
||||
"jobs": [
|
||||
{
|
||||
"job_id": "it-ops-kpi",
|
||||
"description": "First simple job",
|
||||
"create_time": 1491007356077,
|
||||
"finished_time": 1491007365347,
|
||||
"analysis_config": {
|
||||
"bucket_span": "5m",
|
||||
"latency": "0ms",
|
||||
"summary_count_field_name": "doc_count",
|
||||
"detectors": [
|
||||
{
|
||||
"detector_description": "low_sum(events_per_min)",
|
||||
"function": "low_sum",
|
||||
"field_name": "events_per_min",
|
||||
"detector_rules": []
|
||||
}
|
||||
],
|
||||
"influencers": [],
|
||||
"use_per_partition_normalization": false
|
||||
},
|
||||
"data_description": {
|
||||
"time_field": "@timestamp",
|
||||
"time_format": "epoch_ms"
|
||||
},
|
||||
"model_plot_config": {
|
||||
"enabled": true
|
||||
},
|
||||
"model_snapshot_retention_days": 1,
|
||||
"model_snapshot_id": "1491007364",
|
||||
"results_index_name": "shared"
|
||||
}
|
||||
]
|
||||
}
|
||||
----
|
||||
////
|
|
@ -0,0 +1,103 @@
|
|||
[[ml-get-job-stats]]
|
||||
==== Get Job Statistics
|
||||
|
||||
The get jobs API allows you to retrieve usage information for jobs.
|
||||
|
||||
===== Request
|
||||
|
||||
`GET _xpack/anomaly_detectors/_stats` +
|
||||
|
||||
`GET _xpack/anomaly_detectors/<job_id>/_stats`
|
||||
|
||||
////
|
||||
===== Description
|
||||
|
||||
TBD
|
||||
////
|
||||
===== Path Parameters
|
||||
|
||||
`job_id`::
|
||||
(+string+) Identifier for the job. If you do not specify this optional parameter,
|
||||
the API returns information about all jobs that you have authority to view.
|
||||
|
||||
|
||||
===== Results
|
||||
|
||||
The API returns the following usage information:
|
||||
|
||||
`job_id`::
|
||||
(+string+) A numerical character string that uniquely identifies the job.
|
||||
|
||||
`data_counts`::
|
||||
(+object+) An object that describes the number of records processed and any related error counts.
|
||||
See <<ml-datacounts,data counts objects>>.
|
||||
|
||||
`model_size_stats`::
|
||||
(+object+) An object that provides information about the size and contents of the model.
|
||||
See <<ml-modelsizestats,model size stats objects>>
|
||||
|
||||
`state`::
|
||||
(+string+) The status of the job, which can be one of the following values:
|
||||
running:: The job is actively receiving and processing data.
|
||||
closed:: The job finished successfully with its model state persisted.
|
||||
The job is still available to accept further data. NOTE: If you send data in a periodic cycle
|
||||
and close the job at the end of each transaction, the job is marked as closed in the intervals
|
||||
between when data is sent. For example, if data is sent every minute and it takes 1 second to process,
|
||||
the job has a closed state for 59 seconds.
|
||||
failed:: The job did not finish successfully due to an error. NOTE: This can occur due to invalid input data.
|
||||
In this case, sending corrected data to a failed job re-opens the job and resets it to a running state.
|
||||
|
||||
|
||||
////
|
||||
===== Responses
|
||||
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
////
|
||||
===== Examples
|
||||
|
||||
.Example results for a single job
|
||||
----
|
||||
{
|
||||
"count": 1,
|
||||
"jobs": [
|
||||
{
|
||||
"job_id": "it-ops-kpi",
|
||||
"data_counts": {
|
||||
"job_id": "it-ops",
|
||||
"processed_record_count": 43272,
|
||||
"processed_field_count": 86544,
|
||||
"input_bytes": 2846163,
|
||||
"input_field_count": 86544,
|
||||
"invalid_date_count": 0,
|
||||
"missing_field_count": 0,
|
||||
"out_of_order_timestamp_count": 0,
|
||||
"empty_bucket_count": 0,
|
||||
"sparse_bucket_count": 0,
|
||||
"bucket_count": 4329,
|
||||
"earliest_record_timestamp": 1454020560000,
|
||||
"latest_record_timestamp": 1455318900000,
|
||||
"last_data_time": 1491235405945,
|
||||
"input_record_count": 43272
|
||||
},
|
||||
"model_size_stats": {
|
||||
"job_id": "it-ops",
|
||||
"result_type": "model_size_stats",
|
||||
"model_bytes": 25586,
|
||||
"total_by_field_count": 3,
|
||||
"total_over_field_count": 0,
|
||||
"total_partition_field_count": 2,
|
||||
"bucket_allocation_failures_count": 0,
|
||||
"memory_status": "ok",
|
||||
"log_time": 1491235406000,
|
||||
"timestamp": 1455318600000
|
||||
},
|
||||
"state": "closed"
|
||||
}
|
||||
]
|
||||
}
|
||||
----
|
|
@ -0,0 +1,82 @@
|
|||
[[ml-get-job]]
|
||||
==== Get Job Details
|
||||
|
||||
The get jobs API allows you to retrieve configuration information about jobs.
|
||||
|
||||
===== Request
|
||||
|
||||
`GET _xpack/ml/anomaly_detectors/` +
|
||||
|
||||
`GET _xpack/ml/anomaly_detectors/<job_id>`
|
||||
////
|
||||
===== Description
|
||||
|
||||
OUTDATED?: The get job API can also be applied to all jobs by using `_all` as the job name.
|
||||
////
|
||||
===== Path Parameters
|
||||
|
||||
`job_id`::
|
||||
(+string+) Identifier for the job. If you do not specify this optional parameter,
|
||||
the API returns information about all jobs that you have authority to view.
|
||||
|
||||
===== Results
|
||||
|
||||
The API returns information about the job resource. For more information, see
|
||||
<<ml-job-resource,job resources>>.
|
||||
|
||||
////
|
||||
===== Query Parameters
|
||||
|
||||
`_stats`::
|
||||
(+boolean+; default: ++true++) If true (default false), will just validate the cluster definition but will not perform the creation
|
||||
|
||||
===== Responses
|
||||
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
////
|
||||
===== Examples
|
||||
|
||||
.Example results for a single job
|
||||
----
|
||||
{
|
||||
"count": 1,
|
||||
"jobs": [
|
||||
{
|
||||
"job_id": "it-ops-kpi",
|
||||
"description": "First simple job",
|
||||
"create_time": 1491007356077,
|
||||
"finished_time": 1491007365347,
|
||||
"analysis_config": {
|
||||
"bucket_span": "5m",
|
||||
"latency": "0ms",
|
||||
"summary_count_field_name": "doc_count",
|
||||
"detectors": [
|
||||
{
|
||||
"detector_description": "low_sum(events_per_min)",
|
||||
"function": "low_sum",
|
||||
"field_name": "events_per_min",
|
||||
"detector_rules": []
|
||||
}
|
||||
],
|
||||
"influencers": [],
|
||||
"use_per_partition_normalization": false
|
||||
},
|
||||
"data_description": {
|
||||
"time_field": "@timestamp",
|
||||
"time_format": "epoch_ms"
|
||||
},
|
||||
"model_plot_config": {
|
||||
"enabled": true
|
||||
},
|
||||
"model_snapshot_retention_days": 1,
|
||||
"model_snapshot_id": "1491007364",
|
||||
"results_index_name": "shared"
|
||||
}
|
||||
]
|
||||
}
|
||||
----
|
|
@ -0,0 +1,81 @@
|
|||
[[ml-get-record]]
|
||||
==== Get Job Details
|
||||
|
||||
The get records API allows you to retrieve records from the results that were generated by a job.
|
||||
|
||||
===== Request
|
||||
|
||||
`GET _xpack/ml/anomaly_detectors/<job_id>/results/records`
|
||||
|
||||
////
|
||||
===== Description
|
||||
|
||||
OUTDATED?: The get job API can also be applied to all jobs by using `_all` as the job name.
|
||||
////
|
||||
===== Path Parameters
|
||||
|
||||
`job_id`::
|
||||
(+string+) Identifier for the job.
|
||||
|
||||
////
|
||||
===== Results
|
||||
|
||||
The API returns information about the job resource. For more information, see
|
||||
<<ml-job-resource,job resources>>.
|
||||
|
||||
===== Query Parameters
|
||||
|
||||
`_stats`::
|
||||
(+boolean+; default: ++true++) If true (default false), will just validate the cluster definition but will not perform the creation
|
||||
|
||||
===== Responses
|
||||
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
|
||||
===== Examples
|
||||
|
||||
.Example results for a single job
|
||||
----
|
||||
{
|
||||
"count": 1,
|
||||
"jobs": [
|
||||
{
|
||||
"job_id": "it-ops-kpi",
|
||||
"description": "First simple job",
|
||||
"create_time": 1491007356077,
|
||||
"finished_time": 1491007365347,
|
||||
"analysis_config": {
|
||||
"bucket_span": "5m",
|
||||
"latency": "0ms",
|
||||
"summary_count_field_name": "doc_count",
|
||||
"detectors": [
|
||||
{
|
||||
"detector_description": "low_sum(events_per_min)",
|
||||
"function": "low_sum",
|
||||
"field_name": "events_per_min",
|
||||
"detector_rules": []
|
||||
}
|
||||
],
|
||||
"influencers": [],
|
||||
"use_per_partition_normalization": false
|
||||
},
|
||||
"data_description": {
|
||||
"time_field": "@timestamp",
|
||||
"time_format": "epoch_ms"
|
||||
},
|
||||
"model_plot_config": {
|
||||
"enabled": true
|
||||
},
|
||||
"model_snapshot_retention_days": 1,
|
||||
"model_snapshot_id": "1491007364",
|
||||
"results_index_name": "shared"
|
||||
}
|
||||
]
|
||||
}
|
||||
----
|
||||
////
|
|
@ -0,0 +1,86 @@
|
|||
[[ml-get-snapshot]]
|
||||
==== Get Model Snapshots
|
||||
|
||||
The get model snapshots API allows you to retrieve information about model snapshots.
|
||||
|
||||
===== Request
|
||||
|
||||
`GET _xpack/ml/anomaly_detectors/<job_id>/model_snapshots` +
|
||||
|
||||
`GET _xpack/ml/anomaly_detectors/<job_id>/model_snapshots/<snapshot_id>`
|
||||
////
|
||||
===== Description
|
||||
|
||||
OUTDATED?: The get job API can also be applied to all jobs by using `_all` as the job name.
|
||||
////
|
||||
===== Path Parameters
|
||||
|
||||
`job_id`::
|
||||
(+string+) Identifier for the job.
|
||||
|
||||
`snapshot_id`::
|
||||
(+string+) Identifier for the job. If you do not specify this optional parameter,
|
||||
the API returns information about all model snapshots that you have authority to view.
|
||||
|
||||
////
|
||||
===== Results
|
||||
|
||||
The API returns information about the job resource. For more information, see
|
||||
<<ml-job-resource,job resources>>.
|
||||
|
||||
===== Query Parameters
|
||||
|
||||
`_stats`::
|
||||
(+boolean+; default: ++true++) If true (default false), will just validate the cluster definition but will not perform the creation
|
||||
|
||||
===== Responses
|
||||
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
|
||||
===== Examples
|
||||
|
||||
.Example results for a single job
|
||||
----
|
||||
{
|
||||
"count": 1,
|
||||
"jobs": [
|
||||
{
|
||||
"job_id": "it-ops-kpi",
|
||||
"description": "First simple job",
|
||||
"create_time": 1491007356077,
|
||||
"finished_time": 1491007365347,
|
||||
"analysis_config": {
|
||||
"bucket_span": "5m",
|
||||
"latency": "0ms",
|
||||
"summary_count_field_name": "doc_count",
|
||||
"detectors": [
|
||||
{
|
||||
"detector_description": "low_sum(events_per_min)",
|
||||
"function": "low_sum",
|
||||
"field_name": "events_per_min",
|
||||
"detector_rules": []
|
||||
}
|
||||
],
|
||||
"influencers": [],
|
||||
"use_per_partition_normalization": false
|
||||
},
|
||||
"data_description": {
|
||||
"time_field": "@timestamp",
|
||||
"time_format": "epoch_ms"
|
||||
},
|
||||
"model_plot_config": {
|
||||
"enabled": true
|
||||
},
|
||||
"model_snapshot_retention_days": 1,
|
||||
"model_snapshot_id": "1491007364",
|
||||
"results_index_name": "shared"
|
||||
}
|
||||
]
|
||||
}
|
||||
----
|
||||
////
|
|
@ -0,0 +1,120 @@
|
|||
[[ml-jobcounts]]
|
||||
==== Job Counts
|
||||
|
||||
The `data_counts` object provides information about the operational progress of a job.
|
||||
It describes the number of records processed and any related error counts.
|
||||
|
||||
NOTE: Job count values are cumulative for the lifetime of a job. If a model snapshot is reverted
|
||||
or old results are deleted, the job counts are not reset.
|
||||
|
||||
[[ml-datacounts]]
|
||||
===== Data Counts Objects
|
||||
|
||||
A `data_counts` object has the following properties:
|
||||
|
||||
`job_id`::
|
||||
(+string+) A numerical character string that uniquely identifies the job.
|
||||
|
||||
`processed_record_count`::
|
||||
(+long+) The number of records that have been processed by the job.
|
||||
This value includes records with missing fields, since they are nonetheless analyzed.
|
||||
The following records are not processed:
|
||||
* Records not in chronological order and outside the latency window
|
||||
* Records with invalid timestamps
|
||||
* Records filtered by an exclude transform
|
||||
|
||||
`processed_field_count`::
|
||||
(+long+) The total number of fields in all the records that have been processed by the job.
|
||||
Only fields that are specified in the detector configuration object contribute to this count.
|
||||
The time stamp is not included in this count.
|
||||
|
||||
`input_bytes`::
|
||||
(+long+) The number of raw bytes read by the job.
|
||||
|
||||
`input_field_count`::
|
||||
(+long+) The total number of record fields read by the job. This count includes
|
||||
fields that are not used in the analysis.
|
||||
|
||||
`invalid_date_count`::
|
||||
(+long+) The number of records with either a missing date field or a date that could not be parsed.
|
||||
|
||||
`missing_field_count`::
|
||||
(+long+) The number of records that are missing a field that the job is configured to analyze.
|
||||
Records with missing fields are still processed because it is possible that not all fields are missing.
|
||||
The value of `processed_record_count` includes this count.
|
||||
|
||||
`out_of_order_timestamp_count`::
|
||||
(+long+) The number of records that are out of time sequence and outside of the latency window.
|
||||
These records are discarded, since jobs require time series data to be in ascending chronological order.
|
||||
|
||||
`empty_bucket_count`::
|
||||
TBD
|
||||
|
||||
`sparse_bucket_count`::
|
||||
TBD
|
||||
|
||||
`bucket_count`::
|
||||
(+long+) The number of bucket results produced by the job.
|
||||
|
||||
`earliest_record_timestamp`::
|
||||
(+string+) The timestamp of the earliest chronologically ordered record.
|
||||
The datetime string is in ISO 8601 format.
|
||||
|
||||
`latest_record_timestamp`::
|
||||
(+string+) The timestamp of the last chronologically ordered record.
|
||||
If the records are not in strict chronological order, this value might not be
|
||||
the same as the timestamp of the last record.
|
||||
The datetime string is in ISO 8601 format.
|
||||
|
||||
`last_data_time`::
|
||||
TBD
|
||||
|
||||
`input_record_count`::
|
||||
(+long+) The number of data records read by the job.
|
||||
|
||||
|
||||
[[ml-modelsizestats]]
|
||||
===== Model Size Stats Objects
|
||||
|
||||
The `model_size_stats` object has the following properties:
|
||||
|
||||
`job_id`::
|
||||
(+string+) A numerical character string that uniquely identifies the job.
|
||||
|
||||
`result_type`::
|
||||
TBD
|
||||
|
||||
`model_bytes`::
|
||||
(+long+) The number of bytes of memory used by the models. This is the maximum value since the
|
||||
last time the model was persisted. If the job is closed, this value indicates the latest size.
|
||||
|
||||
`total_by_field_count`::
|
||||
(+long+) The number of `by` field values that were analyzed by the models.
|
||||
|
||||
NOTE: The `by` field values are counted separately for each detector and partition.
|
||||
|
||||
|
||||
`total_over_field_count`::
|
||||
(+long+) The number of `over` field values that were analyzed by the models.
|
||||
|
||||
NOTE: The `over` field values are counted separately for each detector and partition.
|
||||
|
||||
`total_partition_field_count`::
|
||||
(+long+) The number of `partition` field values that were analyzed by the models.
|
||||
|
||||
`bucket_allocation_failures_count`::
|
||||
TBD
|
||||
|
||||
`memory_status`::
|
||||
(+string+) The status of the mathematical models. This property can have one of the following values:
|
||||
"ok":: The models stayed below the configured value.
|
||||
"soft_limit":: The models used more than 60% of the configured memory limit and older unused models will
|
||||
be pruned to free up space.
|
||||
"hard_limit":: The models used more space than the configured memory limit. As a result,
|
||||
not all incoming data was processed.
|
||||
|
||||
`log_time`::
|
||||
TBD
|
||||
|
||||
`timestamp`::
|
||||
TBD
|
|
@ -0,0 +1,243 @@
|
|||
[[ml-job-resource]]
|
||||
==== Job Resources
|
||||
|
||||
A job resource has the following properties:
|
||||
|
||||
`analysis_config`::
|
||||
(+object+) The analysis configuration, which specifies how to analyze the data. See <<ml-analysisconfig, analysis configuration objects>>.
|
||||
|
||||
`analysis_limits`::
|
||||
(+object+) Defines limits on the number of field values and time buckets to be analyzed.
|
||||
See <<ml-apilimits,analysis limits>>.
|
||||
|
||||
`create_time`::
|
||||
(+string+) The time the job was created, in ISO 8601 format. For example, `1491007356077`.
|
||||
|
||||
`data_description`::
|
||||
(+object+) Describes the data format and how APIs parse timestamp fields. See <<ml-datadescription,data description objects>>.
|
||||
|
||||
`description`::
|
||||
(+string+) An optional description of the job.
|
||||
|
||||
`finished_time`::
|
||||
(+string+) If the job closed of failed, this is the time the job finished, in ISO 8601 format.
|
||||
Otherwise, it is `null`. For example, `1491007365347`.
|
||||
|
||||
`job_id`::
|
||||
(+string+) A numerical character string that uniquely identifies the job.
|
||||
|
||||
`model_plot_config`:: TBD
|
||||
`enabled`:: TBD. For example, `true`.
|
||||
|
||||
`model_snapshot_id`::
|
||||
TBD. For example, `1491007364`.
|
||||
|
||||
|
||||
`model_snapshot_retention_days`::
|
||||
(+long+) The time in days that model snapshots are retained for the job. Older snapshots are deleted.
|
||||
The default value is 1 day.
|
||||
|
||||
`results_index_name`::
|
||||
TBD. For example, `shared`.
|
||||
|
||||
[[ml-analysisconfig]]
|
||||
===== Analysis Configuration Objects
|
||||
|
||||
An analysis configuration object has the following properties:
|
||||
|
||||
`batch_span`::
|
||||
(+unsigned integer+) The interval into which to batch seasonal data, measured in seconds.
|
||||
This is an advanced option which is usually left as the default value.
|
||||
////
|
||||
Requires `period` to be specified
|
||||
////
|
||||
|
||||
`bucket_span`::
|
||||
(+unsigned integer+, required) The size of the interval that the analysis is aggregated into, measured in seconds.
|
||||
The default value is 300 seconds (5 minutes).
|
||||
|
||||
`categorization_field_name`::
|
||||
(+string+) If not null, the values of the specified field will be categorized.
|
||||
The resulting categories can be used in a detector by setting `by_field_name`,
|
||||
`over_field_name`, or `partition_field_name` to the keyword `prelertcategory`.
|
||||
|
||||
`categorization_filters`::
|
||||
(+array of strings+) If `categorization_field_name` is specified, you can also define optional filters.
|
||||
This property expects an array of regular expressions.
|
||||
The expressions are used to filter out matching sequences off the categorization field values.
|
||||
This functionality is useful to fine tune categorization by excluding sequences
|
||||
that should not be taken into consideration for defining categories.
|
||||
For example, you can exclude SQL statements that appear in your log files.
|
||||
|
||||
`detectors`::
|
||||
(+array+, required) An array of detector configuration objects,
|
||||
which describe the anomaly detectors that are used in the job.
|
||||
See <<ml-detectorconfig,detector configuration objects>>.
|
||||
|
||||
NOTE: If the `detectors` array does not contain at least one detector, no analysis can occur
|
||||
and an error is returned.
|
||||
|
||||
`influencers`::
|
||||
(+array of strings+) A comma separated list of influencer field names.
|
||||
Typically these can be the by, over, or partition fields that are used in the detector configuration.
|
||||
You might also want to use a field name that is not specifically named in a detector,
|
||||
but is available as part of the input data. When you use multiple detectors,
|
||||
the use of influencers is recommended as it aggregates results for each influencer entity.
|
||||
|
||||
`latency`::
|
||||
(+unsigned integer+) The size of the window, in seconds, in which to expect data that is out of time order.
|
||||
The default value is 0 seconds (no latency).
|
||||
|
||||
NOTE: Latency is only applicable when you send data by using the <<ml-post-data, Post Data to Jobs>> API.
|
||||
|
||||
`multivariate_by_fields`::
|
||||
(+boolean+) If set to `true`, the analysis will automatically find correlations
|
||||
between metrics for a given `by` field value and report anomalies when those
|
||||
correlations cease to hold. For example, suppose CPU and memory usage on host A
|
||||
is usually highly correlated with the same metrics on host B. Perhaps this
|
||||
correlation occurs because they are running a load-balanced application.
|
||||
If you enable this property, then anomalies will be reported when, for example,
|
||||
CPU usage on host A is high and the value of CPU usage on host B is low.
|
||||
That is to say, you'll see an anomaly when the CPU of host A is unusual given the CPU of host B.
|
||||
|
||||
NOTE: To use the `multivariate_by_fields` property, you must also specify `by_field_name` in your detector.
|
||||
|
||||
`overlapping_buckets`::
|
||||
(+boolean+) If set to `true`, an additional analysis occurs that runs out of phase by half a bucket length.
|
||||
This requires more system resources and enhances detection of anomalies that span bucket boundaries.
|
||||
|
||||
`period`::
|
||||
(+unsigned integer+) The repeat interval for periodic data in multiples of `batch_span`.
|
||||
If this property is not specified, daily and weekly periodicity are automatically determined.
|
||||
This is an advanced option which is usually left as the default value.
|
||||
|
||||
`summary_count_field_name`::
|
||||
(+string+) If not null, the data fed to the job is expected to be pre-summarized.
|
||||
This property value is the name of the field that contains the count of raw data points that have been summarized.
|
||||
The same `summary_count_field_name` applies to all detectors in the job.
|
||||
|
||||
NOTE: The `summary_count_field_name` property cannot be used with the `metric` function.
|
||||
|
||||
|
||||
`use_per_partition_normalization`::
|
||||
TBD
|
||||
|
||||
[[ml-detectorconfig]]
|
||||
===== Detector Configuration Objects
|
||||
|
||||
Detector configuration objects specify which data fields a job analyzes.
|
||||
They also specify which analytical functions are used.
|
||||
You can specify multiple detectors for a job.
|
||||
Each detector has the following properties:
|
||||
|
||||
`by_field_name`::
|
||||
(+string+) The field used to split the data.
|
||||
In particular, this property is used for analyzing the splits with respect to their own history.
|
||||
It is used for finding unusual values in the context of the split.
|
||||
|
||||
`detector_description`::
|
||||
(+string+) A description of the detector. For example, `low_sum(events_per_min)`.
|
||||
|
||||
`detector_rules`::
|
||||
TBD
|
||||
|
||||
`exclude_frequent`::
|
||||
(+string+) Contains one of the following values: `all`, `none`, `by`, or `over`.
|
||||
If set, frequent entities are excluded from influencing the anomaly results.
|
||||
Entities can be considered frequent over time or frequent in a population.
|
||||
If you are working with both over and by fields, then you can set `exclude_frequent`
|
||||
to `all` for both fields, or to `by` or `over` for those specific fields.
|
||||
|
||||
`field_name`::
|
||||
(+string+) The field that the detector uses in the function. If you use an event rate
|
||||
function such as `count` or `rare`, do not specify this field.
|
||||
|
||||
NOTE: The `field_name` cannot contain double quotes or backslashes.
|
||||
|
||||
`function`::
|
||||
(+string+, required) The analysis function that is used.
|
||||
For example, `count`, `rare`, `mean`, `min`, `max`, and `sum`.
|
||||
The default function is `metric`, which looks for anomalies in all of `min`, `max`,
|
||||
and `mean`.
|
||||
|
||||
NOTE: You cannot use the `metric` function with pre-summarized input. If `summary_count_field_name`
|
||||
is not null, you must specify a function other than `metric`.
|
||||
|
||||
`over_field_name`::
|
||||
(+string+) The field used to split the data.
|
||||
In particular, this property is used for analyzing the splits with respect to the history of all splits.
|
||||
It is used for finding unusual values in the population of all splits.
|
||||
|
||||
`partition_field_name`::
|
||||
(+string+) The field used to segment the analysis.
|
||||
When you use this property, you have completely independent baselines for each value of this field.
|
||||
|
||||
`use_null`::
|
||||
(+boolean+) Defines whether a new series is used as the null series
|
||||
when there is no value for the by or partition fields. The default value is `false`
|
||||
|
||||
IMPORTANT: Field names are case sensitive, for example a field named 'Bytes' is different to one named 'bytes'.
|
||||
|
||||
[[ml-datadescription]]
|
||||
===== Data Description Objects
|
||||
|
||||
The data description settings define the format of the input data.
|
||||
|
||||
When data is read from Elasticsearch, the datafeed must be configured.
|
||||
This defines which index data will be taken from, and over what time period.
|
||||
|
||||
When data is received via the <<ml-post-data, Post Data to Jobs>> API,
|
||||
you must specify the data format (for example, JSON or CSV). In this scenario,
|
||||
the data posted is not stored in Elasticsearch. Only the results for anomaly detection are retained.
|
||||
|
||||
When you create a job, by default it accepts data in tab-separated-values format and expects
|
||||
an Epoch time value in a field named `time`. The `time` field must be measured in seconds from the Epoch.
|
||||
If, however, your data is not in this format, you can provide a data description object that specifies the
|
||||
format of your data.
|
||||
|
||||
A data description object has the following properties:
|
||||
|
||||
`fieldDelimiter`::
|
||||
TBD
|
||||
|
||||
`format`::
|
||||
TBD
|
||||
|
||||
`time_field`::
|
||||
(+string+) The name of the field that contains the timestamp.
|
||||
The default value is `time`.
|
||||
|
||||
`time_format`::
|
||||
(+string+) The time format, which can be `epoch`, `epoch_ms`, or a custom pattern.
|
||||
The default value is `epoch`, which refers to UNIX or Epoch time (the number of seconds
|
||||
since 1 Jan 1970) and corresponds to the time_t type in C and C++.
|
||||
The value `epoch_ms` indicates that time is measured in milliseconds since the epoch.
|
||||
The `epoch` and `epoch_ms` time formats accept either integer or real values. +
|
||||
|
||||
NOTE: Custom patterns must conform to the Java `DateTimeFormatter` class. When you use date-time formatting patterns, it is recommended that you provide the full date, time and time zone. For example: `yyyy-MM-ddTHH:mm:ssX`. If the pattern that you specify is not sufficient to produce a complete timestamp, job creation fails.
|
||||
|
||||
`quotecharacter`::
|
||||
TBD
|
||||
|
||||
[[ml-apilimits]]
|
||||
===== Analysis Limits
|
||||
|
||||
Limits can be applied for the size of the mathematical models that are held in memory.
|
||||
These limits can be set per job and do not control the memory used by other processes.
|
||||
If necessary, the limits can also be updated after the job is created.
|
||||
|
||||
The `analysis_limits` object has the following properties:
|
||||
|
||||
`categorization_examples_limit`::
|
||||
(+long+) The maximum number of examples stored per category in memory and
|
||||
in the results data store. The default value is 4. If you increase this value,
|
||||
more examples are available, however it requires that you have more storage available.
|
||||
If you set this value to `0`, no examples are stored.
|
||||
|
||||
////
|
||||
NOTE: The `categorization_examples_limit` only applies to analysis that uses categorization.
|
||||
////
|
||||
`model_memory_limit`::
|
||||
(+long+) The maximum amount of memory, in MiB, that the mathematical models can use.
|
||||
Once this limit is approached, data pruning becomes more aggressive.
|
||||
Upon exceeding this limit, new entities are not modeled. The default value is 4096.
|
|
@ -0,0 +1,63 @@
|
|||
[[ml-open-job]]
|
||||
==== Open Jobs
|
||||
|
||||
An anomaly detection job must be opened in order for it to be ready to receive and analyze data.
|
||||
A job may be opened and closed multiple times throughout its lifecycle.
|
||||
|
||||
===== Request
|
||||
|
||||
`POST _xpack/ml/anomaly_detectors/<job_id>/_open`
|
||||
|
||||
===== Description
|
||||
|
||||
A job must be open in order to it to accept and analyze data.
|
||||
|
||||
When you open a new job, it starts with an empty model.
|
||||
|
||||
When you open an existing job, the most recent model state is automatically loaded.
|
||||
The job is ready to resume its analysis from where it left off, once new data is received.
|
||||
|
||||
===== Path Parameters
|
||||
|
||||
`job_id` (required)::
|
||||
(+string+) Identifier for the job
|
||||
|
||||
===== Request Body
|
||||
|
||||
`open_timeout`::
|
||||
(+time+; default: ++30 min++) Controls the time to wait until a job has opened
|
||||
|
||||
`ignore_downtime`::
|
||||
(+boolean+; default: ++true++) If true (default), any gap in data since it was
|
||||
last closed is treated as a maintenance window. That is to say, it is not an anomaly
|
||||
|
||||
////
|
||||
===== Responses
|
||||
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
////
|
||||
===== Examples
|
||||
|
||||
The following example opens the `event_rate` job:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
POST _xpack/ml/anomaly_detectors/event_rate/_open
|
||||
{
|
||||
"ignore_downtime":false
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[skip:todo]
|
||||
|
||||
When the job opens, you receive the following results:
|
||||
----
|
||||
{
|
||||
"opened": true
|
||||
}
|
||||
----
|
|
@ -0,0 +1,63 @@
|
|||
[[ml-open-job]]
|
||||
==== Open Jobs
|
||||
|
||||
An anomaly detection job must be opened in order for it to be ready to receive and analyze data.
|
||||
A job may be opened and closed multiple times throughout its lifecycle.
|
||||
|
||||
===== Request
|
||||
|
||||
`POST _xpack/ml/anomaly_detectors/{job_id}/_open`
|
||||
|
||||
===== Description
|
||||
|
||||
A job must be open in order to it to accept and analyze data.
|
||||
|
||||
When you open a new job, it starts with an empty model.
|
||||
|
||||
When you open an existing job, the most recent model state is automatically loaded.
|
||||
The job is ready to resume its analysis from where it left off, once new data is received.
|
||||
|
||||
===== Path Parameters
|
||||
|
||||
`job_id` (required)::
|
||||
(+string+) Identifier for the job
|
||||
|
||||
===== Request Body
|
||||
|
||||
`open_timeout`::
|
||||
(+time+; default: ++30 min++) Controls the time to wait until a job has opened
|
||||
|
||||
`ignore_downtime`::
|
||||
(+boolean+; default: ++true++) If true (default), any gap in data since it was
|
||||
last closed is treated as a maintenance window. That is to say, it is not an anomaly
|
||||
|
||||
////
|
||||
===== Responses
|
||||
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
////
|
||||
===== Examples
|
||||
|
||||
The following example opens the `event_rate` job and sets an optional property:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
POST _xpack/ml/anomaly_detectors/event_rate/_open
|
||||
{
|
||||
"ignore_downtime": false
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[skip:todo]
|
||||
|
||||
When the job opens, you receive the following results:
|
||||
----
|
||||
{
|
||||
"opened": true
|
||||
}
|
||||
----
|
|
@ -0,0 +1,56 @@
|
|||
[[ml-post-data]]
|
||||
==== Post Data to Jobs
|
||||
|
||||
The post data API allows you to send data to an anomaly detection job for analysis.
|
||||
The job must have been opened prior to sending data.
|
||||
|
||||
===== Request
|
||||
|
||||
`POST _xpack/ml/anomaly_detectors/<job_id> --data-binary @{data-file.json}`
|
||||
|
||||
===== Description
|
||||
|
||||
File sizes are limited to 100 Mb, so if your file is larger,
|
||||
then split it into multiple files and upload each one separately in sequential time order.
|
||||
When running in real-time, it is generally recommended to arrange to perform
|
||||
many small uploads, rather than queueing data to upload larger files.
|
||||
|
||||
|
||||
IMPORTANT: Data can only be accepted from a single connection.
|
||||
Do not attempt to access the data endpoint from different threads at the same time.
|
||||
Use a single connection synchronously to send data, close, flush or delete a single job.
|
||||
+
|
||||
It is not currently possible to post data to multiple jobs using wildcards or a comma separated list.
|
||||
|
||||
===== Path Parameters
|
||||
|
||||
`job_id` (required)::
|
||||
(+string+) Identifier for the job
|
||||
|
||||
===== Request Body
|
||||
|
||||
`reset_start`::
|
||||
(+string+; default: ++null++) Specifies the start of the bucket resetting range
|
||||
|
||||
`reset_end`::
|
||||
(+string+; default: ++null++) Specifies the end of the bucket resetting range"
|
||||
|
||||
////
|
||||
===== Responses
|
||||
|
||||
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
|
||||
The following example sends data from file `data-file.json` to a job called `my_analysis`.
|
||||
////
|
||||
===== Examples
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
$ curl -s -XPOST localhost:9200/_xpack/ml/anomaly_detectors/my_analysis --data-binary @data-file.json
|
||||
--------------------------------------------------
|
|
@ -0,0 +1,84 @@
|
|||
[[ml-preview-datafeed]]
|
||||
==== Preview Data Feeds
|
||||
|
||||
The preview data feed API allows you to preview a data feed.
|
||||
|
||||
===== Request
|
||||
|
||||
`GET _xpack/ml/datafeeds/<feed_id>/_preview`
|
||||
|
||||
////
|
||||
===== Description
|
||||
|
||||
Important:: Updates do not take effect until after then job is closed and new
|
||||
data is sent to it.
|
||||
////
|
||||
===== Path Parameters
|
||||
|
||||
`feed_id` (required)::
|
||||
(+string+) Identifier for the data feed
|
||||
|
||||
////
|
||||
===== Request Body
|
||||
|
||||
The following properties can be updated after the job is created:
|
||||
|
||||
`analysis_config`::
|
||||
(+object+) The analysis configuration, which specifies how to analyze the data.
|
||||
See <<ml-analysisconfig, analysis configuration objects>>. In particular, the following properties can be updated: `categorization_filters`, `detector_description`, TBD.
|
||||
|
||||
`analysis_limits`::
|
||||
Optionally specifies runtime limits for the job. See <<ml-apilimits,analysis limits>>.
|
||||
|
||||
[NOTE]
|
||||
* You can update the `analysis_limits` only while the job is closed.
|
||||
* The `model_memory_limit` property value cannot be decreased.
|
||||
* If the `memory_status` property in the `model_size_stats` object has a value of `hard_limit`,
|
||||
increasing the `model_memory_limit` is not recommended.
|
||||
|
||||
`description`::
|
||||
(+string+) An optional description of the job.
|
||||
|
||||
This expects data to be sent in JSON format using the POST `_data` API.
|
||||
|
||||
===== Responses
|
||||
|
||||
TBD
|
||||
////
|
||||
////
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
|
||||
===== Examples
|
||||
|
||||
The following example updates the `it-ops-kpi` job:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT _xpack/ml/anomaly_detectors/it-ops-kpi/_update
|
||||
{
|
||||
"description":"New description",
|
||||
"analysis_limits":{
|
||||
"model_memory_limit": 8192
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[skip:todo]
|
||||
|
||||
When the job is updated, you receive the following results:
|
||||
----
|
||||
{
|
||||
"job_id": "it-ops-kpi",
|
||||
"description": "New description",
|
||||
...
|
||||
"analysis_limits": {
|
||||
"model_memory_limit": 8192
|
||||
...
|
||||
}
|
||||
----
|
||||
////
|
|
@ -0,0 +1,109 @@
|
|||
[[ml-put-datafeed]]
|
||||
==== Create Data Feeds
|
||||
|
||||
The create data feed API allows you to instantiate a data feed.
|
||||
|
||||
===== Request
|
||||
|
||||
`PUT _xpack/ml/datafeeds/<feed_id>`
|
||||
|
||||
////
|
||||
===== Description
|
||||
|
||||
TBD
|
||||
////
|
||||
===== Path Parameters
|
||||
|
||||
`feed_id` (required)::
|
||||
(+string+) Identifier for the data feed
|
||||
|
||||
////
|
||||
===== Request Body
|
||||
|
||||
|
||||
`description`::
|
||||
(+string+) An optional description of the job.
|
||||
|
||||
`analysis_config`::
|
||||
(+object+) The analysis configuration, which specifies how to analyze the data.
|
||||
See <<ml-analysisconfig, analysis configuration objects>>.
|
||||
|
||||
`data_description`::
|
||||
(+object+) Describes the format of the input data.
|
||||
See <<ml-datadescription,data description objects>>.
|
||||
|
||||
`analysis_limits`::
|
||||
Optionally specifies runtime limits for the job. See <<ml-apilimits,analysis limits>>.
|
||||
|
||||
|
||||
This expects data to be sent in JSON format using the POST `_data` API.
|
||||
|
||||
===== Responses
|
||||
|
||||
TBD
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
|
||||
|
||||
===== Examples
|
||||
|
||||
The following example creates the `it-ops-kpi` job:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT _xpack/ml/anomaly_detectors/it-ops-kpi
|
||||
{
|
||||
"description":"First simple job",
|
||||
"analysis_config":{
|
||||
"bucket_span": "5m",
|
||||
"latency": "0ms",
|
||||
"detectors":[
|
||||
{
|
||||
"detector_description": "low_sum(events_per_min)",
|
||||
"function":"low_sum",
|
||||
"field_name": "events_per_min"
|
||||
}
|
||||
]
|
||||
},
|
||||
"data_description": {
|
||||
"time_field":"@timestamp",
|
||||
"time_format":"epoch_ms"
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[skip:todo]
|
||||
|
||||
When the job is created, you receive the following results:
|
||||
----
|
||||
{
|
||||
"job_id": "it-ops-kpi",
|
||||
"description": "First simple job",
|
||||
"create_time": 1491247016391,
|
||||
"analysis_config": {
|
||||
"bucket_span": "5m",
|
||||
"latency": "0ms",
|
||||
"detectors": [
|
||||
{
|
||||
"detector_description": "low_sum(events_per_min)",
|
||||
"function": "low_sum",
|
||||
"field_name": "events_per_min",
|
||||
"detector_rules": []
|
||||
}
|
||||
],
|
||||
"influencers": [],
|
||||
"use_per_partition_normalization": false
|
||||
},
|
||||
"data_description": {
|
||||
"time_field": "@timestamp",
|
||||
"time_format": "epoch_ms"
|
||||
},
|
||||
"model_snapshot_retention_days": 1,
|
||||
"results_index_name": "shared"
|
||||
}
|
||||
----
|
||||
////
|
|
@ -0,0 +1,109 @@
|
|||
[[ml-put-job]]
|
||||
==== Create Jobs
|
||||
|
||||
The create job API allows you to instantiate a {ml} job.
|
||||
|
||||
===== Request
|
||||
|
||||
`PUT _xpack/ml/anomaly_detectors/<job_id>`
|
||||
|
||||
////
|
||||
===== Description
|
||||
|
||||
TBD
|
||||
////
|
||||
===== Path Parameters
|
||||
|
||||
`job_id` (required)::
|
||||
(+string+) Identifier for the job
|
||||
|
||||
|
||||
===== Request Body
|
||||
|
||||
`description`::
|
||||
(+string+) An optional description of the job.
|
||||
|
||||
`analysis_config`::
|
||||
(+object+) The analysis configuration, which specifies how to analyze the data.
|
||||
See <<ml-analysisconfig, analysis configuration objects>>.
|
||||
|
||||
`data_description`::
|
||||
(+object+) Describes the format of the input data.
|
||||
See <<ml-datadescription,data description objects>>.
|
||||
|
||||
`analysis_limits`::
|
||||
Optionally specifies runtime limits for the job. See <<ml-apilimits,analysis limits>>.
|
||||
|
||||
////
|
||||
This expects data to be sent in JSON format using the POST `_data` API.
|
||||
|
||||
===== Responses
|
||||
|
||||
TBD
|
||||
////
|
||||
////
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
////
|
||||
|
||||
===== Examples
|
||||
|
||||
The following example creates the `it-ops-kpi` job:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT _xpack/ml/anomaly_detectors/it-ops-kpi
|
||||
{
|
||||
"description":"First simple job",
|
||||
"analysis_config":{
|
||||
"bucket_span": "5m",
|
||||
"latency": "0ms",
|
||||
"detectors":[
|
||||
{
|
||||
"detector_description": "low_sum(events_per_min)",
|
||||
"function":"low_sum",
|
||||
"field_name": "events_per_min"
|
||||
}
|
||||
]
|
||||
},
|
||||
"data_description": {
|
||||
"time_field":"@timestamp",
|
||||
"time_format":"epoch_ms"
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[skip:todo]
|
||||
|
||||
When the job is created, you receive the following results:
|
||||
----
|
||||
{
|
||||
"job_id": "it-ops-kpi",
|
||||
"description": "First simple job",
|
||||
"create_time": 1491247016391,
|
||||
"analysis_config": {
|
||||
"bucket_span": "5m",
|
||||
"latency": "0ms",
|
||||
"detectors": [
|
||||
{
|
||||
"detector_description": "low_sum(events_per_min)",
|
||||
"function": "low_sum",
|
||||
"field_name": "events_per_min",
|
||||
"detector_rules": []
|
||||
}
|
||||
],
|
||||
"influencers": [],
|
||||
"use_per_partition_normalization": false
|
||||
},
|
||||
"data_description": {
|
||||
"time_field": "@timestamp",
|
||||
"time_format": "epoch_ms"
|
||||
},
|
||||
"model_snapshot_retention_days": 1,
|
||||
"results_index_name": "shared"
|
||||
}
|
||||
----
|
|
@ -0,0 +1,10 @@
|
|||
[[ml-results-resource]]
|
||||
==== Results Resources
|
||||
|
||||
A results resource has the following properties:
|
||||
|
||||
TBD
|
||||
////
|
||||
`analysis_config`::
|
||||
(+object+) The analysis configuration, which specifies how to analyze the data. See <<ml-analysisconfig, analysis configuration objects>>.
|
||||
////
|
|
@ -0,0 +1,89 @@
|
|||
[[ml-revert-snapshot]]
|
||||
==== Update Model Snapshots
|
||||
|
||||
The update model snapshot API allows you to update certain properties of a snapshot.
|
||||
|
||||
===== Request
|
||||
|
||||
`POST _xpack/ml/anomaly_detectors/<job_id>/model_snapshots/<snapshot_id>/_update`
|
||||
|
||||
////
|
||||
===== Description
|
||||
|
||||
Important:: Updates do not take effect until after then job is closed and new
|
||||
data is sent to it.
|
||||
////
|
||||
===== Path Parameters
|
||||
|
||||
`job_id` (required)::
|
||||
(+string+) Identifier for the job
|
||||
|
||||
`snapshot_id` (required)::
|
||||
(+string+) Identifier for the model snapshot
|
||||
|
||||
===== Request Body
|
||||
|
||||
The following properties can be updated after the job is created:
|
||||
|
||||
TBD
|
||||
|
||||
////
|
||||
`analysis_config`::
|
||||
(+object+) The analysis configuration, which specifies how to analyze the data.
|
||||
See <<ml-analysisconfig, analysis configuration objects>>. In particular, the following properties can be updated: `categorization_filters`, `detector_description`, TBD.
|
||||
|
||||
`analysis_limits`::
|
||||
Optionally specifies runtime limits for the job. See <<ml-apilimits,analysis limits>>.
|
||||
|
||||
[NOTE]
|
||||
* You can update the `analysis_limits` only while the job is closed.
|
||||
* The `model_memory_limit` property value cannot be decreased.
|
||||
* If the `memory_status` property in the `model_size_stats` object has a value of `hard_limit`,
|
||||
increasing the `model_memory_limit` is not recommended.
|
||||
|
||||
`description`::
|
||||
(+string+) An optional description of the job.
|
||||
|
||||
|
||||
This expects data to be sent in JSON format using the POST `_data` API.
|
||||
|
||||
===== Responses
|
||||
|
||||
TBD
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
|
||||
|
||||
===== Examples
|
||||
|
||||
The following example updates the `it-ops-kpi` job:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT _xpack/ml/anomaly_detectors/it-ops-kpi/_update
|
||||
{
|
||||
"description":"New description",
|
||||
"analysis_limits":{
|
||||
"model_memory_limit": 8192
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[skip:todo]
|
||||
|
||||
When the job is updated, you receive the following results:
|
||||
----
|
||||
{
|
||||
"job_id": "it-ops-kpi",
|
||||
"description": "New description",
|
||||
...
|
||||
"analysis_limits": {
|
||||
"model_memory_limit": 8192
|
||||
...
|
||||
}
|
||||
----
|
||||
////
|
|
@ -0,0 +1,10 @@
|
|||
[[ml-snapshot-resource]]
|
||||
==== Model Snapshot Resources
|
||||
|
||||
A model snapshot resource has the following properties:
|
||||
|
||||
TBD
|
||||
////
|
||||
`analysis_config`::
|
||||
(+object+) The analysis configuration, which specifies how to analyze the data. See <<ml-analysisconfig, analysis configuration objects>>.
|
||||
////
|
|
@ -0,0 +1,64 @@
|
|||
[[ml-start-datafeed]]
|
||||
==== Start Data Feeds
|
||||
|
||||
A data feed must be started in order for it to be ready to receive and analyze data.
|
||||
A data feed can be opened and closed multiple times throughout its lifecycle.
|
||||
|
||||
===== Request
|
||||
|
||||
`POST _xpack/ml/datafeeds/<feed_id>/_start`
|
||||
|
||||
////
|
||||
===== Description
|
||||
|
||||
A job must be open in order to it to accept and analyze data.
|
||||
|
||||
When you open a new job, it starts with an empty model.
|
||||
|
||||
When you open an existing job, the most recent model state is automatically loaded.
|
||||
The job is ready to resume its analysis from where it left off, once new data is received.
|
||||
////
|
||||
===== Path Parameters
|
||||
|
||||
`feed_id` (required)::
|
||||
(+string+) Identifier for the data feed
|
||||
////
|
||||
===== Request Body
|
||||
|
||||
`open_timeout`::
|
||||
(+time+; default: ++30 min++) Controls the time to wait until a job has opened
|
||||
|
||||
`ignore_downtime`::
|
||||
(+boolean+; default: ++true++) If true (default), any gap in data since it was
|
||||
last closed is treated as a maintenance window. That is to say, it is not an anomaly
|
||||
|
||||
===== Responses
|
||||
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
|
||||
===== Examples
|
||||
|
||||
The following example opens the `event_rate` job:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
POST _xpack/ml/anomaly_detectors/event_rate/_open
|
||||
{
|
||||
"ignore_downtime":false
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[skip:todo]
|
||||
|
||||
When the job opens, you receive the following results:
|
||||
----
|
||||
{
|
||||
"opened": true
|
||||
}
|
||||
----
|
||||
////
|
|
@ -0,0 +1,64 @@
|
|||
[[ml-stop-datafeed]]
|
||||
==== Stop Data Feeds
|
||||
|
||||
A data feed can be opened and closed multiple times throughout its lifecycle.
|
||||
|
||||
===== Request
|
||||
|
||||
`POST _xpack/ml/datafeeds/<feed_id>/_stop`
|
||||
|
||||
////
|
||||
===== Description
|
||||
|
||||
A job can be closed once all data has been analyzed.
|
||||
|
||||
When you close a job, it runs housekeeping tasks such as pruning the model history,
|
||||
flushing buffers, calculating final results and persisting the internal models.
|
||||
Depending upon the size of the job, it could take several minutes to close and
|
||||
the equivalent time to re-open.
|
||||
|
||||
Once closed, the anomaly detection job has almost no overhead on the cluster
|
||||
(except for maintaining its meta data). A closed job is blocked for receiving
|
||||
data and analysis operations, however you can still explore and navigate results.
|
||||
|
||||
//NOTE:
|
||||
//OUTDATED?: If using the {prelert} UI, the job will be automatically closed when stopping a datafeed job.
|
||||
////
|
||||
===== Path Parameters
|
||||
|
||||
`feed_id` (required)::
|
||||
(+string+) Identifier for the data feed
|
||||
////
|
||||
===== Query Parameters
|
||||
|
||||
`close_timeout`::
|
||||
(+time+; default: ++30 min++) Controls the time to wait until a job has closed
|
||||
|
||||
|
||||
===== Responses
|
||||
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
|
||||
===== Examples
|
||||
|
||||
The following example closes the `event_rate` job:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
POST _xpack/ml/anomaly_detectors/event_rate/_close
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[skip:todo]
|
||||
|
||||
When the job is closed, you receive the following results:
|
||||
----
|
||||
{
|
||||
"closed": true
|
||||
}
|
||||
----
|
||||
////
|
|
@ -0,0 +1,84 @@
|
|||
[[ml-update-datafeed]]
|
||||
==== Update Data Feeds
|
||||
|
||||
The update data feed API allows you to update certain properties of a data feed.
|
||||
|
||||
===== Request
|
||||
|
||||
`POST _xpack/ml/datafeeds/<feed_id>/_update`
|
||||
|
||||
////
|
||||
===== Description
|
||||
|
||||
Important:: Updates do not take effect until after then job is closed and new
|
||||
data is sent to it.
|
||||
////
|
||||
===== Path Parameters
|
||||
|
||||
`feed_id` (required)::
|
||||
(+string+) Identifier for the data feed
|
||||
|
||||
////
|
||||
===== Request Body
|
||||
|
||||
The following properties can be updated after the job is created:
|
||||
|
||||
`analysis_config`::
|
||||
(+object+) The analysis configuration, which specifies how to analyze the data.
|
||||
See <<ml-analysisconfig, analysis configuration objects>>. In particular, the following properties can be updated: `categorization_filters`, `detector_description`, TBD.
|
||||
|
||||
`analysis_limits`::
|
||||
Optionally specifies runtime limits for the job. See <<ml-apilimits,analysis limits>>.
|
||||
|
||||
[NOTE]
|
||||
* You can update the `analysis_limits` only while the job is closed.
|
||||
* The `model_memory_limit` property value cannot be decreased.
|
||||
* If the `memory_status` property in the `model_size_stats` object has a value of `hard_limit`,
|
||||
increasing the `model_memory_limit` is not recommended.
|
||||
|
||||
`description`::
|
||||
(+string+) An optional description of the job.
|
||||
|
||||
This expects data to be sent in JSON format using the POST `_data` API.
|
||||
|
||||
===== Responses
|
||||
|
||||
TBD
|
||||
////
|
||||
////
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
|
||||
===== Examples
|
||||
|
||||
The following example updates the `it-ops-kpi` job:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT _xpack/ml/anomaly_detectors/it-ops-kpi/_update
|
||||
{
|
||||
"description":"New description",
|
||||
"analysis_limits":{
|
||||
"model_memory_limit": 8192
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[skip:todo]
|
||||
|
||||
When the job is updated, you receive the following results:
|
||||
----
|
||||
{
|
||||
"job_id": "it-ops-kpi",
|
||||
"description": "New description",
|
||||
...
|
||||
"analysis_limits": {
|
||||
"model_memory_limit": 8192
|
||||
...
|
||||
}
|
||||
----
|
||||
////
|
|
@ -0,0 +1,84 @@
|
|||
[[ml-update-job]]
|
||||
==== Update Jobs
|
||||
|
||||
The update job API allows you to update certain properties of a job.
|
||||
|
||||
===== Request
|
||||
|
||||
`POST _xpack/ml/anomaly_detectors/<job_id>/_update`
|
||||
|
||||
////
|
||||
===== Description
|
||||
|
||||
Important:: Updates do not take effect until after then job is closed and new
|
||||
data is sent to it.
|
||||
////
|
||||
===== Path Parameters
|
||||
|
||||
`job_id` (required)::
|
||||
(+string+) Identifier for the job
|
||||
|
||||
===== Request Body
|
||||
|
||||
The following properties can be updated after the job is created:
|
||||
|
||||
`analysis_config`::
|
||||
(+object+) The analysis configuration, which specifies how to analyze the data.
|
||||
See <<ml-analysisconfig, analysis configuration objects>>. In particular, the following properties can be updated: `categorization_filters`, `detector_description`, TBD.
|
||||
|
||||
`analysis_limits`::
|
||||
Optionally specifies runtime limits for the job. See <<ml-apilimits,analysis limits>>.
|
||||
|
||||
[NOTE]
|
||||
* You can update the `analysis_limits` only while the job is closed.
|
||||
* The `model_memory_limit` property value cannot be decreased.
|
||||
* If the `memory_status` property in the `model_size_stats` object has a value of `hard_limit`,
|
||||
increasing the `model_memory_limit` is not recommended.
|
||||
|
||||
`description`::
|
||||
(+string+) An optional description of the job.
|
||||
|
||||
////
|
||||
This expects data to be sent in JSON format using the POST `_data` API.
|
||||
|
||||
===== Responses
|
||||
|
||||
TBD
|
||||
////
|
||||
////
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
////
|
||||
|
||||
===== Examples
|
||||
|
||||
The following example updates the `it-ops-kpi` job:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT _xpack/ml/anomaly_detectors/it-ops-kpi/_update
|
||||
{
|
||||
"description":"New description",
|
||||
"analysis_limits":{
|
||||
"model_memory_limit": 8192
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[skip:todo]
|
||||
|
||||
When the job is updated, you receive the following results:
|
||||
----
|
||||
{
|
||||
"job_id": "it-ops-kpi",
|
||||
"description": "New description",
|
||||
...
|
||||
"analysis_limits": {
|
||||
"model_memory_limit": 8192
|
||||
...
|
||||
}
|
||||
----
|
|
@ -0,0 +1,89 @@
|
|||
[[ml-update-snapshot]]
|
||||
==== Update Model Snapshots
|
||||
|
||||
The update model snapshot API allows you to update certain properties of a snapshot.
|
||||
|
||||
===== Request
|
||||
|
||||
`POST _xpack/ml/anomaly_detectors/<job_id>/model_snapshots/<snapshot_id>/_update`
|
||||
|
||||
////
|
||||
===== Description
|
||||
|
||||
Important:: Updates do not take effect until after then job is closed and new
|
||||
data is sent to it.
|
||||
////
|
||||
===== Path Parameters
|
||||
|
||||
`job_id` (required)::
|
||||
(+string+) Identifier for the job
|
||||
|
||||
`snapshot_id` (required)::
|
||||
(+string+) Identifier for the model snapshot
|
||||
|
||||
===== Request Body
|
||||
|
||||
The following properties can be updated after the job is created:
|
||||
|
||||
TBD
|
||||
|
||||
////
|
||||
`analysis_config`::
|
||||
(+object+) The analysis configuration, which specifies how to analyze the data.
|
||||
See <<ml-analysisconfig, analysis configuration objects>>. In particular, the following properties can be updated: `categorization_filters`, `detector_description`, TBD.
|
||||
|
||||
`analysis_limits`::
|
||||
Optionally specifies runtime limits for the job. See <<ml-apilimits,analysis limits>>.
|
||||
|
||||
[NOTE]
|
||||
* You can update the `analysis_limits` only while the job is closed.
|
||||
* The `model_memory_limit` property value cannot be decreased.
|
||||
* If the `memory_status` property in the `model_size_stats` object has a value of `hard_limit`,
|
||||
increasing the `model_memory_limit` is not recommended.
|
||||
|
||||
`description`::
|
||||
(+string+) An optional description of the job.
|
||||
|
||||
|
||||
This expects data to be sent in JSON format using the POST `_data` API.
|
||||
|
||||
===== Responses
|
||||
|
||||
TBD
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
|
||||
|
||||
===== Examples
|
||||
|
||||
The following example updates the `it-ops-kpi` job:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT _xpack/ml/anomaly_detectors/it-ops-kpi/_update
|
||||
{
|
||||
"description":"New description",
|
||||
"analysis_limits":{
|
||||
"model_memory_limit": 8192
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[skip:todo]
|
||||
|
||||
When the job is updated, you receive the following results:
|
||||
----
|
||||
{
|
||||
"job_id": "it-ops-kpi",
|
||||
"description": "New description",
|
||||
...
|
||||
"analysis_limits": {
|
||||
"model_memory_limit": 8192
|
||||
...
|
||||
}
|
||||
----
|
||||
////
|
|
@ -0,0 +1,61 @@
|
|||
[[ml-valid-detector]]
|
||||
==== Validate Detectors
|
||||
|
||||
TBD
|
||||
|
||||
===== Request
|
||||
|
||||
`POST _xpack/ml/anomaly_detectors/_validate/detector`
|
||||
|
||||
===== Description
|
||||
|
||||
TBD
|
||||
|
||||
////
|
||||
===== Path Parameters
|
||||
|
||||
`job_id` (required)::
|
||||
(+string+) Identifier for the job
|
||||
////
|
||||
===== Request Body
|
||||
|
||||
TBD
|
||||
////
|
||||
`open_timeout`::
|
||||
(+time+; default: ++30 min++) Controls the time to wait until a job has opened
|
||||
|
||||
`ignore_downtime`::
|
||||
(+boolean+; default: ++true++) If true (default), any gap in data since it was
|
||||
last closed is treated as a maintenance window. That is to say, it is not an anomaly
|
||||
|
||||
|
||||
===== Responses
|
||||
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
|
||||
===== Examples
|
||||
|
||||
The following example opens the `event_rate` job:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
POST _xpack/ml/anomaly_detectors/event_rate/_open
|
||||
{
|
||||
"ignore_downtime":false
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[skip:todo]
|
||||
|
||||
When the job opens, you receive the following results:
|
||||
----
|
||||
{
|
||||
"opened": true
|
||||
}
|
||||
----
|
||||
////
|
|
@ -0,0 +1,61 @@
|
|||
[[ml-valid-job]]
|
||||
==== Validate Jobs
|
||||
|
||||
TBD
|
||||
|
||||
===== Request
|
||||
|
||||
`POST _xpack/ml/anomaly_detectors/_validate`
|
||||
|
||||
===== Description
|
||||
|
||||
TBD
|
||||
|
||||
////
|
||||
===== Path Parameters
|
||||
|
||||
`job_id` (required)::
|
||||
(+string+) Identifier for the job
|
||||
////
|
||||
===== Request Body
|
||||
|
||||
TBD
|
||||
////
|
||||
`open_timeout`::
|
||||
(+time+; default: ++30 min++) Controls the time to wait until a job has opened
|
||||
|
||||
`ignore_downtime`::
|
||||
(+boolean+; default: ++true++) If true (default), any gap in data since it was
|
||||
last closed is treated as a maintenance window. That is to say, it is not an anomaly
|
||||
|
||||
|
||||
===== Responses
|
||||
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
|
||||
===== Examples
|
||||
|
||||
The following example opens the `event_rate` job:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
POST _xpack/ml/anomaly_detectors/event_rate/_open
|
||||
{
|
||||
"ignore_downtime":false
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[skip:todo]
|
||||
|
||||
When the job opens, you receive the following results:
|
||||
----
|
||||
{
|
||||
"opened": true
|
||||
}
|
||||
----
|
||||
////
|
Loading…
Reference in New Issue