[DOCS] Add ML API results examples
Original commit: elastic/x-pack-elasticsearch@60a21763eb
This commit is contained in:
parent
6f643ffba5
commit
a1bf6247a8
|
@ -50,8 +50,8 @@ The main {ml} resources can be accessed with a variety of endpoints:
|
|||
|
||||
* <<ml-get-bucket,GET /results/buckets>>: List the buckets in the results
|
||||
* <<ml-get-bucket,GET /results/buckets/<bucket_id+++>+++>>: Get bucket details
|
||||
* <<ml-get-influencer,GET /results/categories>>: List the categories in the results
|
||||
* <<ml-get-influencer,GET /results/categories/<category_id+++>+++>>: Get category details
|
||||
* <<ml-get-category,GET /results/categories>>: List the categories in the results
|
||||
* <<ml-get-category,GET /results/categories/<category_id+++>+++>>: Get category details
|
||||
* <<ml-get-influencer,GET /results/influencers>>: Get influencer details
|
||||
* <<ml-get-record,GET /results/records>>: Get records from the results
|
||||
|
||||
|
|
|
@ -14,7 +14,7 @@ Use machine learning to detect anomalies in time series data.
|
|||
|
||||
* <<ml-put-datafeed,Create data feeds>>
|
||||
* <<ml-delete-datafeed,Delete data feeds>>
|
||||
* <<ml-get-datafeed,Get data feed details>>
|
||||
* <<ml-get-datafeed,Get data feeds>>
|
||||
* <<ml-get-datafeed-stats,Get data feed statistics>>
|
||||
* <<ml-preview-datafeed,Preview data feeds>>
|
||||
* <<ml-start-datafeed,Start data feeds>>
|
||||
|
@ -38,7 +38,7 @@ You can use APIs to perform the following activities:
|
|||
* <<ml-close-job,Close jobs>>
|
||||
* <<ml-put-job,Create jobs>>
|
||||
* <<ml-delete-job,Delete jobs>>
|
||||
* <<ml-get-job,Get job details>>
|
||||
* <<ml-get-job,Get jobs>>
|
||||
* <<ml-get-job-stats,Get job statistics>>
|
||||
* <<ml-flush-job,Flush jobs>>
|
||||
* <<ml-open-job,Open jobs>>
|
||||
|
@ -70,6 +70,11 @@ include::ml/update-snapshot.asciidoc[]
|
|||
[[ml-api-result-endpoint]]
|
||||
=== Results
|
||||
|
||||
* <<ml-get-bucket,Get buckets>>
|
||||
* <<ml-get-category,Get categories>>
|
||||
* <<ml-get-influencer,Get influencers>>
|
||||
* <<ml-get-record,Get records>>
|
||||
|
||||
include::ml/get-bucket.asciidoc[]
|
||||
include::ml/get-category.asciidoc[]
|
||||
include::ml/get-influencer.asciidoc[]
|
||||
|
|
|
@ -7,7 +7,7 @@ A data feed resource has the following properties:
|
|||
(+object+) TBD
|
||||
The aggregations object describes the aggregations that are
|
||||
applied to the search query?
|
||||
For more information, see <<{ref}search-aggregations.html,Aggregations>>.
|
||||
For more information, see {ref}search-aggregations.html[Aggregations].
|
||||
For example:
|
||||
`{"@timestamp": {"histogram": {"field": "@timestamp",
|
||||
"interval": 30000,"offset": 0,"order": {"_key": "asc"},"keyed": false,
|
||||
|
@ -18,7 +18,7 @@ A data feed resource has the following properties:
|
|||
(+string+) A numerical character string that uniquely identifies the data feed.
|
||||
|
||||
`frequency`::
|
||||
TBD. A time For example: "150s"
|
||||
TBD. For example: "150s"
|
||||
|
||||
`indexes` (required)::
|
||||
(+array+) An array of index names. For example: ["it_ops_metrics"]
|
||||
|
@ -41,11 +41,11 @@ A data feed resource has the following properties:
|
|||
`types` (required)::
|
||||
(+array+) TBD. For example: ["network","sql","kpi"]
|
||||
|
||||
[[ml-datafeed-counts]]
|
||||
==== Data Feed Counts
|
||||
[[ml-datafeed-counts]]
|
||||
==== Data Feed Counts
|
||||
|
||||
The get data feed statistics API provides information about the operational
|
||||
progress of a data feed. For example:
|
||||
The get data feed statistics API provides information about the operational
|
||||
progress of a data feed. For example:
|
||||
|
||||
`assigment_explanation`::
|
||||
TBD
|
||||
|
|
|
@ -1,38 +1,65 @@
|
|||
[[ml-get-bucket]]
|
||||
==== Get Buckets
|
||||
|
||||
The get bucket API allows you to retrieve information about buckets in the results from a job.
|
||||
The get bucket API allows you to retrieve information about buckets in the
|
||||
results from a job.
|
||||
|
||||
===== Request
|
||||
|
||||
`GET _xpack/ml/anomaly_detectors/<job_id>/results/buckets` +
|
||||
|
||||
`GET _xpack/ml/anomaly_detectors/<job_id>/results/buckets/<timestamp>`
|
||||
////
|
||||
|
||||
===== Description
|
||||
|
||||
OUTDATED?: The get job API can also be applied to all jobs by using `_all` as the job name.
|
||||
////
|
||||
This API presents a chronological view of the records, grouped by bucket.
|
||||
|
||||
===== Path Parameters
|
||||
|
||||
`job_id`::
|
||||
(+string+) Identifier for the job
|
||||
|
||||
`timestamp`::
|
||||
(+string+) The timestamp of a single bucket result. If you do not specify this optional parameter,
|
||||
the API returns information about all buckets that you have authority to view in the job.
|
||||
(+string+) The timestamp of a single bucket result.
|
||||
If you do not specify this optional parameter, the API returns information
|
||||
about all buckets that you have authority to view in the job.
|
||||
|
||||
===== Request Body
|
||||
|
||||
`anomaly_score`::
|
||||
(+double+) Returns buckets with anomaly scores higher than this value.
|
||||
|
||||
`end`::
|
||||
(+string+) Returns buckets with timestamps earlier than this time.
|
||||
|
||||
`expand`::
|
||||
(+boolean+) If true, the output includes anomaly records.
|
||||
|
||||
`from`::
|
||||
(+integer+) Skips the specified number of buckets.
|
||||
|
||||
`include_interim`::
|
||||
(+boolean+) If true, the output includes interim results.
|
||||
|
||||
`partition_value`::
|
||||
(+string+) If `expand` is true, the anomaly records are filtered by this
|
||||
partition value.
|
||||
|
||||
`size`::
|
||||
(+integer+) Specifies the maximum number of buckets to obtain.
|
||||
|
||||
`start`::
|
||||
(+string+) Returns buckets with timestamps after this time.
|
||||
|
||||
////
|
||||
===== Results
|
||||
|
||||
The API returns information about the job resource. For more information, see
|
||||
<<ml-job-resource,job resources>>.
|
||||
The API returns the following information:
|
||||
|
||||
===== Query Parameters
|
||||
|
||||
`_stats`::
|
||||
(+boolean+; default: ++true++) If true (default false), will just validate the cluster definition but will not perform the creation
|
||||
`buckets`::
|
||||
(+array+) An array of bucket objects. For more information, see
|
||||
<<ml-results-buckets,Buckets>>.
|
||||
|
||||
////
|
||||
===== Responses
|
||||
|
||||
200
|
||||
|
@ -41,46 +68,56 @@ The API returns information about the job resource. For more information, see
|
|||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
|
||||
////
|
||||
===== Examples
|
||||
|
||||
.Example results for a single job
|
||||
The following example gets bucket information for the `it-ops-kpi` job:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
GET _xpack/ml/anomaly_detectors/it-ops-kpi/results/buckets
|
||||
{
|
||||
"anomaly_score": 80,
|
||||
"start": "1454530200001"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[skip:todo]
|
||||
|
||||
In this example, the API returns a single result that matches the specified
|
||||
score and time constraints:
|
||||
----
|
||||
{
|
||||
"count": 1,
|
||||
"jobs": [
|
||||
{
|
||||
"buckets": [
|
||||
{
|
||||
"job_id": "it-ops-kpi",
|
||||
"description": "First simple job",
|
||||
"create_time": 1491007356077,
|
||||
"finished_time": 1491007365347,
|
||||
"analysis_config": {
|
||||
"bucket_span": "5m",
|
||||
"latency": "0ms",
|
||||
"summary_count_field_name": "doc_count",
|
||||
"detectors": [
|
||||
{
|
||||
"detector_description": "low_sum(events_per_min)",
|
||||
"function": "low_sum",
|
||||
"field_name": "events_per_min",
|
||||
"detector_rules": []
|
||||
}
|
||||
],
|
||||
"influencers": [],
|
||||
"use_per_partition_normalization": false
|
||||
},
|
||||
"data_description": {
|
||||
"time_field": "@timestamp",
|
||||
"time_format": "epoch_ms"
|
||||
},
|
||||
"model_plot_config": {
|
||||
"enabled": true
|
||||
},
|
||||
"model_snapshot_retention_days": 1,
|
||||
"model_snapshot_id": "1491007364",
|
||||
"results_index_name": "shared"
|
||||
"timestamp": 1454943900000,
|
||||
"anomaly_score": 87.2526,
|
||||
"bucket_span": 300,
|
||||
"initial_anomaly_score": 83.3831,
|
||||
"record_count": 1,
|
||||
"event_count": 153,
|
||||
"is_interim": false,
|
||||
"bucket_influencers": [
|
||||
{
|
||||
"job_id": "it-ops-kpi",
|
||||
"result_type": "bucket_influencer",
|
||||
"influencer_field_name": "bucket_time",
|
||||
"initial_anomaly_score": 83.3831,
|
||||
"anomaly_score": 87.2526,
|
||||
"raw_anomaly_score": 2.02204,
|
||||
"probability": 0.0000109783,
|
||||
"timestamp": 1454943900000,
|
||||
"bucket_span": 300,
|
||||
"sequence_num": 2,
|
||||
"is_interim": false
|
||||
}
|
||||
],
|
||||
"processing_time_ms": 3,
|
||||
"partition_scores": [],
|
||||
"result_type": "bucket"
|
||||
}
|
||||
]
|
||||
}
|
||||
----
|
||||
////
|
||||
|
|
|
@ -11,7 +11,6 @@ The get categories API allows you to retrieve information about the categories i
|
|||
////
|
||||
===== Description
|
||||
|
||||
OUTDATED?: The get job API can also be applied to all jobs by using `_all` as the job name.
|
||||
////
|
||||
===== Path Parameters
|
||||
|
||||
|
@ -22,17 +21,24 @@ OUTDATED?: The get job API can also be applied to all jobs by using `_all` as th
|
|||
(+string+) Identifier for the category. If you do not specify this optional parameter,
|
||||
the API returns information about all categories that you have authority to view.
|
||||
|
||||
////
|
||||
===== Request Body
|
||||
|
||||
//TBD: Test these properties, since they didn't work on older build.
|
||||
|
||||
`from`::
|
||||
(+integer+) Skips the specified number of categories.
|
||||
|
||||
`size`::
|
||||
(+integer+) Specifies the maximum number of categories to obtain.
|
||||
|
||||
===== Results
|
||||
|
||||
The API returns information about the job resource. For more information, see
|
||||
<<ml-job-resource,job resources>>.
|
||||
|
||||
===== Query Parameters
|
||||
|
||||
`_stats`::
|
||||
(+boolean+; default: ++true++) If true (default false), will just validate the cluster definition but will not perform the creation
|
||||
The API returns the following information:
|
||||
|
||||
`categories`::
|
||||
(+array+) An array of category objects. For more information, see
|
||||
<<ml-results-categories,Categories>>.
|
||||
////
|
||||
===== Responses
|
||||
|
||||
200
|
||||
|
@ -41,46 +47,38 @@ The API returns information about the job resource. For more information, see
|
|||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
|
||||
////
|
||||
===== Examples
|
||||
|
||||
.Example results for a single job
|
||||
The following example gets category information for the `it_ops_new_logs` job:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
GET _xpack/ml/anomaly_detectors/it_ops_new_logs/results/categories
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[skip:todo]
|
||||
|
||||
In this example, the API returns the following information for each category:
|
||||
----
|
||||
{
|
||||
"count": 1,
|
||||
"jobs": [
|
||||
{
|
||||
"job_id": "it-ops-kpi",
|
||||
"description": "First simple job",
|
||||
"create_time": 1491007356077,
|
||||
"finished_time": 1491007365347,
|
||||
"analysis_config": {
|
||||
"bucket_span": "5m",
|
||||
"latency": "0ms",
|
||||
"summary_count_field_name": "doc_count",
|
||||
"detectors": [
|
||||
{
|
||||
"detector_description": "low_sum(events_per_min)",
|
||||
"function": "low_sum",
|
||||
"field_name": "events_per_min",
|
||||
"detector_rules": []
|
||||
}
|
||||
],
|
||||
"influencers": [],
|
||||
"use_per_partition_normalization": false
|
||||
},
|
||||
"data_description": {
|
||||
"time_field": "@timestamp",
|
||||
"time_format": "epoch_ms"
|
||||
},
|
||||
"model_plot_config": {
|
||||
"enabled": true
|
||||
},
|
||||
"model_snapshot_retention_days": 1,
|
||||
"model_snapshot_id": "1491007364",
|
||||
"results_index_name": "shared"
|
||||
}
|
||||
"count": 11,
|
||||
"categories": [
|
||||
{
|
||||
"job_id": "it_ops_new_logs",
|
||||
"category_id": 1,
|
||||
"terms": "Actual Transaction Already Voided Reversed hostname
|
||||
dbserver.acme.com physicalhost esxserver1.acme.com vmhost app1.acme.com",
|
||||
"regex": ".*?Actual.+?Transaction.+?Already.+?Voided.+?Reversed.+?hostname.
|
||||
+?dbserver.acme.com.+?physicalhost.+?esxserver1.acme.com.+?vmhost.
|
||||
+?app1.acme.com.*",
|
||||
"max_matching_length": 137,
|
||||
"examples": [
|
||||
"Actual Transaction Already Voided / Reversed;hostname=dbserver.acme.com;
|
||||
physicalhost=esxserver1.acme.com;vmhost=app1.acme.com"
|
||||
]
|
||||
},
|
||||
...
|
||||
]
|
||||
}
|
||||
----
|
||||
////
|
||||
|
|
|
@ -1,7 +1,8 @@
|
|||
[[ml-get-influencer]]
|
||||
==== Get Influencers
|
||||
|
||||
The get influencers API allows you to retrieve information about the influencers in a job.
|
||||
The get influencers API allows you to retrieve information about the influencers
|
||||
in a job.
|
||||
|
||||
===== Request
|
||||
|
||||
|
@ -10,24 +11,50 @@ The get influencers API allows you to retrieve information about the influencers
|
|||
////
|
||||
===== Description
|
||||
|
||||
OUTDATED?: The get job API can also be applied to all jobs by using `_all` as the job name.
|
||||
|
||||
////
|
||||
===== Path Parameters
|
||||
|
||||
`job_id`::
|
||||
(+string+) Identifier for the job.
|
||||
|
||||
////
|
||||
===== Request Body
|
||||
|
||||
`desc`::
|
||||
(+boolean+) If true, the results are sorted in descending order.
|
||||
//TBD: Using the "sort" value?
|
||||
|
||||
`end`::
|
||||
(+string+) Returns influencers with timestamps earlier than this time.
|
||||
|
||||
`from`::
|
||||
(+integer+) Skips the specified number of influencers.
|
||||
|
||||
`include_interim`::
|
||||
(+boolean+) If true, the output includes interim results.
|
||||
|
||||
`influencer_score`::
|
||||
(+double+) Returns influencers with anomaly scores higher than this value.
|
||||
|
||||
`size`::
|
||||
(+integer+) Specifies the maximum number of influencers to obtain.
|
||||
|
||||
`sort`::
|
||||
(+string+) Specifies the sort field for the requested influencers.
|
||||
//TBD: By default the results are sorted on the influencer score?
|
||||
|
||||
`start`::
|
||||
(+string+) Returns influencers with timestamps after this time.
|
||||
|
||||
===== Results
|
||||
|
||||
The API returns information about the job resource. For more information, see
|
||||
<<ml-job-resource,job resources>>.
|
||||
The API returns the following information:
|
||||
|
||||
===== Query Parameters
|
||||
|
||||
`_stats`::
|
||||
(+boolean+; default: ++true++) If true (default false), will just validate the cluster definition but will not perform the creation
|
||||
`influencers`::
|
||||
(+array+) An array of influencer objects.
|
||||
For more information, see <<ml-results-influencers,Influencers>>.
|
||||
|
||||
////
|
||||
===== Responses
|
||||
|
||||
200
|
||||
|
@ -36,46 +63,43 @@ The API returns information about the job resource. For more information, see
|
|||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
|
||||
////
|
||||
===== Examples
|
||||
|
||||
.Example results for a single job
|
||||
The following example gets influencer information for the `it_ops_new_kpi` job:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
GET _xpack/ml/anomaly_detectors/it_ops_new_kpi/results/influencers
|
||||
{
|
||||
"sort": "influencer_score",
|
||||
"desc": true
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[skip:todo]
|
||||
|
||||
In this example, the API returns the following information, sorted based on the
|
||||
influencer score in descending order:
|
||||
----
|
||||
{
|
||||
"count": 1,
|
||||
"jobs": [
|
||||
{
|
||||
"job_id": "it-ops-kpi",
|
||||
"description": "First simple job",
|
||||
"create_time": 1491007356077,
|
||||
"finished_time": 1491007365347,
|
||||
"analysis_config": {
|
||||
"bucket_span": "5m",
|
||||
"latency": "0ms",
|
||||
"summary_count_field_name": "doc_count",
|
||||
"detectors": [
|
||||
{
|
||||
"detector_description": "low_sum(events_per_min)",
|
||||
"function": "low_sum",
|
||||
"field_name": "events_per_min",
|
||||
"detector_rules": []
|
||||
}
|
||||
],
|
||||
"influencers": [],
|
||||
"use_per_partition_normalization": false
|
||||
},
|
||||
"data_description": {
|
||||
"time_field": "@timestamp",
|
||||
"time_format": "epoch_ms"
|
||||
},
|
||||
"model_plot_config": {
|
||||
"enabled": true
|
||||
},
|
||||
"model_snapshot_retention_days": 1,
|
||||
"model_snapshot_id": "1491007364",
|
||||
"results_index_name": "shared"
|
||||
}
|
||||
"count": 22,
|
||||
"influencers": [
|
||||
{
|
||||
"job_id": "it_ops_new_kpi",
|
||||
"result_type": "influencer",
|
||||
"influencer_field_name": "kpi_indicator",
|
||||
"influencer_field_value": "online_purchases",
|
||||
"kpi_indicator": "online_purchases",
|
||||
"influencer_score": 94.1386,
|
||||
"initial_influencer_score": 94.1386,
|
||||
"probability": 0.000111612,
|
||||
"sequence_num": 2,
|
||||
"bucket_span": 600,
|
||||
"is_interim": false,
|
||||
"timestamp": 1454943600000
|
||||
},
|
||||
...
|
||||
]
|
||||
}
|
||||
----
|
||||
////
|
||||
|
|
|
@ -39,10 +39,13 @@ The API returns the following usage information:
|
|||
`state`::
|
||||
(+string+) The status of the job, which can be one of the following values:
|
||||
|
||||
running:: The job is actively receiving and processing data.
|
||||
open:: The job is actively receiving and processing data.
|
||||
|
||||
closed:: The job finished successfully with its model state persisted.
|
||||
The job is still available to accept further data.
|
||||
|
||||
closing:: TBD
|
||||
|
||||
NOTE: If you send data in a periodic cycle and close the job at the end of each transaction,
|
||||
the job is marked as closed in the intervals between when data is sent.
|
||||
For example, if data is sent every minute and it takes 1 second to process, the job has a closed state for 59 seconds.
|
||||
|
@ -50,7 +53,6 @@ For example, if data is sent every minute and it takes 1 second to process, the
|
|||
failed:: The job did not finish successfully due to an error. NOTE: This can occur due to invalid input data.
|
||||
In this case, sending corrected data to a failed job re-opens the job and resets it to a running state.
|
||||
|
||||
|
||||
////
|
||||
===== Responses
|
||||
|
||||
|
|
|
@ -1,5 +1,5 @@
|
|||
[[ml-get-job]]
|
||||
==== Get Job Details
|
||||
==== Get Jobs
|
||||
|
||||
The get jobs API allows you to retrieve configuration information about jobs.
|
||||
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
[[ml-get-record]]
|
||||
==== Get Job Details
|
||||
==== Get Records
|
||||
|
||||
The get records API allows you to retrieve records from the results that were generated by a job.
|
||||
The get records API allows you to retrieve anomaly records for a job.
|
||||
|
||||
===== Request
|
||||
|
||||
|
@ -10,72 +10,123 @@ The get records API allows you to retrieve records from the results that were ge
|
|||
////
|
||||
===== Description
|
||||
|
||||
OUTDATED?: The get job API can also be applied to all jobs by using `_all` as the job name.
|
||||
|
||||
////
|
||||
===== Path Parameters
|
||||
|
||||
`job_id`::
|
||||
(+string+) Identifier for the job.
|
||||
|
||||
////
|
||||
===== Request Body
|
||||
|
||||
`desc`::
|
||||
(+boolean+) If true, the results are sorted in descending order.
|
||||
//TBD: Using the "sort" value?
|
||||
|
||||
`end`::
|
||||
(+string+) Returns records with timestamps earlier than this time.
|
||||
|
||||
`expand`::
|
||||
(+boolean+) TBD
|
||||
//This field did not work on older build.
|
||||
|
||||
`from`::
|
||||
(+integer+) Skips the specified number of records.
|
||||
|
||||
`include_interim`::
|
||||
(+boolean+) If true, the output includes interim results.
|
||||
|
||||
`partition_value`::
|
||||
(+string+) If `expand` is true, the records are filtered by this
|
||||
partition value.
|
||||
|
||||
`record_score`::
|
||||
(+double+) Returns records with anomaly scores higher than this value.
|
||||
|
||||
`size`::
|
||||
(+integer+) Specifies the maximum number of records to obtain.
|
||||
|
||||
`sort`::
|
||||
(+string+) Specifies the sort field for the requested records.
|
||||
By default, the records are sorted by the `anomaly_score` value.
|
||||
|
||||
`start`::
|
||||
(+string+) Returns records with timestamps after this time.
|
||||
|
||||
===== Results
|
||||
|
||||
The API returns information about the job resource. For more information, see
|
||||
<<ml-job-resource,job resources>>.
|
||||
The API returns the following information:
|
||||
|
||||
===== Query Parameters
|
||||
`records`::
|
||||
(+array+) An array of record objects. For more information, see
|
||||
<<ml-results-records,Records>>.
|
||||
|
||||
`_stats`::
|
||||
(+boolean+; default: ++true++) If true (default false), will just validate the cluster definition but will not perform the creation
|
||||
|
||||
===== Responses
|
||||
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
////
|
||||
===== Responses
|
||||
|
||||
200
|
||||
(EmptyResponse) The cluster has been successfully deleted
|
||||
404
|
||||
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
|
||||
412
|
||||
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
|
||||
////
|
||||
===== Examples
|
||||
|
||||
.Example results for a single job
|
||||
The following example gets bucket information for the `it-ops-kpi` job:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
GET _xpack/ml/anomaly_detectors/it-ops-kpi/results/buckets
|
||||
{
|
||||
"sort": "record_score",
|
||||
"desc": true,
|
||||
"start": "1454944200000"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[skip:todo]
|
||||
|
||||
In this example, the API returns a single result that matches the specified
|
||||
score and time constraints:
|
||||
----
|
||||
{
|
||||
"count": 1,
|
||||
"jobs": [
|
||||
{
|
||||
"job_id": "it-ops-kpi",
|
||||
"description": "First simple job",
|
||||
"create_time": 1491007356077,
|
||||
"finished_time": 1491007365347,
|
||||
"analysis_config": {
|
||||
"bucket_span": "5m",
|
||||
"latency": "0ms",
|
||||
"summary_count_field_name": "doc_count",
|
||||
"detectors": [
|
||||
{
|
||||
"detector_description": "low_sum(events_per_min)",
|
||||
"function": "low_sum",
|
||||
"field_name": "events_per_min",
|
||||
"detector_rules": []
|
||||
}
|
||||
],
|
||||
"influencers": [],
|
||||
"use_per_partition_normalization": false
|
||||
},
|
||||
"data_description": {
|
||||
"time_field": "@timestamp",
|
||||
"time_format": "epoch_ms"
|
||||
},
|
||||
"model_plot_config": {
|
||||
"enabled": true
|
||||
},
|
||||
"model_snapshot_retention_days": 1,
|
||||
"model_snapshot_id": "1491007364",
|
||||
"results_index_name": "shared"
|
||||
}
|
||||
"count": 6,
|
||||
"records": [
|
||||
{
|
||||
"job_id": "it_ops_new_kpi",
|
||||
"result_type": "record",
|
||||
"probability": 0.000113075,
|
||||
"record_score": 86.9677,
|
||||
"initial_record_score": 82.8891,
|
||||
"bucket_span": 600,
|
||||
"detector_index": 0,
|
||||
"sequence_num": 1,
|
||||
"is_interim": false,
|
||||
"timestamp": 1454944200000,
|
||||
"partition_field_name": "kpi_indicator",
|
||||
"partition_field_value": "online_purchases",
|
||||
"function": "low_non_zero_count",
|
||||
"function_description": "count",
|
||||
"typical": [
|
||||
3582.53
|
||||
],
|
||||
"actual": [
|
||||
575
|
||||
],
|
||||
"influencers": [
|
||||
{
|
||||
"influencer_field_name": "kpi_indicator",
|
||||
"influencer_field_values": [
|
||||
"online_purchases"
|
||||
]
|
||||
}
|
||||
],
|
||||
"kpi_indicator": [
|
||||
"online_purchases"
|
||||
]
|
||||
},
|
||||
...
|
||||
]
|
||||
}
|
||||
----
|
||||
////
|
||||
|
|
|
@ -1,10 +1,250 @@
|
|||
[[ml-results-resource]]
|
||||
==== Results Resources
|
||||
|
||||
A results resource has the following properties:
|
||||
The results of a job are organized into _records_ and _buckets_.
|
||||
The results are aggregated and normalized in order to identify the mathematically
|
||||
significant anomalies.
|
||||
|
||||
TBD
|
||||
////
|
||||
`analysis_config`::
|
||||
(+object+) The analysis configuration, which specifies how to analyze the data. See <<ml-analysisconfig, analysis configuration objects>>.
|
||||
////
|
||||
When categorization is specified, the results also contain category definitions.
|
||||
|
||||
* <<ml-results-records,Records>>
|
||||
* <<ml-results-influencers,Influencers>>
|
||||
* <<ml-results-buckets,Buckets>>
|
||||
* <<ml-results-categories,Categories>>
|
||||
|
||||
[float]
|
||||
[[ml-results-records]]
|
||||
===== Records
|
||||
|
||||
Records contain the analytic results. They detail the anomalous activity that
|
||||
has been identified in the input data based upon the detector configuration.
|
||||
For example, if you are looking for unusually large data transfers,
|
||||
an anomaly record would identify the source IP address, the destination,
|
||||
the time window during which it occurred, the expected and actual size of the
|
||||
transfer and the probability of this occurring.
|
||||
Something that is highly improbable is therefore highly anomalous.
|
||||
|
||||
There can be many anomaly records depending upon the characteristics and size
|
||||
of the input data; in practice too many to be able to manually process.
|
||||
The {xpack} {ml} features therefore perform a sophisticated aggregation of
|
||||
the anomaly records into buckets.
|
||||
|
||||
A record object has the following properties:
|
||||
|
||||
`actual`::
|
||||
TBD. For example, [633].
|
||||
|
||||
`bucket_span`::
|
||||
TBD. For example, 600.
|
||||
|
||||
`detector_index`::
|
||||
TBD. For example, 0.
|
||||
|
||||
`function`::
|
||||
TBD. For example, "low_non_zero_count".
|
||||
|
||||
`function_description`::
|
||||
TBD. For example, "count".
|
||||
|
||||
`influencers`::
|
||||
TBD. For example, [{
|
||||
"influencer_field_name": "kpi_indicator",
|
||||
"influencer_field_values": [
|
||||
"online_purchases"]}].
|
||||
|
||||
`initial_record_score`::
|
||||
TBD. For example, 94.1386.
|
||||
|
||||
`is_interim`::
|
||||
TBD. For example, false.
|
||||
|
||||
`job_id`::
|
||||
TBD. For example, "it_ops_new_kpi".
|
||||
|
||||
`kpi_indicator`::
|
||||
TBD. For example, ["online_purchases"]
|
||||
|
||||
`partition_field_name`::
|
||||
TBD. For example, "kpi_indicator".
|
||||
|
||||
`partition_field_value`::
|
||||
TBD. For example, "online_purchases".
|
||||
|
||||
`probability`::
|
||||
TBD. For example, 0.0000772031.
|
||||
|
||||
`record_score`::
|
||||
TBD. For example, 94.1386.
|
||||
|
||||
`result_type`::
|
||||
TBD. For example, "record".
|
||||
|
||||
`sequence_num`::
|
||||
TBD. For example, 1.
|
||||
|
||||
`timestamp`::
|
||||
(+date+) The start time of the bucket that contains the record,
|
||||
specified in ISO 8601 format. For example, 1454020800000.
|
||||
|
||||
`typical`::
|
||||
TBD. For example, [3596.71].
|
||||
|
||||
[float]
|
||||
[[ml-results-influencers]]
|
||||
===== Influencers
|
||||
|
||||
Influencers are the entities that have contributed to, or are to blame for,
|
||||
the anomalies. Influencers are given an anomaly score, which is calculated
|
||||
based on the anomalies that have occurred in each bucket interval.
|
||||
For jobs with more than one detector, this gives a powerful view of the most
|
||||
anomalous entities.
|
||||
|
||||
Upon identifying an influencer with a high score, you can investigate further
|
||||
by accessing the records resource for that bucket and enumerating the anomaly
|
||||
records that contain this influencer.
|
||||
|
||||
An influencer object has the following properties:
|
||||
|
||||
`bucket_span`::
|
||||
TBD. For example, 300.
|
||||
|
||||
`job_id`::
|
||||
(+string+) A numerical character string that uniquely identifies the job.
|
||||
|
||||
`influencer_score`::
|
||||
TBD. For example: 94.1386.
|
||||
|
||||
`initial_influencer_score`::
|
||||
TBD. For example, 83.3831.
|
||||
|
||||
`influencer_field_name`::
|
||||
TBD. For example, "bucket_time".
|
||||
|
||||
`influencer_field_value`::
|
||||
TBD. For example, "online_purchases".
|
||||
|
||||
`is_interim`::
|
||||
TBD. For example, false.
|
||||
|
||||
`kpi_indicator`::
|
||||
TBD. For example, "online_purchases".
|
||||
|
||||
`probability`::
|
||||
TBD. For example, 0.0000109783.
|
||||
|
||||
`result_type`::
|
||||
TBD. For example, "influencer".
|
||||
|
||||
//TBD: How is this different from the "bucket_influencer" type?
|
||||
|
||||
`sequence_num`::
|
||||
`TBD. For example, 2.
|
||||
|
||||
`timestamp`::
|
||||
TBD. For example, 1454943900000.
|
||||
|
||||
[float]
|
||||
[[ml-results-buckets]]
|
||||
===== Buckets
|
||||
|
||||
Buckets are the grouped and time-ordered view of the job results.
|
||||
A bucket time interval is defined by `bucket_span`, which is specified in the
|
||||
job configuration.
|
||||
|
||||
Each bucket has an `anomaly_score`, which is a statistically aggregated and
|
||||
normalized view of the combined anomalousness of the records. You can use this
|
||||
score for rate controlled alerting.
|
||||
|
||||
//TBD: Still correct?
|
||||
//Each bucket also has a maxNormalizedProbability that is equal to the highest
|
||||
//normalizedProbability of the records with the bucket. This gives an indication
|
||||
// of the most anomalous event that has occurred within the time interval.
|
||||
//Unlike anomalyScore this does not take into account the number of correlated
|
||||
//anomalies that have happened.
|
||||
Upon identifying an anomalous bucket, you can investigate further by either
|
||||
expanding the bucket resource to show the records as nested objects or by
|
||||
accessing the records resource directly and filtering upon date range.
|
||||
|
||||
A bucket resource has the following properties:
|
||||
|
||||
`anomaly_score`::
|
||||
(+number+) The aggregated and normalized anomaly score.
|
||||
All the anomaly records in the bucket contribute to this score.
|
||||
|
||||
`bucket_influencers`::
|
||||
(+array+) An array of influencer objects.
|
||||
For more information, see <<ml-results-influencers,influencers>>.
|
||||
|
||||
`bucket_span`::
|
||||
(+unsigned integer+) The length of the bucket in seconds. This value is
|
||||
equal to the `bucket_span` value in the job configuration.
|
||||
|
||||
`event_count`::
|
||||
(+unsigned integer+) The number of input data records processed in this bucket.
|
||||
|
||||
`initial_anomaly_score`::
|
||||
(+number+) The value of `anomaly_score` at the time the bucket result was
|
||||
created. This is normalized based on data which has already been seen;
|
||||
this is not re-normalized and therefore is not adjusted for more recent data.
|
||||
//TBD. This description is unclear.
|
||||
|
||||
`is_interim`::
|
||||
(+boolean+) If true, then this bucket result is an interim result.
|
||||
In other words, it is calculated based on partial input data.
|
||||
|
||||
`job_id`::
|
||||
(+string+) A numerical character string that uniquely identifies the job.
|
||||
|
||||
`partition_scores`::
|
||||
(+TBD+) TBD. For example, [].
|
||||
|
||||
`processing_time_ms`::
|
||||
(+unsigned integer+) The time in milliseconds taken to analyze the bucket
|
||||
contents and produce results.
|
||||
|
||||
`record_count`::
|
||||
(+unsigned integer+) The number of anomaly records in this bucket.
|
||||
|
||||
`result_type`::
|
||||
(+string+) TBD. For example, "bucket".
|
||||
|
||||
`timestamp`::
|
||||
(+date+) The start time of the bucket, specified in ISO 8601 format.
|
||||
For example, 1454020800000. This timestamp uniquely identifies the bucket.
|
||||
|
||||
NOTE: Events that occur exactly at the timestamp of the bucket are included in
|
||||
the results for the bucket.
|
||||
|
||||
[float]
|
||||
[[ml-results-categories]]
|
||||
===== Categories
|
||||
|
||||
When `categorization_field_name` is specified in the job configuration, it is
|
||||
possible to view the definitions of the resulting categories. A category
|
||||
definition describes the common terms matched and contains examples of matched
|
||||
values.
|
||||
|
||||
A category resource has the following properties:
|
||||
|
||||
`category_id`::
|
||||
(+unsigned integer+) A unique identifier for the category.
|
||||
|
||||
`examples`::
|
||||
(+array+) A list of examples of actual values that matched the category.
|
||||
|
||||
`job_id`::
|
||||
(+string+) A numerical character string that uniquely identifies the job.
|
||||
|
||||
`max_matching_length`::
|
||||
(+unsigned integer+) The maximum length of the fields that matched the
|
||||
category.
|
||||
//TBD: Still true? "The value is increased by 10% to enable matching for
|
||||
//similar fields that have not been analyzed"
|
||||
|
||||
`regex`::
|
||||
(+string+) A regular expression that is used to search for values that match
|
||||
the category.
|
||||
|
||||
`terms`::
|
||||
(+string+) A space separated list of the common tokens that are matched in
|
||||
values of the category.
|
||||
|
|
Loading…
Reference in New Issue