Update anomaly detection document for opensearch GA.

Date:   Mon May 17 16:28:16 2021 -0700

Signed-off-by: Alex <pengsun@amazon.com>
This commit is contained in:
Alex 2021-05-24 16:27:14 -07:00
parent 58d5826866
commit 397ba68561
4 changed files with 54 additions and 54 deletions

View File

@ -28,7 +28,7 @@ This command creates a detector named `http_requests` that finds anomalies based
#### Request
```json
POST _opensearch/_anomaly_detection/detectors
POST _plugins/_anomaly_detection/detectors
{
"name": "test-detector",
"description": "Test detector",
@ -143,7 +143,7 @@ To set a category field for high cardinality:
#### Request
```json
POST _opensearch/_anomaly_detection/detectors
POST _plugins/_anomaly_detection/detectors
{
"name": "Host OK Rate Detector",
"description": "ok rate",
@ -243,7 +243,7 @@ To create a historical detector:
#### Request
```json
POST _opensearch/_anomaly_detection/detectors
POST _plugins/_anomaly_detection/detectors
{
"name": "test1",
"description": "test historical detector",
@ -312,7 +312,7 @@ Passes a date range to the anomaly detector to return any anomalies within that
#### Request
```json
POST _opensearch/_anomaly_detection/detectors/<detectorId>/_preview
POST _plugins/_anomaly_detection/detectors/<detectorId>/_preview
{
"period_start": 1588838250000,
"period_end": 1589443050000
@ -452,7 +452,7 @@ Starts a real-time or historical anomaly detector job.
#### Request
```json
POST _opensearch/_anomaly_detection/detectors/<detectorId>/_start
POST _plugins/_anomaly_detection/detectors/<detectorId>/_start
```
#### Sample response
@ -476,7 +476,7 @@ Stops a real-time or historical anomaly detector job.
#### Request
```json
POST _opensearch/_anomaly_detection/detectors/<detectorId>/_stop
POST _plugins/_anomaly_detection/detectors/<detectorId>/_stop
```
#### Sample response
@ -494,8 +494,8 @@ Returns all results for a search query.
#### Request
```json
GET _opensearch/_anomaly_detection/detectors/results/_search
POST _opensearch/_anomaly_detection/detectors/results/_search
GET _plugins/_anomaly_detection/detectors/results/_search
POST _plugins/_anomaly_detection/detectors/results/_search
{
"query": {
@ -596,7 +596,7 @@ To see an ordered set of anomaly records for an entity with an anomaly within a
#### Request
```json
POST _opensearch/_anomaly_detection/detectors/results/_search
POST _plugins/_anomaly_detection/detectors/results/_search
{
"query": {
"bool": {
@ -782,7 +782,7 @@ To get the latest task:
#### Request
```json
GET _opensearch/_anomaly_detection/detectors/<detector_id>?task=true
GET _plugins/_anomaly_detection/detectors/<detector_id>?task=true
```
To query the anomaly results with `task_id`:
@ -790,7 +790,7 @@ To query the anomaly results with `task_id`:
#### Request
```json
GET _opensearch/_anomaly_detection/detectors/results/_search
GET _plugins/_anomaly_detection/detectors/results/_search
{
"query": {
"term": {
@ -940,7 +940,7 @@ To delete a detector, you need to first stop the detector.
#### Request
```json
DELETE _opensearch/_anomaly_detection/detectors/<detectorId>
DELETE _plugins/_anomaly_detection/detectors/<detectorId>
```
@ -975,7 +975,7 @@ To update a detector, you need to first stop the detector.
#### Request
```json
PUT _opensearch/_anomaly_detection/detectors/<detectorId>
PUT _plugins/_anomaly_detection/detectors/<detectorId>
{
"name": "test-detector",
"description": "Test detector",
@ -1091,7 +1091,7 @@ To update a historical detector:
#### Request
```json
PUT _opensearch/_anomaly_detection/detectors/<detectorId>
PUT _plugins/_anomaly_detection/detectors/<detectorId>
{
"name": "test1",
"description": "test historical detector",
@ -1145,7 +1145,7 @@ Returns all information about a detector based on the `detector_id`.
#### Request
```json
GET _opensearch/_anomaly_detection/detectors/<detectorId>
GET _plugins/_anomaly_detection/detectors/<detectorId>
```
#### Sample response
@ -1215,7 +1215,7 @@ Use `job=true` to get anomaly detection job information.
#### Request
```json
GET _opensearch/_anomaly_detection/detectors/<detectorId>?job=true
GET _plugins/_anomaly_detection/detectors/<detectorId>?job=true
```
#### Sample response
@ -1304,7 +1304,7 @@ Use `task=true` to get historical detector task information.
#### Request
```json
GET _opensearch/_anomaly_detection/detectors/<detectorId>?task=true
GET _plugins/_anomaly_detection/detectors/<detectorId>?task=true
```
#### Sample response
@ -1491,8 +1491,8 @@ Returns all anomaly detectors for a search query.
#### Request
```json
GET _opensearch/_anomaly_detection/detectors/_search
POST _opensearch/_anomaly_detection/detectors/_search
GET _plugins/_anomaly_detection/detectors/_search
POST _plugins/_anomaly_detection/detectors/_search
Sample Input:
{
@ -1597,10 +1597,10 @@ Provides information about how the plugin is performing.
#### Request
```json
GET _opensearch/_anomaly_detection/stats
GET _opensearch/_anomaly_detection/<nodeId>/stats
GET _opensearch/_anomaly_detection/<nodeId>/stats/<stat>
GET _opensearch/_anomaly_detection/stats/<stat>
GET _plugins/_anomaly_detection/stats
GET _plugins/_anomaly_detection/<nodeId>/stats
GET _plugins/_anomaly_detection/<nodeId>/stats/<stat>
GET _plugins/_anomaly_detection/stats/<stat>
```
#### Sample response
@ -1697,7 +1697,7 @@ Create a monitor to set up alerts for the detector.
#### Request
```json
POST _opensearch/_alerting/monitors
POST _plugins/_alerting/monitors
{
"type": "monitor",
"name": "test-monitor",
@ -1919,23 +1919,23 @@ It also helps track the initialization percentage, the required shingles, and th
#### Request
```json
GET _opensearch/_anomaly_detection/detectors/<detectorId>/_profile/
GET _opensearch/_anomaly_detection/detectors/<detectorId>/_profile?_all=true
GET _opensearch/_anomaly_detection/detectors/<detectorId>/_profile/<type>
GET /_opensearch/_anomaly_detection/detectors/<detectorId>/_profile/<type1>,<type2>
GET _plugins/_anomaly_detection/detectors/<detectorId>/_profile/
GET _plugins/_anomaly_detection/detectors/<detectorId>/_profile?_all=true
GET _plugins/_anomaly_detection/detectors/<detectorId>/_profile/<type>
GET /_plugins/_anomaly_detection/detectors/<detectorId>/_profile/<type1>,<type2>
```
#### Sample Responses
```json
GET _opensearch/_anomaly_detection/detectors/<detectorId>/_profile
GET _plugins/_anomaly_detection/detectors/<detectorId>/_profile
{
"state":"DISABLED",
"error":"Stopped detector: AD models memory usage exceeds our limit."
}
GET _opensearch/_anomaly_detection/detectors/<detectorId>/_profile?_all=true&pretty
GET _plugins/_anomaly_detection/detectors/<detectorId>/_profile?_all=true&pretty
{
"state": "RUNNING",
@ -1970,7 +1970,7 @@ GET _opensearch/_anomaly_detection/detectors/<detectorId>/_profile?_all=true&pre
}
}
GET _opensearch/_anomaly_detection/detectors/<detectorId>/_profile/total_size_in_bytes
GET _plugins/_anomaly_detection/detectors/<detectorId>/_profile/total_size_in_bytes
{
"total_size_in_bytes" : 13369344
@ -1984,7 +1984,7 @@ You can use this data to estimate how much memory is required for anomaly detect
#### Request
```json
GET /_opensearch/_anomaly_detection/detectors/<detectorId>/_profile?_all=true&pretty
GET /_plugins/_anomaly_detection/detectors/<detectorId>/_profile?_all=true&pretty
{
"state": "RUNNING",
@ -2043,7 +2043,7 @@ If there are no anomaly results for an entity, either the entity doesn't have an
#### Request
```json
GET /_opensearch/_anomaly_detection/detectors/<detectorId>/_profile?_all=true&entity=i-00f28ec1eb8997686
GET /_plugins/_anomaly_detection/detectors/<detectorId>/_profile?_all=true&entity=i-00f28ec1eb8997686
{
"category_field": "host",
"value": "i-00f28ec1eb8997686",
@ -2067,8 +2067,8 @@ For a historical detector, specify `_all` or `ad_task` to see information about
#### Request
```json
GET _opensearch/_anomaly_detection/detectors/<detectorId>/_profile?_all
GET _opensearch/_anomaly_detection/detectors/<detectorId>/_profile/ad_task
GET _plugins/_anomaly_detection/detectors/<detectorId>/_profile?_all
GET _plugins/_anomaly_detection/detectors/<detectorId>/_profile/ad_task
```
#### Sample Responses

View File

@ -52,7 +52,7 @@ A feature is the field in your index that you want to check for anomalies. A det
For example, if you choose `min()`, the detector focuses on finding anomalies based on the minimum values of your feature. If you choose `average()`, the detector finds anomalies based on the average values of your feature.
A multi-feature model correlates anomalies across all its features. The [curse of dimensionality](https://en.wikipedia.org/wiki/Curse_of_dimensionality) makes it less likely for multi-feature models to identify smaller anomalies as compared to a single-feature model. Adding more features might negatively impact the [precision and recall](https://en.wikipedia.org/wiki/Precision_and_recall) of a model. A higher proportion of noise in your data might further amplify this negative impact. Selecting the optimal feature set is usually an iterative process. We recommend experimenting with a historical detector with different feature sets and checking the precision before moving on to real-time detectors. By default, the maximum number of features for a detector is 5. You can adjust this limit with the `opendistro.anomaly_detection.max_anomaly_features` setting.
A multi-feature model correlates anomalies across all its features. The [curse of dimensionality](https://en.wikipedia.org/wiki/Curse_of_dimensionality) makes it less likely for multi-feature models to identify smaller anomalies as compared to a single-feature model. Adding more features might negatively impact the [precision and recall](https://en.wikipedia.org/wiki/Precision_and_recall) of a model. A higher proportion of noise in your data might further amplify this negative impact. Selecting the optimal feature set is usually an iterative process. We recommend experimenting with a historical detector with different feature sets and checking the precision before moving on to real-time detectors. By default, the maximum number of features for a detector is 5. You can adjust this limit with the `plugins.anomaly_detection.max_anomaly_features` setting.
{: .note }
1. On the **Model configuration** page, enter the **Feature name**.

View File

@ -35,7 +35,7 @@ Next, enable the following setting:
PUT _cluster/settings
{
"transient": {
"opendistro.anomaly_detection.filter_by_backend_roles": "true"
"plugins.anomaly_detection.filter_by_backend_roles": "true"
}
}
```

View File

@ -17,26 +17,26 @@ For example, to update the retention period of the result index:
PUT _cluster/settings
{
"transient": {
"opendistro.anomaly_detection.ad_result_history_retention_period": "5m"
"plugins.anomaly_detection.ad_result_history_retention_period": "5m"
}
}
```
Setting | Default | Description
:--- | :--- | :---
`opendistro.anomaly_detection.enabled` | True | Whether the anomaly detection plugin is enabled or not. If disabled, all detectors immediately stop running.
`opendistro.anomaly_detection.max_anomaly_detectors` | 1,000 | The maximum number of non-high cardinality detectors (no category field) users can create.
`opendistro.anomaly_detection.max_multi_entity_anomaly_detectors` | 10 | The maximum number of high cardinality detectors (with category field) in a cluster.
`opendistro.anomaly_detection.max_anomaly_features` | 5 | The maximum number of features for a detector.
`opendistro.anomaly_detection.ad_result_history_rollover_period` | 12h | How often the rollover condition is checked. If `true`, the plugin rolls over the result index to a new index.
`opendistro.anomaly_detection.ad_result_history_max_docs` | 250000000 | The maximum number of documents in one result index. The plugin only counts refreshed documents in the primary shards.
`opendistro.anomaly_detection.ad_result_history_retention_period` | 30d | The maximum age of the result index. If its age exceeds the threshold, the plugin deletes the rolled over result index. If the cluster has only one result index, the plugin keeps the index even if it's older than its configured retention period.
`opendistro.anomaly_detection.max_entities_per_query` | 1,000 | The maximum unique values per detection interval for high cardinality detectors. By default, if the category field has more than 1,000 unique values in a detector interval, the plugin selects the top 1,000 values and orders them by `doc_count`.
`opendistro.anomaly_detection.max_entities_for_preview` | 30 | The maximum unique category field values displayed with the preview operation for high cardinality detectors. If the category field has more than 30 unique values, the plugin selects the top 30 values and orders them by `doc_count`.
`opendistro.anomaly_detection.max_primary_shards` | 10 | The maximum number of primary shards an anomaly detection index can have.
`opendistro.anomaly_detection.filter_by_backend_roles` | False | When you enable the security plugin and set this to `true`, the plugin filters results based on the user's backend role(s).
`opendistro.anomaly_detection.max_cache_miss_handling_per_second` | 100 | High cardinality detectors use a cache to store active models. In the event of a cache miss, the cache gets the models from the model checkpoint index. Use this setting to limit the rate of fetching models. Because the thread pool for a GET operation has a queue of 1,000, we recommend setting this value below 1,000.
`opendistro.anomaly_detection.max_batch_task_per_node` | 2 | Starting a historical detector triggers a batch task. This setting is the number of batch tasks that you can run per data node. You can tune this setting from 1 to 1000. If the data nodes can't support all batch tasks and you're not sure if the data nodes are capable of running more historical detectors, add more data nodes instead of changing this setting to a higher value.
`opendistro.anomaly_detection.max_old_ad_task_docs_per_detector` | 10 | You can run the same historical detector many times. For each run, the anomaly detection plugin creates a new task. This setting is the number of previous tasks the plugin keeps. Set this value to at least 1 to track its last run. You can keep a maximum of 1,000 old tasks to avoid overwhelming the cluster.
`opendistro.anomaly_detection.batch_task_piece_size` | 1000 | The date range for a historical task is split into smaller pieces and the anomaly detection plugin runs the task piece by piece. Each piece contains 1,000 detection intervals by default. For example, if detector interval is 1 minute and one piece is 1000 minutes, the feature data is queried every 1,000 minutes. You can change this setting from 1 to 10,000.
`opendistro.anomaly_detection.batch_task_piece_interval_seconds` | 5 | Add a time interval between historical detector tasks. This interval prevents the task from consuming too much of the available resources and starving other operations like search and bulk index. You can change this setting from 1 to 600 seconds.
`plugins.anomaly_detection.enabled` | True | Whether the anomaly detection plugin is enabled or not. If disabled, all detectors immediately stop running.
`plugins.anomaly_detection.max_anomaly_detectors` | 1,000 | The maximum number of non-high cardinality detectors (no category field) users can create.
`plugins.anomaly_detection.max_multi_entity_anomaly_detectors` | 10 | The maximum number of high cardinality detectors (with category field) in a cluster.
`plugins.anomaly_detection.max_anomaly_features` | 5 | The maximum number of features for a detector.
`plugins.anomaly_detection.ad_result_history_rollover_period` | 12h | How often the rollover condition is checked. If `true`, the plugin rolls over the result index to a new index.
`plugins.anomaly_detection.ad_result_history_max_docs` | 250000000 | The maximum number of documents in one result index. The plugin only counts refreshed documents in the primary shards.
`plugins.anomaly_detection.ad_result_history_retention_period` | 30d | The maximum age of the result index. If its age exceeds the threshold, the plugin deletes the rolled over result index. If the cluster has only one result index, the plugin keeps the index even if it's older than its configured retention period.
`plugins.anomaly_detection.max_entities_per_query` | 1,000 | The maximum unique values per detection interval for high cardinality detectors. By default, if the category field has more than 1,000 unique values in a detector interval, the plugin selects the top 1,000 values and orders them by `doc_count`.
`plugins.anomaly_detection.max_entities_for_preview` | 30 | The maximum unique category field values displayed with the preview operation for high cardinality detectors. If the category field has more than 30 unique values, the plugin selects the top 30 values and orders them by `doc_count`.
`plugins.anomaly_detection.max_primary_shards` | 10 | The maximum number of primary shards an anomaly detection index can have.
`plugins.anomaly_detection.filter_by_backend_roles` | False | When you enable the security plugin and set this to `true`, the plugin filters results based on the user's backend role(s).
`plugins.anomaly_detection.max_cache_miss_handling_per_second` | 100 | High cardinality detectors use a cache to store active models. In the event of a cache miss, the cache gets the models from the model checkpoint index. Use this setting to limit the rate of fetching models. Because the thread pool for a GET operation has a queue of 1,000, we recommend setting this value below 1,000.
`plugins.anomaly_detection.max_batch_task_per_node` | 2 | Starting a historical detector triggers a batch task. This setting is the number of batch tasks that you can run per data node. You can tune this setting from 1 to 1000. If the data nodes can't support all batch tasks and you're not sure if the data nodes are capable of running more historical detectors, add more data nodes instead of changing this setting to a higher value.
`plugins.anomaly_detection.max_old_ad_task_docs_per_detector` | 10 | You can run the same historical detector many times. For each run, the anomaly detection plugin creates a new task. This setting is the number of previous tasks the plugin keeps. Set this value to at least 1 to track its last run. You can keep a maximum of 1,000 old tasks to avoid overwhelming the cluster.
`plugins.anomaly_detection.batch_task_piece_size` | 1000 | The date range for a historical task is split into smaller pieces and the anomaly detection plugin runs the task piece by piece. Each piece contains 1,000 detection intervals by default. For example, if detector interval is 1 minute and one piece is 1000 minutes, the feature data is queried every 1,000 minutes. You can change this setting from 1 to 10,000.
`plugins.anomaly_detection.batch_task_piece_interval_seconds` | 5 | Add a time interval between historical detector tasks. This interval prevents the task from consuming too much of the available resources and starving other operations like search and bulk index. You can change this setting from 1 to 600 seconds.