More redirects and spelling fixes (#4093)
* redirects and spelling Signed-off-by: Heather Halter <hdhalter@amazon.com> * Update _observing-your-data/ad/index.md Co-authored-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Signed-off-by: Heather Halter <HDHALTER@AMAZON.COM> * Update _observing-your-data/ad/index.md Co-authored-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Signed-off-by: Heather Halter <HDHALTER@AMAZON.COM> * Update _search-plugins/knn/index.md Co-authored-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Signed-off-by: Heather Halter <HDHALTER@AMAZON.COM> --------- Signed-off-by: Heather Halter <hdhalter@amazon.com> Signed-off-by: Heather Halter <HDHALTER@AMAZON.COM> Co-authored-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>
This commit is contained in:
parent
2b627de8f8
commit
ee7d1efd02
|
@ -2,6 +2,8 @@
|
|||
layout: default
|
||||
title: Scroll
|
||||
nav_order: 71
|
||||
redirect_from:
|
||||
- /opensearch/rest-api/scroll/
|
||||
---
|
||||
|
||||
# Scroll
|
||||
|
|
|
@ -9,7 +9,7 @@ nav_order: 20
|
|||
The OpenSearch Java high-level REST client will be deprecated starting with OpenSearch version 3.0.0 and will be removed in a future release. We recommend switching to the [Java client]({{site.url}}{{site.baseurl}}/clients/java/) instead.
|
||||
{: .warning}
|
||||
|
||||
The OpenSearch Java high-level REST client lets you interact with your OpenSearch clusters and indices through Java methods and data structures rather than HTTP methods and JSON.
|
||||
The OpenSearch Java high-level REST client lets you interact with your OpenSearch clusters and indexes through Java methods and data structures rather than HTTP methods and JSON.
|
||||
|
||||
## Setup
|
||||
|
||||
|
|
|
@ -29,7 +29,7 @@ A detector is an individual anomaly detection task. You can define multiple dete
|
|||
1. Add in the detector details.
|
||||
- Enter a name and brief description. Make sure the name is unique and descriptive enough to help you to identify the purpose of the detector.
|
||||
1. Specify the data source.
|
||||
- For **Data source**, choose the index you want to use as the data source. You can optionally use index patterns to choose multiple indices.
|
||||
- For **Data source**, choose the index you want to use as the data source. You can optionally use index patterns to choose multiple indexes.
|
||||
- (Optional) For **Data filter**, filter the index you chose as the data source. From the **Data filter** menu, choose **Add data filter**, and then design your filter query by selecting **Field**, **Operator**, and **Value**, or choose **Use query DSL** and add your own JSON filter query.
|
||||
1. Specify a timestamp.
|
||||
- Select the **Timestamp field** in your index.
|
||||
|
@ -55,12 +55,12 @@ A detector is an individual anomaly detection task. You can define multiple dete
|
|||
- To use the custom result index option, you need the following permissions:
|
||||
- `indices:admin/create` - If the custom index already exists, you don't need this.
|
||||
- `indices:data/write/index` - You need the `write` permission for the anomaly detection plugin to write results into the custom index for a single-entity detector.
|
||||
- `indices:data/read/search` - You need the `search` permission because the anomaly detection plugin needs to search custom result indices to show results on the anomaly detection UI.
|
||||
- `indices:data/read/search` - You need the `search` permission because the Anomaly Detection plugin needs to search custom result indexes to show results on the anomaly detection UI.
|
||||
- `indices:data/write/delete` - Because the detector might generate a large number of anomaly results, you need the `delete` permission to delete old data and save disk space.
|
||||
- `indices:data/write/bulk*` - You need the `bulk*` permission because the anomaly detection plugin uses the bulk API to write results into the custom index.
|
||||
- Managing the custom result index:
|
||||
- The anomaly detection dashboard queries all detectors’ results from all custom result indices. Having too many custom result indices might impact the performance of the anomaly detection plugin.
|
||||
- You can use [Index State Management]({{site.url}}{{site.baseurl}}/im-plugin/ism/index/) to rollover old result indices. You can also manually delete or archive any old result indices. We recommend reusing a custom result index for multiple detectors.
|
||||
- The anomaly detection dashboard queries all detectors’ results from all custom result indexes. Having too many custom result indexes might impact the performance of the Anomaly Detection plugin.
|
||||
- You can use [Index State Management]({{site.url}}{{site.baseurl}}/im-plugin/ism/index/) to rollover old result indexes. You can also manually delete or archive any old result indexes. We recommend reusing a custom result index for multiple detectors.
|
||||
1. Choose **Next**.
|
||||
|
||||
After you define the detector, the next step is to configure the model.
|
||||
|
|
|
@ -12,7 +12,7 @@ redirect_from:
|
|||
|
||||
You can use the Security plugin with anomaly detection in OpenSearch to limit non-admin users to specific actions. For example, you might want some users to only be able to create, update, or delete detectors, while others to only view detectors.
|
||||
|
||||
All anomaly detection indices are protected as system indices. Only a super admin user or an admin user with a TLS certificate can access system indices. For more information, see [System indices]({{site.url}}{{site.baseurl}}/security/configuration/system-indices/).
|
||||
All anomaly detection indexes are protected as system indexes. Only a super admin user or an admin user with a TLS certificate can access system indexes. For more information, see [System indexes]({{site.url}}{{site.baseurl}}/security/configuration/system-indices/).
|
||||
|
||||
|
||||
Security for anomaly detection works the same as [security for alerting]({{site.url}}{{site.baseurl}}/monitoring-plugins/alerting/security/).
|
||||
|
|
|
@ -22,14 +22,14 @@ If these roles don't meet your needs, mix and match individual alerting [permiss
|
|||
|
||||
## How monitors access data
|
||||
|
||||
Monitors run with the permissions of the user who created or last modified them. For example, consider the user `jdoe`, who works at a chain of retail stores. `jdoe` has two roles. Together, these two roles allow read access to three indices: `store1-returns`, `store2-returns`, and `store3-returns`.
|
||||
Monitors run with the permissions of the user who created or last modified them. For example, consider the user `jdoe`, who works at a chain of retail stores. `jdoe` has two roles. Together, these two roles allow read access to three indexes: `store1-returns`, `store2-returns`, and `store3-returns`.
|
||||
|
||||
`jdoe` creates a monitor that sends an email to management whenever the number of returns across all three indices exceeds 40 per hour.
|
||||
`jdoe` creates a monitor that sends an email to management whenever the number of returns across all three indexes exceeds 40 per hour.
|
||||
|
||||
Later, the user `psantos` wants to edit the monitor to run every two hours, but `psantos` only has access to `store1-returns`. To make the change, `psantos` has two options:
|
||||
|
||||
- Update the monitor so that it only checks `store1-returns`.
|
||||
- Ask an administrator for read access to the other two indices.
|
||||
- Ask an administrator for read access to the other two indexes.
|
||||
|
||||
After making the change, the monitor now runs with the same permissions as `psantos`, including any [document-level security]({{site.url}}{{site.baseurl}}/security/access-control/document-level-security/) queries, [excluded fields]({{site.url}}{{site.baseurl}}/security/access-control/field-level-security/), and [masked fields]({{site.url}}{{site.baseurl}}/security/access-control/field-masking/). If you use an extraction query to define your monitor, use the **Run** button to ensure that the response includes the fields you need.
|
||||
|
||||
|
|
|
@ -10,9 +10,9 @@ redirect_from:
|
|||
# Management
|
||||
|
||||
|
||||
## Alerting indices
|
||||
## Alerting indexes
|
||||
|
||||
The alerting feature creates several indices and one alias. The Security plugin demo script configures them as [system indices]({{site.url}}{{site.baseurl}}/security/configuration/system-indices/) for an extra layer of protection. Don't delete these indices or modify their contents without using the alerting APIs.
|
||||
The alerting feature creates several indexes and one alias. The Security plugin demo script configures them as [system indexes]({{site.url}}{{site.baseurl}}/security/configuration/system-indices/) for an extra layer of protection. Don't delete these indexes or modify their contents without using the alerting APIs.
|
||||
|
||||
Index | Purpose
|
||||
:--- | :---
|
||||
|
@ -21,7 +21,7 @@ Index | Purpose
|
|||
`.opendistro-alerting-config` | Stores monitors, triggers, and destinations. [Take a snapshot]({{site.url}}{{site.baseurl}}/opensearch/snapshots/snapshot-restore) of this index to back up your alerting configuration.
|
||||
`.opendistro-alerting-alert-history-write` (alias) | Provides a consistent URI for the `.opendistro-alerting-alert-history-<date>` index.
|
||||
|
||||
All alerting indices are hidden by default. For a summary, make the following request:
|
||||
All alerting indexes are hidden by default. For a summary, make the following request:
|
||||
|
||||
```
|
||||
GET _cat/indices?expand_wildcards=open,hidden
|
||||
|
@ -44,14 +44,14 @@ Setting | Default | Description
|
|||
`plugins.alerting.bulk_timeout` | 120s | How long the monitor can write alerts to the alert index.
|
||||
`plugins.alerting.alert_backoff_count` | 3 | The number of retries for writing alerts before the operation fails.
|
||||
`plugins.alerting.alert_backoff_millis` | 50ms | The amount of time to wait between retries---increases exponentially after each failed retry.
|
||||
`plugins.alerting.alert_history_rollover_period` | 12h | How frequently to check whether the `.opendistro-alerting-alert-history-write` alias should roll over to a new history index and whether the Alerting plugin should delete any history indices.
|
||||
`plugins.alerting.alert_history_rollover_period` | 12h | How frequently to check whether the `.opendistro-alerting-alert-history-write` alias should roll over to a new history index and whether the Alerting plugin should delete any history indexes.
|
||||
`plugins.alerting.move_alerts_backoff_millis` | 250 | The amount of time to wait between retries---increases exponentially after each failed retry.
|
||||
`plugins.alerting.move_alerts_backoff_count` | 3 | The number of retries for moving alerts to a deleted state after their monitor or trigger has been deleted.
|
||||
`plugins.alerting.monitor.max_monitors` | 1000 | The maximum number of monitors users can create.
|
||||
`plugins.alerting.alert_history_max_age` | 30d | The oldest document to store in the `.opendistro-alert-history-<date>` index before creating a new index. If the number of alerts in this time period does not exceed `alert_history_max_docs`, alerting creates one history index per period (e.g. one index every 30 days).
|
||||
`plugins.alerting.alert_history_max_docs` | 1000 | The maximum number of alerts to store in the `.opendistro-alert-history-<date>` index before creating a new index.
|
||||
`plugins.alerting.alert_history_enabled` | true | Whether to create `.opendistro-alerting-alert-history-<date>` indices.
|
||||
`plugins.alerting.alert_history_retention_period` | 60d | The amount of time to keep history indices before automatically deleting them.
|
||||
`plugins.alerting.alert_history_enabled` | true | Whether to create `.opendistro-alerting-alert-history-<date>` indexes.
|
||||
`plugins.alerting.alert_history_retention_period` | 60d | The amount of time to keep history indexes before automatically deleting them.
|
||||
`plugins.alerting.destination.allow_list` | ["chime", "slack", "custom_webhook", "email", "test_action"] | The list of allowed destinations. If you don't want to allow users to a certain type of destination, you can remove it from this list, but we recommend leaving this setting as-is.
|
||||
`plugins.alerting.filter_by_backend_roles` | "false" | Restricts access to monitors by backend role. See [Alerting security]({{site.url}}{{site.baseurl}}/monitoring-plugins/alerting/security/).
|
||||
`plugins.scheduled_jobs.sweeper.period` | 5m | The alerting feature uses its "job sweeper" component to periodically check for new or updated jobs. This setting is the rate at which the sweeper checks to see if any jobs (monitors) have changed and need to be rescheduled.
|
||||
|
|
|
@ -44,7 +44,7 @@ GET opensearch_dashboards_sample_data_ecommerce/_search
|
|||
}
|
||||
```
|
||||
|
||||
#### Sample Response
|
||||
#### Example response
|
||||
|
||||
```json
|
||||
...
|
||||
|
@ -194,7 +194,7 @@ GET opensearch_dashboards_sample_data_ecommerce/_search
|
|||
}
|
||||
```
|
||||
|
||||
#### Sample Response
|
||||
#### Example response
|
||||
|
||||
```json
|
||||
...
|
||||
|
@ -621,7 +621,7 @@ GET opensearch_dashboards_sample_data_logs/_search
|
|||
}
|
||||
```
|
||||
|
||||
#### Sample Response
|
||||
#### Example response
|
||||
|
||||
```json
|
||||
...
|
||||
|
|
|
@ -651,11 +651,11 @@ GET opensearch_dashboards_sample_data_logs/_search
|
|||
|
||||
## derivative
|
||||
|
||||
The `derivative` aggregation is a parent aggregation that calculates 1st order and 2nd order derivates of each bucket of a previous aggregation.
|
||||
The `derivative` aggregation is a parent aggregation that calculates 1st order and 2nd order derivatives of each bucket of a previous aggregation.
|
||||
|
||||
In mathematics, the derivative of a function measures its sensitivity to change. In other words, a derivative evaluates the rate of change in some function with respect to some variable. To learn more about derivates, see [Wikipedia](https://en.wikipedia.org/wiki/Derivative).
|
||||
In mathematics, the derivative of a function measures its sensitivity to change. In other words, a derivative evaluates the rate of change in some function with respect to some variable. To learn more about derivatives, see [Wikipedia](https://en.wikipedia.org/wiki/Derivative).
|
||||
|
||||
You can use derivates to calculate the rate of change of numeric values compared to its previous time periods.
|
||||
You can use derivatives to calculate the rate of change of numeric values compared to its previous time periods.
|
||||
|
||||
The 1st order derivative indicates whether a metric is increasing or decreasing, and by how much it's increasing or decreasing.
|
||||
|
||||
|
|
|
@ -16,7 +16,7 @@ To use the analyzer when you map an index, specify the value within your query.
|
|||
"analyzer": "french"
|
||||
```
|
||||
|
||||
#### Sample Request
|
||||
#### Example request
|
||||
|
||||
The following query maps an index with the language analyzer set to `french`:
|
||||
|
||||
|
|
|
@ -23,12 +23,12 @@ GET /_plugins/_knn/nodeId1,nodeId2/stats/statName1,statName2
|
|||
Statistic | Description
|
||||
:--- | :---
|
||||
`circuit_breaker_triggered` | Indicates whether the circuit breaker is triggered. This statistic is only relevant to approximate k-NN search.
|
||||
`total_load_time` | The time in nanoseconds that k-NN has taken to load native library indices into the cache. This statistic is only relevant to approximate k-NN search.
|
||||
`eviction_count` | The number of native library indices that have been evicted from the cache due to memory constraints or idle time. This statistic is only relevant to approximate k-NN search. <br /> **Note**: Explicit evictions that occur because of index deletion aren't counted.
|
||||
`total_load_time` | The time in nanoseconds that k-NN has taken to load native library indexes into the cache. This statistic is only relevant to approximate k-NN search.
|
||||
`eviction_count` | The number of native library indexes that have been evicted from the cache due to memory constraints or idle time. This statistic is only relevant to approximate k-NN search. <br /> **Note**: Explicit evictions that occur because of index deletion aren't counted.
|
||||
`hit_count` | The number of cache hits. A cache hit occurs when a user queries a native library index that's already loaded into memory. This statistic is only relevant to approximate k-NN search.
|
||||
`miss_count` | The number of cache misses. A cache miss occurs when a user queries a native library index that isn't loaded into memory yet. This statistic is only relevant to approximate k-NN search.
|
||||
`graph_memory_usage` | The amount of native memory native library indices are using on the node in kilobytes.
|
||||
`graph_memory_usage_percentage` | The amount of native memory native library indices are using on the node as a percentage of the maximum cache capacity.
|
||||
`graph_memory_usage` | The amount of native memory native library indexes are using on the node in kilobytes.
|
||||
`graph_memory_usage_percentage` | The amount of native memory native library indexes are using on the node as a percentage of the maximum cache capacity.
|
||||
`graph_index_requests` | The number of requests to add the `knn_vector` field of a document into a native library index.
|
||||
`graph_index_errors` | The number of requests to add the `knn_vector` field of a document into a native library index that have produced an error.
|
||||
`graph_query_requests` | The number of native library index queries that have been made.
|
||||
|
@ -37,7 +37,7 @@ Statistic | Description
|
|||
`cache_capacity_reached` | Whether `knn.memory.circuit_breaker.limit` has been reached. This statistic is only relevant to approximate k-NN search.
|
||||
`load_success_count` | The number of times k-NN successfully loaded a native library index into the cache. This statistic is only relevant to approximate k-NN search.
|
||||
`load_exception_count` | The number of times an exception occurred when trying to load a native library index into the cache. This statistic is only relevant to approximate k-NN search.
|
||||
`indices_in_cache` | For each OpenSearch index with a `knn_vector` field and approximate k-NN turned on, this statistic provides the number of native library indices that OpenSearch index has and the total `graph_memory_usage` that the OpenSearch index is using, in kilobytes.
|
||||
`indices_in_cache` | For each OpenSearch index with a `knn_vector` field and approximate k-NN turned on, this statistic provides the number of native library indexes that OpenSearch index has and the total `graph_memory_usage` that the OpenSearch index is using, in kilobytes.
|
||||
`script_compilations` | The number of times the k-NN script has been compiled. This value should usually be 1 or 0, but if the cache containing the compiled scripts is filled, the k-NN script might be recompiled. This statistic is only relevant to k-NN score script search.
|
||||
`script_compilation_errors` | The number of errors during script compilation. This statistic is only relevant to k-NN score script search.
|
||||
`script_query_requests` | The total number of script queries. This statistic is only relevant to k-NN score script search.
|
||||
|
@ -129,18 +129,18 @@ GET /_plugins/_knn/HYMrXXsBSamUkcAjhjeN0w/stats/circuit_breaker_triggered,graph_
|
|||
Introduced 1.0
|
||||
{: .label .label-purple }
|
||||
|
||||
The native library indices used to perform approximate k-Nearest Neighbor (k-NN) search are stored as special files with other Apache Lucene segment files. In order for you to perform a search on these indices using the k-NN plugin, the plugin needs to load these files into native memory.
|
||||
The native library indexes used to perform approximate k-Nearest Neighbor (k-NN) search are stored as special files with other Apache Lucene segment files. In order for you to perform a search on these indexes using the k-NN plugin, the plugin needs to load these files into native memory.
|
||||
|
||||
If the plugin hasn't loaded the files into native memory, it loads them when it receives a search request. The loading time can cause high latency during initial queries. To avoid this situation, users often run random queries during a warmup period. After this warmup period, the files are loaded into native memory and their production workloads can begin. This loading process is indirect and requires extra effort.
|
||||
|
||||
As an alternative, you can avoid this latency issue by running the k-NN plugin warmup API operation on whatever indices you're interested in searching. This operation loads all the native library files for all of the shards (primaries and replicas) of all the indices specified in the request into native memory.
|
||||
As an alternative, you can avoid this latency issue by running the k-NN plugin warmup API operation on whatever indexes you're interested in searching. This operation loads all the native library files for all of the shards (primaries and replicas) of all the indexes specified in the request into native memory.
|
||||
|
||||
After the process finishes, you can start searching against the indices with no initial latency penalties. The warmup API operation is idempotent, so if a segment's native library files are already loaded into memory, this operation has no impact. It only loads files that aren't currently in memory.
|
||||
After the process finishes, you can start searching against the indexes with no initial latency penalties. The warmup API operation is idempotent, so if a segment's native library files are already loaded into memory, this operation has no impact. It only loads files that aren't currently in memory.
|
||||
|
||||
|
||||
### Usage
|
||||
|
||||
This request performs a warmup on three indices:
|
||||
This request performs a warmup on three indexes:
|
||||
|
||||
```json
|
||||
GET /_plugins/_knn/warmup/index1,index2,index3?pretty
|
||||
|
@ -168,11 +168,11 @@ After the operation has finished, use the [k-NN `_stats` API operation](#stats)
|
|||
|
||||
For the warmup operation to function properly, follow these best practices:
|
||||
|
||||
* Don't run merge operations on indices that you want to warm up. During merge, the k-NN plugin creates new segments, and old segments are sometimes deleted. For example, you could encounter a situation in which the warmup API operation loads native library indices A and B into native memory, but segment C is created from segments A and B being merged. The native library indices for A and B would no longer be in memory, and native library index C would also not be in memory. In this case, the initial penalty for loading native library index C is still present.
|
||||
* Don't run merge operations on indexes that you want to warm up. During merge, the k-NN plugin creates new segments, and old segments are sometimes deleted. For example, you could encounter a situation in which the warmup API operation loads native library indexes A and B into native memory, but segment C is created from segments A and B being merged. The native library indexes for A and B would no longer be in memory, and native library index C would also not be in memory. In this case, the initial penalty for loading native library index C is still present.
|
||||
|
||||
* Confirm that all native library indices you want to warm up can fit into native memory. For more information about the native memory limit, see the [knn.memory.circuit_breaker.limit statistic]({{site.url}}{{site.baseurl}}/search-plugins/knn/settings#cluster-settings). High graph memory usage causes cache thrashing, which can lead to operations constantly failing and attempting to run again.
|
||||
* Confirm that all native library indexes you want to warm up can fit into native memory. For more information about the native memory limit, see the [knn.memory.circuit_breaker.limit statistic]({{site.url}}{{site.baseurl}}/search-plugins/knn/settings#cluster-settings). High graph memory usage causes cache thrashing, which can lead to operations constantly failing and attempting to run again.
|
||||
|
||||
* Don't index any documents that you want to load into the cache. Writing new information to segments prevents the warmup API operation from loading the native library indices until they're searchable. This means that you would have to run the warmup operation again after indexing finishes.
|
||||
* Don't index any documents that you want to load into the cache. Writing new information to segments prevents the warmup API operation from loading the native library indexes until they're searchable. This means that you would have to run the warmup operation again after indexing finishes.
|
||||
|
||||
## Get Model
|
||||
Introduced 1.2
|
||||
|
@ -294,7 +294,7 @@ DELETE /_plugins/_knn/models/{model_id}
|
|||
Introduced 1.2
|
||||
{: .label .label-purple }
|
||||
|
||||
Create and train a model that can be used for initializing k-NN native library indices during indexing. This API will
|
||||
Create and train a model that can be used for initializing k-NN native library indexes during indexing. This API will
|
||||
pull training data from a `knn_vector` field in a training index and then create and train a model and then serialize it
|
||||
to the model system index. Training data must match the dimension passed into the body of the request. This request
|
||||
will return when training begins. To monitor the state of the model, use the [Get model API](#get-model).
|
||||
|
@ -312,7 +312,7 @@ Request Parameter | Description
|
|||
`max_training_vector_count` | (Optional) Maximum number of vectors from the training index to use for training. Defaults to all of the vectors in the index.
|
||||
`search_size` | (Optional) Training data is pulled from the training index with scroll queries. Defines the number of results to return per scroll query. Defaults to 10,000.
|
||||
`description` | (Optional) User provided description of the model.
|
||||
`method` | Configuration of ANN method used for search. For more information on possible methods, refer to the [method documentation]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index#method-definitions). Method must require training to be valid.
|
||||
`method` | Configuration of ANN method used for search. For more information about possible methods, refer to the [method documentation]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index#method-definitions). Method must require training to be valid.
|
||||
|
||||
|
||||
### Usage
|
||||
|
|
|
@ -12,7 +12,7 @@ redirect_from:
|
|||
|
||||
Short for *k-nearest neighbors*, the k-NN plugin enables users to search for the k-nearest neighbors to a query point across an index of vectors. To determine the neighbors, you can specify the space (the distance function) you want to use to measure the distance between points.
|
||||
|
||||
Use cases include recommendations (for example, an "other songs you might like" feature in a music application), image recognition, and fraud detection. For more background information on k-NN search, see [Wikipedia](https://en.wikipedia.org/wiki/Nearest_neighbor_search).
|
||||
Use cases include recommendations (for example, an "other songs you might like" feature in a music application), image recognition, and fraud detection. For more background information about k-NN search, see [Wikipedia](https://en.wikipedia.org/wiki/Nearest_neighbor_search).
|
||||
|
||||
This plugin supports three different methods for obtaining the k-nearest neighbors from an index of vectors:
|
||||
|
||||
|
@ -20,7 +20,7 @@ This plugin supports three different methods for obtaining the k-nearest neighbo
|
|||
|
||||
The first method takes an approximate nearest neighbor approach---it uses one of several algorithms to return the approximate k-nearest neighbors to a query vector. Usually, these algorithms sacrifice indexing speed and search accuracy in return for performance benefits such as lower latency, smaller memory footprints and more scalable search. To learn more about the algorithms, refer to [*nmslib*](https://github.com/nmslib/nmslib/blob/master/manual/README.md)'s and [*faiss*](https://github.com/facebookresearch/faiss/wiki)'s documentation.
|
||||
|
||||
Approximate k-NN is the best choice for searches over large indices (i.e. hundreds of thousands of vectors or more) that require low latency. You should not use approximate k-NN if you want to apply a filter on the index before the k-NN search, which greatly reduces the number of vectors to be searched. In this case, you should use either the script scoring method or painless extensions.
|
||||
Approximate k-NN is the best choice for searches over large indexes (that is, hundreds of thousands of vectors or more) that require low latency. You should not use approximate k-NN if you want to apply a filter on the index before the k-NN search, which greatly reduces the number of vectors to be searched. In this case, you should use either the script scoring method or Painless extensions.
|
||||
|
||||
For more details about this method, including recommendations for which engine to use, see [Approximate k-NN search]({{site.url}}{{site.baseurl}}/search-plugins/knn/approximate-knn/).
|
||||
|
||||
|
@ -28,7 +28,7 @@ This plugin supports three different methods for obtaining the k-nearest neighbo
|
|||
|
||||
The second method extends OpenSearch's script scoring functionality to execute a brute force, exact k-NN search over "knn_vector" fields or fields that can represent binary objects. With this approach, you can run k-NN search on a subset of vectors in your index (sometimes referred to as a pre-filter search).
|
||||
|
||||
Use this approach for searches over smaller bodies of documents or when a pre-filter is needed. Using this approach on large indices may lead to high latencies.
|
||||
Use this approach for searches over smaller bodies of documents or when a pre-filter is needed. Using this approach on large indexes may lead to high latencies.
|
||||
|
||||
For more details about this method, see [Exact k-NN with scoring script]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-score-script/).
|
||||
|
||||
|
|
|
@ -229,7 +229,7 @@ At the moment, several parameters defined in the settings are in the deprecation
|
|||
|
||||
Setting | Default | Updateable | Description
|
||||
:--- | :--- | :--- | :---
|
||||
`index.knn` | false | false | Whether the index should build native library indices for the `knn_vector` fields. If set to false, the `knn_vector` fields will be stored in doc values, but Approximate k-NN search functionality will be disabled.
|
||||
`index.knn` | false | false | Whether the index should build native library indexes for the `knn_vector` fields. If set to false, the `knn_vector` fields will be stored in doc values, but Approximate k-NN search functionality will be disabled.
|
||||
`index.knn.algo_param.ef_search` | 512 | true | The size of the dynamic list used during k-NN searches. Higher values lead to more accurate but slower searches. Only available for nmslib.
|
||||
`index.knn.algo_param.ef_construction` | 512 | false | Deprecated in 1.0.0. Use the [mapping parameters](https://opensearch.org/docs/latest/search-plugins/knn/knn-index/#method-definitions) to set this value instead.
|
||||
`index.knn.algo_param.m` | 16 | false | Deprecated in 1.0.0. Use the [mapping parameters](https://opensearch.org/docs/latest/search-plugins/knn/knn-index/#method-definitions) to set this value instead.
|
||||
|
|
|
@ -8,7 +8,7 @@ nav_order: 45
|
|||
# Performance tuning
|
||||
|
||||
This topic provides performance tuning recommendations to improve indexing and search performance for approximate k-NN (ANN). From a high level, k-NN works according to these principles:
|
||||
* Native library indices are created per knn_vector field / (Lucene) segment pair.
|
||||
* Native library indexes are created per knn_vector field / (Lucene) segment pair.
|
||||
* Queries execute on segments sequentially inside the shard (same as any other OpenSearch query).
|
||||
* Each native library index in the segment returns <=k neighbors.
|
||||
* The coordinator node picks up final size number of neighbors from the neighbors returned by each shard.
|
||||
|
@ -35,7 +35,7 @@ Take the following steps to improve indexing performance, especially when you pl
|
|||
|
||||
* **Disable replicas (no OpenSearch replica shard)**
|
||||
|
||||
Set replicas to `0` to prevent duplicate construction of native library indices in both primary and replica shards. When you enable replicas after indexing finishes, the serialized native library indices are directly copied. If you have no replicas, losing nodes might cause data loss, so it's important that the data lives elsewhere so this initial load can be retried in case of an issue.
|
||||
Set replicas to `0` to prevent duplicate construction of native library indexes in both primary and replica shards. When you enable replicas after indexing finishes, the serialized native library indexes are directly copied. If you have no replicas, losing nodes might cause data loss, so it's important that the data lives elsewhere so this initial load can be retried in case of an issue.
|
||||
|
||||
* **Increase the number of indexing threads**
|
||||
|
||||
|
@ -57,11 +57,11 @@ Take the following steps to improve search performance:
|
|||
|
||||
* **Warm up the index**
|
||||
|
||||
Native library indices are constructed during indexing, but they're loaded into memory during the first search. In Lucene, each segment is searched sequentially (so, for k-NN, each segment returns up to k nearest neighbors of the query point), and the top 'size' number of results based on the score are returned from all the results returned by segements at a shard level (higher score = better result).
|
||||
Native library indexes are constructed during indexing, but they're loaded into memory during the first search. In Lucene, each segment is searched sequentially (so, for k-NN, each segment returns up to k nearest neighbors of the query point), and the top 'size' number of results based on the score are returned from all the results returned by segments at a shard level (higher score = better result).
|
||||
|
||||
Once a native library index is loaded (native library indices are loaded outside OpenSearch JVM), OpenSearch caches them in memory. Initial queries are expensive and take a few seconds, while subsequent queries are faster and take milliseconds (assuming the k-NN circuit breaker isn't hit).
|
||||
Once a native library index is loaded (native library indexes are loaded outside OpenSearch JVM), OpenSearch caches them in memory. Initial queries are expensive and take a few seconds, while subsequent queries are faster and take milliseconds (assuming the k-NN circuit breaker isn't hit).
|
||||
|
||||
To avoid this latency penalty during your first queries, you can use the warmup API operation on the indices you want to search:
|
||||
To avoid this latency penalty during your first queries, you can use the warmup API operation on the indexes you want to search:
|
||||
|
||||
```json
|
||||
GET /_plugins/_knn/warmup/index1,index2,index3?pretty
|
||||
|
@ -74,9 +74,9 @@ Take the following steps to improve search performance:
|
|||
}
|
||||
```
|
||||
|
||||
The warmup API operation loads all native library indices for all shards (primary and replica) for the specified indices into the cache, so there's no penalty to load native library indices during initial searches.
|
||||
The warmup API operation loads all native library indexes for all shards (primary and replica) for the specified indexes into the cache, so there's no penalty to load native library indexes during initial searches.
|
||||
|
||||
**Note**: This API operation only loads the segments of the indices it ***sees*** into the cache. If a merge or refresh operation finishes after the API runs, or if you add new documents, you need to rerun the API to load those native library indices into memory.
|
||||
**Note**: This API operation only loads the segments of the indexes it ***sees*** into the cache. If a merge or refresh operation finishes after the API runs, or if you add new documents, you need to rerun the API to load those native library indexes into memory.
|
||||
|
||||
* **Avoid reading stored fields**
|
||||
|
||||
|
|
|
@ -14,11 +14,11 @@ The k-NN plugin adds several new cluster settings.
|
|||
Setting | Default | Description
|
||||
:--- | :--- | :---
|
||||
`knn.algo_param.index_thread_qty` | 1 | The number of threads used for native library index creation. Keeping this value low reduces the CPU impact of the k-NN plugin, but also reduces indexing performance.
|
||||
`knn.cache.item.expiry.enabled` | false | Whether to remove native library indices that have not been accessed for a certain duration from memory.
|
||||
`knn.cache.item.expiry.enabled` | false | Whether to remove native library indexes that have not been accessed for a certain duration from memory.
|
||||
`knn.cache.item.expiry.minutes` | 3h | If enabled, the idle time before removing a native library index from memory.
|
||||
`knn.circuit_breaker.unset.percentage` | 75% | The native memory usage threshold for the circuit breaker. Memory usage must be below this percentage of `knn.memory.circuit_breaker.limit` for `knn.circuit_breaker.triggered` to remain false.
|
||||
`knn.circuit_breaker.triggered` | false | True when memory usage exceeds the `knn.circuit_breaker.unset.percentage` value.
|
||||
`knn.memory.circuit_breaker.limit` | 50% | The native memory limit for native library indices. At the default value, if a machine has 100 GB of memory and the JVM uses 32 GB, the k-NN plugin uses 50% of the remaining 68 GB (34 GB). If memory usage exceeds this value, k-NN removes the least recently used native library indices.
|
||||
`knn.memory.circuit_breaker.limit` | 50% | The native memory limit for native library indexes. At the default value, if a machine has 100 GB of memory and the JVM uses 32 GB, the k-NN plugin uses 50% of the remaining 68 GB (34 GB). If memory usage exceeds this value, k-NN removes the least recently used native library indexes.
|
||||
`knn.memory.circuit_breaker.enabled` | true | Whether to enable the k-NN memory circuit breaker.
|
||||
`knn.plugin.enabled`| true | Enables or disables the k-NN plugin.
|
||||
`knn.model.index.number_of_shards`| 1 | Number of shards to use for the model system index, the OpenSearch index that stores the models used for Approximate k-NN Search.
|
||||
|
|
|
@ -167,7 +167,7 @@ The Delete PITs by ID API fully supports deleting cross-cluster PITs.
|
|||
|
||||
The Delete All PITs API deletes only local PITs or mixed PITs (PITs created in both local and remote clusters). It does not delete fully remote PITs.
|
||||
|
||||
#### Sample Request: Delete all PITs
|
||||
#### Example request: Delete all PITs
|
||||
|
||||
```json
|
||||
DELETE /_search/point_in_time/_all
|
||||
|
|
|
@ -264,7 +264,7 @@ GET _search/template
|
|||
|
||||
### Loops
|
||||
|
||||
You can also use the section tag to implement a foreach loop:
|
||||
You can also use the section tag to implement a for each loop:
|
||||
|
||||
```
|
||||
{% raw %}{{#var}}{{.}}}{{/var}}{% endraw %}
|
||||
|
|
|
@ -3,6 +3,8 @@ layout: default
|
|||
title: Full-Text Search
|
||||
parent: SQL and PPL
|
||||
nav_order: 11
|
||||
redirect_from:
|
||||
- /search-plugins/sql/sql-full-text/
|
||||
---
|
||||
|
||||
# Full-text search
|
||||
|
|
|
@ -3,6 +3,8 @@ layout: default
|
|||
title: Functions
|
||||
parent: SQL and PPL
|
||||
nav_order: 10
|
||||
redirect_from:
|
||||
- /search-plugins/sql/functions/
|
||||
---
|
||||
|
||||
# Functions
|
||||
|
|
|
@ -3,6 +3,8 @@ layout: default
|
|||
title: Identifiers
|
||||
parent: SQL and PPL
|
||||
nav_order: 6
|
||||
redirect_from:
|
||||
- /search-plugins/ppl/identifiers/
|
||||
---
|
||||
|
||||
|
||||
|
|
|
@ -3,6 +3,8 @@ layout: default
|
|||
title: Limitations
|
||||
parent: SQL and PPL
|
||||
nav_order: 99
|
||||
redirect_from:
|
||||
- /search-plugins/sql/limitation/
|
||||
---
|
||||
|
||||
# Limitations
|
||||
|
|
|
@ -3,6 +3,8 @@ layout: default
|
|||
title: Monitoring
|
||||
parent: SQL and PPL
|
||||
nav_order: 95
|
||||
redirect_from:
|
||||
- /search-plugins/sql/monitoring/
|
||||
---
|
||||
|
||||
# Monitoring
|
||||
|
|
|
@ -52,7 +52,7 @@ search source=accounts
|
|||
| fields firstname, lastname
|
||||
```
|
||||
|
||||
#### Sample Response
|
||||
#### Example response
|
||||
|
||||
firstname | lastname |
|
||||
:--- | :--- |
|
||||
|
|
|
@ -3,6 +3,8 @@ layout: default
|
|||
title: Settings
|
||||
parent: SQL and PPL
|
||||
nav_order: 77
|
||||
redirect_from:
|
||||
- /search-plugins/sql/settings/
|
||||
---
|
||||
|
||||
# Settings
|
||||
|
|
|
@ -4,6 +4,8 @@ title: Aggregate Functions
|
|||
parent: SQL
|
||||
grand_parent: SQL and PPL
|
||||
nav_order: 11
|
||||
Redirect_from:
|
||||
- /search-plugins/sql/aggregations/
|
||||
---
|
||||
|
||||
# Aggregate functions
|
||||
|
|
|
@ -4,6 +4,8 @@ title: Basic Queries
|
|||
parent: SQL
|
||||
grand_parent: SQL and PPL
|
||||
nav_order: 5
|
||||
Redirect_from:
|
||||
- /search-plugins/sql/basic/
|
||||
---
|
||||
|
||||
|
||||
|
@ -11,7 +13,7 @@ nav_order: 5
|
|||
|
||||
Use the `SELECT` clause, along with `FROM`, `WHERE`, `GROUP BY`, `HAVING`, `ORDER BY`, and `LIMIT` to search and aggregate data.
|
||||
|
||||
Among these clauses, `SELECT` and `FROM` are required, as they specify which fields to retrieve and which indices to retrieve them from. All other clauses are optional. Use them according to your needs.
|
||||
Among these clauses, `SELECT` and `FROM` are required, as they specify which fields to retrieve and which indexes to retrieve them from. All other clauses are optional. Use them according to your needs.
|
||||
|
||||
### Syntax
|
||||
|
||||
|
@ -164,7 +166,7 @@ FROM accounts acc
|
|||
| 13 | 28
|
||||
| 18 | 33
|
||||
|
||||
*Example 2*: Use index patterns to query indices that match a specific pattern:
|
||||
*Example 2*: Use index patterns to query indexes that match a specific pattern:
|
||||
|
||||
```sql
|
||||
SELECT account_number
|
||||
|
@ -210,7 +212,7 @@ WHERE account_number = 1
|
|||
| :---
|
||||
| 1
|
||||
|
||||
*Example 2*: OpenSearch allows for flexible schema, so documents in an index may have different fields. Use `IS NULL` or `IS NOT NULL` to retrieve only missing fields or existing fields. We do not differentiate between missing fields and fields explicitly set to `NULL`:
|
||||
*Example 2*: OpenSearch allows for flexible schema,so documents in an index may have different fields. Use `IS NULL` or `IS NOT NULL` to retrieve only missing fields or existing fields. OpenSearch does not differentiate between missing fields and fields explicitly set to `NULL`:
|
||||
|
||||
```sql
|
||||
SELECT account_number, employer
|
||||
|
@ -346,7 +348,7 @@ ORDER BY account_number LIMIT 1
|
|||
| :---
|
||||
| 1
|
||||
|
||||
*Example 2*: If you pass in two arguments, the first is mapped to the `from` parameter and the second to the `size` parameter in OpenSearch. You can use this for simple pagination for small indices, as it's inefficient for large indices.
|
||||
*Example 2*: If you pass in two arguments, the first is mapped to the `from` parameter and the second to the `size` parameter in OpenSearch. You can use this for simple pagination for small indexes, as it's inefficient for large indexes.
|
||||
Use `ORDER BY` to ensure the same order between pages:
|
||||
|
||||
```sql
|
||||
|
|
|
@ -4,6 +4,8 @@ title: Complex Queries
|
|||
parent: SQL
|
||||
grand_parent: SQL and PPL
|
||||
nav_order: 6
|
||||
Redirect_from:
|
||||
- /search-plugins/sql/complex/
|
||||
---
|
||||
|
||||
# Complex queries
|
||||
|
@ -19,10 +21,10 @@ OpenSearch SQL supports inner joins, cross joins, and left outer joins.
|
|||
|
||||
Joins have a number of constraints:
|
||||
|
||||
1. You can only join two indices.
|
||||
1. You must use aliases for indices (e.g. `people p`).
|
||||
1. You can only join two indexes.
|
||||
1. You must use aliases for indexes (for example, `people p`).
|
||||
1. Within an ON clause, you can only use AND conditions.
|
||||
1. In a WHERE statement, don't combine trees that contain multiple indices. For example, the following statement works:
|
||||
1. In a WHERE statement, don't combine trees that contain multiple indexes. For example, the following statement works:
|
||||
|
||||
```
|
||||
WHERE (a.type1 > 3 OR a.type1 < 0) AND (b.type2 > 4 OR b.type2 < -1)
|
||||
|
@ -39,7 +41,7 @@ Joins have a number of constraints:
|
|||
|
||||
### Description
|
||||
|
||||
The `JOIN` clause combines columns from one or more indices using values common to each.
|
||||
The `JOIN` clause combines columns from one or more indexes using values common to each.
|
||||
|
||||
### Syntax
|
||||
|
||||
|
@ -53,7 +55,7 @@ Rule `joinPart`:
|
|||
|
||||
### Example 1: Inner join
|
||||
|
||||
Inner join creates a new result set by combining columns of two indices based on your join predicates. It iterates the two indices and compares each document to find the ones that satisfy the join predicates. You can optionally precede the `JOIN` clause with an `INNER` keyword.
|
||||
Inner join creates a new result set by combining columns of two indexes based on your join predicates. It iterates the two indexes and compares each document to find the ones that satisfy the join predicates. You can optionally precede the `JOIN` clause with an `INNER` keyword.
|
||||
|
||||
The join predicate(s) is specified by the ON clause.
|
||||
|
||||
|
@ -149,10 +151,10 @@ Result set:
|
|||
### Example 2: Cross join
|
||||
|
||||
Cross join, also known as cartesian join, combines each document from the first index with each document from the second.
|
||||
The result set is the the cartesian product of documents of both indices.
|
||||
The result set is the the cartesian product of documents of both indexes.
|
||||
This operation is similar to the inner join without the `ON` clause that specifies the join condition.
|
||||
|
||||
It's risky to perform cross join on two indices of large or even medium size. It might trigger a circuit breaker that terminates the query to avoid running out of memory.
|
||||
It's risky to perform cross join on two indexes of large or even medium size. It might trigger a circuit breaker that terminates the query to avoid running out of memory.
|
||||
{: .warning }
|
||||
|
||||
SQL query:
|
||||
|
|
|
@ -4,6 +4,8 @@ title: Delete
|
|||
parent: SQL
|
||||
grand_parent: SQL and PPL
|
||||
nav_order: 12
|
||||
Redirect_from:
|
||||
- /search-plugins/sql/delete/
|
||||
---
|
||||
|
||||
|
||||
|
|
|
@ -4,6 +4,8 @@ title: Functions
|
|||
parent: SQL
|
||||
grand_parent: SQL and PPL
|
||||
nav_order: 7
|
||||
Redirect_from:
|
||||
- /search-plugins/sql/functions/
|
||||
---
|
||||
|
||||
# Functions
|
||||
|
|
|
@ -6,7 +6,8 @@ nav_order: 4
|
|||
has_children: true
|
||||
has_toc: false
|
||||
redirect_from:
|
||||
- /search-plugins/sql/sql
|
||||
- /search-plugins/sql/sql/index/
|
||||
|
||||
---
|
||||
|
||||
# SQL
|
||||
|
|
|
@ -4,13 +4,16 @@ title: JDBC Driver
|
|||
parent: SQL
|
||||
grand_parent: SQL and PPL
|
||||
nav_order: 71
|
||||
redirect_from:
|
||||
- /search-plugins/sql/jdbc/
|
||||
|
||||
---
|
||||
|
||||
# JDBC driver
|
||||
|
||||
The Java Database Connectivity (JDBC) driver lets you integrate OpenSearch with your favorite business intelligence (BI) applications.
|
||||
|
||||
For information on downloading and using the JAR file, see [the SQL repository on GitHub](https://github.com/opensearch-project/sql-jdbc).
|
||||
For information about downloading and using the JAR file, see [the SQL repository on GitHub](https://github.com/opensearch-project/sql-jdbc).
|
||||
|
||||
## Connecting to Tableau
|
||||
|
||||
|
|
|
@ -4,11 +4,13 @@ title: Metadata Queries
|
|||
parent: SQL
|
||||
grand_parent: SQL and PPL
|
||||
nav_order: 9
|
||||
redirect_from:
|
||||
- /search-plugins/sql/metadata/
|
||||
---
|
||||
|
||||
# Metadata queries
|
||||
|
||||
To see basic metadata about your indices, use the `SHOW` and `DESCRIBE` commands.
|
||||
To see basic metadata about your indexes, use the `SHOW` and `DESCRIBE` commands.
|
||||
|
||||
### Syntax
|
||||
|
||||
|
@ -20,10 +22,10 @@ Rule `showFilter`:
|
|||
|
||||
![showFilter]({{site.url}}{{site.baseurl}}/images/showFilter.png)
|
||||
|
||||
### Example 1: See metadata for indices
|
||||
### Example 1: See metadata for indexes
|
||||
|
||||
To see metadata for indices that match a specific pattern, use the `SHOW` command.
|
||||
Use the wildcard `%` to match all indices:
|
||||
To see metadata for indexes that match a specific pattern, use the `SHOW` command.
|
||||
Use the wildcard `%` to match all indexes:
|
||||
|
||||
```sql
|
||||
SHOW TABLES LIKE %
|
||||
|
|
|
@ -4,6 +4,8 @@ title: ODBC Driver
|
|||
parent: SQL
|
||||
grand_parent: SQL and PPL
|
||||
nav_order: 72
|
||||
redirect_from:
|
||||
- /search-plugins/sql/odbc/
|
||||
---
|
||||
|
||||
# ODBC driver
|
||||
|
@ -184,7 +186,7 @@ Pre-requisites:
|
|||
2. In the **DSN drop-down**, select the OpenSearch DSN you set up in the previous set of steps. The options you added will be automatically filled in under the **Connection Attributes**.
|
||||
|
||||
3. Select **Sign In**. After a few seconds, Tableau connects to your OpenSearch server. Once connected, you will be directed to the **Datasource** window. The **Database** will be already be populated with the name of the OpenSearch cluster.
|
||||
To list all the indices, click the search icon under **Table**.
|
||||
To list all the indexes, click the search icon under **Table**.
|
||||
|
||||
4. Start experimenting with data by dragging the table to the connection area. Choose **Update Now** or **Automatically Update** to populate the table data.
|
||||
|
||||
|
|
|
@ -4,6 +4,8 @@ title: JSON Support
|
|||
parent: SQL
|
||||
grand_parent: SQL and PPL
|
||||
nav_order: 8
|
||||
redirect_from:
|
||||
- /search-plugins/sql/partiql/
|
||||
---
|
||||
|
||||
# JSON Support
|
||||
|
|
|
@ -3,6 +3,8 @@ layout: default
|
|||
title: Troubleshooting
|
||||
parent: SQL and PPL
|
||||
nav_order: 88
|
||||
redirect_from:
|
||||
- /search-plugins/sql/troubleshoot/
|
||||
---
|
||||
|
||||
# Troubleshooting
|
||||
|
@ -41,7 +43,7 @@ POST _plugins/_sql/_explain
|
|||
|
||||
## Index mapping verification exception
|
||||
|
||||
If you see the following verification exception:
|
||||
If you see the following verification exception, make sure the index in your query isn't an index pattern and doesn't have multiple types:
|
||||
|
||||
```json
|
||||
{
|
||||
|
@ -54,6 +56,4 @@ If you see the following verification exception:
|
|||
}
|
||||
```
|
||||
|
||||
Make sure the index in your query is not an index pattern and is not an index pattern and doesn't have multiple types.
|
||||
|
||||
If these steps don't work, submit a Github issue [here](https://github.com/opensearch-project/sql/issues).
|
||||
|
|
|
@ -4,6 +4,9 @@ title: Query Workbench
|
|||
parent: SQL
|
||||
grand_parent: SQL and PPL
|
||||
nav_order: 1
|
||||
redirect_from:
|
||||
- /search-plugins/sql/workbench/
|
||||
|
||||
---
|
||||
|
||||
# Query Workbench
|
||||
|
@ -30,9 +33,9 @@ PUT accounts/_bulk?refresh
|
|||
Then return to SQL Workbench.
|
||||
|
||||
|
||||
### List indices
|
||||
### List indexes
|
||||
|
||||
To list all your indices:
|
||||
To list all your indexes:
|
||||
|
||||
```sql
|
||||
SHOW TABLES LIKE %
|
||||
|
|
|
@ -27,13 +27,13 @@ Security Analytics includes a number of tools and features elemental to its oper
|
|||
|
||||
### Detectors
|
||||
|
||||
Detectors are core components that are configured to identify a range of cybersecurity threats corresponding to an ever-growing knowldege base of adversary tactics and techniques maintained by the [MITRE ATT&CK](https://attack.mitre.org/) organization. Detectors use log data to evaluate events occuring in the system. They then apply a set of security rules specified for the detector and determine findings from these events.
|
||||
Detectors are core components that are configured to identify a range of cybersecurity threats corresponding to an ever-growing knowledge base of adversary tactics and techniques maintained by the [MITRE ATT&CK](https://attack.mitre.org/) organization. Detectors use log data to evaluate events occurring in the system. They then apply a set of security rules specified for the detector and determine findings from these events.
|
||||
|
||||
For information on configuring detectors, see [Creating detectors]({{site.url}}{{site.baseurl}}/security-analytics/sec-analytics-config/detectors-config/).
|
||||
For information about configuring detectors, see [Creating detectors]({{site.url}}{{site.baseurl}}/security-analytics/sec-analytics-config/detectors-config/).
|
||||
|
||||
### Log types
|
||||
|
||||
Log types provide the data used to evaluate events occuring in a system. OpenSearch supports several types of logs and provides out-of-the-box mappings for the most common log sources. Currently supported log sources include:
|
||||
Log types provide the data used to evaluate events occurring in a system. OpenSearch supports several types of logs and provides out-of-the-box mappings for the most common log sources. Currently supported log sources include:
|
||||
* Network events
|
||||
* DNS logs
|
||||
* Apache access logs
|
||||
|
@ -54,7 +54,7 @@ Log types are specified during the creation of detectors, including steps for ma
|
|||
|
||||
Rules, or threat detection rules, define the conditional logic applied to ingested log data that allows the system to identify an event of interest. Security Analytics uses prepackaged, open source [Sigma rules](https://github.com/SigmaHQ/sigma) as a starting point for describing relevant log events. But with their inherently flexible format and easy portability, Sigma rules provide users of Security Analytics with options for importing and customizing the rules. You can take advantage of these options using either Dashboards or the API.
|
||||
|
||||
For information on configuring rules, see [Working with rules]({{site.url}}{{site.baseurl}}/security-analytics/usage/rules/).
|
||||
For information about configuring rules, see [Working with rules]({{site.url}}{{site.baseurl}}/security-analytics/usage/rules/).
|
||||
|
||||
### Findings
|
||||
|
||||
|
@ -66,7 +66,7 @@ To learn more about findings, see [Working with findings]({{site.url}}{{site.bas
|
|||
|
||||
When defining a detector, you can specify certain conditions that will trigger an alert. When an event triggers an alert, the system sends a notification to a preferred channel, such as Amazon Chime, Slack, or email. The alert can be triggered when the detector matches one or multiple rules. Further conditions can be set by rule severity and tags. You can also create a notification message with a customized subject line and message body.
|
||||
|
||||
For information on setting up alerts, see [Step 3. Set up alerts]({{site.url}}{{site.baseurl}}/security-analytics/sec-analytics-config/detectors-config/#step-3-set-up-alerts) in detector creation documentation. For information on managing alerts on the Alerts window, see [Working with alerts]({{site.url}}{{site.baseurl}}/security-analytics/usage/alerts/).
|
||||
For information about setting up alerts, see [Step 3. Set up alerts]({{site.url}}{{site.baseurl}}/security-analytics/sec-analytics-config/detectors-config/#step-3-set-up-alerts) in detector creation documentation. For information about managing alerts on the Alerts window, see [Working with alerts]({{site.url}}{{site.baseurl}}/security-analytics/usage/alerts/).
|
||||
|
||||
### Correlation engine
|
||||
|
||||
|
|
|
@ -3,6 +3,8 @@ layout: default
|
|||
title: Audit log field reference
|
||||
parent: Audit logs
|
||||
nav_order: 130
|
||||
redirect_from:
|
||||
- /security/audit-logs/field-reference/
|
||||
---
|
||||
|
||||
# Audit log field reference
|
||||
|
@ -108,7 +110,7 @@ Name | Description
|
|||
`audit_request_effective_user` | The username that failed to authenticate.
|
||||
`audit_request_initiating_user` | The user that initiated the request. Only logged if it differs from the effective user.
|
||||
`audit_transport_request_type` | The type of request (e.g. `IndexRequest`).
|
||||
`audit_request_privilege` | The required privilege of the request (e.g. `indices:data/read/search`).
|
||||
`audit_request_privilege` | The required privilege of the request (for example, `indices:data/read/search`).
|
||||
`audit_request_body` | The HTTP request body, if any (and if request body logging is enabled).
|
||||
`audit_trace_indices` | The index name(s) included in the request. Can contain wildcards, date patterns, and aliases. Only logged if `resolve_indices` is true.
|
||||
`audit_trace_resolved_indices` | The resolved index name(s) affected by the request. Only logged if `resolve_indices` is true.
|
||||
|
@ -124,7 +126,7 @@ Name | Description
|
|||
`audit_transport_headers` | The headers of the request, if any.
|
||||
`audit_request_effective_user` | The username that failed to authenticate.
|
||||
`audit_request_initiating_user` | The user that initiated the request. Only logged if it differs from the effective user.
|
||||
`audit_transport_request_type` | The type of request (e.g. `IndexRequest`).
|
||||
`audit_transport_request_type` | The type of request (for example, `IndexRequest`).
|
||||
`audit_request_privilege` | The required privilege of the request (e.g. `indices:data/read/search`).
|
||||
`audit_request_body` | The HTTP request body, if any (and if request body logging is enabled).
|
||||
`audit_trace_indices` | The index name(s) included in the request. Can contain wildcards, date patterns, and aliases. Only logged if `resolve_indices` is true.
|
||||
|
|
|
@ -5,7 +5,7 @@ nav_order: 125
|
|||
has_children: true
|
||||
has_toc: false
|
||||
redirect_from:
|
||||
- /security/audit-logs/
|
||||
- /security/audit-logs/index/
|
||||
---
|
||||
|
||||
# Audit logs
|
||||
|
@ -98,7 +98,7 @@ plugins.security.audit.log_request_body: false
|
|||
|
||||
## Log index names
|
||||
|
||||
By default, the Security plugin logs all indices affected by a request. Because index names can be aliases and contain wildcards/date patterns, the Security plugin logs the index name that the user submitted *and* the actual index name to which it resolves.
|
||||
By default, the Security plugin logs all indexes affected by a request. Because index names can be aliases and contain wildcards/date patterns, the Security plugin logs the index name that the user submitted *and* the actual index name to which it resolves.
|
||||
|
||||
For example, if you use an alias or a wildcard, the audit event might look like:
|
||||
|
||||
|
@ -168,7 +168,7 @@ By default, the Security plugin stores audit events in a daily rolling index nam
|
|||
plugins.security.audit.config.index: myauditlogindex
|
||||
```
|
||||
|
||||
Use a date pattern in the index name to configure daily, weekly, or monthly rolling indices:
|
||||
Use a date pattern in the index name to configure daily, weekly, or monthly rolling indexes:
|
||||
|
||||
```yml
|
||||
plugins.security.audit.config.index: "'auditlog-'YYYY.MM.dd"
|
||||
|
|
|
@ -3,6 +3,8 @@ layout: default
|
|||
title: Audit log storage types
|
||||
parent: Audit logs
|
||||
nav_order: 135
|
||||
redirect_from:
|
||||
- /security/audit-logs/storage-types/
|
||||
---
|
||||
|
||||
# Audit log storage types
|
||||
|
|
|
@ -3,6 +3,8 @@ layout: default
|
|||
title: OpenID Connect
|
||||
parent: Authentication backends
|
||||
nav_order: 50
|
||||
redirect_from:
|
||||
- /security-plugin/configuration/openid-connect/
|
||||
---
|
||||
|
||||
# OpenID Connect
|
||||
|
|
|
@ -182,7 +182,7 @@ requests:
|
|||
- PUT
|
||||
```
|
||||
|
||||
You can also add custom indices to the allow list. `allowlist.yml` doesn't support wildcards, so you must manually specify all of the indexes you want to add.
|
||||
You can also add custom indexes to the allow list. `allowlist.yml` doesn't support wildcards, so you must manually specify all of the indexes you want to add.
|
||||
|
||||
```yml
|
||||
requests: # Only allow GET requests to /sample-index1/_doc/1 and /sample-index2/_doc/1
|
||||
|
|
|
@ -37,7 +37,7 @@ saml:
|
|||
|
||||
## Check the SAML assertion consumer service URL
|
||||
|
||||
After a successful login, your IdP sends a SAML response using HTTP POST to OpenSearch Dashboards's "assertion consumer service URL" (ACS).
|
||||
After a successful login, your IdP sends a SAML response using HTTP POST to the OpenSearch Dashboards "assertion consumer service URL" (ACS).
|
||||
|
||||
The endpoint the OpenSearch Dashboards Security plugin provides is:
|
||||
|
||||
|
|
|
@ -257,13 +257,13 @@ The `routing.allocation.awareness.balance` setting is false by default. When it
|
|||
`routing.allocation.awareness.balance` takes effect only if `cluster.routing.allocation.awareness.attributes` and `cluster.routing.allocation.awareness.force.zone.values` are set.
|
||||
{: .note}
|
||||
|
||||
`routing.allocation.awareness.balance` applies to all operations that create or update indices. For example, let's say you're running a cluster with three nodes and three zones in a zone-aware setting. If you try to create an index with one replica or update an index's settings to one replica, the attempt will fail with a validation exception because the number of shards must be a multiple of three. Similarly, if you try to create an index template with one shard and no replicas, the attempt will fail for the same reason. However, in all of those operations, if you set the number of shards to one and the number of replicas to two, the total number of shards is three and the attempt will succeed.
|
||||
`routing.allocation.awareness.balance` applies to all operations that create or update indexes. For example, let's say you're running a cluster with three nodes and three zones in a zone-aware setting. If you try to create an index with one replica or update an index's settings to one replica, the attempt will fail with a validation exception because the number of shards must be a multiple of three. Similarly, if you try to create an index template with one shard and no replicas, the attempt will fail for the same reason. However, in all of those operations, if you set the number of shards to one and the number of replicas to two, the total number of shards is three and the attempt will succeed.
|
||||
|
||||
## (Advanced) Step 7: Set up a hot-warm architecture
|
||||
|
||||
You can design a hot-warm architecture where you first index your data to hot nodes---fast and expensive---and after a certain period of time move them to warm nodes---slow and cheap.
|
||||
|
||||
If you analyze time series data that you rarely update and want the older data to go onto cheaper storage, this architecture can be a good fit.
|
||||
If you analyze time-series data that you rarely update and want the older data to go onto cheaper storage, this architecture can be a good fit.
|
||||
|
||||
This architecture helps save money on storage costs. Rather than increasing the number of hot nodes and using fast, expensive storage, you can add warm nodes for data that you don't access as frequently.
|
||||
|
||||
|
|
Loading…
Reference in New Issue