Merge pull request #50 from opensearch-project/layout-changes

Navigational improvement
This commit is contained in:
Andrew Etter 2021-06-10 11:16:03 -07:00 committed by GitHub
commit 94a1da6c8d
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
194 changed files with 593 additions and 506 deletions

View File

@ -3,13 +3,16 @@ layout: default
title: OpenSearch CLI title: OpenSearch CLI
nav_order: 52 nav_order: 52
has_children: false has_children: false
redirect_from:
- /docs/odfe-cli/
- /docs/cli/
--- ---
# OpenSearch CLI # OpenSearch CLI
The OpenSearch CLI command line interface (opensearch-cli) lets you manage your OpenSearch cluster from the command line and automate tasks. The OpenSearch CLI command line interface (opensearch-cli) lets you manage your OpenSearch cluster from the command line and automate tasks.
Currently, opensearch-cli supports the [Anomaly Detection](../ad/) and [k-NN](../knn/) plugins, along with arbitrary REST API paths. Among other things, you can use opensearch-cli to create and delete detectors, start and stop them, and check k-NN statistics. Currently, opensearch-cli supports the [Anomaly Detection]({{site.url}}{{site.baseurl}}/monitoring-plugins/ad/) and [k-NN]({{site.url}}{{site.baseurl}}/search-plugins/knn/) plugins, along with arbitrary REST API paths. Among other things, you can use opensearch-cli to create and delete detectors, start and stop them, and check k-NN statistics.
Profiles let you easily access different clusters or sign requests with different credentials. opensearch-cli supports unauthenticated requests, HTTP basic signing, and IAM signing for Amazon Web Services. Profiles let you easily access different clusters or sign requests with different credentials. opensearch-cli supports unauthenticated requests, HTTP basic signing, and IAM signing for Amazon Web Services.

View File

@ -1,18 +1,3 @@
# Welcome to Jekyll!
#
# This config file is meant for settings that affect your whole blog, values
# which you are expected to set up once and rarely edit after that. If you find
# yourself editing this file very often, consider using Jekyll's data files
# feature for the data you need to update frequently.
#
# For technical reasons, this file is *NOT* reloaded automatically when you use
# 'bundle exec jekyll serve'. If you change this file, please restart the server process.
# Site settings
# These are used to personalize your new site. If you look in the HTML files,
# you will see them accessed via {{ site.title }}, {{ site.email }}, and so on.
# You can create any custom variable you would like, and they will be accessible
# in the templates via {{ site.myvariable }}.
title: OpenSearch documentation title: OpenSearch documentation
description: >- # this means to ignore newlines until "baseurl:" description: >- # this means to ignore newlines until "baseurl:"
Documentation for OpenSearch, the Apache 2.0 search, analytics, and visualization suite with advanced security, alerting, SQL support, automated index management, deep performance analysis, and more. Documentation for OpenSearch, the Apache 2.0 search, analytics, and visualization suite with advanced security, alerting, SQL support, automated index management, deep performance analysis, and more.
@ -39,6 +24,70 @@ aux_links:
- "https://opensearch.org" - "https://opensearch.org"
color_scheme: opensearch color_scheme: opensearch
# Define Jekyll collections
collections:
# Define a collection named "tests", its documents reside in the "_tests" directory
opensearch:
permalink: "/:collection/:path"
output: true
dashboards:
permalink: "/:collection/:path"
output: true
security-plugin:
permalink: "/:collection/:path"
output: true
search-plugins:
permalink: "/:collection/:path"
output: true
im-plugin:
permalink: "/:collection/:path"
output: true
monitoring-plugins:
permalink: "/:collection/:path"
output: true
clients:
permalink: "/:collection/:path"
output: true
troubleshoot:
permalink: "/:collection/:path"
output: true
external_links:
permalink: "/:collection/:path"
output: true
just_the_docs:
# Define the collections used in the theme
collections:
opensearch:
name: OpenSearch
# nav_exclude: true
nav_fold: true
# search_exclude: true
dashboards:
name: OpenSearch Dashboards
nav_fold: true
security-plugin:
name: Security plugin
nav_fold: true
search-plugins:
name: Search plugins
nav_fold: true
im-plugin:
name: Index management plugin
nav_fold: true
monitoring-plugins:
name: Monitoring plugins
nav_fold: true
clients:
name: Clients and tools
nav_fold: true
troubleshoot:
name: Troubleshooting
nav_fold: true
external_links:
name: External links
# Enable or disable the site search # Enable or disable the site search
# Supports true (default) or false # Supports true (default) or false
search_enabled: true search_enabled: true

View File

@ -1,7 +1,6 @@
--- ---
layout: default layout: default
title: Gantt charts title: Gantt charts
parent: OpenSearch Dashboards
nav_order: 10 nav_order: 10
--- ---
@ -21,6 +20,6 @@ To create a Gantt chart, perform the following steps:
1. Choose **Panel settings** to adjust axis labels, time format, and colors. 1. Choose **Panel settings** to adjust axis labels, time format, and colors.
1. Choose **Update**. 1. Choose **Update**.
![Gantt Chart](../../images/gantt-chart.png) ![Gantt Chart]({{site.url}}{{site.baseurl}}/images/gantt-chart.png)
This Gantt chart displays the ID of each log on the y-axis. Each bar is a unique event that spans some amount of time. Hover over a bar to see the duration of that event. This Gantt chart displays the ID of each log on the y-axis. Each bar is a unique event that spans some amount of time. Hover over a bar to see the duration of that event.

View File

@ -1,9 +1,12 @@
--- ---
layout: default layout: default
title: OpenSearch Dashboards title: About Dashboards
nav_order: 11 nav_order: 1
has_children: true has_children: false
has_toc: false has_toc: false
redirect_from:
- /docs/opensearch-dashboards/
- /opensearch-dashboards/
--- ---
# OpenSearch Dashboards # OpenSearch Dashboards

View File

@ -2,7 +2,6 @@
layout: default layout: default
title: Docker title: Docker
parent: Install OpenSearch Dashboards parent: Install OpenSearch Dashboards
grand_parent: OpenSearch Dashboards
nav_order: 1 nav_order: 1
--- ---
@ -12,13 +11,13 @@ You *can* start OpenSearch Dashboards using `docker run` after [creating a Docke
1. Run `docker pull opensearchproject/opensearch-dashboards:{{site.opensearch_version}}`. 1. Run `docker pull opensearchproject/opensearch-dashboards:{{site.opensearch_version}}`.
1. Create a [`docker-compose.yml`](https://docs.docker.com/compose/compose-file/) file appropriate for your environment. A sample file that includes OpenSearch Dashboards is available on the OpenSearch [Docker installation page](../opensearch/docker/#sample-docker-compose-file). 1. Create a [`docker-compose.yml`](https://docs.docker.com/compose/compose-file/) file appropriate for your environment. A sample file that includes OpenSearch Dashboards is available on the OpenSearch [Docker installation page]({{site.url}}{{site.baseurl}}/opensearch/install/docker/#sample-docker-compose-file).
Just like `opensearch.yml`, you can pass a custom `opensearch_dashboards.yml` to the container in the Docker Compose file. Just like `opensearch.yml`, you can pass a custom `opensearch_dashboards.yml` to the container in the Docker Compose file.
{: .tip } {: .tip }
1. Run `docker-compose up`. 1. Run `docker-compose up`.
Wait for the containers to start. Then see the [OpenSearch Dashboards documentation](../../../opensearch-dashboards/). Wait for the containers to start. Then see the [OpenSearch Dashboards documentation]({{site.url}}{{site.baseurl}}/).
1. When finished, run `docker-compose down`. 1. When finished, run `docker-compose down`.

View File

@ -2,7 +2,6 @@
layout: default layout: default
title: Install OpenSearch Dashboards title: Install OpenSearch Dashboards
nav_order: 1 nav_order: 1
parent: OpenSearch Dashboards
has_children: true has_children: true
--- ---

View File

@ -2,7 +2,6 @@
layout: default layout: default
title: OpenSearch Dashboards plugins title: OpenSearch Dashboards plugins
parent: Install OpenSearch Dashboards parent: Install OpenSearch Dashboards
grand_parent: OpenSearch Dashboards
nav_order: 50 nav_order: 50
--- ---
@ -66,8 +65,8 @@ traceAnalyticsDashboards 1.0.0.0-beta1
## Prerequisites ## Prerequisites
- A compatible OpenSearch cluster - A compatible OpenSearch cluster
- The corresponding OpenSearch plugins [installed on that cluster](../../install/plugins) - The corresponding OpenSearch plugins [installed on that cluster]({{site.url}}{{site.baseurl}}/opensearch/install/plugins/)
- The corresponding version of [OpenSearch Dashboards](../) (e.g. OpenSearch Dashboards 1.0.0 works with OpenSearch 1.0.0) - The corresponding version of [OpenSearch Dashboards]({{site.url}}{{site.baseurl}}/) (e.g. OpenSearch Dashboards 1.0.0 works with OpenSearch 1.0.0)
## Install ## Install
@ -180,6 +179,8 @@ To remove a plugin:
sudo bin/opensearch-dashboards-plugin remove <plugin-name> sudo bin/opensearch-dashboards-plugin remove <plugin-name>
``` ```
Then remove all associated entries from `opensearch_dashboards.yml`.
For certain plugins, you must also remove the "optimze" bundle. This is a sample command for the Anomaly Detection plugin: For certain plugins, you must also remove the "optimze" bundle. This is a sample command for the Anomaly Detection plugin:
```bash ```bash

View File

@ -2,7 +2,6 @@
layout: default layout: default
title: Tarball title: Tarball
parent: Install OpenSearch Dashboards parent: Install OpenSearch Dashboards
grand_parent: OpenSearch Dashboards
nav_order: 30 nav_order: 30
--- ---
@ -28,4 +27,4 @@ nav_order: 30
./bin/opensearch-dashboards ./bin/opensearch-dashboards
``` ```
1. See the [OpenSearch Dashboards documentation](../../opensearch-dashboards/). 1. See the [OpenSearch Dashboards documentation]({{site.url}}{{site.baseurl}}/opensearch-dashboards/).

View File

@ -1,7 +1,6 @@
--- ---
layout: default layout: default
title: WMS map server title: WMS map server
parent: OpenSearch Dashboards
nav_order: 5 nav_order: 5
--- ---

View File

@ -1,7 +1,6 @@
--- ---
layout: default layout: default
title: Notebooks (experimental) title: Notebooks (experimental)
parent: OpenSearch Dashboards
nav_order: 50 nav_order: 50
redirect_from: /docs/notebooks/ redirect_from: /docs/notebooks/
has_children: false has_children: false
@ -9,7 +8,7 @@ has_children: false
# OpenSearch Dashboards notebooks (experimental) # OpenSearch Dashboards notebooks (experimental)
Notebooks have a known issue with [tenants](../../security/access-control/multi-tenancy/). If you open a notebook and can't see its visualizations, you might be under the wrong tenant, or you might not have access to the tenant at all. Notebooks have a known issue with [tenants]({{site.url}}{{site.baseurl}}/security-plugin/access-control/multi-tenancy/). If you open a notebook and can't see its visualizations, you might be under the wrong tenant, or you might not have access to the tenant at all.
{: .warning } {: .warning }
An OpenSearch Dashboards notebook is an interface that lets you easily combine live visualizations and narrative text in a single notebook interface. An OpenSearch Dashboards notebook is an interface that lets you easily combine live visualizations and narrative text in a single notebook interface.
@ -46,7 +45,7 @@ Paragraphs combine text and visualizations for describing data.
1. To add text, choose **Add markdown paragraph**. 1. To add text, choose **Add markdown paragraph**.
1. Add rich text with markdown syntax. 1. Add rich text with markdown syntax.
![Markdown paragraph](../../images/markdown-notebook.png) ![Markdown paragraph]({{site.url}}{{site.baseurl}}/images/markdown-notebook.png)
#### Add a visualization paragraph #### Add a visualization paragraph

View File

@ -1,14 +1,13 @@
--- ---
layout: default layout: default
title: Reporting title: Reporting
parent: OpenSearch Dashboards
nav_order: 20 nav_order: 20
--- ---
# Reporting # Reporting
You can use OpenSearch Dashboards to create PNG, PDF, and CSV reports. To create reports, you must have the correct permissions. For a summary of the predefined roles and the permissions they grant, see the [security plugin](../../security/access-control/users-roles/#predefined-roles). You can use OpenSearch Dashboards to create PNG, PDF, and CSV reports. To create reports, you must have the correct permissions. For a summary of the predefined roles and the permissions they grant, see the [security plugin]({{site.url}}{{site.baseurl}}/security-plugin/access-control/users-roles/#predefined-roles).
## Create reports from Discovery, Visualize, or Dashboard ## Create reports from Discovery, Visualize, or Dashboard
@ -36,7 +35,7 @@ Definitions let you generate reports on a periodic schedule.
1. (Optional) Add a header or footer to the report. Headers and footers are only available for dashboard or visualization reports. 1. (Optional) Add a header or footer to the report. Headers and footers are only available for dashboard or visualization reports.
1. Under **Report trigger**, choose either **On-demand** or **Schedule**. 1. Under **Report trigger**, choose either **On-demand** or **Schedule**.
For scheduled reports, select either **Recurring** or **Cron based**. You can receive reports daily or at some other time interval. Cron expressions give you even more flexiblity. See [Cron expression reference](../../alerting/cron/) for more information. For scheduled reports, select either **Recurring** or **Cron based**. You can receive reports daily or at some other time interval. Cron expressions give you even more flexiblity. See [Cron expression reference]({{site.url}}{{site.baseurl}}/monitoring-plugins/alerting/cron/) for more information.
1. Choose **Create**. 1. Choose **Create**.
@ -46,7 +45,7 @@ Definitions let you generate reports on a periodic schedule.
While creating a report for dashboards or visualizations, you might see a the following error: While creating a report for dashboards or visualizations, you might see a the following error:
![OpenSearch Dashboards reporting pop-up error message](../../images/reporting-error.png) ![OpenSearch Dashboards reporting pop-up error message]({{site.url}}{{site.baseurl}}/images/reporting-error.png)
This problem can occur for two reasons: This problem can occur for two reasons:

View File

@ -0,0 +1,7 @@
---
layout: default
title: Javadoc
nav_order: 1
permalink: /javadoc/
redirect_to: https://opensearch.org/docs/javadocs/
---

View File

@ -2,13 +2,12 @@
layout: default layout: default
title: Index rollups title: Index rollups
nav_order: 35 nav_order: 35
parent: Index management
has_children: true has_children: true
redirect_from: /docs/ism/index-rollups/ redirect_from: /docs/ism/index-rollups/
has_toc: false has_toc: false
--- ---
# Index Rollups # Index rollups
Time series data increases storage costs, strains cluster health, and slows down aggregations over time. Index rollup lets you periodically reduce data granularity by rolling up old data into summarized indices. Time series data increases storage costs, strains cluster health, and slows down aggregations over time. Index rollup lets you periodically reduce data granularity by rolling up old data into summarized indices.
@ -52,7 +51,7 @@ The order in which you select attributes is critical. A city followed by a demog
Specify a schedule to roll up your indices as its being ingested. The index rollup job is enabled by default. Specify a schedule to roll up your indices as its being ingested. The index rollup job is enabled by default.
1. Specify if the data is continuous or not. 1. Specify if the data is continuous or not.
3. For roll up execution frequency, select **Define by fixed interval** and specify the **Rollup interval** and the time unit or **Define by cron expression** and add in a cron expression to select the interval. To learn how to define a cron expression, see [Alerting](../alerting/cron/). 3. For roll up execution frequency, select **Define by fixed interval** and specify the **Rollup interval** and the time unit or **Define by cron expression** and add in a cron expression to select the interval. To learn how to define a cron expression, see [Alerting]({{site.url}}{{site.baseurl}}/monitoring-plugins/alerting/cron/).
4. Specify the number of pages per execution process. A larger number means faster execution and more cost for memory. 4. Specify the number of pages per execution process. A larger number means faster execution and more cost for memory.
5. (Optional) Add a delay to the roll up executions. This is the amount of time the job waits for data ingestion to accommodate any processing time. For example, if you set this value to 10 minutes, an index rollup that executes at 2 PM to roll up 1 PM to 2 PM of data starts at 2:10 PM. 5. (Optional) Add a delay to the roll up executions. This is the amount of time the job waits for data ingestion to accommodate any processing time. For example, if you set this value to 10 minutes, an index rollup that executes at 2 PM to roll up 1 PM to 2 PM of data starts at 2:10 PM.
6. Choose **Next**. 6. Choose **Next**.

View File

@ -2,12 +2,11 @@
layout: default layout: default
title: Index rollups API title: Index rollups API
parent: Index rollups parent: Index rollups
grand_parent: Index management
redirect_from: /docs/ism/rollup-api/ redirect_from: /docs/ism/rollup-api/
nav_order: 9 nav_order: 9
--- ---
# Index Rollups API # Index rollups API
Use the index rollup operations to programmatically work with index rollup jobs. Use the index rollup operations to programmatically work with index rollup jobs.

View File

@ -1,9 +1,9 @@
--- ---
layout: default layout: default
title: Index transforms title: Index transforms
nav_order: 40 nav_order: 20
parent: Index management
has_children: true has_children: true
redirect_from: /docs/im/index-transforms/
has_toc: false has_toc: false
--- ---
@ -29,7 +29,7 @@ If you don't have any data in your cluster, you can use the sample flight data w
### Step 1: Choose indices ### Step 1: Choose indices
1. In the **Job name and description** section, specify a name and an optional description for your job. 1. In the **Job name and description** section, specify a name and an optional description for your job.
2. In the **Indices** section, select the source and target index. You can either select an existing target index or create a new one by entering a name for your new index. If you want to transform just a subset of your source index, choose **Add Data Filter**, and use the OpenSearch query DSL to specify a subset of your source index. For more information about the OpenSearch query DSL, see [query DSL](../../opensearch/query-dsl/). 2. In the **Indices** section, select the source and target index. You can either select an existing target index or create a new one by entering a name for your new index. If you want to transform just a subset of your source index, choose **Add Data Filter**, and use the OpenSearch query DSL to specify a subset of your source index. For more information about the OpenSearch query DSL, see [query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/).
3. Choose **Next**. 3. Choose **Next**.
### Step 2: Select fields to transform ### Step 2: Select fields to transform
@ -42,7 +42,7 @@ On the other hand, aggregations let you perform simple calculations. For example
1. In the data table, select the fields you want to transform and expand the drop-down menu within the column header to choose the grouping or aggregation you want to use. 1. In the data table, select the fields you want to transform and expand the drop-down menu within the column header to choose the grouping or aggregation you want to use.
Currently, transform jobs support histogram, date_histogram, and terms groupings. For more information about groupings, see [Bucket Aggregations](../../opensearch/bucket-agg/). In terms of aggregations, you can select from `sum`, `avg`, `max`, `min`, `value_count`, `percentiles`, and `scripted_metric`. For more information about aggregations, see [Metric Aggregations](../../opensearch/metric-agg/). Currently, transform jobs support histogram, date_histogram, and terms groupings. For more information about groupings, see [Bucket Aggregations]({{site.url}}{{site.baseurl}}/opensearch/bucket-agg/). In terms of aggregations, you can select from `sum`, `avg`, `max`, `min`, `value_count`, `percentiles`, and `scripted_metric`. For more information about aggregations, see [Metric Aggregations]({{site.url}}{{site.baseurl}}/opensearch/metric-agg/).
2. Repeat step 1 for any other fields that you want to transform. 2. Repeat step 1 for any other fields that you want to transform.
3. After selecting the fields that you want to transform and verifying the transformation, choose **Next**. 3. After selecting the fields that you want to transform and verifying the transformation, choose **Next**.

View File

@ -3,7 +3,6 @@ layout: default
title: Transforms APIs title: Transforms APIs
nav_order: 45 nav_order: 45
parent: Index transforms parent: Index transforms
grand_parent: Index management
has_toc: true has_toc: true
--- ---
@ -133,11 +132,11 @@ description | String | Describes the transform job. | No
metadata_id | String | Any metadata to be associated with the transform job. | No metadata_id | String | Any metadata to be associated with the transform job. | No
source_index | String | The source index whose data to transform. | Yes source_index | String | The source index whose data to transform. | Yes
target_index | String | The target index the newly transformed data is added into. You can create a new index or update an existing one. | Yes target_index | String | The target index the newly transformed data is added into. You can create a new index or update an existing one. | Yes
data_selection_query | JSON | The query DSL to use to filter a subset of the source index for the transform job. See [query DSL](../../../opensearch/query-dsl) for more information. | Yes data_selection_query | JSON | The query DSL to use to filter a subset of the source index for the transform job. See [query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl) for more information. | Yes
page_size | Integer | The number of fields to transform at a time. Higher number means higher performance but requires more memory and can cause higher latency. (Default: 1) | Yes page_size | Integer | The number of fields to transform at a time. Higher number means higher performance but requires more memory and can cause higher latency. (Default: 1) | Yes
groups | Array | Specifies the grouping(s) to use in the transform job. Supported groups are `terms`, `histogram`, and `date_histogram`. For more information, see [Bucket Aggregations](../../../opensearch/bucket-agg). | Yes if not using aggregations groups | Array | Specifies the grouping(s) to use in the transform job. Supported groups are `terms`, `histogram`, and `date_histogram`. For more information, see [Bucket Aggregations]({{site.url}}{{site.baseurl}}/opensearch/bucket-agg). | Yes if not using aggregations
source_field | String | The field(s) to transform | Yes source_field | String | The field(s) to transform | Yes
aggregations | JSON | The aggregations to use in the transform job. Supported aggregations are: `sum`, `max`, `min`, `value_count`, `avg`, `scripted_metric`, and `percentiles`. For more information, see [Metric Aggregations](../../../opensearch/metric-agg). | Yes if not using groups aggregations | JSON | The aggregations to use in the transform job. Supported aggregations are: `sum`, `max`, `min`, `value_count`, `avg`, `scripted_metric`, and `percentiles`. For more information, see [Metric Aggregations]({{site.url}}{{site.baseurl}}/opensearch/metric-agg). | Yes if not using groups
## Update a transform job ## Update a transform job

View File

@ -1,11 +1,11 @@
--- ---
layout: default layout: default
title: Index management title: About Index Management
nav_order: 30 nav_order: 1
has_children: true has_children: false
--- ---
# Index Management # About Index Management
OpenSearch Dashboards OpenSearch Dashboards
{: .label .label-yellow :} {: .label .label-yellow :}

View File

@ -2,7 +2,6 @@
layout: default layout: default
title: ISM API title: ISM API
parent: Index State Management parent: Index State Management
grand_parent: Index management
redirect_from: /docs/ism/api/ redirect_from: /docs/ism/api/
nav_order: 5 nav_order: 5
--- ---
@ -457,7 +456,7 @@ GET _plugins/_ism/explain/index_1
} }
``` ```
The `plugins.index_state_management.policy_id` setting is deprecated starting from ODFE version 1.13.0. We retain this field in the response API for consistency. The `plugins.index_state_management.policy_id` setting is deprecated starting from ODFE version 1.13.0. We retain this field in the response API for consistency.
--- ---

View File

@ -2,9 +2,10 @@
layout: default layout: default
title: Index State Management title: Index State Management
nav_order: 3 nav_order: 3
parent: Index management
has_children: true has_children: true
redirect_from: /docs/ism/ redirect_from:
- /docs/im/ism/
- /docs/ism/
has_toc: false has_toc: false
--- ---
@ -20,7 +21,7 @@ For example, you can define a policy that moves your index into a `read_only` st
You might want to perform an index rollover after a certain amount of time or run a `force_merge` operation on an index during off-peak hours to improve search performance during peak hours. You might want to perform an index rollover after a certain amount of time or run a `force_merge` operation on an index during off-peak hours to improve search performance during peak hours.
To use the ISM plugin, your user role needs to be mapped to the `all_access` role that gives you full access to the cluster. To learn more, see [Users and roles](../security/access-control/users-roles/). To use the ISM plugin, your user role needs to be mapped to the `all_access` role that gives you full access to the cluster. To learn more, see [Users and roles]({{site.url}}{{site.baseurl}}/security-plugin/access-control/users-roles/).
{: .note } {: .note }
## Get started with ISM ## Get started with ISM
@ -29,7 +30,7 @@ To get started, choose **Index Management** in OpenSearch Dashboards.
### Step 1: Set up policies ### Step 1: Set up policies
A policy is a set of rules that describes how an index should be managed. For information about creating a policy, see [Policies](policies/). A policy is a set of rules that describes how an index should be managed. For information about creating a policy, see [Policies]({{site.url}}{{site.baseurl}}/im-plugin/ism/policies/).
1. Choose the **Index Policies** tab. 1. Choose the **Index Policies** tab.
2. Choose **Create policy**. 2. Choose **Create policy**.
@ -55,7 +56,7 @@ PUT _plugins/_ism/policies/policy_id
} }
``` ```
For an example ISM template policy, see [Sample policy with ISM template](policies/#sample-policy-with-ism-template). For an example ISM template policy, see [Sample policy with ISM template]({{site.url}}{{site.baseurl}}/im-plugin/ism/policies#sample-policy-with-ism-template).
Older versions of the plugin include the `policy_id` in an index template, so when an index is created that matches the index template pattern, the index will have the policy attached to it: Older versions of the plugin include the `policy_id` in an index template, so when an index is created that matches the index template pattern, the index will have the policy attached to it:
@ -84,20 +85,20 @@ The `opendistro.index_state_management.policy_id` setting is deprecated. You can
4. From the **Policy ID** menu, choose the policy that you created. 4. From the **Policy ID** menu, choose the policy that you created.
You can see a preview of your policy. You can see a preview of your policy.
5. If your policy includes a rollover operation, specify a rollover alias. 5. If your policy includes a rollover operation, specify a rollover alias.
Make sure that the alias that you enter already exists. For more information about the rollover operation, see [rollover](policies/#rollover). Make sure that the alias that you enter already exists. For more information about the rollover operation, see [rollover]({{site.url}}{{site.baseurl}}/im-plugin/ism/policies#rollover).
6. Choose **Apply**. 6. Choose **Apply**.
After you attach a policy to an index, ISM creates a job that runs every 5 minutes by default to perform policy actions, check conditions, and transition the index into different states. To change the default time interval for this job, see [Settings](settings/). After you attach a policy to an index, ISM creates a job that runs every 5 minutes by default to perform policy actions, check conditions, and transition the index into different states. To change the default time interval for this job, see [Settings]({{site.url}}{{site.baseurl}}/im-plugin/ism/settings/).
If you want to use an OpenSearch operation to create an index with a policy already attached to it, see [create index](api/#create-index). If you want to use an OpenSearch operation to create an index with a policy already attached to it, see [create index]({{site.url}}{{site.baseurl}}/im-plugin/ism/api#create-index).
### Step 3: Manage indices ### Step 3: Manage indices
1. Choose **Managed Indices**. 1. Choose **Managed Indices**.
2. To change your policy, see [Change Policy](managedindices/#change-policy). 2. To change your policy, see [Change Policy]({{site.url}}{{site.baseurl}}/im-plugin/ism/managedindices#change-policy).
3. To attach a rollover alias to your index, select your policy and choose **Add rollover alias**. 3. To attach a rollover alias to your index, select your policy and choose **Add rollover alias**.
Make sure that the alias that you enter already exists. For more information about the rollover operation, see [rollover](policies/#rollover). Make sure that the alias that you enter already exists. For more information about the rollover operation, see [rollover]({{site.url}}{{site.baseurl}}/im-plugin/ism/policies#rollover).
4. To remove a policy, choose your policy, and then choose **Remove policy**. 4. To remove a policy, choose your policy, and then choose **Remove policy**.
5. To retry a policy, choose your policy, and then choose **Retry policy**. 5. To retry a policy, choose your policy, and then choose **Retry policy**.
For information about managing your policies, see [Managed Indices](managedindices/). For information about managing your policies, see [Managed Indices]({{site.url}}{{site.baseurl}}/im-plugin/ism/managedindices/).

View File

@ -3,7 +3,6 @@ layout: default
title: Managed Indices title: Managed Indices
nav_order: 3 nav_order: 3
parent: Index State Management parent: Index State Management
grand_parent: Index management
redirect_from: /docs/ism/managedindices/ redirect_from: /docs/ism/managedindices/
has_children: false has_children: false
--- ---

View File

@ -3,7 +3,6 @@ layout: default
title: Policies title: Policies
nav_order: 1 nav_order: 1
parent: Index State Management parent: Index State Management
grand_parent: Index management
redirect_from: /docs/ism/policies/ redirect_from: /docs/ism/policies/
has_children: false has_children: false
--- ---
@ -89,7 +88,7 @@ The following example action has a timeout period of one hour. The policy retrie
} }
``` ```
For a list of available unit types, see [Supported units](../../../opensearch/units/). For a list of available unit types, see [Supported units]({{site.url}}{{site.baseurl}}/opensearch/units/).
## ISM supported operations ## ISM supported operations
@ -160,7 +159,7 @@ Parameter | Description | Type | Required
} }
``` ```
For information about setting replicas, see [Primary and replica shards](../../../opensearch/#primary-and-replica-shards). For information about setting replicas, see [Primary and replica shards]({{site.url}}{{site.baseurl}}/opensearch/#primary-and-replica-shards).
### close ### close
@ -309,7 +308,7 @@ Parameter | Description | Type
### snapshot ### snapshot
Backup your clusters indices and state. For more information about snapshots, see [Take and restore snapshots](../../../opensearch/snapshot-restore/). Backup your clusters indices and state. For more information about snapshots, see [Take and restore snapshots]({{site.url}}{{site.baseurl}}/opensearch/snapshot-restore/).
The `snapshot` operation has the following parameters: The `snapshot` operation has the following parameters:
@ -436,7 +435,7 @@ Note that this condition does not execute at exactly 5:00 PM; the job still exec
A window of an hour, which this example uses, is generally sufficient, but you might increase it to 2--3 hours to avoid missing the window and having to wait a week for the transition to occur. Alternately, you could use a broader expression such as `* * * * SAT,SUN` to have the transition occur at any time during the weekend. A window of an hour, which this example uses, is generally sufficient, but you might increase it to 2--3 hours to avoid missing the window and having to wait a week for the transition to occur. Alternately, you could use a broader expression such as `* * * * SAT,SUN` to have the transition occur at any time during the weekend.
For information on writing cron expressions, see [Cron expression reference](../../../alerting/cron/). For information on writing cron expressions, see [Cron expression reference]({{site.url}}{{site.baseurl}}/monitoring-plugins/alerting/cron/).
--- ---
@ -663,4 +662,4 @@ After 30 days, the policy moves this index into a `delete` state. The service se
This diagram shows the `states`, `transitions`, and `actions` of the above policy as a finite-state machine. For more information about finite-state machines, see [Wikipedia](https://en.wikipedia.org/wiki/Finite-state_machine). This diagram shows the `states`, `transitions`, and `actions` of the above policy as a finite-state machine. For more information about finite-state machines, see [Wikipedia](https://en.wikipedia.org/wiki/Finite-state_machine).
![Policy State Machine](../../images/ism.png) ![Policy State Machine]({{site.baseurl}}/images/ism.png)

View File

@ -2,16 +2,15 @@
layout: default layout: default
title: Settings title: Settings
parent: Index State Management parent: Index State Management
grand_parent: Index management
redirect_from: /docs/ism/settings/ redirect_from: /docs/ism/settings/
nav_order: 4 nav_order: 4
--- ---
# ISM Settings # ISM settings
We don't recommend changing these settings; the defaults should work well for most use cases. We don't recommend changing these settings; the defaults should work well for most use cases.
Index State Management (ISM) stores its configuration in the `.opendistro-ism-config` index. Don't modify this index without using the [ISM API operations](../api/). Index State Management (ISM) stores its configuration in the `.opendistro-ism-config` index. Don't modify this index without using the [ISM API operations]({{site.url}}{{site.baseurl}}/im-plugin/ism/api/).
All settings are available using the OpenSearch `_cluster/settings` operation. None require a restart, and all can be marked `persistent` or `transient`. All settings are available using the OpenSearch `_cluster/settings` operation. None require a restart, and all can be marked `persistent` or `transient`.
@ -32,7 +31,7 @@ Setting | Default | Description
## Audit history indices ## Audit history indices
If you don't want to disable ISM audit history or shorten the retention period, you can create an [index template](../../../opensearch/index-templates/) to reduce the shard count of the history indices: If you don't want to disable ISM audit history or shorten the retention period, you can create an [index template]({{site.url}}{{site.baseurl}}/opensearch/index-templates/) to reduce the shard count of the history indices:
```json ```json
PUT _index_template/ism_history_indices PUT _index_template/ism_history_indices

View File

@ -1,8 +1,7 @@
--- ---
layout: default layout: default
title: Refresh Search Analyzer title: Refresh search analyzer
nav_order: 50 nav_order: 40
parent: Index management
has_children: false has_children: false
redirect_from: /docs/ism/refresh-analyzer/ redirect_from: /docs/ism/refresh-analyzer/
has_toc: false has_toc: false

View File

@ -60,7 +60,7 @@
{%- for node in pages_list -%} {%- for node in pages_list -%}
{%- if node.parent == nil -%} {%- if node.parent == nil -%}
{%- unless node.nav_exclude -%} {%- unless node.nav_exclude -%}
<li class="nav-list-item{% if page.url == node.url or page.parent == node.title or page.grand_parent == node.title %} active{% endif %}"> <li class="nav-list-item{% if page.collection == include.key and page.url == node.url or page.parent == node.title or page.grand_parent == node.title %} active{% endif %}">
{%- if node.has_children -%} {%- if node.has_children -%}
<a href="#" class="nav-list-expander"><svg viewBox="0 0 24 24"><use xlink:href="#svg-arrow-right"></use></svg></a> <a href="#" class="nav-list-expander"><svg viewBox="0 0 24 24"><use xlink:href="#svg-arrow-right"></use></svg></a>
{%- endif -%} {%- endif -%}
@ -90,14 +90,36 @@
</li> </li>
{%- endunless -%} {%- endunless -%}
{%- endfor -%} {%- endfor -%}
</ul> </ul>
{%- endif -%} {%- endif -%}
</li> </li>
{%- endunless -%} {%- endunless -%}
{%- endif -%} {%- endif -%}
{%- endfor -%} {%- endfor -%}
<li class="nav-list-item">
<a href="https://opensearch.org/docs/javadocs/" target="_blank" class="nav-list-link">Javadoc <svg class="external-arrow" width="16" height="16" fill="#002A3A"><use xlink:href="#external-arrow"></use></svg></a>
</li>
</ul> </ul>
{%- if page.collection == include.key -%}
{%- for node in pages_list -%}
{%- if node.parent == nil -%}
{%- if page.parent == node.title or page.grand_parent == node.title -%}
{%- assign first_level_url = node.url | absolute_url -%}
{%- endif -%}
{%- if node.has_children -%}
{%- assign children_list = pages_list | where: "parent", node.title -%}
{%- for child in children_list -%}
{%- if child.has_children -%}
{%- if page.url == child.url or page.parent == child.title and page.grand_parent == child.parent -%}
{%- assign second_level_url = child.url | absolute_url -%}
{%- endif -%}
{%- endif -%}
{%- endfor -%}
{%- endif -%}
{%- endif -%}
{%- endfor -%}
{% if page.has_children == true and page.has_toc != false %}
{%- assign toc_list = pages_list | where: "parent", page.title | where: "grand_parent", page.parent -%}
{%- endif -%}
{%- endif -%}

View File

@ -38,13 +38,6 @@ layout: table_wrappers
<path d="M13 2H6a2 2 0 0 0-2 2v16a2 2 0 0 0 2 2h12a2 2 0 0 0 2-2V9z"></path><polyline points="13 2 13 9 20 9"></polyline> <path d="M13 2H6a2 2 0 0 0-2 2v16a2 2 0 0 0 2 2h12a2 2 0 0 0 2-2V9z"></path><polyline points="13 2 13 9 20 9"></polyline>
</svg> </svg>
</symbol> </symbol>
<symbol id="external-arrow" viewBox="0 0 16 16">
<title>External</title>
<svg xmlns="http://www.w3.org/2000/svg" width="16" height="16" fill="currentColor" viewBox="0 0 16 16">
<path fill-rule="evenodd" d="M8.636 3.5a.5.5 0 0 0-.5-.5H1.5A1.5 1.5 0 0 0 0 4.5v10A1.5 1.5 0 0 0 1.5 16h10a1.5 1.5 0 0 0 1.5-1.5V7.864a.5.5 0 0 0-1 0V14.5a.5.5 0 0 1-.5.5h-10a.5.5 0 0 1-.5-.5v-10a.5.5 0 0 1 .5-.5h6.636a.5.5 0 0 0 .5-.5z"/>
<path fill-rule="evenodd" d="M16 .5a.5.5 0 0 0-.5-.5h-5a.5.5 0 0 0 0 1h3.793L6.146 9.146a.5.5 0 1 0 .708.708L15 1.707V5.5a.5.5 0 0 0 1 0v-5z"/>
</svg>
</symbol>
</svg> </svg>
<div class="side-bar"> <div class="side-bar">
@ -55,6 +48,14 @@ layout: table_wrappers
</a> </a>
</div> </div>
<nav role="navigation" aria-label="Main" id="site-nav" class="site-nav"> <nav role="navigation" aria-label="Main" id="site-nav" class="site-nav">
{% assign pages_top_size = site.html_pages
| where_exp:"item", "item.title != nil"
| where_exp:"item", "item.parent == nil"
| where_exp:"item", "item.nav_exclude != true"
| size %}
{% if pages_top_size > 0 %}
{% include nav.html pages=site.html_pages key=nil %}
{% endif %}
{% if site.just_the_docs.collections %} {% if site.just_the_docs.collections %}
{% assign collections_size = site.just_the_docs.collections | size %} {% assign collections_size = site.just_the_docs.collections | size %}
{% for collection_entry in site.just_the_docs.collections %} {% for collection_entry in site.just_the_docs.collections %}
@ -62,14 +63,26 @@ layout: table_wrappers
{% assign collection_value = collection_entry[1] %} {% assign collection_value = collection_entry[1] %}
{% assign collection = site[collection_key] %} {% assign collection = site[collection_key] %}
{% if collection_value.nav_exclude != true %} {% if collection_value.nav_exclude != true %}
{% if collections_size > 1 %} {% if collections_size > 1 or pages_top_size > 0 %}
<div class="nav-category">{{ collection_value.name }}</div> {% if collection_value.nav_fold == true %}
<ul class="nav-list nav-category-list">
<li class="nav-list-item{% if page.collection == collection_key %} active{% endif %}">
{%- if collection.size > 0 -%}
<a href="#" class="nav-list-expander"><svg viewBox="0 0 24 24"><use xlink:href="#svg-arrow-right"></use></svg></a>
{%- endif -%}
<div class="nav-category">{{ collection_value.name }}</div>
{% include nav.html pages=collection key=collection_key %}
</li>
</ul>
{% else %}
<div class="nav-category">{{ collection_value.name }}</div>
{% include nav.html pages=collection key=collection_key %}
{% endif %}
{% else %}
{% include nav.html pages=collection key=collection_key %}
{% endif %} {% endif %}
{% include nav.html pages=collection %}
{% endif %} {% endif %}
{% endfor %} {% endfor %}
{% else %}
{% include nav.html pages=site.html_pages %}
{% endif %} {% endif %}
</nav> </nav>
<footer class="site-footer"> <footer class="site-footer">
@ -110,21 +123,6 @@ layout: table_wrappers
<div id="main-content-wrap" class="main-content-wrap"> <div id="main-content-wrap" class="main-content-wrap">
{% unless page.url == "/" %} {% unless page.url == "/" %}
{% if page.parent %} {% if page.parent %}
{%- for node in pages_list -%}
{%- if node.parent == nil -%}
{%- if page.parent == node.title or page.grand_parent == node.title -%}
{%- assign first_level_url = node.url | absolute_url -%}
{%- endif -%}
{%- if node.has_children -%}
{%- assign children_list = pages_list | where: "parent", node.title -%}
{%- for child in children_list -%}
{%- if page.url == child.url or page.parent == child.title -%}
{%- assign second_level_url = child.url | absolute_url -%}
{%- endif -%}
{%- endfor -%}
{%- endif -%}
{%- endif -%}
{%- endfor -%}
<nav aria-label="Breadcrumb" class="breadcrumb-nav"> <nav aria-label="Breadcrumb" class="breadcrumb-nav">
<ol class="breadcrumb-nav-list"> <ol class="breadcrumb-nav-list">
{% if page.grand_parent %} {% if page.grand_parent %}
@ -150,8 +148,7 @@ layout: table_wrappers
<hr> <hr>
<h2 class="text-delta">Table of contents</h2> <h2 class="text-delta">Table of contents</h2>
<ul> <ul>
{%- assign children_list = pages_list | where: "parent", page.title | where: "grand_parent", page.parent -%} {% for child in toc_list %}
{% for child in children_list %}
<li> <li>
<a href="{{ child.url | absolute_url }}">{{ child.title }}</a>{% if child.summary %} - {{ child.summary }}{% endif %} <a href="{{ child.url | absolute_url }}">{{ child.title }}</a>{% if child.summary %} - {{ child.summary }}{% endif %}
</li> </li>

View File

@ -13,7 +13,7 @@ It can be challenging to discover anomalies using conventional methods such as c
Anomaly detection automatically detects anomalies in your OpenSearch data in near real-time using the Random Cut Forest (RCF) algorithm. RCF is an unsupervised machine learning algorithm that models a sketch of your incoming data stream to compute an `anomaly grade` and `confidence score` value for each incoming data point. These values are used to differentiate an anomaly from normal variations. For more information about how RCF works, see [Random Cut Forests](https://pdfs.semanticscholar.org/8bba/52e9797f2e2cc9a823dbd12514d02f29c8b9.pdf?_ga=2.56302955.1913766445.1574109076-1059151610.1574109076). Anomaly detection automatically detects anomalies in your OpenSearch data in near real-time using the Random Cut Forest (RCF) algorithm. RCF is an unsupervised machine learning algorithm that models a sketch of your incoming data stream to compute an `anomaly grade` and `confidence score` value for each incoming data point. These values are used to differentiate an anomaly from normal variations. For more information about how RCF works, see [Random Cut Forests](https://pdfs.semanticscholar.org/8bba/52e9797f2e2cc9a823dbd12514d02f29c8b9.pdf?_ga=2.56302955.1913766445.1574109076-1059151610.1574109076).
You can pair the anomaly detection plugin with the [alerting plugin](../alerting/) to notify you as soon as an anomaly is detected. You can pair the anomaly detection plugin with the [alerting plugin]({{site.url}}{{site.baseurl}}/monitoring-plugins/alerting/) to notify you as soon as an anomaly is detected.
To use the anomaly detection plugin, your computer needs to have more than one CPU core. To use the anomaly detection plugin, your computer needs to have more than one CPU core.
{: .note } {: .note }
@ -100,11 +100,11 @@ Examine the sample preview and use it to fine-tune your feature configurations (
Choose the **Anomaly results** tab. You need to wait for some time to see the anomaly results. If the detector interval is 10 minutes, the detector might take more than an hour to start, as it's waiting for sufficient data to generate anomalies. Choose the **Anomaly results** tab. You need to wait for some time to see the anomaly results. If the detector interval is 10 minutes, the detector might take more than an hour to start, as it's waiting for sufficient data to generate anomalies.
A shorter interval means the model passes the shingle process more quickly and starts to generate the anomaly results sooner. A shorter interval means the model passes the shingle process more quickly and starts to generate the anomaly results sooner.
Use the [profile detector](./api#profile-detector) operation to make sure you have sufficient data points. Use the [profile detector]({{site.url}}{{site.baseurl}}/monitoring-plugins/ad/api#profile-detector) operation to make sure you have sufficient data points.
If you see the detector pending in "initialization" for longer than a day, aggregate your existing data using the detector interval to check for any missing data points. If you find a lot of missing data points from the aggregated data, consider increasing the detector interval. If you see the detector pending in "initialization" for longer than a day, aggregate your existing data using the detector interval to check for any missing data points. If you find a lot of missing data points from the aggregated data, consider increasing the detector interval.
![Anomaly detection results](../images/ad.png) ![Anomaly detection results]({{site.url}}{{site.baseurl}}/images/ad.png)
Analize anomalies with the following visualizations: Analize anomalies with the following visualizations:
@ -113,7 +113,7 @@ Analize anomalies with the following visualizations:
- **Feature breakdown** - plots the features based on the aggregation method. You can vary the date-time range of the detector. - **Feature breakdown** - plots the features based on the aggregation method. You can vary the date-time range of the detector.
- **Anomaly occurrence** - shows the `Start time`, `End time`, `Data confidence`, and `Anomaly grade` for each detected anomaly. - **Anomaly occurrence** - shows the `Start time`, `End time`, `Data confidence`, and `Anomaly grade` for each detected anomaly.
`Anomaly grade` is a number between 0 and 1 that indicates how anomalous a data point is. An anomaly grade of 0 represents “not an anomaly,” and a non-zero value represents the relative severity of the anomaly. `Anomaly grade` is a number between 0 and 1 that indicates how anomalous a data point is. An anomaly grade of 0 represents “not an anomaly,” and a non-zero value represents the relative severity of the anomaly.
`Data confidence` is an estimate of the probability that the reported anomaly grade matches the expected anomaly grade. Confidence increases as the model observes more data and learns the data behavior and trends. Note that confidence is distinct from model accuracy. `Data confidence` is an estimate of the probability that the reported anomaly grade matches the expected anomaly grade. Confidence increases as the model observes more data and learns the data behavior and trends. Note that confidence is distinct from model accuracy.
@ -124,7 +124,7 @@ Choose a filled rectangle to see a more detailed view of the anomaly.
### Step 4: Set up alerts ### Step 4: Set up alerts
Choose **Set up alerts** and configure a monitor to notify you when anomalies are detected. For steps to create a monitor and set up notifications based on your anomaly detector, see [Monitors](../alerting/monitors/). Choose **Set up alerts** and configure a monitor to notify you when anomalies are detected. For steps to create a monitor and set up notifications based on your anomaly detector, see [Monitors]({{site.url}}{{site.baseurl}}/monitoring-plugins/alerting/monitors/).
If you stop or delete a detector, make sure to delete any monitors associated with it. If you stop or delete a detector, make sure to delete any monitors associated with it.

View File

@ -10,24 +10,24 @@ has_children: false
You can use the security plugin with anomaly detection in OpenSearch to limit non-admin users to specific actions. For example, you might want some users to only be able to create, update, or delete detectors, while others to only view detectors. You can use the security plugin with anomaly detection in OpenSearch to limit non-admin users to specific actions. For example, you might want some users to only be able to create, update, or delete detectors, while others to only view detectors.
All anomaly detection indices are protected as system indices. Only a super admin user or an admin user with a TLS certificate can access system indices. For more information, see [System indices](../../security/configuration/system-indices/). All anomaly detection indices are protected as system indices. Only a super admin user or an admin user with a TLS certificate can access system indices. For more information, see [System indices]({{site.url}}{{site.baseurl}}/security-plugin/configuration/system-indices/).
Security for anomaly detection works the same as [security for alerting](../../alerting/security/). Security for anomaly detection works the same as [security for alerting]({{site.url}}{{site.baseurl}}/monitoring-plugins/alerting/security/).
## Basic permissions ## Basic permissions
As an admin user, you can use the security plugin to assign specific permissions to users based on which APIs they need access to. For a list of supported APIs, see [Anomaly detection API](../api/). As an admin user, you can use the security plugin to assign specific permissions to users based on which APIs they need access to. For a list of supported APIs, see [Anomaly detection API]({{site.url}}{{site.baseurl}}/monitoring-plugins/ad/api/).
The security plugin has two built-in roles that cover most anomaly detection use cases: `anomaly_full_access` and `anomaly_read_access`. For descriptions of each, see [Predefined roles](../../security/access-control/users-roles/#predefined-roles). The security plugin has two built-in roles that cover most anomaly detection use cases: `anomaly_full_access` and `anomaly_read_access`. For descriptions of each, see [Predefined roles]({{site.url}}{{site.baseurl}}/security-plugin/access-control/users-roles/#predefined-roles).
If these roles don't meet your needs, mix and match individual anomaly detection [permissions](../../security/access-control/permissions/) to suit your use case. Each action corresponds to an operation in the REST API. For example, the `cluster:admin/opensearch/ad/detector/delete` permission lets you delete detectors. If these roles don't meet your needs, mix and match individual anomaly detection [permissions]({{site.url}}{{site.baseurl}}/security-plugin/access-control/permissions/) to suit your use case. Each action corresponds to an operation in the REST API. For example, the `cluster:admin/opensearch/ad/detector/delete` permission lets you delete detectors.
## (Advanced) Limit access by backend role ## (Advanced) Limit access by backend role
Use backend roles to configure fine-grained access to individual detectors based on roles. For example, users of different departments in an organization can view detectors owned by their own department. Use backend roles to configure fine-grained access to individual detectors based on roles. For example, users of different departments in an organization can view detectors owned by their own department.
First, make sure your users have the appropriate [backend roles](../../security/access-control/). Backend roles usually come from an [LDAP server](../../security/configuration/ldap/) or [SAML provider](../../security/configuration/saml/), but if you use the internal user database, you can use the REST API to [add them manually](../../security/access-control/api/#create-user). First, make sure your users have the appropriate [backend roles]({{site.url}}{{site.baseurl}}/security-plugin/access-control/). Backend roles usually come from an [LDAP server]({{site.url}}{{site.baseurl}}/security-plugin/configuration/ldap/) or [SAML provider]({{site.url}}{{site.baseurl}}/security-plugin/configuration/saml/), but if you use the internal user database, you can use the REST API to [add them manually]({{site.url}}{{site.baseurl}}/security-plugin/access-control/api/#create-user).
Next, enable the following setting: Next, enable the following setting:

View File

@ -174,7 +174,7 @@ If you use a custom webhook for your destination and need to embed JSON in the m
} }
``` ```
If you want to specify a timezone, you can do so by including a [cron expression](../cron/) with a timezone name in the `schedule` section of your request. If you want to specify a timezone, you can do so by including a [cron expression]({{site.url}}{{site.baseurl}}/monitoring-plugins/alerting/cron/) with a timezone name in the `schedule` section of your request.
The following example creates a monitor that runs at 12:10 PM Pacific Time on the 1st day of every month. The following example creates a monitor that runs at 12:10 PM Pacific Time on the 1st day of every month.

View File

@ -4,6 +4,9 @@ title: Cron
nav_order: 20 nav_order: 20
parent: Alerting parent: Alerting
has_children: false has_children: false
redirect_from:
- /alerting/cron/
- /docs/alerting/cron/
--- ---
# Cron expression reference # Cron expression reference
@ -61,4 +64,4 @@ Every three hours on the first day of every other month:
## API ## API
For an example of how to use a custom cron expression in an API call, see the [create monitor API operation](../api/#request-1). For an example of how to use a custom cron expression in an API call, see the [create monitor API operation]({{site.url}}{{site.baseurl}}/monitoring-plugins/alerting/api/#request-1).

View File

@ -13,4 +13,4 @@ The alerting feature notifies you when data from one or more OpenSearch indices
To get started, choose **Alerting** in OpenSearch Dashboards. To get started, choose **Alerting** in OpenSearch Dashboards.
![OpenSearch Dashboards side bar with link](../images/alerting.png) ![OpenSearch Dashboards side bar with link]({{site.url}}{{site.baseurl}}/images/alerting.png)

View File

@ -102,7 +102,7 @@ POST _nodes/reload_secure_settings
1. Choose **Alerting**, **Monitors**, **Create monitor**. 1. Choose **Alerting**, **Monitors**, **Create monitor**.
1. Specify a name for the monitor. 1. Specify a name for the monitor.
The anomaly detection option is for pairing with the anomaly detection plugin. See [Anomaly Detection](../../ad/). The anomaly detection option is for pairing with the anomaly detection plugin. See [Anomaly Detection]({{site.url}}{{site.baseurl}}/monitoring-plugins/ad/).
For anomaly detector, choose an appropriate schedule for the monitor based on the detector interval. Otherwise, the alerting monitor might miss reading the results. For anomaly detector, choose an appropriate schedule for the monitor based on the detector interval. Otherwise, the alerting monitor might miss reading the results.
For example, assume you set the monitor interval and the detector interval as 5 minutes, and you start the detector at 12:00. If an anomaly is detected at 12:05, it might be available at 12:06 because of the delay between writing the anomaly and it being available for queries. The monitor reads the anomaly results between 12:00 and 12:05, so it does not get the anomaly results available at 12:06. For example, assume you set the monitor interval and the detector interval as 5 minutes, and you start the detector at 12:00. If an anomaly is detected at 12:05, it might be available at 12:06 because of the delay between writing the anomaly and it being available for queries. The monitor reads the anomaly results between 12:00 and 12:05, so it does not get the anomaly results available at 12:06.
@ -114,13 +114,13 @@ Whenever you update a detectors interval, make sure to update the associated
1. Choose one or more indices. You can also use `*` as a wildcard to specify an index pattern. 1. Choose one or more indices. You can also use `*` as a wildcard to specify an index pattern.
If you use the security plugin, you can only choose indices that you have permission to access. For details, see [Alerting security](../security/). If you use the security plugin, you can only choose indices that you have permission to access. For details, see [Alerting security]({{site.url}}{{site.baseurl}}/security-plugin/).
1. Define the monitor in one of three ways: visually, using a query, or using an anomaly detector. 1. Define the monitor in one of three ways: visually, using a query, or using an anomaly detector.
- Visual definition works well for monitors that you can define as "some value is above or below some threshold for some amount of time." - Visual definition works well for monitors that you can define as "some value is above or below some threshold for some amount of time."
- Query definition gives you flexibility in terms of what you query for (using [the OpenSearch query DSL](../../opensearch/full-text)) and how you evaluate the results of that query (Painless scripting). - Query definition gives you flexibility in terms of what you query for (using [the OpenSearch query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text)) and how you evaluate the results of that query (Painless scripting).
This example averages the `cpu_usage` field: This example averages the `cpu_usage` field:
@ -172,12 +172,12 @@ Whenever you update a detectors interval, make sure to update the associated
1. To define a monitor visually, choose **Define using visual graph**. Then choose an aggregation (for example, `count()` or `average()`), a set of documents, and a timeframe. Visual definition works well for most monitors. 1. To define a monitor visually, choose **Define using visual graph**. Then choose an aggregation (for example, `count()` or `average()`), a set of documents, and a timeframe. Visual definition works well for most monitors.
To use a query, choose **Define using extraction query**, add your query (using [the OpenSearch query DSL](../../opensearch/full-text/)), and test it using the **Run** button. To use a query, choose **Define using extraction query**, add your query (using [the OpenSearch query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/)), and test it using the **Run** button.
The monitor makes this query to OpenSearch as often as the schedule dictates; check the **Query Performance** section and make sure you're comfortable with the performance implications. The monitor makes this query to OpenSearch as often as the schedule dictates; check the **Query Performance** section and make sure you're comfortable with the performance implications.
To use an anomaly detector, choose **Define using Anomaly detector** and select your **Detector**. To use an anomaly detector, choose **Define using Anomaly detector** and select your **Detector**.
1. Choose a frequency and timezone for your monitor. Note that you can only pick a timezone if you choose Daily, Weekly, Monthly, or [custom cron expression](../cron/) for frequency. 1. Choose a frequency and timezone for your monitor. Note that you can only pick a timezone if you choose Daily, Weekly, Monthly, or [custom cron expression]({{site.url}}{{site.baseurl}}/monitoring-plugins/alerting/cron/) for frequency.
1. Choose **Create**. 1. Choose **Create**.
@ -265,7 +265,7 @@ Below are some variables you can include in your message using Mustache template
Variable | Data Type | Description Variable | Data Type | Description
:--- | :--- | :--- :--- | :--- | :---
`ctx.monitor` | JSON | Includes `ctx.monitor.name`, `ctx.monitor.type`, `ctx.monitor.enabled`, `ctx.monitor.enabled_time`, `ctx.monitor.schedule`, `ctx.monitor.inputs`, `triggers` and `ctx.monitor.last_update_time`. `ctx.monitor` | JSON | Includes `ctx.monitor.name`, `ctx.monitor.type`, `ctx.monitor.enabled`, `ctx.monitor.enabled_time`, `ctx.monitor.schedule`, `ctx.monitor.inputs`, `triggers` and `ctx.monitor.last_update_time`.
`ctx.monitor.user` | JSON | Includes information about the user who created the monitor. Includes `ctx.monitor.user.backend_roles` and `ctx.monitor.user.roles`, which are arrays that contain the backend roles and roles assigned to the user. See [alerting security](../security/) for more information. `ctx.monitor.user` | JSON | Includes information about the user who created the monitor. Includes `ctx.monitor.user.backend_roles` and `ctx.monitor.user.roles`, which are arrays that contain the backend roles and roles assigned to the user. See [alerting security]({{site.url}}{{site.baseurl}}/monitoring-plugins/alerting/security/) for more information.
`ctx.monitor.enabled` | Boolean | Whether the monitor is enabled. `ctx.monitor.enabled` | Boolean | Whether the monitor is enabled.
`ctx.monitor.enabled_time` | Milliseconds | Unix epoch time of when the monitor was last enabled. `ctx.monitor.enabled_time` | Milliseconds | Unix epoch time of when the monitor was last enabled.
`ctx.monitor.schedule` | JSON | Contains a schedule of how often or when the monitor should run. `ctx.monitor.schedule` | JSON | Contains a schedule of how often or when the monitor should run.

View File

@ -1,6 +1,6 @@
--- ---
layout: default layout: default
title: Alerting Security title: Alerting security
nav_order: 10 nav_order: 10
parent: Alerting parent: Alerting
has_children: false has_children: false
@ -13,9 +13,9 @@ If you use the security plugin alongside alerting, you might want to limit certa
## Basic permissions ## Basic permissions
The security plugin has three built-in roles that cover most alerting use cases: `alerting_read_access`, `alerting_ack_alerts`, and `alerting_full_access`. For descriptions of each, see [Predefined roles](../../security/access-control/users-roles/#predefined-roles). The security plugin has three built-in roles that cover most alerting use cases: `alerting_read_access`, `alerting_ack_alerts`, and `alerting_full_access`. For descriptions of each, see [Predefined roles]({{site.url}}{{site.baseurl}}/security-plugin/access-control/users-roles/#predefined-roles).
If these roles don't meet your needs, mix and match individual alerting [permissions](../../security/access-control/permissions/) to suit your use case. Each action corresponds to an operation in the REST API. For example, the `cluster:admin/opensearch/alerting/destination/delete` permission lets you delete destinations. If these roles don't meet your needs, mix and match individual alerting [permissions]({{site.url}}{{site.baseurl}}/security-plugin/access-control/permissions/) to suit your use case. Each action corresponds to an operation in the REST API. For example, the `cluster:admin/opensearch/alerting/destination/delete` permission lets you delete destinations.
## How monitors access data ## How monitors access data
@ -29,14 +29,14 @@ Later, the user `psantos` wants to edit the monitor to run every two hours, but
- Update the monitor so that it only checks `store1-returns`. - Update the monitor so that it only checks `store1-returns`.
- Ask an administrator for read access to the other two indices. - Ask an administrator for read access to the other two indices.
After making the change, the monitor now runs with the same permissions as `psantos`, including any [document-level security](../../security/access-control/document-level-security/) queries, [excluded fields](../../security/access-control/field-level-security/), and [masked fields](../../security/access-control/field-masking/). If you use an extraction query to define your monitor, use the **Run** button to ensure that the response includes the fields you need. After making the change, the monitor now runs with the same permissions as `psantos`, including any [document-level security]({{site.url}}{{site.baseurl}}/security-plugin/access-control/document-level-security/) queries, [excluded fields]({{site.url}}{{site.baseurl}}/security-plugin/access-control/field-level-security/), and [masked fields]({{site.url}}{{site.baseurl}}/security-plugin/access-control/field-masking/). If you use an extraction query to define your monitor, use the **Run** button to ensure that the response includes the fields you need.
## (Advanced) Limit access by backend role ## (Advanced) Limit access by backend role
Out of the box, the alerting plugin has no concept of ownership. For example, if you have the `cluster:admin/opensearch/alerting/monitor/write` permission, you can edit *all* monitors, regardless of whether you created them. If a small number of trusted users manage your monitors and destinations, this lack of ownership generally isn't a problem. A larger organization might need to segment access by backend role. Out of the box, the alerting plugin has no concept of ownership. For example, if you have the `cluster:admin/opensearch/alerting/monitor/write` permission, you can edit *all* monitors, regardless of whether you created them. If a small number of trusted users manage your monitors and destinations, this lack of ownership generally isn't a problem. A larger organization might need to segment access by backend role.
First, make sure that your users have the appropriate [backend roles](../../security/access-control/). Backend roles usually come from an [LDAP server](../../security/configuration/ldap/) or [SAML provider](../../security/configuration/saml/). However, if you use the internal user database, you can use the REST API to [add them manually](../../security/access-control/api/#create-user). First, make sure that your users have the appropriate [backend roles]({{site.url}}{{site.baseurl}}/security-plugin/access-control/). Backend roles usually come from an [LDAP server]({{site.url}}{{site.baseurl}}/security-plugin/configuration/ldap/) or [SAML provider]({{site.url}}{{site.baseurl}}/security-plugin/configuration/saml/). However, if you use the internal user database, you can use the REST API to [add them manually]({{site.url}}{{site.baseurl}}/security-plugin/access-control/api/#create-user).
Next, enable the following setting: Next, enable the following setting:
@ -58,7 +58,7 @@ If `jdoe` creates a monitor, `jroe` can see and modify it, but `psantos` can't.
<!-- ## (Advanced) Limit access by individual <!-- ## (Advanced) Limit access by individual
If you only want users to be able to see and modify their own monitors and destinations, duplicate the `alerting_full_access` role and add the following [DLS query](../../security/access-control/document-level-security/) to it: If you only want users to be able to see and modify their own monitors and destinations, duplicate the `alerting_full_access` role and add the following [DLS query]({{site.url}}{{site.baseurl}}/security-plugin/access-control/document-level-security/) to it:
```json ```json
{ {

View File

@ -10,13 +10,13 @@ nav_order: 5
## Alerting indices ## Alerting indices
The alerting feature creates several indices and one alias. The security plugin demo script configures them as [system indices](../../security/configuration/system-indices/) for an extra layer of protection. Don't delete these indices or modify their contents without using the alerting APIs. The alerting feature creates several indices and one alias. The security plugin demo script configures them as [system indices]({{site.url}}{{site.baseurl}}/security-plugin/configuration/system-indices/) for an extra layer of protection. Don't delete these indices or modify their contents without using the alerting APIs.
Index | Purpose Index | Purpose
:--- | :--- :--- | :---
`.opendistro-alerting-alerts` | Stores ongoing alerts. `.opendistro-alerting-alerts` | Stores ongoing alerts.
`.opendistro-alerting-alert-history-<date>` | Stores a history of completed alerts. `.opendistro-alerting-alert-history-<date>` | Stores a history of completed alerts.
`.opendistro-alerting-config` | Stores monitors, triggers, and destinations. [Take a snapshot](../../opensearch/snapshot-restore) of this index to back up your alerting configuration. `.opendistro-alerting-config` | Stores monitors, triggers, and destinations. [Take a snapshot]({{site.url}}{{site.baseurl}}/opensearch/snapshot-restore) of this index to back up your alerting configuration.
`.opendistro-alerting-alert-history-write` (alias) | Provides a consistent URI for the `.opendistro-alerting-alert-history-<date>` index. `.opendistro-alerting-alert-history-write` (alias) | Provides a consistent URI for the `.opendistro-alerting-alert-history-<date>` index.
All alerting indices are hidden by default. For a summary, make the following request: All alerting indices are hidden by default. For a summary, make the following request:
@ -51,7 +51,7 @@ Setting | Default | Description
`plugins.alerting.alert_history_enabled` | true | Whether to create `.opendistro-alerting-alert-history-<date>` indices. `plugins.alerting.alert_history_enabled` | true | Whether to create `.opendistro-alerting-alert-history-<date>` indices.
`plugins.alerting.alert_history_retention_period` | 60d | The amount of time to keep history indices before automatically deleting them. `plugins.alerting.alert_history_retention_period` | 60d | The amount of time to keep history indices before automatically deleting them.
`plugins.alerting.destination.allow_list` | ["chime", "slack", "custom_webhook", "email", "test_action"] | The list of allowed destinations. If you don't want to allow users to a certain type of destination, you can remove it from this list, but we recommend leaving this setting as-is. `plugins.alerting.destination.allow_list` | ["chime", "slack", "custom_webhook", "email", "test_action"] | The list of allowed destinations. If you don't want to allow users to a certain type of destination, you can remove it from this list, but we recommend leaving this setting as-is.
`plugins.alerting.filter_by_backend_roles` | "false" | Restricts access to monitors by backend role. See [Alerting security](../security/). `plugins.alerting.filter_by_backend_roles` | "false" | Restricts access to monitors by backend role. See [Alerting security]({{site.url}}{{site.baseurl}}/monitoring-plugins/alerting/security/).
`plugins.scheduled_jobs.sweeper.period` | 5m | The alerting feature uses its "job sweeper" component to periodically check for new or updated jobs. This setting is the rate at which the sweeper checks to see if any jobs (monitors) have changed and need to be rescheduled. `plugins.scheduled_jobs.sweeper.period` | 5m | The alerting feature uses its "job sweeper" component to periodically check for new or updated jobs. This setting is the rate at which the sweeper checks to see if any jobs (monitors) have changed and need to be rescheduled.
`plugins.scheduled_jobs.sweeper.page_size` | 100 | The page size for the sweeper. You shouldn't need to change this value. `plugins.scheduled_jobs.sweeper.page_size` | 100 | The page size for the sweeper. You shouldn't need to change this value.
`plugins.scheduled_jobs.sweeper.backoff_millis` | 50ms | The amount of time the sweeper waits between retries---increases exponentially after each failed retry. `plugins.scheduled_jobs.sweeper.backoff_millis` | 50ms | The amount of time the sweeper waits between retries---increases exponentially after each failed retry.

View File

@ -19,7 +19,7 @@ Note the use of port 9600. Provide parameters for metrics, aggregations, dimensi
?metrics=<metrics>&agg=<aggregations>&dim=<dimensions>&nodes=all" ?metrics=<metrics>&agg=<aggregations>&dim=<dimensions>&nodes=all"
``` ```
For a full list of metrics, see [Metrics reference](../reference/). Performance Analyzer updates its data every five seconds. If you create a custom client, we recommend using that same interval for calls to the API. For a full list of metrics, see [Metrics reference]({{site.url}}{{site.baseurl}}/monitoring-plugins/pa/reference/). Performance Analyzer updates its data every five seconds. If you create a custom client, we recommend using that same interval for calls to the API.
#### Sample request #### Sample request

View File

@ -32,7 +32,7 @@ The best way to get started with building custom dashboards is to duplicate and
PerfTop positions elements within a grid. For example, consider this 12 * 12 grid. PerfTop positions elements within a grid. For example, consider this 12 * 12 grid.
![Dashboard grid](../../images/perftop-grid.png) ![Dashboard grid]({{site.url}}{{site.baseurl}}/images/perftop-grid.png)
The upper-left of the grid represents row 0, column 0, so the starting positions for the three boxes are: The upper-left of the grid represents row 0, column 0, so the starting positions for the three boxes are:
@ -95,7 +95,7 @@ At this point, however, all the JSON does is define the size and position of thr
## Add queries ## Add queries
Queries use the same elements as the [REST API](../api/), just in JSON form: Queries use the same elements as the [REST API]({{site.url}}{{site.baseurl}}/monitoring-plugins/pa/api/), just in JSON form:
```json ```json
{ {
@ -108,7 +108,7 @@ Queries use the same elements as the [REST API](../api/), just in JSON form:
} }
``` ```
For details on available metrics, see [Metrics reference](../reference/). For details on available metrics, see [Metrics reference]({{site.url}}{{site.baseurl}}/monitoring-plugins/pa/reference/).
## Add options ## Add options

View File

@ -17,7 +17,7 @@ You can also install it using [npm](https://www.npmjs.com/):
npm install -g @aws/opensearch-perftop npm install -g @aws/opensearch-perftop
``` ```
![PerfTop screenshot](../images/perftop.png) ![PerfTop screenshot]({{site.url}}{{site.baseurl}}/images/perftop.png)
## Get started with PerfTop ## Get started with PerfTop
@ -46,7 +46,7 @@ Otherwise, just specify the OpenSearch endpoint:
./opensearch-perf-top-macos --dashboard dashboards/<dashboard>.json --endpoint my-cluster.my-domain.com ./opensearch-perf-top-macos --dashboard dashboards/<dashboard>.json --endpoint my-cluster.my-domain.com
``` ```
PerfTop has four pre-built dashboards in the `dashboards` directory, but you can also [create your own](dashboards/). PerfTop has four pre-built dashboards in the `dashboards` directory, but you can also [create your own]({{site.url}}{{site.baseurl}}/dashboards/).
You can also load the pre-built dashboards (ClusterOverview, ClusterNetworkMemoryAnalysis, ClusterThreadAnalysis, or NodeAnalysis) without the JSON files, such as `--dashboard ClusterThreadAnalysis`. You can also load the pre-built dashboards (ClusterOverview, ClusterNetworkMemoryAnalysis, ClusterThreadAnalysis, or NodeAnalysis) without the JSON files, such as `--dashboard ClusterThreadAnalysis`.

View File

@ -8,4 +8,4 @@ nav_order: 3
# RCA reference # RCA reference
You can find a reference of available RCAs and their purposes on [Github](https://github.com/opensearch-project/performance-analyzer-rca/tree/main/docs). You can find a reference of available RCAs and their purposes on [GitHub](https://github.com/opensearch-project/performance-analyzer-rca/tree/main/docs).

View File

@ -7,7 +7,7 @@ nav_order: 25
# Data Prepper configuration reference # Data Prepper configuration reference
This page lists all supported Data Prepper sources, buffers, preppers, and sinks, along with their associated options. For example configuration files, see [Data Prepper](../data-prepper/). This page lists all supported Data Prepper sources, buffers, preppers, and sinks, along with their associated options. For example configuration files, see [Data Prepper]({{site.url}}{{site.baseurl}}/monitoring-plugins/trace/data-prepper/).
## Data Prepper server options ## Data Prepper server options
@ -149,7 +149,7 @@ aws_region | No | String, AWS region for the cluster (e.g. `"us-east-1"`) if you
trace_analytics_raw | No | Boolean, default false. Whether to export as trace data to the `otel-v1-apm-span-*` index pattern (alias `otel-v1-apm-span`) for use with the Trace Analytics OpenSearch Dashboards plugin. trace_analytics_raw | No | Boolean, default false. Whether to export as trace data to the `otel-v1-apm-span-*` index pattern (alias `otel-v1-apm-span`) for use with the Trace Analytics OpenSearch Dashboards plugin.
trace_analytics_service_map | No | Boolean, default false. Whether to export as trace data to the `otel-v1-apm-service-map` index for use with the service map component of the Trace Analytics OpenSearch Dashboards plugin. trace_analytics_service_map | No | Boolean, default false. Whether to export as trace data to the `otel-v1-apm-service-map` index for use with the service map component of the Trace Analytics OpenSearch Dashboards plugin.
index | No | String, name of the index to export to. Only required if you don't use the `trace_analytics_raw` or `trace_analytics_service_map` presets. index | No | String, name of the index to export to. Only required if you don't use the `trace_analytics_raw` or `trace_analytics_service_map` presets.
template_file | No | String, the path to a JSON [index template](../../opensearch/index-templates/) file (e.g. `/your/local/template-file.json` if you do not use the `trace_analytics_raw` or `trace_analytics_service_map`. See [otel-v1-apm-span-index-template.json](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/opensearch/src/main/resources/otel-v1-apm-span-index-template.json) for an example. template_file | No | String, the path to a JSON [index template]({{site.url}}{{site.baseurl}}/opensearch/index-templates/) file (e.g. `/your/local/template-file.json` if you do not use the `trace_analytics_raw` or `trace_analytics_service_map`. See [otel-v1-apm-span-index-template.json](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/opensearch/src/main/resources/otel-v1-apm-span-index-template.json) for an example.
document_id_field | No | String, the field from the source data to use for the OpenSearch document ID (e.g. `"my-field"`) if you don't use the `trace_analytics_raw` or `trace_analytics_service_map` presets. document_id_field | No | String, the field from the source data to use for the OpenSearch document ID (e.g. `"my-field"`) if you don't use the `trace_analytics_raw` or `trace_analytics_service_map` presets.
dlq_file | No | String, the path to your preferred dead letter queue file (e.g. `/your/local/dlq-file`). Data Prepper writes to this file when it fails to index a document on the OpenSearch cluster. dlq_file | No | String, the path to your preferred dead letter queue file (e.g. `/your/local/dlq-file`). Data Prepper writes to this file when it fails to index a document on the OpenSearch cluster.
bulk_size | No | Integer (long), default 5. The maximum size (in MiB) of bulk requests to the OpenSearch cluster. Values below 0 indicate an unlimited size. If a single document exceeds the maximum bulk request size, Data Prepper sends it individually. bulk_size | No | Integer (long), default 5. The maximum size (in MiB) of bulk requests to the OpenSearch cluster. Values below 0 indicate an unlimited size. If a single document exceeds the maximum bulk request size, Data Prepper sends it individually.

View File

@ -105,7 +105,7 @@ service-map-pipeline:
trace_analytics_service_map: true trace_analytics_service_map: true
``` ```
To learn more, see the [Data Prepper configuration reference](../data-prepper-reference/). To learn more, see the [Data Prepper configuration reference]({{site.url}}{{site.baseurl}}/monitoring-plugins/trace/data-prepper-reference/).
## Configure the Data Prepper server ## Configure the Data Prepper server
Data Prepper itself provides administrative HTTP endpoints such as `/list` to list pipelines and `/metrics/prometheus` to provide Prometheus-compatible metrics data. The port which serves these endpoints, as well as TLS configuration, is specified by a separate YAML file. Example: Data Prepper itself provides administrative HTTP endpoints such as `/list` to list pipelines and `/metrics/prometheus` to provide Prometheus-compatible metrics data. The port which serves these endpoints, as well as TLS configuration, is specified by a separate YAML file. Example:

View File

@ -12,7 +12,7 @@ OpenSearch Trace Analytics consists of two components---Data Prepper and the Tra
## Basic flow of data ## Basic flow of data
![Data flow diagram from a distributed application to OpenSearch](../../images/ta.svg) ![Data flow diagram from a distributed application to OpenSearch]({{site.url}}{{site.baseurl}}/images/ta.svg)
1. Trace Analytics relies on you adding instrumentation to your application and generating trace data. The [OpenTelemetry documentation](https://opentelemetry.io/docs/) contains example applications for many programming languages that can help you get started, including Java, Python, Go, and JavaScript. 1. Trace Analytics relies on you adding instrumentation to your application and generating trace data. The [OpenTelemetry documentation](https://opentelemetry.io/docs/) contains example applications for many programming languages that can help you get started, including Java, Python, Go, and JavaScript.
@ -20,9 +20,9 @@ OpenSearch Trace Analytics consists of two components---Data Prepper and the Tra
1. The [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/getting-started/) receives data from the application and formats it into OpenTelemetry data. 1. The [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/getting-started/) receives data from the application and formats it into OpenTelemetry data.
1. [Data Prepper](../data-prepper/) processes the OpenTelemetry data, transforms it for use in OpenSearch, and indexes it on an OpenSearch cluster. 1. [Data Prepper]({{site.url}}{{site.baseurl}}/monitoring-plugins/trace/data-prepper/) processes the OpenTelemetry data, transforms it for use in OpenSearch, and indexes it on an OpenSearch cluster.
1. The [Trace Analytics OpenSearch Dashboards plugin](../ta-opensearch-dashboards/) displays the data in near real-time as a series of charts and tables, with an emphasis on service architecture, latency, error rate, and throughput. 1. The [Trace Analytics OpenSearch Dashboards plugin]({{site.url}}{{site.baseurl}}/monitoring-plugins/trace/ta-dashboards/) displays the data in near real-time as a series of charts and tables, with an emphasis on service architecture, latency, error rate, and throughput.
## Jaeger HotROD ## Jaeger HotROD
@ -39,7 +39,7 @@ Download or clone the [Data Prepper repository](https://github.com/opensearch-pr
Close the file and run `docker-compose up --build`. After the containers start, navigate to `http://localhost:8080` in a web browser. Close the file and run `docker-compose up --build`. After the containers start, navigate to `http://localhost:8080` in a web browser.
![HotROD web interface](../../images/hot-rod.png) ![HotROD web interface]({{site.url}}{{site.baseurl}}/images/hot-rod.png)
Click one of the buttons in the web interface to send a request to the application. Each request starts a series of operations across the services that make up the application. From the console logs, you can see that these operations share the same `trace-id`, which lets you track all of the operations in the request as a single *trace*: Click one of the buttons in the web interface to send a request to the application. Each request starts a series of operations across the services that make up the application. From the console logs, you can see that these operations share the same `trace-id`, which lets you track all of the operations in the request as a single *trace*:
@ -80,4 +80,4 @@ curl -X GET -u 'admin:admin' -k 'https://localhost:9200/otel-v1-apm-span-000001/
Navigate to `http://localhost:5601` in a web browser and choose **Trace Analytics**. You can see the results of your single click in the Jaeger HotROD web interface: the number of traces per API and HTTP method, latency trends, a color-coded map of the service architecture, and a list of trace IDs that you can use to drill down on individual operations. Navigate to `http://localhost:5601` in a web browser and choose **Trace Analytics**. You can see the results of your single click in the Jaeger HotROD web interface: the number of traces per API and HTTP method, latency trends, a color-coded map of the service architecture, and a list of trace IDs that you can use to drill down on individual operations.
If you don't see your trace, adjust the timeframe in OpenSearch Dashboards. For more information on using the plugin, see [OpenSearch Dashboards plugin](../ta-opensearch-dashboards/). If you don't see your trace, adjust the timeframe in OpenSearch Dashboards. For more information on using the plugin, see [OpenSearch Dashboards plugin]({{site.url}}{{site.baseurl}}/monitoring-plugins/trace/ta-dashboards/).

View File

@ -14,4 +14,4 @@ A single operation, such as a user clicking a button, can trigger an extended se
Trace Analytics can help you visualize this flow of events and identify performance problems. Trace Analytics can help you visualize this flow of events and identify performance problems.
![Detailed trace view](../images/ta-trace.png) ![Detailed trace view]({{site.url}}{{site.baseurl}}/images/ta-trace.png)

View File

@ -7,16 +7,16 @@ nav_order: 50
# Trace Analytics OpenSearch Dashboards plugin # Trace Analytics OpenSearch Dashboards plugin
The Trace Analytics plugin for OpenSearch Dashboards provides at-a-glance visibility into your application performance, along with the ability to drill down on individual traces. For installation instructions, see [Standalone OpenSearch Dashboards plugin install](../../opensearch-dashboards/plugins/). The Trace Analytics plugin for OpenSearch Dashboards provides at-a-glance visibility into your application performance, along with the ability to drill down on individual traces. For installation instructions, see [Standalone OpenSearch Dashboards plugin install]({{site.url}}{{site.baseurl}}/dashboards/install/plugins/).
The **Dashboard** view groups traces together by HTTP method and path so that you can see the average latency, error rate, and trends associated with a particular operation. For a more focused view, try filtering by trace group name. The **Dashboard** view groups traces together by HTTP method and path so that you can see the average latency, error rate, and trends associated with a particular operation. For a more focused view, try filtering by trace group name.
![Dashboard view](../../images/ta-dashboard.png) ![Dashboard view]({{site.url}}{{site.baseurl}}/images/ta-dashboard.png)
To drill down on the traces that make up a trace group, choose the number of traces in righthand column. Then choose an individual trace for a detailed summary. To drill down on the traces that make up a trace group, choose the number of traces in righthand column. Then choose an individual trace for a detailed summary.
![Detailed trace view](../../images/ta-trace.png) ![Detailed trace view]({{site.url}}{{site.baseurl}}/images/ta-trace.png)
The **Services** view lists all services in the application, plus an interactive map that shows how the various services connect to each other. In contrast to the dashboard, which helps identify problems by operation, the service map helps identify problems by service. Try sorting by error rate or latency to get a sense of potential problem areas of your application. The **Services** view lists all services in the application, plus an interactive map that shows how the various services connect to each other. In contrast to the dashboard, which helps identify problems by operation, the service map helps identify problems by service. Try sorting by error rate or latency to get a sense of potential problem areas of your application.
![Service view](../../images/ta-services.png) ![Service view]({{site.url}}{{site.baseurl}}/images/ta-services.png)

View File

@ -1,7 +1,6 @@
--- ---
layout: default layout: default
title: Aggregations title: Aggregations
parent: OpenSearch
nav_order: 13 nav_order: 13
has_children: true has_children: true
--- ---

View File

@ -2,12 +2,11 @@
layout: default layout: default
title: Bucket Aggregations title: Bucket Aggregations
parent: Aggregations parent: Aggregations
grand_parent: OpenSearch
nav_order: 2 nav_order: 2
has_children: false has_children: false
--- ---
# Bucket Aggregations # Bucket aggregations
Bucket aggregations categorize sets of documents as buckets. The type of bucket aggregation determines whether a given document falls into a bucket or not. Bucket aggregations categorize sets of documents as buckets. The type of bucket aggregation determines whether a given document falls into a bucket or not.

View File

@ -1,7 +1,6 @@
--- ---
layout: default layout: default
title: CAT API title: CAT API
parent: OpenSearch
nav_order: 20 nav_order: 20
--- ---

View File

@ -1,7 +1,6 @@
--- ---
layout: default layout: default
title: Cluster formation title: Cluster formation
parent: OpenSearch
nav_order: 7 nav_order: 7
--- ---
@ -15,7 +14,7 @@ To create and deploy an OpenSearch cluster according to your requirements, it
There are many ways to design a cluster. The following illustration shows a basic architecture: There are many ways to design a cluster. The following illustration shows a basic architecture:
![multi-node cluster architecture diagram](../../images/cluster.png) ![multi-node cluster architecture diagram]({{site.url}}{{site.baseurl}}/images/cluster.png)
This is a four-node cluster that has one dedicated master node, one dedicated coordinating node, and two data nodes that are master-eligible and also used for ingesting data. This is a four-node cluster that has one dedicated master node, one dedicated coordinating node, and two data nodes that are master-eligible and also used for ingesting data.
@ -37,7 +36,7 @@ This page demonstrates how to work with the different node types. It assumes tha
## Prerequisites ## Prerequisites
Before you get started, you must install and configure OpenSearch on all of your nodes. For information about the available options, see [Install and configure OpenSearch](../../install/). Before you get started, you must install and configure OpenSearch on all of your nodes. For information about the available options, see [Install and configure OpenSearch]({{site.url}}{{site.baseurl}}/opensearch/install/).
After you're done, use SSH to connect to each node, then open the `config/opensearch.yml` file. You can set all configurations for your cluster in this file. After you're done, use SSH to connect to each node, then open the `config/opensearch.yml` file. You can set all configurations for your cluster in this file.
@ -189,7 +188,7 @@ x.x.x.x 34 38 0 0.12 0.07 0.06 md - o
x.x.x.x 23 38 0 0.12 0.07 0.06 md - opensearch-c1 x.x.x.x 23 38 0 0.12 0.07 0.06 md - opensearch-c1
``` ```
To better understand and monitor your cluster, use the [cat API](../catapis/). To better understand and monitor your cluster, use the [cat API]({{site.url}}{{site.baseurl}}/opensearch/catapis/).
## (Advanced) Step 6: Configure shard allocation awareness or forced awareness ## (Advanced) Step 6: Configure shard allocation awareness or forced awareness
@ -323,9 +322,9 @@ old_index 0 r UNASSIGNED
In this case, all primary shards are allocated to `opensearch-d2`. Again, all replica shards are unassigned because we only have one warm node. In this case, all primary shards are allocated to `opensearch-d2`. Again, all replica shards are unassigned because we only have one warm node.
A popular approach is to configure your [index templates](../index-templates/) to set the `index.routing.allocation.require.temp` value to `hot`. This way, OpenSearch stores your most recent data on your hot nodes. A popular approach is to configure your [index templates]({{site.url}}{{site.baseurl}}/opensearch/index-templates/) to set the `index.routing.allocation.require.temp` value to `hot`. This way, OpenSearch stores your most recent data on your hot nodes.
You can then use the [Index State Management (ISM)](../../ism/index/) plugin to periodically check the age of an index and specify actions to take on it. For example, when the index reaches a specific age, change the `index.routing.allocation.require.temp` setting to `warm` to automatically move your data from hot nodes to warm nodes. You can then use the [Index State Management (ISM)]({{site.url}}{{site.baseurl}}/im-plugin/) plugin to periodically check the age of an index and specify actions to take on it. For example, when the index reaches a specific age, change the `index.routing.allocation.require.temp` setting to `warm` to automatically move your data from hot nodes to warm nodes.
## Next steps ## Next steps
@ -333,7 +332,7 @@ You can then use the [Index State Management (ISM)](../../ism/index/) plugin to
If you are using the security plugin, the previous request to `_cat/nodes?v` might have failed with an initialization error. To initialize the plugin, run `opensearch/plugins/opensearch-security/tools/securityadmin.sh`. A sample command that uses the demo certificates might look like this: If you are using the security plugin, the previous request to `_cat/nodes?v` might have failed with an initialization error. To initialize the plugin, run `opensearch/plugins/opensearch-security/tools/securityadmin.sh`. A sample command that uses the demo certificates might look like this:
```bash ```bash
sudo ./securityadmin.sh -cd ../securityconfig/ -icl -nhnv -cacert /etc/opensearch/root-ca.pem -cert /etc/opensearch/kirk.pem -key /etc/opensearch/kirk-key.pem -h <private-ip> sudo ./securityadmin.sh -cd {{site.url}}{{site.baseurl}}/securityconfig/ -icl -nhnv -cacert /etc/opensearch/root-ca.pem -cert /etc/opensearch/kirk.pem -key /etc/opensearch/kirk-key.pem -h <private-ip>
``` ```
For full guidance around configuration options, see [Security configuration](../../security/configuration). For full guidance around configuration options, see [Security configuration]({{site.url}}{{site.baseurl}}/security-plugin/configuration).

View File

@ -1,7 +1,6 @@
--- ---
layout: default layout: default
title: Common REST Parameters title: Common REST Parameters
parent: OpenSearch
nav_order: 93 nav_order: 93
--- ---

View File

@ -1,7 +1,6 @@
--- ---
layout: default layout: default
title: Configuration title: Configuration
parent: OpenSearch
nav_order: 5 nav_order: 5
--- ---
@ -66,4 +65,4 @@ PUT /_cluster/settings
You can find `opensearch.yml` in `/usr/share/opensearch/config/opensearch.yml` (Docker) or `/etc/opensearch/opensearch.yml` (RPM and DEB) on each node. You can find `opensearch.yml` in `/usr/share/opensearch/config/opensearch.yml` (Docker) or `/etc/opensearch/opensearch.yml` (RPM and DEB) on each node.
The demo configuration includes a number of settings for the security plugin that you should modify before using OpenSearch for a production workload. To learn more, see [Security](../../security/). The demo configuration includes a number of settings for the security plugin that you should modify before using OpenSearch for a production workload. To learn more, see [Security]({{site.url}}{{site.baseurl}}/security-plugin/).

View File

@ -1,7 +1,6 @@
--- ---
layout: default layout: default
title: Index aliases title: Index aliases
parent: OpenSearch
nav_order: 12 nav_order: 12
--- ---

View File

@ -1,7 +1,6 @@
--- ---
layout: default layout: default
title: Index data title: Index data
parent: OpenSearch
nav_order: 10 nav_order: 10
--- ---

View File

@ -1,7 +1,6 @@
--- ---
layout: default layout: default
title: Index templates title: Index templates
parent: OpenSearch
nav_order: 14 nav_order: 14
--- ---

View File

@ -1,8 +1,8 @@
--- ---
layout: default layout: default
title: OpenSearch title: About OpenSearch
nav_order: 10 nav_order: 1
has_children: true has_children: false
has_toc: false has_toc: false
--- ---
@ -10,9 +10,9 @@ has_toc: false
OpenSearch is a distributed search and analytics engine based on [Apache Lucene](https://lucene.apache.org/). After adding your data to OpenSearch, you can perform full-text searches on it with all of the features you might expect: search by field, search multiple indices, boost fields, rank results by score, sort results by field, and aggregate results. OpenSearch is a distributed search and analytics engine based on [Apache Lucene](https://lucene.apache.org/). After adding your data to OpenSearch, you can perform full-text searches on it with all of the features you might expect: search by field, search multiple indices, boost fields, rank results by score, sort results by field, and aggregate results.
Unsurprisingly, people often use OpenSearch as the backend for a search application---think [Wikipedia](https://en.wikipedia.org/wiki/Wikipedia:FAQ/Technical#What_software_is_used_to_run_Wikipedia?) or an online store. It offers excellent performance and can scale up and down as the needs of the application grow or shrink. Unsurprisingly, people often use search engines like OpenSearch as the backend for a search application---think [Wikipedia](https://en.wikipedia.org/wiki/Wikipedia:FAQ/Technical#What_software_is_used_to_run_Wikipedia?) or an online store. It offers excellent performance and can scale up and down as the needs of the application grow or shrink.
An equally popular, but less obvious use case is log analytics, in which you take the logs from an application, feed them into OpenSearch, and use the rich search and visualization functionality to identify issues. For example, a malfunctioning web server might throw a 500 error 0.5% of the time, which can be hard to notice unless you have a real-time graph of all HTTP status codes that the server has thrown in the past four hours. You can use [OpenSearch Dashboards](../opensearch-dashboards/) to build these sorts of visualizations from data in OpenSearch. An equally popular, but less obvious use case is log analytics, in which you take the logs from an application, feed them into OpenSearch, and use the rich search and visualization functionality to identify issues. For example, a malfunctioning web server might throw a 500 error 0.5% of the time, which can be hard to notice unless you have a real-time graph of all HTTP status codes that the server has thrown in the past four hours. You can use [OpenSearch Dashboards]({{site.url}}{{site.baseurl}}/dashboards/) to build these sorts of visualizations from data in OpenSearch.
## Clusters and nodes ## Clusters and nodes
@ -21,7 +21,7 @@ Its distributed design means that you interact with OpenSearch *clusters*. Each
You can run OpenSearch locally on a laptop---its system requirements are minimal---but you can also scale a single cluster to hundreds of powerful machines in a data center. You can run OpenSearch locally on a laptop---its system requirements are minimal---but you can also scale a single cluster to hundreds of powerful machines in a data center.
In a single node cluster, such as a laptop, one machine has to do everything: manage the state of the cluster, index and search data, and perform any preprocessing of data prior to indexing it. As a cluster grows, however, you can subdivide responsibilities. Nodes with fast disks and plenty of RAM might be great at indexing and searching data, whereas a node with plenty of CPU power and a tiny disk could manage cluster state. For more information on setting node types, see [Cluster formation](cluster/). In a single node cluster, such as a laptop, one machine has to do everything: manage the state of the cluster, index and search data, and perform any preprocessing of data prior to indexing it. As a cluster grows, however, you can subdivide responsibilities. Nodes with fast disks and plenty of RAM might be great at indexing and searching data, whereas a node with plenty of CPU power and a tiny disk could manage cluster state. For more information on setting node types, see [Cluster formation]({{site.url}}{{site.baseurl}}/opensearch/cluster/).
## Indices and documents ## Indices and documents

View File

@ -2,7 +2,6 @@
layout: default layout: default
title: Docker security configuration title: Docker security configuration
parent: Install OpenSearch parent: Install OpenSearch
grand_parent: OpenSearch
nav_order: 5 nav_order: 5
--- ---
@ -109,7 +108,7 @@ networks:
opensearch-net: opensearch-net:
``` ```
Then make your changes to `opensearch.yml`. For a full list of settings, see [Security](../../../security/configuration/). This example adds (extremely) verbose audit logging: Then make your changes to `opensearch.yml`. For a full list of settings, see [Security]({{site.url}}{{site.baseurl}}/security-plugin/configuration/). This example adds (extremely) verbose audit logging:
```yml ```yml
plugins.security.ssl.transport.pemcert_filepath: node.pem plugins.security.ssl.transport.pemcert_filepath: node.pem
@ -134,7 +133,7 @@ plugins.security.audit.config.disabled_rest_categories: NONE
plugins.security.audit.config.disabled_transport_categories: NONE plugins.security.audit.config.disabled_transport_categories: NONE
``` ```
Use this same override process to specify new [authentication settings](../../../security/configuration/configuration/) in `/usr/share/opensearch/plugins/opensearch-security/securityconfig/config.yml`, as well as new default [internal users, roles, mappings, action groups, and tenants](../../../security/configuration/yaml/). Use this same override process to specify new [authentication settings]({{site.url}}{{site.baseurl}}/security-plugin/configuration/configuration/) in `/usr/share/opensearch/plugins/opensearch-security/securityconfig/config.yml`, as well as new default [internal users, roles, mappings, action groups, and tenants]({{site.url}}{{site.baseurl}}/security-plugin/configuration/yaml/).
To start the cluster, run `docker-compose up`. To start the cluster, run `docker-compose up`.
@ -163,7 +162,7 @@ volumes:
- ./custom-opensearch.yml: /full/path/to/custom-opensearch.yml - ./custom-opensearch.yml: /full/path/to/custom-opensearch.yml
``` ```
Remember that the certificates you specify in your Docker Compose file must be the same as the certificates listed in your custom `opensearch.yml` file. At a minimum, you should replace the root, admin, and node certificates with your own. For more information about adding and using certificates, see [Configure TLS certificates](../security/configuration/tls.md). Remember that the certificates you specify in your Docker Compose file must be the same as the certificates listed in your custom `opensearch.yml` file. At a minimum, you should replace the root, admin, and node certificates with your own. For more information about adding and using certificates, see [Configure TLS certificates]({{site.url}}{{site.baseurl}}/security-plugin/configuration/tls).
```yml ```yml
plugins.security.ssl.transport.pemcert_filepath: new-node-cert.pem plugins.security.ssl.transport.pemcert_filepath: new-node-cert.pem

View File

@ -2,7 +2,6 @@
layout: default layout: default
title: Docker title: Docker
parent: Install OpenSearch parent: Install OpenSearch
grand_parent: OpenSearch
nav_order: 1 nav_order: 1
--- ---
@ -184,7 +183,7 @@ services:
- ./custom-opensearch_dashboards.yml:/usr/share/opensearch-dashboards/config/opensearch_dashboards.yml - ./custom-opensearch_dashboards.yml:/usr/share/opensearch-dashboards/config/opensearch_dashboards.yml
``` ```
You can also configure `docker-compose.yml` and `opensearch.yml` [to take your own certificates](../docker-security/) for use with the [Security](../../security/configuration/) plugin. You can also configure `docker-compose.yml` and `opensearch.yml` [to take your own certificates]({{site.url}}{{site.baseurl}}/opensearch/install/docker-security/) for use with the [Security]({{site.url}}{{site.baseurl}}/security-plugin/configuration/) plugin.
### (Optional) Set up Performance Analyzer ### (Optional) Set up Performance Analyzer
@ -299,7 +298,7 @@ docker build --tag=opensearch-custom-plugin .
docker run -p 9200:9200 -p 9600:9600 -v /usr/share/opensearch/data opensearch-custom-plugin docker run -p 9200:9200 -p 9600:9600 -v /usr/share/opensearch/data opensearch-custom-plugin
``` ```
You can also use a `Dockerfile` to pass your own certificates for use with the [security](../../../security/) plugin, similar to the `-v` argument in [Configure OpenSearch](#configure-opensearch): You can also use a `Dockerfile` to pass your own certificates for use with the [security]({{site.url}}{{site.baseurl}}/security-plugin/) plugin, similar to the `-v` argument in [Configure OpenSearch](#configure-opensearch):
``` ```
FROM opensearchproject/opensearch:{{site.opensearch_version}} FROM opensearchproject/opensearch:{{site.opensearch_version}}

View File

@ -2,7 +2,6 @@
layout: default layout: default
title: Important settings title: Important settings
parent: Install OpenSearch parent: Install OpenSearch
grand_parent: OpenSearch
nav_order: 70 nav_order: 70
--- ---
@ -22,7 +21,7 @@ vm.max_map_count=262144
Then run `sudo sysctl -p` to reload. Then run `sudo sysctl -p` to reload.
The [sample docker-compose.yml](../docker/#sample-docker-compose-file) file also contains several key settings: The [sample docker-compose.yml]({{site.url}}{{site.baseurl}}/opensearch/install/docker/#sample-docker-compose-file) file also contains several key settings:
- `bootstrap.memory_lock=true` - `bootstrap.memory_lock=true`

View File

@ -1,8 +1,7 @@
--- ---
layout: default layout: default
title: Install OpenSearch title: Install OpenSearch
nav_order: 1 nav_order: 2
parent: OpenSearch
redirect_from: /docs/install/ redirect_from: /docs/install/
has_children: true has_children: true
--- ---

View File

@ -2,7 +2,6 @@
layout: default layout: default
title: OpenSearch plugins title: OpenSearch plugins
parent: Install OpenSearch parent: Install OpenSearch
grand_parent: OpenSearch
nav_order: 90 nav_order: 90
--- ---
@ -96,9 +95,9 @@ Navigate to the OpenSearch home directory (most likely, it is `/usr/share/opense
sudo bin/opensearch-plugin install https://d3g5vo6xdbdb9a.cloudfront.net/downloads/opensearch-plugins/opensearch-security/opensearch-security-{{site.opensearch_major_minor_version}}.1.0.zip sudo bin/opensearch-plugin install https://d3g5vo6xdbdb9a.cloudfront.net/downloads/opensearch-plugins/opensearch-security/opensearch-security-{{site.opensearch_major_minor_version}}.1.0.zip
``` ```
After installing the security plugin, you can run `sudo sh /usr/share/opensearch/plugins/opensearch-security/tools/install_demo_configuration.sh` to quickly get started with demo certificates. Otherwise, you must configure it manually and run [securityadmin.sh](../../../security/configuration/security-admin/). After installing the security plugin, you can run `sudo sh /usr/share/opensearch/plugins/opensearch-security/tools/install_demo_configuration.sh` to quickly get started with demo certificates. Otherwise, you must configure it manually and run [securityadmin.sh]({{site.url}}{{site.baseurl}}/security-plugin/configuration/security-admin/).
The security plugin has a corresponding [OpenSearch Dashboards plugin](../../../opensearch-dashboards/install/plugins) that you probably want to install as well. The security plugin has a corresponding [OpenSearch Dashboards plugin]({{site.url}}{{site.baseurl}}/opensearch-dashboards/install/plugins) that you probably want to install as well.
### Job scheduler ### Job scheduler
@ -114,7 +113,7 @@ sudo bin/opensearch-plugin install https://d3g5vo6xdbdb9a.cloudfront.net/downloa
sudo bin/opensearch-plugin install https://d3g5vo6xdbdb9a.cloudfront.net/downloads/opensearch-plugins/opensearch-alerting/opensearch-alerting-{{site.opensearch_major_minor_version}}.1.0.zip sudo bin/opensearch-plugin install https://d3g5vo6xdbdb9a.cloudfront.net/downloads/opensearch-plugins/opensearch-alerting/opensearch-alerting-{{site.opensearch_major_minor_version}}.1.0.zip
``` ```
To install Alerting, you must first install the Job Scheduler plugin. Alerting has a corresponding [OpenSearch Dashboards plugin](../../../opensearch-dashboards/install/plugins/) that you probably want to install as well. To install Alerting, you must first install the Job Scheduler plugin. Alerting has a corresponding [OpenSearch Dashboards plugin]({{site.url}}{{site.baseurl}}/opensearch-dashboards/install/plugins/) that you probably want to install as well.
### SQL ### SQL
@ -137,7 +136,7 @@ sudo bin/opensearch-plugin install https://d3g5vo6xdbdb9a.cloudfront.net/downloa
sudo bin/opensearch-plugin install https://d3g5vo6xdbdb9a.cloudfront.net/downloads/opensearch-plugins/opensearch-index-management/opensearch-index-management-{{site.opensearch_major_minor_version}}.2.0.zip sudo bin/opensearch-plugin install https://d3g5vo6xdbdb9a.cloudfront.net/downloads/opensearch-plugins/opensearch-index-management/opensearch-index-management-{{site.opensearch_major_minor_version}}.2.0.zip
``` ```
To install Index State Management, you must first install the Job Scheduler plugin. ISM has a corresponding [OpenSearch Dashboards plugin](../../../opensearch-dashboards/install/plugins/) that you probably want to install as well. To install Index State Management, you must first install the Job Scheduler plugin. ISM has a corresponding [OpenSearch Dashboards plugin]({{site.url}}{{site.baseurl}}/opensearch-dashboards/install/plugins/) that you probably want to install as well.
### k-NN ### k-NN

View File

@ -2,7 +2,6 @@
layout: default layout: default
title: Tarball title: Tarball
parent: Install OpenSearch parent: Install OpenSearch
grand_parent: OpenSearch
nav_order: 50 nav_order: 50
--- ---
@ -46,7 +45,7 @@ You can modify `config/opensearch.yml` or specify environment variables as argum
./opensearch-tar-install.sh -Ecluster.name=opensearch-cluster -Enode.name=opensearch-node1 -Ehttp.host=0.0.0.0 -Ediscovery.type=single-node ./opensearch-tar-install.sh -Ecluster.name=opensearch-cluster -Enode.name=opensearch-node1 -Ehttp.host=0.0.0.0 -Ediscovery.type=single-node
``` ```
For other settings, see [Important settings](../important-settings/). For other settings, see [Important settings]({{site.url}}{{site.baseurl}}/opensearch/install/important-settings/).
### (Optional) Set up Performance Analyzer ### (Optional) Set up Performance Analyzer
@ -143,6 +142,6 @@ In a tarball installation, Performance Analyzer collects data when it is enabled
### (Optional) Removing Performance Analyzer ### (Optional) Removing Performance Analyzer
See [Clean up Performance Analyzer files](../plugins/#optional-clean-up-performance-analyzer-files). See [Clean up Performance Analyzer files]({{site.url}}{{site.baseurl}}/plugins/#optional-clean-up-performance-analyzer-files).
{% endcomment %} {% endcomment %}

View File

@ -1,7 +1,6 @@
--- ---
layout: default layout: default
title: Logs title: Logs
parent: OpenSearch
nav_order: 60 nav_order: 60
--- ---

View File

@ -2,12 +2,11 @@
layout: default layout: default
title: Metric Aggregations title: Metric Aggregations
parent: Aggregations parent: Aggregations
grand_parent: OpenSearch
nav_order: 1 nav_order: 1
has_children: false has_children: false
--- ---
# Metric Aggregations # Metric aggregations
Metric aggregations let you perform simple calculations such as finding the minimum, maximum, and average values of a field. Metric aggregations let you perform simple calculations such as finding the minimum, maximum, and average values of a field.

View File

@ -2,12 +2,11 @@
layout: default layout: default
title: Pipeline Aggregations title: Pipeline Aggregations
parent: Aggregations parent: Aggregations
grand_parent: OpenSearch
nav_order: 4 nav_order: 4
has_children: false has_children: false
--- ---
# Pipeline Aggregations # Pipeline aggregations
With pipeline aggregations, you can chain aggregations by piping the results of one aggregation as an input to another for a more nuanced output. With pipeline aggregations, you can chain aggregations by piping the results of one aggregation as an input to another for a more nuanced output.

View File

@ -1,7 +1,6 @@
--- ---
layout: default layout: default
title: Popular APIs title: Popular APIs
parent: OpenSearch
nav_order: 96 nav_order: 96
--- ---

View File

@ -2,8 +2,8 @@
layout: default layout: default
title: Boolean queries title: Boolean queries
parent: Query DSL parent: Query DSL
grand_parent: OpenSearch
nav_order: 45 nav_order: 45
redirect_from: /docs/opensearch/bool/
--- ---
# Boolean queries # Boolean queries

View File

@ -2,8 +2,10 @@
layout: default layout: default
title: Full-text queries title: Full-text queries
parent: Query DSL parent: Query DSL
grand_parent: OpenSearch
nav_order: 40 nav_order: 40
redirect_from:
- /docs/opensearch/full-text/
- /opensearch/full-text/
--- ---
# Full-text queries # Full-text queries

View File

@ -2,7 +2,6 @@
layout: default layout: default
title: Query DSL title: Query DSL
nav_order: 27 nav_order: 27
parent: OpenSearch
has_children: true has_children: true
--- ---

View File

@ -2,8 +2,8 @@
layout: default layout: default
title: Term-level queries title: Term-level queries
parent: Query DSL parent: Query DSL
grand_parent: OpenSearch
nav_order: 30 nav_order: 30
redirect_from: /docs/opensearch/term/
--- ---
# Term-level queries # Term-level queries
@ -145,7 +145,7 @@ The search query “To be, or not to be” is analyzed and tokenized into an arr
... ...
``` ```
For a list of all full-text queries, see [Full-text queries](../full-text/). For a list of all full-text queries, see [Full-text queries]({{site.url}}{{site.baseurl}}/opensearch/full-text/).
If you want to query for an exact term like “HAMLET” in the speaker field and don't need the results to be sorted by relevance scores, a term-level query is more efficient: If you want to query for an exact term like “HAMLET” in the speaker field and don't need the results to be sorted by relevance scores, a term-level query is more efficient:

View File

@ -1,7 +1,6 @@
--- ---
layout: default layout: default
title: Reindex data title: Reindex data
parent: OpenSearch
nav_order: 16 nav_order: 16
--- ---
@ -113,7 +112,7 @@ POST _reindex
} }
``` ```
For a list of all query operations, see [Full-text queries](../full-text/). For a list of all query operations, see [Full-text queries]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/).
## Combine one or more indices ## Combine one or more indices

View File

@ -2,7 +2,6 @@
layout: default layout: default
title: Bulk title: Bulk
parent: REST API reference parent: REST API reference
grand_parent: OpenSearch
nav_order: 15 nav_order: 15
--- ---

View File

@ -2,7 +2,6 @@
layout: default layout: default
title: Cluster allocation explain title: Cluster allocation explain
parent: REST API reference parent: REST API reference
grand_parent: OpenSearch
nav_order: 30 nav_order: 30
--- ---

View File

@ -2,7 +2,6 @@
layout: default layout: default
title: REST API reference title: REST API reference
nav_order: 99 nav_order: 99
parent: OpenSearch
has_children: true has_children: true
--- ---

View File

@ -1,7 +1,6 @@
--- ---
layout: default layout: default
title: Search templates title: Search templates
parent: OpenSearch
nav_order: 50 nav_order: 50
--- ---

View File

@ -1,7 +1,6 @@
--- ---
layout: default layout: default
title: Take and restore snapshots title: Take and restore snapshots
parent: OpenSearch
nav_order: 65 nav_order: 65
--- ---
@ -101,7 +100,7 @@ Setting | Description
sudo ./bin/opensearch-plugin install repository-s3 sudo ./bin/opensearch-plugin install repository-s3
``` ```
If you're using the Docker installation, see [Customize the Docker image](../../install/docker/#customize-the-docker-image). Your `Dockerfile` should look something like this: If you're using the Docker installation, see [Customize the Docker image]({{site.url}}{{site.baseurl}}/opensearch/install/docker/#customize-the-docker-image). Your `Dockerfile` should look something like this:
``` ```
FROM opensearchproject/opensearch:{{site.opensearch_version}} FROM opensearchproject/opensearch:{{site.opensearch_version}}

View File

@ -1,9 +1,7 @@
--- ---
layout: default layout: default
title: Tasks API title: Tasks API
parent: OpenSearch nav_order: 25
nav_order: 25
has_math: false
--- ---
# Tasks API operation # Tasks API operation
@ -22,7 +20,7 @@ By including a task ID, you can get information specific to a particular task. N
GET _tasks/<task_id> GET _tasks/<task_id>
``` ```
Note that if a task finishes running, it won't be returned as part of your request. For an example of a task that takes a little longer to finish, you can run the [`_reindex`](../reindex-data) API operation on a larger document, and then run `tasks`. Note that if a task finishes running, it won't be returned as part of your request. For an example of a task that takes a little longer to finish, you can run the [`_reindex`]({{site.url}}{{site.baseurl}}/opensearch/reindex-data) API operation on a larger document, and then run `tasks`.
**Sample Response** **Sample Response**
```json ```json

View File

@ -1,7 +1,6 @@
--- ---
layout: default layout: default
title: Supported units title: Supported units
parent: OpenSearch
nav_order: 90 nav_order: 90
--- ---
@ -16,4 +15,4 @@ Bytes | The supported units for byte size are `b` for bytes, `kb` for kibibytes,
Distances | The supported units for distance are `mi` for miles, `yd` for yards, `ft` for feet, `in` for inches, `km` for kilometers, `m` for meters, `cm` for centimeters, `mm` for millimeters, and `nmi` or `NM` for nautical miles. | `5mi` or `4ft` Distances | The supported units for distance are `mi` for miles, `yd` for yards, `ft` for feet, `in` for inches, `km` for kilometers, `m` for meters, `cm` for centimeters, `mm` for millimeters, and `nmi` or `NM` for nautical miles. | `5mi` or `4ft`
Quantities without units | For large values that don't have a unit, use `k` for kilo, `m` for mega, `g` for giga, `t` for tera, and `p` for peta. | `5k` for 5,000 Quantities without units | For large values that don't have a unit, use `k` for kilo, `m` for mega, `g` for giga, `t` for tera, and `p` for peta. | `5k` for 5,000
To convert output units to human-readable values, see [Common REST parameters](../common-parameters/). To convert output units to human-readable values, see [Common REST parameters]({{site.url}}{{site.baseurl}}/opensearch/common-parameters/).

View File

@ -1,7 +1,6 @@
--- ---
layout: default layout: default
title: Search experience title: Search experience
parent: OpenSearch
nav_order: 55 nav_order: 55
--- ---
@ -40,7 +39,7 @@ These methods are described in the following sections.
Prefix matching finds documents that matches the last term in the query string. Prefix matching finds documents that matches the last term in the query string.
For example, assume that the user types “qui” into a search UI. To autocomplete this phrase, use the `match_phrase_prefix` query to search all `text_entry` fields that begin with the prefix "qui." For example, assume that the user types “qui” into a search UI. To autocomplete this phrase, use the `match_phrase_prefix` query to search all `text_entry` fields that begin with the prefix "qui."
To make the word order and relative positions flexible, specify a `slop` value. To learn about the `slop` option, see [Options](../full-text/#options). To make the word order and relative positions flexible, specify a `slop` value. To learn about the `slop` option, see [Options]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/#options).
#### Sample Request #### Sample Request
@ -60,7 +59,7 @@ GET shakespeare/_search
Prefix matching doesnt require any special mappings. It works with your data as-is. Prefix matching doesnt require any special mappings. It works with your data as-is.
However, its a fairly resource-intensive operation. A prefix of `a` could match hundreds of thousands of terms and not be useful to your user. However, its a fairly resource-intensive operation. A prefix of `a` could match hundreds of thousands of terms and not be useful to your user.
To limit the impact of prefix expansion, set `max_expansions` to a reasonable number. To learn about the `max_expansions` option, see [Options](../full-text/#options). To limit the impact of prefix expansion, set `max_expansions` to a reasonable number. To learn about the `max_expansions` option, see [Options]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/full-text/#options).
#### Sample Request #### Sample Request

View File

@ -3,6 +3,7 @@ layout: default
title: Asynchronous search title: Asynchronous search
nav_order: 51 nav_order: 51
has_children: true has_children: true
redirect_from: /docs/async/
--- ---
# Asynchronous search # Asynchronous search

View File

@ -4,19 +4,20 @@ title: Asynchronous search security
nav_order: 2 nav_order: 2
parent: Asynchronous search parent: Asynchronous search
has_children: false has_children: false
redirect_from: /docs/async/security/
--- ---
# Asynchronous search security # Asynchronous search security
You can use the security plugin with asynchronous searches to limit non-admin users to specific actions. For example, you might want some users to only be able to submit or delete asynchronous searches, while you might want others to only view the results. You can use the security plugin with asynchronous searches to limit non-admin users to specific actions. For example, you might want some users to only be able to submit or delete asynchronous searches, while you might want others to only view the results.
All asynchronous search indices are protected as system indices. Only a super admin user or an admin user with a Transport Layer Security (TLS) certificate can access system indices. For more information, see [System indices](../../security/configuration/system-indices/). All asynchronous search indices are protected as system indices. Only a super admin user or an admin user with a Transport Layer Security (TLS) certificate can access system indices. For more information, see [System indices]({{site.url}}{{site.baseurl}}/security-plugin/configuration/system-indices/).
## Basic permissions ## Basic permissions
As an admin user, you can use the security plugin to assign specific permissions to users based on which API operations they need access to. For a list of supported APIs operations, see [Asynchronous search](../). As an admin user, you can use the security plugin to assign specific permissions to users based on which API operations they need access to. For a list of supported APIs operations, see [Asynchronous search]({{site.url}}{{site.baseurl}}/).
The security plugin has two built-in roles that cover most asynchronous search use cases: `asynchronous_search_full_access` and `asynchronous_search_read_access`. For descriptions of each, see [Predefined roles](../../security/access-control/users-roles/#predefined-roles). The security plugin has two built-in roles that cover most asynchronous search use cases: `asynchronous_search_full_access` and `asynchronous_search_read_access`. For descriptions of each, see [Predefined roles]({{site.url}}{{site.baseurl}}/security-plugin/access-control/users-roles/#predefined-roles).
If these roles dont meet your needs, mix and match individual asynchronous search permissions to suit your use case. Each action corresponds to an operation in the REST API. For example, the `cluster:admin/opensearch/asynchronous_search/delete` permission lets you delete a previously submitted asynchronous search. If these roles dont meet your needs, mix and match individual asynchronous search permissions to suit your use case. Each action corresponds to an operation in the REST API. For example, the `cluster:admin/opensearch/asynchronous_search/delete` permission lets you delete a previously submitted asynchronous search.
@ -24,7 +25,7 @@ If these roles dont meet your needs, mix and match individual asynchronous se
Use backend roles to configure fine-grained access to asynchronous searches based on roles. For example, users of different departments in an organization can view asynchronous searches owned by their own department. Use backend roles to configure fine-grained access to asynchronous searches based on roles. For example, users of different departments in an organization can view asynchronous searches owned by their own department.
First, make sure your users have the appropriate [backend roles](../../security/access-control/). Backend roles usually come from an [LDAP server](../../security/configuration/ldap/) or [SAML provider](../../security/configuration/saml/). However, if you use the internal user database, you can use the REST API to [add them manually](../../security/access-control/api/#create-user). First, make sure your users have the appropriate [backend roles]({{site.url}}{{site.baseurl}}/security-plugin/access-control/). Backend roles usually come from an [LDAP server]({{site.url}}{{site.baseurl}}/security-plugin/configuration/ldap/) or [SAML provider]({{site.url}}{{site.baseurl}}/security-plugin/configuration/saml/). However, if you use the internal user database, you can use the REST API to [add them manually]({{site.url}}{{site.baseurl}}/security-plugin/access-control/api/#create-user).
Now when users view asynchronous search resources in OpenSearch Dashboards (or make REST API calls), they only see asynchronous searches submitted by users who have a subset of the backend role. Now when users view asynchronous search resources in OpenSearch Dashboards (or make REST API calls), they only see asynchronous searches submitted by users who have a subset of the backend role.
For example, consider two users: `judy` and `elon`. For example, consider two users: `judy` and `elon`.
@ -32,7 +33,7 @@ For example, consider two users: `judy` and `elon`.
`judy` has an IT backend role: `judy` has an IT backend role:
```json ```json
PUT _opensearch/_security/api/internalusers/judy PUT _plugins/_security/api/internalusers/judy
{ {
"password": "judy", "password": "judy",
"backend_roles": [ "backend_roles": [
@ -45,7 +46,7 @@ PUT _opensearch/_security/api/internalusers/judy
`elon` has an admin backend role: `elon` has an admin backend role:
```json ```json
PUT _opensearch/_security/api/internalusers/elon PUT _plugins/_security/api/internalusers/elon
{ {
"password": "elon", "password": "elon",
"backend_roles": [ "backend_roles": [
@ -58,7 +59,7 @@ PUT _opensearch/_security/api/internalusers/elon
Both `judy` and `elon` have full access to asynchronous search: Both `judy` and `elon` have full access to asynchronous search:
```json ```json
PUT _opensearch/_security/api/rolesmapping/async_full_access PUT _plugins/_security/api/rolesmapping/async_full_access
{ {
"backend_roles": [], "backend_roles": [],
"hosts": [], "hosts": [],

View File

@ -3,6 +3,7 @@ layout: default
title: Settings title: Settings
parent: Asynchronous search parent: Asynchronous search
nav_order: 4 nav_order: 4
redirect_from: /docs/async/settings/
--- ---
# Settings # Settings

View File

@ -1,9 +1,10 @@
--- ---
layout: default layout: default
title: API title: k-NN API
nav_order: 5 nav_order: 5
parent: k-NN parent: k-NN
has_children: false has_children: false
redirect_from: /docs/knn/api/
--- ---
# k-NN plugin API # k-NN plugin API
@ -22,7 +23,7 @@ Statistic | Description
:--- | :--- :--- | :---
`circuit_breaker_triggered` | Indicates whether the circuit breaker is triggered. This statistic is only relevant to approximate k-NN search. `circuit_breaker_triggered` | Indicates whether the circuit breaker is triggered. This statistic is only relevant to approximate k-NN search.
`total_load_time` | The time in nanoseconds that k-NN has taken to load graphs into the cache. This statistic is only relevant to approximate k-NN search. `total_load_time` | The time in nanoseconds that k-NN has taken to load graphs into the cache. This statistic is only relevant to approximate k-NN search.
`eviction_count` | The number of graphs that have been evicted from the cache due to memory constraints or idle time. This statistic is only relevant to approximate k-NN search. <br /> **Note**: Explicit evictions that occur because of index deletion aren't counted. `eviction_count` | The number of graphs that have been evicted from the cache due to memory constraints or idle time. This statistic is only relevant to approximate k-NN search. <br /> **Note**: Explicit evictions that occur because of index deletion aren't counted.
`hit_count` | The number of cache hits. A cache hit occurs when a user queries a graph that's already loaded into memory. This statistic is only relevant to approximate k-NN search. `hit_count` | The number of cache hits. A cache hit occurs when a user queries a graph that's already loaded into memory. This statistic is only relevant to approximate k-NN search.
`miss_count` | The number of cache misses. A cache miss occurs when a user queries a graph that isn't loaded into memory yet. This statistic is only relevant to approximate k-NN search. `miss_count` | The number of cache misses. A cache miss occurs when a user queries a graph that isn't loaded into memory yet. This statistic is only relevant to approximate k-NN search.
`graph_memory_usage` | Current cache size (total size of all graphs in memory) in kilobytes. This statistic is only relevant to approximate k-NN search. `graph_memory_usage` | Current cache size (total size of all graphs in memory) in kilobytes. This statistic is only relevant to approximate k-NN search.
@ -139,7 +140,7 @@ The call doesn't return results until the warmup operation finishes or the reque
GET /_tasks GET /_tasks
``` ```
After the operation has finished, use the [k-NN `_stats` API operation](#Stats) to see what the k-NN plugin loaded into the graph. After the operation has finished, use the [k-NN `_stats` API operation](#stats) to see what the k-NN plugin loaded into the graph.
### Best practices ### Best practices
@ -148,6 +149,6 @@ For the warmup operation to function properly, follow these best practices:
* Don't run merge operations on indices that you want to warm up. During merge, the k-NN plugin creates new segments, and old segments are sometimes deleted. For example, you could encounter a situation in which the warmup API operation loads graphs A and B into native memory, but segment C is created from segments A and B being merged. The graphs for A and B would no longer be in memory, and graph C would also not be in memory. In this case, the initial penalty for loading graph C is still present. * Don't run merge operations on indices that you want to warm up. During merge, the k-NN plugin creates new segments, and old segments are sometimes deleted. For example, you could encounter a situation in which the warmup API operation loads graphs A and B into native memory, but segment C is created from segments A and B being merged. The graphs for A and B would no longer be in memory, and graph C would also not be in memory. In this case, the initial penalty for loading graph C is still present.
* Confirm that all graphs you want to warm up can fit into native memory. For more information about the native memory limit, see the [knn.memory.circuit_breaker.limit statistic](../settings/#cluster-settings). High graph memory usage causes cache thrashing, which can lead to operations constantly failing and attempting to run again. * Confirm that all graphs you want to warm up can fit into native memory. For more information about the native memory limit, see the [knn.memory.circuit_breaker.limit statistic]({{site.url}}{{site.baseurl}}/search-plugins/knn/settings#cluster-settings). High graph memory usage causes cache thrashing, which can lead to operations constantly failing and attempting to run again.
* Don't index any documents that you want to load into the cache. Writing new information to segments prevents the warmup API operation from loading the graphs until they're searchable. This means that you would have to run the warmup operation again after indexing finishes. * Don't index any documents that you want to load into the cache. Writing new information to segments prevents the warmup API operation from loading the graphs until they're searchable. This means that you would have to run the warmup operation again after indexing finishes.

View File

@ -5,13 +5,14 @@ nav_order: 2
parent: k-NN parent: k-NN
has_children: false has_children: false
has_math: true has_math: true
redirect_from: /docs/knn/approximate-knn/
--- ---
# Approximate k-NN search # Approximate k-NN search
The approximate k-NN method uses [nmslib's](https://github.com/nmslib/nmslib/) implementation of the Hierarchical Navigable Small World (HNSW) algorithm to power k-NN search. In this case, approximate means that for a given search, the neighbors returned are an estimate of the true k-nearest neighbors. Of the three methods, this method offers the best search scalability for large data sets. Generally speaking, once the data set gets into the hundreds of thousands of vectors, this approach is preferred. The approximate k-NN method uses [nmslib's](https://github.com/nmslib/nmslib/) implementation of the Hierarchical Navigable Small World (HNSW) algorithm to power k-NN search. In this case, approximate means that for a given search, the neighbors returned are an estimate of the true k-nearest neighbors. Of the three methods, this method offers the best search scalability for large data sets. Generally speaking, once the data set gets into the hundreds of thousands of vectors, this approach is preferred.
The k-NN plugin builds an HNSW graph of the vectors for each "knn-vector field"/ "Lucene segment" pair during indexing that can be used to efficiently find the k-nearest neighbors to a query vector during search. To learn more about Lucene segments, please refer to [Apache Lucene's documentation](https://lucene.apache.org/core/8_7_0/core/org/apache/lucene/codecs/lucene87/package-summary.html#package.description). These graphs are loaded into native memory during search and managed by a cache. To learn more about pre-loading graphs into memory, refer to the [warmup API](../api#warmup). Additionally, you can see what graphs are already loaded in memory, which you can learn more about in the [stats API section](../api#stats). The k-NN plugin builds an HNSW graph of the vectors for each "knn-vector field"/ "Lucene segment" pair during indexing that can be used to efficiently find the k-nearest neighbors to a query vector during search. To learn more about Lucene segments, please refer to [Apache Lucene's documentation](https://lucene.apache.org/core/8_7_0/core/org/apache/lucene/codecs/lucene87/package-summary.html#package.description). These graphs are loaded into native memory during search and managed by a cache. To learn more about pre-loading graphs into memory, refer to the [warmup API]({{site.url}}{{site.baseurl}}/search-plugins/knn/api#warmup). Additionally, you can see what graphs are already loaded in memory, which you can learn more about in the [stats API section]({{site.url}}{{site.baseurl}}/search-plugins/knn/api#stats).
Because the graphs are constructed during indexing, it is not possible to apply a filter on an index and then use this search method. All filters are applied on the results produced by the approximate nearest neighbor search. Because the graphs are constructed during indexing, it is not possible to apply a filter on an index and then use this search method. All filters are applied on the results produced by the approximate nearest neighbor search.
@ -19,7 +20,7 @@ Because the graphs are constructed during indexing, it is not possible to apply
To use the k-NN plugin's approximate search functionality, you must first create a k-NN index with setting `index.knn` to `true`. This setting tells the plugin to create HNSW graphs for the index. To use the k-NN plugin's approximate search functionality, you must first create a k-NN index with setting `index.knn` to `true`. This setting tells the plugin to create HNSW graphs for the index.
Additionally, if you're using the approximate k-nearest neighbor method, specify `knn.space_type` to the space you're interested in. You can't change this setting after it's set. To see what spaces we support, see [spaces](#spaces). By default, `index.knn.space_type` is `l2`. For more information about index settings, such as algorithm parameters you can tweak to tune performance, see [Index settings](../knn-index#index-settings). Additionally, if you're using the approximate k-nearest neighbor method, specify `knn.space_type` to the space you're interested in. You can't change this setting after it's set. To see what spaces we support, see [spaces](#spaces). By default, `index.knn.space_type` is `l2`. For more information about index settings, such as algorithm parameters you can tweak to tune performance, see [Index settings]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index#index-settings).
Next, you must add one or more fields of the `knn_vector` data type. This example creates an index with two `knn_vector` fields and uses cosine similarity: Next, you must add one or more fields of the `knn_vector` data type. This example creates an index with two `knn_vector` fields and uses cosine similarity:
@ -44,7 +45,7 @@ PUT my-knn-index-1
"parameters": { "parameters": {
"ef_construction": 128, "ef_construction": 128,
"m": 24 "m": 24
} }
} }
}, },
"my_vector2": { "my_vector2": {
@ -57,7 +58,7 @@ PUT my-knn-index-1
"parameters": { "parameters": {
"ef_construction": 256, "ef_construction": 256,
"m": 48 "m": 48
} }
} }
} }
} }

View File

@ -4,6 +4,7 @@ title: k-NN
nav_order: 50 nav_order: 50
has_children: true has_children: true
has_toc: false has_toc: false
redirect_from: /docs/knn/
--- ---
# k-NN # k-NN
@ -20,7 +21,7 @@ This plugin supports three different methods for obtaining the k-nearest neighbo
Approximate k-NN is the best choice for searches over large indices (i.e. hundreds of thousands of vectors or more) that require low latency. You should not use approximate k-NN if you want to apply a filter on the index before the k-NN search, which greatly reduces the number of vectors to be searched. In this case, you should use either the script scoring method or painless extensions. Approximate k-NN is the best choice for searches over large indices (i.e. hundreds of thousands of vectors or more) that require low latency. You should not use approximate k-NN if you want to apply a filter on the index before the k-NN search, which greatly reduces the number of vectors to be searched. In this case, you should use either the script scoring method or painless extensions.
For more details about this method, see [Approximate k-NN search](approximate-knn). For more details about this method, see [Approximate k-NN search]({{site.url}}{{site.baseurl}}/search-plugins/knn/approximate-knn/).
2. **Script Score k-NN** 2. **Script Score k-NN**
@ -28,7 +29,7 @@ This plugin supports three different methods for obtaining the k-nearest neighbo
Use this approach for searches over smaller bodies of documents or when a pre-filter is needed. Using this approach on large indices may lead to high latencies. Use this approach for searches over smaller bodies of documents or when a pre-filter is needed. Using this approach on large indices may lead to high latencies.
For more details about this method, see [Exact k-NN with scoring script](knn-score-script). For more details about this method, see [Exact k-NN with scoring script]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-score-script/).
3. **Painless extensions** 3. **Painless extensions**
@ -36,7 +37,7 @@ This plugin supports three different methods for obtaining the k-nearest neighbo
This approach has slightly slower query performance compared to the k-NN Script Score. If your use case requires more customization over the final score, you should use this approach over Script Score k-NN. This approach has slightly slower query performance compared to the k-NN Script Score. If your use case requires more customization over the final score, you should use this approach over Script Score k-NN.
For more details about this method, see [Painless scripting functions](painless-functions). For more details about this method, see [Painless scripting functions]({{site.url}}{{site.baseurl}}/search-plugins/knn/painless-functions/).
Overall, for larger data sets, you should generally choose the approximate nearest neighbor method because it scales significantly better. For smaller data sets, where you may want to apply a filter, you should choose the custom scoring approach. If you have a more complex use case where you need to use a distance function as part of their scoring method, you should use the painless scripting approach. Overall, for larger data sets, you should generally choose the approximate nearest neighbor method because it scales significantly better. For smaller data sets, where you may want to apply a filter, you should choose the custom scoring approach. If you have a more complex use case where you need to use a distance function as part of their scoring method, you should use the painless scripting approach.

View File

@ -4,9 +4,11 @@ title: JNI library
nav_order: 6 nav_order: 6
parent: k-NN parent: k-NN
has_children: false has_children: false
redirect_from: /docs/knn/jni-library/
--- ---
# JNI library # JNI library
To integrate [nmslib's](https://github.com/nmslib/nmslib/) approximate k-NN functionality (implemented in C++) into the k-NN plugin (implemented in Java), we created a Java Native Interface library, which lets the k-NN plugin leverage nmslib's functionality. To see how we build the JNI library binary and learn how to get the most of it in your production environment, see [JNI Library Artifacts](https://github.com/opensearch-project/k-NN#jni-library-artifacts). To integrate [nmslib's](https://github.com/nmslib/nmslib/) approximate k-NN functionality (implemented in C++) into the k-NN plugin (implemented in Java), we created a Java Native Interface library, which lets the k-NN plugin leverage nmslib's functionality. To see how we build the JNI library binary and learn how to get the most of it in your production environment, see [JNI Library Artifacts](https://github.com/opensearch-project/k-NN#jni-library-artifacts).
For more information about JNI, see [Java Native Interface](https://en.wikipedia.org/wiki/Java_Native_Interface) on Wikipedia. For more information about JNI, see [Java Native Interface](https://en.wikipedia.org/wiki/Java_Native_Interface) on Wikipedia.

View File

@ -4,11 +4,13 @@ title: k-NN Index
nav_order: 1 nav_order: 1
parent: k-NN parent: k-NN
has_children: false has_children: false
redirect_from: /docs/knn/knn-index/
--- ---
# k-NN Index # k-NN Index
## `knn_vector` datatype ## knn_vector data type
The k-NN plugin introduces a custom data type, the `knn_vector`, that allows users to ingest their k-NN vectors The k-NN plugin introduces a custom data type, the `knn_vector`, that allows users to ingest their k-NN vectors
into an OpenSearch index. into an OpenSearch index.
@ -23,7 +25,7 @@ into an OpenSearch index.
"parameters": { "parameters": {
"ef_construction": 128, "ef_construction": 128,
"m": 24 "m": 24
} }
} }
} }
``` ```
@ -34,7 +36,7 @@ Mapping Pararameter | Required | Default | Updateable | Description
`dimension` | true | n/a | false | The vector dimension for the field `dimension` | true | n/a | false | The vector dimension for the field
`method` | false | null | false | The configuration for the Approximate nearest neighbor method `method` | false | null | false | The configuration for the Approximate nearest neighbor method
`method.name` | true, if `method` is specified | n/a | false | The identifier for the nearest neighbor method. Currently, "hnsw" is the only valid method. `method.name` | true, if `method` is specified | n/a | false | The identifier for the nearest neighbor method. Currently, "hnsw" is the only valid method.
`method.space_type` | false | "l2" | false | The vector space used to calculate the distance between vectors. Refer to [here](../approximate-knn#spaces)) to see available spaces. `method.space_type` | false | "l2" | false | The vector space used to calculate the distance between vectors. Refer to [here]({{site.url}}{{site.baseurl}}/search-plugins/knn/approximate-knn#spaces)) to see available spaces.
`method.engine` | false | "nmslib" | false | The approximate k-NN library to use for indexing and search. Currently, "nmslib" is the only valid engine. `method.engine` | false | "nmslib" | false | The approximate k-NN library to use for indexing and search. Currently, "nmslib" is the only valid engine.
`method.parameters` | false | null | false | The parameters used for the nearest neighbor method. `method.parameters` | false | null | false | The parameters used for the nearest neighbor method.
`method.parameters.ef_construction` | false | 512 | false | The size of the dynamic list used during k-NN graph creation. Higher values lead to a more accurate graph, but slower indexing speed. Only valid for "hnsw" method. `method.parameters.ef_construction` | false | 512 | false | The size of the dynamic list used during k-NN graph creation. Higher values lead to a more accurate graph, but slower indexing speed. Only valid for "hnsw" method.
@ -44,15 +46,15 @@ Mapping Pararameter | Required | Default | Updateable | Description
Additionally, the k-NN plugin introduces several index settings that can be used to configure the k-NN structure as well. Additionally, the k-NN plugin introduces several index settings that can be used to configure the k-NN structure as well.
At the moment, several parameters defined in the settings are in the deprecation process. Those parameters should be set At the moment, several parameters defined in the settings are in the deprecation process. Those parameters should be set
in the mapping instead of the index settings. Parameters set in the mapping will override the parameters set in the in the mapping instead of the index settings. Parameters set in the mapping will override the parameters set in the
index settings. Setting the parameters in the mapping allows an index to have multiple `knn_vector` fields with index settings. Setting the parameters in the mapping allows an index to have multiple `knn_vector` fields with
different parameters. different parameters.
Setting | Default | Updateable | Description Setting | Default | Updateable | Description
:--- | :--- | :--- | :--- :--- | :--- | :--- | :---
`index.knn` | false | false | Whether the index should build hnsw graphs for the `knn_vector` fields. If set to false, the `knn_vector` fields will be stored in doc values, but Approximate k-NN search functionality will be disabled. `index.knn` | false | false | Whether the index should build hnsw graphs for the `knn_vector` fields. If set to false, the `knn_vector` fields will be stored in doc values, but Approximate k-NN search functionality will be disabled.
`index.knn.algo_param.ef_search` | 512 | true | The size of the dynamic list used during k-NN searches. Higher values lead to more accurate but slower searches. `index.knn.algo_param.ef_search` | 512 | true | The size of the dynamic list used during k-NN searches. Higher values lead to more accurate but slower searches.
`index.knn.algo_param.ef_construction` | 512 | false | (Deprecated in 1.0.0. Use the mapping parameters to set this value instead.) Refer to mapping definition. `index.knn.algo_param.ef_construction` | 512 | false | (Deprecated in 1.0.0. Use the mapping parameters to set this value instead.) Refer to mapping definition.
`index.knn.algo_param.m` | 16 | false | (Deprecated in 1.0.0. Use the mapping parameters to set this value instead.) Refer to mapping definition. `index.knn.algo_param.m` | 16 | false | (Deprecated in 1.0.0. Use the mapping parameters to set this value instead.) Refer to mapping definition.
`index.knn.space_type` | "l2" | false | (Deprecated in 1.0.0. Use the mapping parameters to set this value instead.) Refer to mapping definition. `index.knn.space_type` | "l2" | false | (Deprecated in 1.0.0. Use the mapping parameters to set this value instead.) Refer to mapping definition.

View File

@ -5,18 +5,20 @@ nav_order: 3
parent: k-NN parent: k-NN
has_children: false has_children: false
has_math: true has_math: true
redirect_from: /docs/knn/knn-score-script/
--- ---
# Exact k-NN with scoring script # Exact k-NN with scoring script
The k-NN plugin implements the OpenSearch score script plugin that you can use to find the exact k-nearest neighbors to a given query point. Using the k-NN score script, you can apply a filter on an index before executing the nearest neighbor search. This is useful for dynamic search cases where the index body may vary based on other conditions.
Because the score script approach executes a brute force search, it doesn't scale as well as the [approximate approach](../approximate-knn). In some cases, it might be better to think about refactoring your workflow or index structure to use the approximate approach instead of the score script approach. The k-NN plugin implements the OpenSearch score script plugin that you can use to find the exact k-nearest neighbors to a given query point. Using the k-NN score script, you can apply a filter on an index before executing the nearest neighbor search. This is useful for dynamic search cases where the index body may vary based on other conditions.
Because the score script approach executes a brute force search, it doesn't scale as well as the [approximate approach]({{site.url}}{{site.baseurl}}/search-plugins/knn/approximate-knn). In some cases, it might be better to think about refactoring your workflow or index structure to use the approximate approach instead of the score script approach.
## Getting started with the score script for vectors ## Getting started with the score script for vectors
Similar to approximate nearest neighbor search, in order to use the score script on a body of vectors, you must first create an index with one or more `knn_vector` fields. Similar to approximate nearest neighbor search, in order to use the score script on a body of vectors, you must first create an index with one or more `knn_vector` fields.
If you intend to just use the score script approach (and not the approximate approach) you can set `index.knn` to `false` and not set `index.knn.space_type`. You can choose the space type during search. See [spaces](#spaces) for the spaces the k-NN score script suppports. If you intend to just use the score script approach (and not the approximate approach) you can set `index.knn` to `false` and not set `index.knn.space_type`. You can choose the space type during search. See [spaces](#spaces) for the spaces the k-NN score script suppports.
This example creates an index with two `knn_vector` fields: This example creates an index with two `knn_vector` fields:
@ -101,7 +103,7 @@ All parameters are required.
- `query_value` is the point you want to find the nearest neighbors for. For the Euclidean and cosine similarity spaces, the value must be an array of floats that matches the dimension set in the field's mapping. For Hamming bit distance, this value can be either of type signed long or a base64-encoded string (for the long and binary field types, respectively). - `query_value` is the point you want to find the nearest neighbors for. For the Euclidean and cosine similarity spaces, the value must be an array of floats that matches the dimension set in the field's mapping. For Hamming bit distance, this value can be either of type signed long or a base64-encoded string (for the long and binary field types, respectively).
- `space_type` corresponds to the distance function. See the [spaces section](#spaces). - `space_type` corresponds to the distance function. See the [spaces section](#spaces).
The [post filter example in the approximate approach](../approximate-knn/#using-approximate-k-nn-with-filters) shows a search that returns fewer than `k` results. If you want to avoid this situation, the score script method lets you essentially invert the order of events. In other words, you can filter down the set of documents over which to execute the k-nearest neighbor search. The [post filter example in the approximate approach]({{site.url}}{{site.baseurl}}/search-plugins/knn/approximate-knn/#using-approximate-k-nn-with-filters) shows a search that returns fewer than `k` results. If you want to avoid this situation, the score script method lets you essentially invert the order of events. In other words, you can filter down the set of documents over which to execute the k-nearest neighbor search.
This example shows a pre-filter approach to k-NN search with the score script approach. First, create the index: This example shows a pre-filter approach to k-NN search with the score script approach. First, create the index:
@ -312,8 +314,8 @@ A space corresponds to the function used to measure the distance between two poi
</tr> </tr>
<tr> <tr>
<td>innerproduct</td> <td>innerproduct</td>
<td>\[ Distance(X, Y) = -{A &middot; B} \]</td> <td>\[ Distance(X, Y) = \sum_{i=1}^n (X_i - Y_i) \]</td>
<td>if (Distance Function >= 0) 1 / (1 + Distance Function) else -Distance Function + 1</td> <td>1 / (1 + Distance Function)</td>
</tr> </tr>
<tr> <tr>
<td>hammingbit</td> <td>hammingbit</td>

View File

@ -5,15 +5,16 @@ nav_order: 4
parent: k-NN parent: k-NN
has_children: false has_children: false
has_math: true has_math: true
redirect_from: /docs/knn/painless-functions/
--- ---
# k-NN Painless Scripting extensions # k-NN Painless Scripting extensions
With the k-NN plugin's Painless Scripting extensions, you can use k-NN distance functions directly in your Painless scripts to perform operations on `knn_vector` fields. Painless has a strict list of allowed functions and classes per context to ensure its scripts are secure. The k-NN plugin adds Painless Scripting extensions to a few of the distance functions used in [k-NN score script](../knn-score-script), so you can use them to customize your k-NN workload. With the k-NN plugin's Painless Scripting extensions, you can use k-NN distance functions directly in your Painless scripts to perform operations on `knn_vector` fields. Painless has a strict list of allowed functions and classes per context to ensure its scripts are secure. The k-NN plugin adds Painless Scripting extensions to a few of the distance functions used in [k-NN score script]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-score-script), so you can use them to customize your k-NN workload.
## Get started with k-NN's Painless Scripting functions ## Get started with k-NN's Painless Scripting functions
To use k-NN's Painless Scripting functions, first create an index with `knn_vector` fields like in [k-NN score script](../knn-score-script#Getting-started-with-the-score-script). Once the index is created and you ingest some data, you can use the painless extensions: To use k-NN's Painless Scripting functions, first create an index with `knn_vector` fields like in [k-NN score script]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-score-script#getting-started-with-the-score-script). Once the index is created and you ingest some data, you can use the painless extensions:
```json ```json
GET my-knn-index-2/_search GET my-knn-index-2/_search
@ -48,16 +49,21 @@ GET my-knn-index-2/_search
The following table describes the available painless functions the k-NN plugin provides: The following table describes the available painless functions the k-NN plugin provides:
Function name | Function signature | Description Function name | Function signature | Description
:--- | :--- :--- | :---
l2Squared | `float l2Squared (float[] queryVector, doc['vector field'])` | This function calculates the square of the L2 distance (Euclidean distance) between a given query vector and document vectors. The shorter the distance, the more relevant the document is, so this example inverts the return value of the l2Squared function. If the document vector matches the query vector, the result is 0, so this example also adds 1 to the distance to avoid divide by zero errors. l2Squared | `float l2Squared (float[] queryVector, doc['vector field'])` | This function calculates the square of the L2 distance (Euclidean distance) between a given query vector and document vectors. The shorter the distance, the more relevant the document is, so this example inverts the return value of the l2Squared function. If the document vector matches the query vector, the result is 0, so this example also adds 1 to the distance to avoid divide by zero errors.
l1Norm | `float l1Norm (float[] queryVector, doc['vector field'])` | This function calculates the square of the L2 distance (Euclidean distance) between a given query vector and document vectors. The shorter the distance, the more relevant the document is, so this example inverts the return value of the l2Squared function. If the document vector matches the query vector, the result is 0, so this example also adds 1 to the distance to avoid divide by zero errors. l1Norm | `float l1Norm (float[] queryVector, doc['vector field'])` | This function calculates the square of the L2 distance (Euclidean distance) between a given query vector and document vectors. The shorter the distance, the more relevant the document is, so this example inverts the return value of the l2Squared function. If the document vector matches the query vector, the result is 0, so this example also adds 1 to the distance to avoid divide by zero errors.
cosineSimilarity | `float cosineSimilarity (float[] queryVector, doc['vector field'])` | Cosine similarity is an inner product of the query vector and document vector normalized to both have a length of 1. If the magnitude of the query vector doesn't change throughout the query, you can pass the magnitude of the query vector to improve performance, instead of calculating the magnitude every time for every filtered document:<br /> `float cosineSimilarity (float[] queryVector, doc['vector field'], float normQueryVector)` <br />In general, the range of cosine similarity is [-1, 1]. However, in the case of information retrieval, the cosine similarity of two documents ranges from 0 to 1 because the tf-idf statistic can't be negative. Therefore, the k-NN plugin adds 1.0 in order to always yield a positive cosine similarity score. cosineSimilarity | `float cosineSimilarity (float[] queryVector, doc['vector field'])` | Cosine similarity is an inner product of the query vector and document vector normalized to both have a length of 1. If the magnitude of the query vector doesn't change throughout the query, you can pass the magnitude of the query vector to improve performance, instead of calculating the magnitude every time for every filtered document:<br /> `float cosineSimilarity (float[] queryVector, doc['vector field'], float normQueryVector)` <br />In general, the range of cosine similarity is [-1, 1]. However, in the case of information retrieval, the cosine similarity of two documents ranges from 0 to 1 because the tf-idf statistic can't be negative. Therefore, the k-NN plugin adds 1.0 in order to always yield a positive cosine similarity score.
## Constraints ## Constraints
1. If a documents `knn_vector` field has different dimensions than the query, the function throws an `IllegalArgumentException`. 1. If a documents `knn_vector` field has different dimensions than the query, the function throws an `IllegalArgumentException`.
2. If a vector field doesn't have a value, the function throws an <code>IllegalStateException</code>. 2. If a vector field doesn't have a value, the function throws an <code>IllegalStateException</code>.
You can avoid this situation by first checking if a document has a value in its field: You can avoid this situation by first checking if a document has a value in its field:
```
"source": "doc[params.field].size() == 0 ? 0 : 1 / (1 + l2Squared(params.query_value, doc[params.field]))", ```
``` "source": "doc[params.field].size() == 0 ? 0 : 1 / (1 + l2Squared(params.query_value, doc[params.field]))",
Because scores can only be positive, this script ranks documents with vector fields higher than those without. ```
Because scores can only be positive, this script ranks documents with vector fields higher than those without.

View File

@ -3,6 +3,7 @@ layout: default
title: Performance tuning title: Performance tuning
parent: k-NN parent: k-NN
nav_order: 8 nav_order: 8
redirect_from: /docs/knn/performance-tuning/
--- ---
# Performance tuning # Performance tuning
@ -39,7 +40,7 @@ Take the following steps to improve indexing performance, especially when you pl
* **Increase the number of indexing threads** * **Increase the number of indexing threads**
If the hardware you choose has multiple cores, you can allow multiple threads in graph construction by speeding up the indexing process. Determine the number of threads to allot with the [knn.algo_param.index_thread_qty](../settings/#Cluster-settings) setting. If the hardware you choose has multiple cores, you can allow multiple threads in graph construction by speeding up the indexing process. Determine the number of threads to allot with the [knn.algo_param.index_thread_qty]({{site.url}}{{site.baseurl}}/search-plugins/knn/settings#cluster-settings) setting.
Keep an eye on CPU utilization and choose the correct number of threads. Because graph construction is costly, having multiple threads can cause additional CPU load. Keep an eye on CPU utilization and choose the correct number of threads. Because graph construction is costly, having multiple threads can cause additional CPU load.
@ -49,7 +50,7 @@ Take the following steps to improve search performance:
* **Reduce segment count** * **Reduce segment count**
To improve search performance, you must keep the number of segments under control. Lucene's IndexSearcher searches over all of the segments in a shard to find the 'size' best results. However, because the complexity of search for the HNSW algorithm is logarithmic with respect to the number of vectors, searching over five graphs with 100 vectors each and then taking the top 'size' results from 5*k results will take longer than searching over one graph with 500 vectors and then taking the top size results from k results. To improve search performance, you must keep the number of segments under control. Lucene's IndexSearcher searches over all of the segments in a shard to find the 'size' best results. However, because the complexity of search for the HNSW algorithm is logarithmic with respect to the number of vectors, searching over five graphs with 100 vectors each and then taking the top 'size' results from 5*k results will take longer than searching over one graph with 500 vectors and then taking the top size results from k results.
Ideally, having one segment per shard provides the optimal performance with respect to search latency. You can configure an index to have multiple shards to avoid giant shards and achieve more parallelism. Ideally, having one segment per shard provides the optimal performance with respect to search latency. You can configure an index to have multiple shards to avoid giant shards and achieve more parallelism.
@ -86,9 +87,9 @@ Take the following steps to improve search performance:
Recall depends on multiple factors like number of vectors, number of dimensions, segments, and so on. Searching over a large number of small segments and aggregating the results leads to better recall than searching over a small number of large segments and aggregating results. The larger the graph, the more chances of losing recall if you're using smaller algorithm parameters. Choosing larger values for algorithm parameters should help solve this issue but sacrifices search latency and indexing time. That being said, it's important to understand your system's requirements for latency and accuracy, and then choose the number of segments you want your index to have based on experimentation. Recall depends on multiple factors like number of vectors, number of dimensions, segments, and so on. Searching over a large number of small segments and aggregating the results leads to better recall than searching over a small number of large segments and aggregating results. The larger the graph, the more chances of losing recall if you're using smaller algorithm parameters. Choosing larger values for algorithm parameters should help solve this issue but sacrifices search latency and indexing time. That being said, it's important to understand your system's requirements for latency and accuracy, and then choose the number of segments you want your index to have based on experimentation.
To configure recall, adjust the algorithm parameters of the HNSW algorithm exposed through index settings. Algorithm parameters that control recall are `m`, `ef_construction`, and `ef_search`. For more information about how algorithm parameters influence indexing and search recall, see [HNSW algorithm parameters](https://github.com/nmslib/hnswlib/blob/master/ALGO_PARAMS.md). Increasing these values can help recall and lead to better search results, but at the cost of higher memory utilization and increased indexing time. To configure recall, adjust the algorithm parameters of the HNSW algorithm exposed through index settings. Algorithm parameters that control recall are `m`, `ef_construction`, and `ef_search`. For more information about how algorithm parameters influence indexing and search recall, see [HNSW algorithm parameters](https://github.com/nmslib/hnswlib/blob/master/ALGO_PARAMS.md). Increasing these values can help recall and lead to better search results, but at the cost of higher memory utilization and increased indexing time.
The default recall values work on a broader set of use cases, but make sure to run your own experiments on your data sets and choose the appropriate values. For index-level settings, see [Index settings](../knn-index#index-settings). The default recall values work on a broader set of use cases, but make sure to run your own experiments on your data sets and choose the appropriate values. For index-level settings, see [Index settings]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index#index-settings).
## Estimating memory usage ## Estimating memory usage
@ -102,10 +103,11 @@ As an example, assume you have a million vectors with a dimension of 256 and M o
1.1 * (4 *256 + 8 * 16) * 1,000,000 ~= 1.26 GB 1.1 * (4 *256 + 8 * 16) * 1,000,000 ~= 1.26 GB
``` ```
**Note**: Remember that having a replica doubles the total number of vectors. Having a replica doubles the total number of vectors.
{: .note }
## Approximate nearest neighbor versus score script ## Approximate nearest neighbor versus score script
The standard k-NN query and custom scoring option perform differently. Test with a representative set of documents to see if the search results and latencies match your expectations. The standard k-NN query and custom scoring option perform differently. Test with a representative set of documents to see if the search results and latencies match your expectations.
Custom scoring works best if the initial filter reduces the number of documents to no more than 20,000. Increasing shard count can improve latency, but be sure to keep shard size within the [recommended guidelines](../../opensearch/#primary-and-replica-shards). Custom scoring works best if the initial filter reduces the number of documents to no more than 20,000. Increasing shard count can improve latency, but be sure to keep shard size within the [recommended guidelines]({{site.url}}{{site.baseurl}}/opensearch#primary-and-replica-shards).

View File

@ -3,6 +3,7 @@ layout: default
title: Settings title: Settings
parent: k-NN parent: k-NN
nav_order: 7 nav_order: 7
redirect_from: /docs/knn/settings/
--- ---
# k-NN settings # k-NN settings

View File

@ -3,6 +3,7 @@ layout: default
title: Commands title: Commands
parent: Piped processing language parent: Piped processing language
nav_order: 4 nav_order: 4
redirect_from: /docs/ppl/commands/
--- ---

View File

@ -3,6 +3,7 @@ layout: default
title: Data Types title: Data Types
parent: Piped processing language parent: Piped processing language
nav_order: 6 nav_order: 6
redirect_from: /docs/ppl/datatypes/
--- ---
@ -33,4 +34,4 @@ array | nested | STRUCT
In addition to this list, the PPL plugin also supports the `datetime` type, though it doesn't have a corresponding mapping with OpenSearch. In addition to this list, the PPL plugin also supports the `datetime` type, though it doesn't have a corresponding mapping with OpenSearch.
To use a function without a corresponding mapping, you must explicitly convert the data type to one that does. To use a function without a corresponding mapping, you must explicitly convert the data type to one that does.
The PPL plugin supports all SQL date and time types. To learn more, see [SQL Data Types](../../sql/datatypes/). The PPL plugin supports all SQL date and time types. To learn more, see [SQL Data Types]({{site.url}}{{site.baseurl}}/search-plugins/sql/datatypes/).

View File

@ -3,6 +3,7 @@ layout: default
title: Endpoint title: Endpoint
parent: Piped processing language parent: Piped processing language
nav_order: 1 nav_order: 1
redirect_from: /docs/ppl/endpoint/
--- ---
# Endpoint # Endpoint

View File

@ -3,8 +3,9 @@ layout: default
title: Functions title: Functions
parent: Piped processing language parent: Piped processing language
nav_order: 10 nav_order: 10
redirect_from: /docs/ppl/functions/
--- ---
# Functions # Functions
The PPL plugin supports all SQL functions. To learn more, see [SQL Functions](../../sql/functions/). The PPL plugin supports all SQL functions. To learn more, see [SQL Functions]({{site.url}}{{site.baseurl}}/search-plugins/sql/functions/).

View File

@ -3,6 +3,7 @@ layout: default
title: Identifiers title: Identifiers
parent: Piped processing language parent: Piped processing language
nav_order: 7 nav_order: 7
redirect_from: /docs/ppl/identifiers/
--- ---

Some files were not shown because too many files have changed in this diff Show More