opensearch-docs-cn/_search-plugins/knn/performance-tuning.md

103 lines
7.7 KiB
Markdown
Raw Normal View History

2021-05-05 13:09:47 -04:00
---
layout: default
2021-05-11 12:29:35 -04:00
title: Performance tuning
Add an overview of search methods and pages for each search method (#5636) * Restructuring TOC Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Resolve merge conflicts Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * More foundational rewrites of ML Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * TOC restructure Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Rename and rewrite search pages and add keyword search Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Small wording change Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Small wording change Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Updated response Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Apply suggestions from code review Co-authored-by: Melissa Vagi <vagimeli@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Small rewording Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Move neural search to top of vector search list Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Change terminology Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Reorganize search methods list Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Rename links Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * More link renames Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Apply suggestions from code review Co-authored-by: Nathan Bower <nbower@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Implemented editorial comments Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> --------- Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Co-authored-by: Melissa Vagi <vagimeli@amazon.com> Co-authored-by: Nathan Bower <nbower@amazon.com>
2023-11-29 15:28:20 -05:00
parent: k-NN search
grand_parent: Search methods
Search with k-NN filters (#1814) * new file for knn filter searches Signed-off-by: alicejw <alicejw@amazon.com> * for knn filter queries Signed-off-by: alicejw <alicejw@amazon.com> * more details and include graphic Signed-off-by: alicejw <alicejw@amazon.com> * add graph of filtered doc set Signed-off-by: alicejw <alicejw@amazon.com> * add release label Signed-off-by: alicejw <alicejw@amazon.com> * filters are defined by Query DSL Signed-off-by: alicejw <alicejw@amazon.com> * more details about how the algorithm works and how to specify lucene as the search engine Signed-off-by: alicejw <alicejw@amazon.com> * more refining sentences Signed-off-by: alicejw <alicejw@amazon.com> * for response samples Signed-off-by: alicejw <alicejw@amazon.com> * reorg heading levels Signed-off-by: alicejw <alicejw@amazon.com> * more rewrites for clarity Signed-off-by: alicejw <alicejw@amazon.com> * to add the complex filter query Signed-off-by: alicejw <alicejw@amazon.com> * update response for complex query Signed-off-by: alicejw <alicejw@amazon.com> * for typo Signed-off-by: alicejw <alicejw@amazon.com> * for rewrites to overview Signed-off-by: alicejw <alicejw@amazon.com> * to add better request/response for the complex filter example Signed-off-by: alicejw <alicejw@amazon.com> * for eng review update Signed-off-by: alicejw <alicejw@amazon.com> * format fix for example Signed-off-by: alicejw <alicejw@amazon.com> * for filter selectiveness use case section Signed-off-by: alicejw <alicejw@amazon.com> * for new workflow diagram and description Signed-off-by: alicejw <alicejw@amazon.com> * update section headings Signed-off-by: alicejw <alicejw@amazon.com> * add image for algorithm workflow diagram Signed-off-by: alicejw <alicejw@amazon.com> * reorg sections to make more concise Signed-off-by: alicejw <alicejw@amazon.com> * explain selectiveness percentage Signed-off-by: alicejw <alicejw@amazon.com> * more rewrites to complex query description Signed-off-by: alicejw <alicejw@amazon.com> * define complex query Signed-off-by: alicejw <alicejw@amazon.com> * more rewrites Signed-off-by: alicejw <alicejw@amazon.com> * for tech review feedback and add new information Signed-off-by: alicejw <alicejw@amazon.com> * to blend new Boolean query example into filter approaches section Signed-off-by: alicejw <alicejw@amazon.com> * for complex query description clarity Signed-off-by: alicejw <alicejw@amazon.com> * more rewrites Signed-off-by: alicejw <alicejw@amazon.com> * typo Signed-off-by: alicejw <alicejw@amazon.com> * eng review updates Signed-off-by: alicejw <alicejw@amazon.com> * nit for grammar Signed-off-by: alicejw <alicejw@amazon.com> * to fix incorrect descriptions of restrictive filters Signed-off-by: alicejw <alicejw@amazon.com> * to fix incorrect descriptions of restrictive filters Signed-off-by: alicejw <alicejw@amazon.com> * for doc review feedback updates Signed-off-by: alicejw <alicejw@amazon.com> * minor grammar change Signed-off-by: alicejw <alicejw@amazon.com> * removed figure and table titles, per AWS Style Guide Signed-off-by: alicejw <alicejw@amazon.com> * remove table title per style guide Signed-off-by: alicejw <alicejw@amazon.com> * update nav orders for all pages to give space for new topics in multiples of 5, and add links to other knn topics where appropriate Signed-off-by: alicejw <alicejw@amazon.com> * small rewrite Signed-off-by: alicejw <alicejw@amazon.com> * for second doc review comments Signed-off-by: alicejw <alicejw@amazon.com> * Update _search-plugins/knn/filter-search-knn.md Co-authored-by: Nate Bower <nbower@amazon.com> * Update _search-plugins/knn/filter-search-knn.md Co-authored-by: Nate Bower <nbower@amazon.com> * for editorial review updates Signed-off-by: alicejw <alicejw@amazon.com> * for editorial review updates Signed-off-by: alicejw <alicejw@amazon.com> * fix cross-ref link Signed-off-by: alicejw <alicejw@amazon.com> * fix undone commit suggestions Signed-off-by: alicejw <alicejw@amazon.com> Signed-off-by: alicejw <alicejw@amazon.com> Co-authored-by: Nate Bower <nbower@amazon.com>
2022-11-11 14:23:45 -05:00
nav_order: 45
2021-05-05 13:09:47 -04:00
---
# Performance tuning
This topic provides performance tuning recommendations to improve indexing and search performance for approximate k-NN (ANN). From a high level, k-NN works according to these principles:
* Native library indexes are created per knn_vector field / (Lucene) segment pair.
2021-05-05 13:09:47 -04:00
* Queries execute on segments sequentially inside the shard (same as any other OpenSearch query).
* Each native library index in the segment returns <=k neighbors.
2021-05-11 12:29:35 -04:00
* The coordinator node picks up final size number of neighbors from the neighbors returned by each shard.
2021-05-05 13:09:47 -04:00
2021-05-11 12:29:35 -04:00
This topic also provides recommendations for comparing approximate k-NN to exact k-NN with score script.
2021-05-05 13:09:47 -04:00
## Indexing performance tuning
2021-05-11 12:29:35 -04:00
Take the following steps to improve indexing performance, especially when you plan to index a large number of vectors at once:
2021-05-05 13:09:47 -04:00
2021-05-11 12:29:35 -04:00
* **Disable the refresh interval**
Either disable the refresh interval (default = 1 sec), or set a long duration for the refresh interval to avoid creating multiple small segments:
```json
PUT /<index_name>/_settings
{
"index" : {
"refresh_interval" : "-1"
}
}
```
**Note**: Make sure to reenable `refresh_interval` after indexing finishes.
2021-05-05 13:09:47 -04:00
2021-05-11 12:29:35 -04:00
* **Disable replicas (no OpenSearch replica shard)**
2021-05-05 13:09:47 -04:00
Set replicas to `0` to prevent duplicate construction of native library indexes in both primary and replica shards. When you enable replicas after indexing finishes, the serialized native library indexes are directly copied. If you have no replicas, losing nodes might cause data loss, so it's important that the data lives elsewhere so this initial load can be retried in case of an issue.
2021-05-05 13:09:47 -04:00
2021-05-11 12:29:35 -04:00
* **Increase the number of indexing threads**
2021-05-05 13:09:47 -04:00
If the hardware you choose has multiple cores, you can allow multiple threads in native library index construction by speeding up the indexing process. Determine the number of threads to allot with the [knn.algo_param.index_thread_qty]({{site.url}}{{site.baseurl}}/search-plugins/knn/settings#cluster-settings) setting.
2021-05-05 13:09:47 -04:00
Keep an eye on CPU utilization and choose the correct number of threads. Because native library index construction is costly, having multiple threads can cause additional CPU load.
2021-05-05 13:09:47 -04:00
## Search performance tuning
2021-05-11 12:29:35 -04:00
Take the following steps to improve search performance:
2021-05-05 13:09:47 -04:00
2021-05-11 12:29:35 -04:00
* **Reduce segment count**
2021-05-05 13:09:47 -04:00
To improve search performance, you must keep the number of segments under control. Lucene's IndexSearcher searches over all of the segments in a shard to find the 'size' best results.
2021-05-05 13:09:47 -04:00
2021-05-11 12:29:35 -04:00
Ideally, having one segment per shard provides the optimal performance with respect to search latency. You can configure an index to have multiple shards to avoid giant shards and achieve more parallelism.
2021-05-05 13:09:47 -04:00
2021-05-11 12:29:35 -04:00
You can control the number of segments by choosing a larger refresh interval, or during indexing by asking OpenSearch to slow down segment creation by disabling the refresh interval.
2021-05-05 13:09:47 -04:00
2021-05-11 12:29:35 -04:00
* **Warm up the index**
2021-05-05 13:09:47 -04:00
Native library indexes are constructed during indexing, but they're loaded into memory during the first search. In Lucene, each segment is searched sequentially (so, for k-NN, each segment returns up to k nearest neighbors of the query point), and the top 'size' number of results based on the score are returned from all the results returned by segments at a shard level (higher score = better result).
2021-05-05 13:09:47 -04:00
Once a native library index is loaded (native library indexes are loaded outside OpenSearch JVM), OpenSearch caches them in memory. Initial queries are expensive and take a few seconds, while subsequent queries are faster and take milliseconds (assuming the k-NN circuit breaker isn't hit).
2021-05-05 13:09:47 -04:00
To avoid this latency penalty during your first queries, you can use the warmup API operation on the indexes you want to search:
2021-05-11 12:29:35 -04:00
```json
GET /_plugins/_knn/warmup/index1,index2,index3?pretty
2021-05-11 12:29:35 -04:00
{
"_shards" : {
"total" : 6,
"successful" : 6,
"failed" : 0
}
}
```
The warmup API operation loads all native library indexes for all shards (primary and replica) for the specified indexes into the cache, so there's no penalty to load native library indexes during initial searches.
2021-05-05 13:09:47 -04:00
**Note**: This API operation only loads the segments of the indexes it ***sees*** into the cache. If a merge or refresh operation finishes after the API runs, or if you add new documents, you need to rerun the API to load those native library indexes into memory.
2021-05-05 13:09:47 -04:00
2021-05-11 12:29:35 -04:00
* **Avoid reading stored fields**
2021-05-05 13:09:47 -04:00
2021-05-11 12:29:35 -04:00
If your use case is simply to read the IDs and scores of the nearest neighbors, you can disable reading stored fields, which saves time retrieving the vectors from stored fields.
2021-05-05 13:09:47 -04:00
* **Use `mmap` file I/O**
For the Lucene-based approximate k-NN search, there is no dedicated cache layer that speeds up read/write operations. Instead, the plugin relies on the existing caching mechanism in OpenSearch core. In versions 2.4 and earlier of the Lucene-based approximate k-NN search, read/write operations were based on Java NIO by default, which can be slow, depending on the Lucene version and number of segments per shard. Starting with version 2.5, k-NN enables [`mmap`](https://en.wikipedia.org/wiki/Mmap) file I/O by default when the store type is `hybridfs` (the default store type in OpenSearch). This leads to fast file I/O operations and improves the overall performance of both data ingestion and search. The two file extensions specific to vector values that use `mmap` are `.vec` and `.vem`. For more information about these file extensions, see [the Lucene documentation](https://lucene.apache.org/core/9_0_0/core/org/apache/lucene/codecs/lucene90/Lucene90HnswVectorsFormat.html).
The `mmap` file I/O uses the system file cache rather than memory allocated for the Java heap, so no additional allocation is required. To change the default list of extensions set by the plugin, update the `index.store.hybrid.mmap.extensions` setting at the cluster level using the [Cluster Settings API]({{site.url}}{{site.baseurl}}/api-reference/cluster-api/cluster-settings). **Note**: This is an expert-level setting that requires closing the index before updating the setting and reopening it after the update.
2021-05-11 12:29:35 -04:00
## Improving recall
2021-05-05 13:09:47 -04:00
Recall depends on multiple factors like number of vectors, number of dimensions, segments, and so on. Searching over a large number of small segments and aggregating the results leads to better recall than searching over a small number of large segments and aggregating results. The larger the native library index, the more chances of losing recall if you're using smaller algorithm parameters. Choosing larger values for algorithm parameters should help solve this issue but sacrifices search latency and indexing time. That being said, it's important to understand your system's requirements for latency and accuracy, and then choose the number of segments you want your index to have based on experimentation.
2021-05-05 13:09:47 -04:00
The default parameters work on a broader set of use cases, but make sure to run your own experiments on your data sets and choose the appropriate values. For index-level settings, see [Index settings]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index#index-settings).
2021-05-05 13:09:47 -04:00
2021-05-11 12:29:35 -04:00
## Approximate nearest neighbor versus score script
2021-05-05 13:09:47 -04:00
The standard k-NN query and custom scoring option perform differently. Test with a representative set of documents to see if the search results and latencies match your expectations.
Custom scoring works best if the initial filter reduces the number of documents to no more than 20,000. Increasing shard count can improve latency, but be sure to keep shard size within the [recommended guidelines]({{site.url}}{{site.baseurl}}/about/intro/#primary-and-replica-shards).