From 2c4356202b187043df8db2556903bd4304373eda Mon Sep 17 00:00:00 2001 From: Mohit Godwani <81609427+mgodwan@users.noreply.github.com> Date: Wed, 21 Feb 2024 02:27:55 +0530 Subject: [PATCH] Add documentation for new bloom filter settings (#6449) * Add documentation for new bloom filter settings Signed-off-by: mgodwan * Address PR comments Signed-off-by: mgodwan * Update _install-and-configure/configuring-opensearch/index-settings.md Co-authored-by: Nathan Bower Signed-off-by: Melissa Vagi * Update _install-and-configure/configuring-opensearch/index-settings.md Co-authored-by: Nathan Bower Signed-off-by: Melissa Vagi * Update index-settings.md Signed-off-by: Melissa Vagi Signed-off-by: Melissa Vagi --------- Signed-off-by: mgodwan Signed-off-by: Melissa Vagi Co-authored-by: Melissa Vagi Co-authored-by: Nathan Bower --- .../configuring-opensearch/index-settings.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/_install-and-configure/configuring-opensearch/index-settings.md b/_install-and-configure/configuring-opensearch/index-settings.md index f88d0602..8f37be48 100644 --- a/_install-and-configure/configuring-opensearch/index-settings.md +++ b/_install-and-configure/configuring-opensearch/index-settings.md @@ -182,6 +182,10 @@ OpenSearch supports the following dynamic index-level index settings: - `index.final_pipeline` (String): The final ingest node pipeline for the index. If the final pipeline is set and the pipeline does not exist, then index requests fail. The pipeline name `_none` specifies that the index does not have an ingest pipeline. +- `index.optimize_doc_id_lookup.fuzzy_set.enabled` (Boolean): This setting controls whether `fuzzy_set` should be enabled in order to optimize document ID lookups in index or search calls by using an additional data structure, in this case, the Bloom filter data structure. Enabling this setting improves performance for upsert and search operations that rely on document ID by creating a new data structure (Bloom filter). The Bloom filter allows for the handling of negative cases (that is, IDs being absent in the existing index) through faster off-heap lookups. Default is `false`. This setting can only be used if the feature flag `opensearch.experimental.optimize_doc_id_lookup.fuzzy_set.enabled` is set to `true`. + +- `index.optimize_doc_id_lookup.fuzzy_set.false_positive_probability` (Double): Sets the false-positive probability for the underlying `fuzzy_set` (that is, the Bloom filter). A lower false-positive probability ensures higher throughput for `UPSERT` and `GET` operations. Allowed values range between `0.01` and `0.50`. Default is `0.20`. This setting can only be used if the feature flag `opensearch.experimental.optimize_doc_id_lookup.fuzzy_set.enabled` is set to `true`. + ### Updating a dynamic index setting You can update a dynamic index setting at any time through the API. For example, to update the refresh interval, use the following request: