Change name of normalization processor in text (#5611)
* Change name of normalization processor in text Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Fix link Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> --------- Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
This commit is contained in:
parent
6625a06958
commit
bb269db70b
|
@ -12,7 +12,7 @@ You can use a hybrid query to combine relevance scores from multiple queries int
|
|||
|
||||
## Example
|
||||
|
||||
Before using a `hybrid` query, you must configure a search pipeline with a [`normalization_processor`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/normalization-processor/) (see [this example]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/normalization-processor#example)).
|
||||
Before using a `hybrid` query, you must configure a search pipeline with a [`normalization-processor`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/normalization-processor/) (see [this example]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/normalization-processor#example)).
|
||||
|
||||
To try out the example, follow the [Semantic search tutorial]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search#tutorial).
|
||||
|
||||
|
|
|
@ -9,7 +9,7 @@ grand_parent: Search pipelines
|
|||
|
||||
# Normalization processor
|
||||
|
||||
The `normalization_processor` is a search phase results processor that runs between the query and fetch phases of search execution. It intercepts the query phase results and then normalizes and combines the document scores from different query clauses before passing the documents to the fetch phase.
|
||||
The `normalization-processor` is a search phase results processor that runs between the query and fetch phases of search execution. It intercepts the query phase results and then normalizes and combines the document scores from different query clauses before passing the documents to the fetch phase.
|
||||
|
||||
## Score normalization and combination
|
||||
|
||||
|
@ -21,7 +21,7 @@ OpenSearch supports two search types: `query_then_fetch` and `dfs_query_then_fet
|
|||
|
||||
![Normalization processor flow diagram]({{site.url}}{{site.baseurl}}/images/normalization-processor.png)
|
||||
|
||||
When you send a search request to a node, the node becomes a _coordinating node_. During the first phase of search, the _query phase_, the coordinating node routes the search request to all shards in the index, including primary and replica shards. Each shard then runs the search query locally and returns metadata about the matching documents, which includes their document IDs and relevance scores. The `normalization_processor` then normalizes and combines scores from different query clauses. The coordinating node merges and sorts the local lists of results, compiling a global list of top documents that match the query. After that, search execution enters a _fetch phase_, in which the coordinating node requests the documents in the global list from the shards where they reside. Each shard returns the documents' `_source` to the coordinating node. Finally, the coordinating node sends a search response containing the results back to you.
|
||||
When you send a search request to a node, the node becomes a _coordinating node_. During the first phase of search, the _query phase_, the coordinating node routes the search request to all shards in the index, including primary and replica shards. Each shard then runs the search query locally and returns metadata about the matching documents, which includes their document IDs and relevance scores. The `normalization-processor` then normalizes and combines scores from different query clauses. The coordinating node merges and sorts the local lists of results, compiling a global list of top documents that match the query. After that, search execution enters a _fetch phase_, in which the coordinating node requests the documents in the global list from the shards where they reside. Each shard returns the documents' `_source` to the coordinating node. Finally, the coordinating node sends a search response containing the results back to you.
|
||||
|
||||
## Request fields
|
||||
|
||||
|
@ -38,11 +38,11 @@ Field | Data type | Description
|
|||
|
||||
## Example
|
||||
|
||||
The following example demonstrates using a search pipeline with a `normalization_processor`. To try out this example, follow the [Semantic search tutorial]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search#tutorial).
|
||||
The following example demonstrates using a search pipeline with a `normalization-processor`. To try out this example, follow the [Semantic search tutorial]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search#tutorial).
|
||||
|
||||
### Creating a search pipeline
|
||||
|
||||
The following request creates a search pipeline containing a `normalization_processor` that uses the `min_max` normalization technique and the `arithmetic_mean` combination technique:
|
||||
The following request creates a search pipeline containing a `normalization-processor` that uses the `min_max` normalization technique and the `arithmetic_mean` combination technique:
|
||||
|
||||
```json
|
||||
PUT /_search/pipeline/nlp-search-pipeline
|
||||
|
@ -114,4 +114,4 @@ For more information, see [Hybrid query]({{site.url}}{{site.baseurl}}/query-dsl/
|
|||
|
||||
To improve search relevance, we recommend increasing the sample size.
|
||||
|
||||
If the hybrid query does not return some expected results, it may be because the subqueries return too few documents. The `normalization_processor` only transforms the results returned by each subquery; it does not perform any additional sampling. During our experiments, we used [nDCG@10](https://en.wikipedia.org/wiki/Discounted_cumulative_gain) to measure quality of information retrieval depending on the number of documents returned (the size). We have found that a size in the [100, 200] range works best for datasets of up to 10M documents. We do not recommend increasing the size beyond the recommended values because higher size values do not improve search relevance but increase search latency.
|
||||
If the hybrid query does not return some expected results, it may be because the subqueries return too few documents. The `normalization-processor` only transforms the results returned by each subquery; it does not perform any additional sampling. During our experiments, we used [nDCG@10](https://en.wikipedia.org/wiki/Discounted_cumulative_gain) to measure quality of information retrieval depending on the number of documents returned (the size). We have found that a size in the [100, 200] range works best for datasets of up to 10M documents. We do not recommend increasing the size beyond the recommended values because higher size values do not improve search relevance but increase search latency.
|
||||
|
|
|
@ -46,7 +46,7 @@ The following table lists all supported search request processors.
|
|||
|
||||
Processor | Description | Earliest available version
|
||||
:--- | :--- | :---
|
||||
[`normalization_processor`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/script-processor/) | Intercepts the query phase results and normalizes and combines the document scores before passing the documents to the fetch phase. | 2.10
|
||||
[`normalization-processor`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/normalization-processor/) | Intercepts the query phase results and normalizes and combines the document scores before passing the documents to the fetch phase. | 2.10
|
||||
|
||||
## Viewing available processor types
|
||||
|
||||
|
|
Loading…
Reference in New Issue