Add cross-encoder model documentation (#6357)

* Add cross-ranking model documentation Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Model id format Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Move to custom models Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Update _search-plugins/search-relevance/reranking-search-results.md Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _ml-commons-plugin/custom-local-models.md Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Tech review and doc review comments Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Apply suggestions from code review Co-authored-by: Nathan Bower <nbower@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _ml-commons-plugin/pretrained-models.md Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> --------- Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Co-authored-by: Nathan Bower <nbower@amazon.com>
2024-02-16 08:44:29 -05:00 · 2024-02-16 08:44:29 -05:00 · e76ec7c0c7
parent 37ee05d979
commit e76ec7c0c7
3 changed files with 148 additions and 3 deletions
--- a/_ml-commons-plugin/custom-local-models.md
+++ b/_ml-commons-plugin/custom-local-models.md
@ -315,4 +315,149 @@ The response contains the tokens and weights:

 ## Step 5: Use the model for search

-To learn how to use the model for vector search, see [Set up neural search]({{site.url}}{{site.baseurl}}http://localhost:4000/docs/latest/search-plugins/neural-search/#set-up-neural-search).
+To learn how to use the model for vector search, see [Using an ML model for neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/#using-an-ml-model-for-neural-search).
+
+## Cross-encoder models
+
+Cross-encoder models support query reranking. 
+
+To register a cross-encoder model, send a request in the following format. The `model_config` object is optional. For cross-encoder models, specify the `function_name` as `TEXT_SIMILARITY`. For example, the following request registers an `ms-marco-TinyBERT-L-2-v2` model:
+
+```json
+POST /_plugins/_ml/models/_register
+{
+    "name": "ms-marco-TinyBERT-L-2-v2",
+    "version": "1.0.0",
+    "function_name": "TEXT_SIMILARITY",
+    "description": "test model",
+    "model_format": "TORCH_SCRIPT",
+    "model_group_id": "lN4AP40BKolAMNtR4KJ5",
+    "model_content_hash_value": "90e39a926101d1a4e542aade0794319404689b12acfd5d7e65c03d91c668b5cf",
+    "model_config": { 
+        "model_type": "bert",
+        "embedding_dimension": 1,
+        "framework_type": "huggingface_transformers",
+        "total_chunks":2,
+        "all_config": "{\"total_chunks\":2}"
+    },
+    "url": "https://github.com/opensearch-project/ml-commons/blob/main/ml-algorithms/src/test/resources/org/opensearch/ml/engine/algorithms/text_similarity/TinyBERT-CE-torch_script.zip?raw=true"
+}
+```
+{% include copy-curl.html %}
+
+Then send a request to deploy the model:
+
+```json
+POST _plugins/_ml/models/<model_id>/_deploy
+```
+{% include copy-curl.html %}
+
+To test a cross-encoder model, send the following request:
+
+```json
+POST _plugins/_ml/models/<model_id>/_predict
+{
+    "query_text": "today is sunny",
+    "text_docs": [
+        "how are you",
+        "today is sunny",
+        "today is july fifth",
+        "it is winter"
+    ]
+}
+```
+{% include copy-curl.html %}
+
+The model calculates the similarity score of `query_text` and each document in `text_docs` and returns a list of scores for each document in the order they were provided in `text_docs`:
+
+```json
+{
+  "inference_results": [
+    {
+      "output": [
+        {
+          "name": "similarity",
+          "data_type": "FLOAT32",
+          "shape": [
+            1
+          ],
+          "data": [
+            -6.077798
+          ],
+          "byte_buffer": {
+            "array": "Un3CwA==",
+            "order": "LITTLE_ENDIAN"
+          }
+        }
+      ]
+    },
+    {
+      "output": [
+        {
+          "name": "similarity",
+          "data_type": "FLOAT32",
+          "shape": [
+            1
+          ],
+          "data": [
+            10.223609
+          ],
+          "byte_buffer": {
+            "array": "55MjQQ==",
+            "order": "LITTLE_ENDIAN"
+          }
+        }
+      ]
+    },
+    {
+      "output": [
+        {
+          "name": "similarity",
+          "data_type": "FLOAT32",
+          "shape": [
+            1
+          ],
+          "data": [
+            -1.3987057
+          ],
+          "byte_buffer": {
+            "array": "ygizvw==",
+            "order": "LITTLE_ENDIAN"
+          }
+        }
+      ]
+    },
+    {
+      "output": [
+        {
+          "name": "similarity",
+          "data_type": "FLOAT32",
+          "shape": [
+            1
+          ],
+          "data": [
+            -4.5923924
+          ],
+          "byte_buffer": {
+            "array": "4fSSwA==",
+            "order": "LITTLE_ENDIAN"
+          }
+        }
+      ]
+    }
+  ]
+}
+```
+
+A higher document score means higher similarity. In the preceding response, documents are scored as follows against the query text `today is sunny`:
+
+Document text | Score
+:--- | :---
+`how are you` | -6.077798
+`today is sunny` | 10.223609
+`today is july fifth` | -1.3987057
+`it is winter` | -4.5923924
+
+The document that contains the same text as the query is scored the highest, and the remaining documents are scored based on the text similarity.
+
+To learn how to use the model for reranking, see [Reranking search results]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/reranking-search-results/).
--- a/_ml-commons-plugin/pretrained-models.md
+++ b/_ml-commons-plugin/pretrained-models.md
@ -296,4 +296,4 @@ The following table provides a list of sparse encoding models and artifact links
 |---|---|---|---|
 | `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1` | 1.0.1 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-encoding-v1-1.0.1-torch_script.zip)<br>- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-v1/1.0.1/torch_script/config.json) | A neural sparse encoding model. The model transforms text into a sparse vector, identifies the indexes of non-zero elements in the vector, and then converts the vector into `<entry, weight>` pairs, where each entry corresponds to a non-zero element index. |
 | `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` | 1.0.1 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-encoding-doc-v1-1.0.1-torch_script.zip)<br>- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1/1.0.1/torch_script/config.json) | A neural sparse encoding model. The model transforms text into a sparse vector, identifies the indexes of non-zero elements in the vector, and then converts the vector into `<entry, weight>` pairs, where each entry corresponds to a non-zero element index. |
-| `amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1` | 1.0.1 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-tokenizer-v1-1.0.1-torch_script.zip)<br>- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.1/torch_script/config.json) | A neural sparse tokenizer model. The model tokenizes text into tokens and assigns each token a predefined weight, which is the token's IDF (if the IDF file is not provided, the weight defaults to 1). For more information, see [Preparing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/custom-local-models/#preparing-a-model). |
+| `amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1` | 1.0.1 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-tokenizer-v1-1.0.1-torch_script.zip)<br>- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.1/torch_script/config.json) | A neural sparse tokenizer model. The model tokenizes text into tokens and assigns each token a predefined weight, which is the token's inverse document frequency (IDF). If the IDF file is not provided, the weight defaults to 1. For more information, see [Preparing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/custom-local-models/#preparing-a-model). |
--- a/_search-plugins/search-relevance/reranking-search-results.md
+++ b/_search-plugins/search-relevance/reranking-search-results.md
@ -13,7 +13,7 @@ Introduced 2.12
 You can rerank search results using a cross-encoder reranker in order to improve search relevance. To implement reranking, you need to configure a [search pipeline]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/index/) that runs at search time. The search pipeline intercepts search results and applies the [`rerank` processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rerank-processor/) to them. The `rerank` processor evaluates the search results and sorts them based on the new scores provided by the cross-encoder model. 

 **PREREQUISITE**<br>
-Before using hybrid search, you must set up a cross-encoder model. For more information, see [Choosing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#choosing-a-model).
+Before configuring a reranking pipeline, you must set up a cross-encoder model. For more information, see [Cross-encoder models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/custom-local-models/#cross-encoder-models).
 {: .note}

 ## Running a search with reranking