add pretrained model description and saving disk description (#5383)

* add description and saving disk Signed-off-by: xinyual <xinyual@amazon.com> * update Signed-off-by: xinyual <xinyual@amazon.com> * update Signed-off-by: xinyual <xinyual@amazon.com> * Update _ml-commons-plugin/pretrained-models.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Signed-off-by: xinyual <74362153+xinyual@users.noreply.github.com> * Update _query-dsl/specialized/neural-sparse.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Signed-off-by: xinyual <74362153+xinyual@users.noreply.github.com> * Update _search-plugins/neural-sparse-search.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Signed-off-by: xinyual <74362153+xinyual@users.noreply.github.com> * Update _search-plugins/neural-sparse-search.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Signed-off-by: xinyual <74362153+xinyual@users.noreply.github.com> * Update _search-plugins/neural-sparse-search.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Signed-off-by: xinyual <74362153+xinyual@users.noreply.github.com> * Moved recommended model choice options outside the table Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Add a link for more info Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Apply suggestions from code review Co-authored-by: Nathan Bower <nbower@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> --------- Signed-off-by: xinyual <xinyual@amazon.com> Signed-off-by: xinyual <74362153+xinyual@users.noreply.github.com> Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Co-authored-by: Fanit Kolchina <kolchfa@amazon.com> Co-authored-by: Nathan Bower <nbower@amazon.com>
2023-10-25 23:59:49 +08:00 · 2023-10-25 23:59:49 +08:00 · 242c35db59
parent 49889b44f0
commit 242c35db59
3 changed files with 44 additions and 6 deletions
--- a/_ml-commons-plugin/pretrained-models.md
+++ b/_ml-commons-plugin/pretrained-models.md
@ -80,11 +80,17 @@ The following table provides a list of sentence transformer models and artifact

 Sparse encoding models transfer text into a sparse vector and convert the vector to a list of `<token: weight>` pairs representing the text entry and its corresponding weight in the sparse vector. You can use these models for use cases such as clustering or sparse neural search.

+We recommend the following models for optimal performance:
+
+- Use the `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1` model during both ingestion and search.
+- Use the `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` model during ingestion and the
+`amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1` model during search.
+
 The following table provides a list of sparse encoding models and artifact links you can use to download them.

-| Model name | Auto-truncation | TorchScript artifact | 
-|---|---|
-| `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1` | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-v1/1.0.0/torch_script/opensearch-neural-sparse-encoding-v1-1.0.0-torch_script.zip)<br>- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-v1/1.0.0/torch_script/config.json) | 
-| `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1/1.0.0/torch_script/opensearch-neural-sparse-encoding-doc-v1-1.0.0-torch_script.zip)<br>- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1/1.0.0/torch_script/config.json) | 
-| `amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1` | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.0/torch_script/opensearch-neural-sparse-tokenizer-v1-1.0.0.zip)<br>- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.0/torch_script/config.json) | 
+| Model name | Auto-truncation | TorchScript artifact | Description |
+|---|---|---|
+| `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1` | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-v1/1.0.0/torch_script/opensearch-neural-sparse-encoding-v1-1.0.0-torch_script.zip)<br>- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-v1/1.0.0/torch_script/config.json) | A neural sparse encoding model. The model transforms text into a sparse vector, identifies the indexes of non-zero elements in the vector, and then converts the vector into `<entry, weight>` pairs, where each entry corresponds to a non-zero element index. |
+| `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1/1.0.0/torch_script/opensearch-neural-sparse-encoding-doc-v1-1.0.0-torch_script.zip)<br>- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1/1.0.0/torch_script/config.json) | A neural sparse encoding model. The model transforms text into a sparse vector, identifies the indexes of non-zero elements in the vector, and then converts the vector into `<entry, weight>` pairs, where each entry corresponds to a non-zero element index. |
+| `amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1` | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.0/torch_script/opensearch-neural-sparse-tokenizer-v1-1.0.0.zip)<br>- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.0/torch_script/config.json) | A neural sparse tokenizer model. The model tokenizes text into tokens and assigns each token a predefined weight, which is the token's IDF (if the IDF file is not provided, the weight defaults to 1). For more information, see [Uploading your own model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/#uploading-your-own-model). |

--- a/_query-dsl/specialized/neural-sparse.md
+++ b/_query-dsl/specialized/neural-sparse.md
@ -32,7 +32,7 @@ Field | Data type | Required/Optional | Description
 :--- | :--- | :--- 
 `query_text` | String | Required | The query text from which to generate vector embeddings. 
 `model_id` | String | Required | The ID of the sparse encoding model or tokenizer model that will be used to generate vector embeddings from the query text. The model must be deployed in OpenSearch before it can be used in sparse neural search. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/) and [Semantic search]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search/).
-`max_token_score` | Float | Optional | The theoretical upper bound of the score for all tokens in the vocabulary (required for performance optimization).
+`max_token_score` | Float | Optional | The theoretical upper bound of the score for all tokens in the vocabulary (required for performance optimization). For OpenSearch-provided [pretrained sparse embedding models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/#sparse-encoding-models), we recommend setting `max_token_score` to 2 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` and to 3.5 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1`.

 #### Example request

--- a/_search-plugins/neural-sparse-search.md
+++ b/_search-plugins/neural-sparse-search.md
@ -83,6 +83,38 @@ PUT /my-nlp-index
 ```
 {% include copy-curl.html %}

+To save disk space, you can exclude the embedding vector from the source as follows:
+
+```json
+PUT /my-nlp-index
+{
+  "settings": {
+    "default_pipeline": "nlp-ingest-pipeline-sparse"
+  },
+  "mappings": {
+      "_source": {
+      "excludes": [
+        "passage_embedding"
+      ]
+    },
+    "properties": {
+      "id": {
+        "type": "text"
+      },
+      "passage_embedding": {
+        "type": "rank_features"
+      },
+      "passage_text": {
+        "type": "text"
+      }
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+Once the `<token, weight>` pairs are excluded from the source, they cannot be recovered. Before applying this optimization, make sure you don't need the  `<token, weight>` pairs for your application.
+{: .important}

 ## Step 3: Ingest documents into the index