Deprecate max_token_score in neural sparse search (#6554)

* deprecated max_token_score

Signed-off-by: zhichao-aws <zhichaog@amazon.com>

* Update _query-dsl/specialized/neural-sparse.md

Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

---------

Signed-off-by: zhichao-aws <zhichaog@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
This commit is contained in:
zhichao-aws 2024-03-01 22:30:03 +08:00 committed by GitHub
parent ee2b67f354
commit 5f486abf83
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 4 additions and 7 deletions

View File

@ -20,8 +20,7 @@ Include the following request fields in the `neural_sparse` query:
"neural_sparse": {
"<vector_field>": {
"query_text": "<query_text>",
"model_id": "<model_id>",
"max_token_score": "<max_token_score>"
"model_id": "<model_id>"
}
}
```
@ -32,7 +31,7 @@ Field | Data type | Required/Optional | Description
:--- | :--- | :---
`query_text` | String | Required | The query text from which to generate vector embeddings.
`model_id` | String | Required | The ID of the sparse encoding model or tokenizer model that will be used to generate vector embeddings from the query text. The model must be deployed in OpenSearch before it can be used in sparse neural search. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/using-ml-models/) and [Neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/).
`max_token_score` | Float | Optional | The theoretical upper bound of the score for all tokens in the vocabulary (required for performance optimization). For OpenSearch-provided [pretrained sparse embedding models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/#sparse-encoding-models), we recommend setting `max_token_score` to 2 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` and to 3.5 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1`.
`max_token_score` | Float | Optional | (Deprecated) The theoretical upper bound of the score for all tokens in the vocabulary (required for performance optimization). For OpenSearch-provided [pretrained sparse embedding models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/#sparse-encoding-models), we recommend setting `max_token_score` to 2 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` and to 3.5 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1`. This field has been deprecated as of OpenSearch 2.12.
#### Example request
@ -43,8 +42,7 @@ GET my-nlp-index/_search
"neural_sparse": {
"passage_embedding": {
"query_text": "Hi world",
"model_id": "aP2Q8ooBpBj3wT4HVS8a",
"max_token_score": 2
"model_id": "aP2Q8ooBpBj3wT4HVS8a"
}
}
}

View File

@ -154,8 +154,7 @@ GET my-nlp-index/_search
"neural_sparse": {
"passage_embedding": {
"query_text": "Hi world",
"model_id": "aP2Q8ooBpBj3wT4HVS8a",
"max_token_score": 2
"model_id": "aP2Q8ooBpBj3wT4HVS8a"
}
}
}