Add cross-encoder model documentation (#6357)
* Add cross-ranking model documentation Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Model id format Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Move to custom models Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Update _search-plugins/search-relevance/reranking-search-results.md Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _ml-commons-plugin/custom-local-models.md Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Tech review and doc review comments Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Apply suggestions from code review Co-authored-by: Nathan Bower <nbower@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _ml-commons-plugin/pretrained-models.md Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> --------- Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Co-authored-by: Nathan Bower <nbower@amazon.com>
This commit is contained in:
parent
37ee05d979
commit
e76ec7c0c7
|
@ -315,4 +315,149 @@ The response contains the tokens and weights:
|
|||
|
||||
## Step 5: Use the model for search
|
||||
|
||||
To learn how to use the model for vector search, see [Set up neural search]({{site.url}}{{site.baseurl}}http://localhost:4000/docs/latest/search-plugins/neural-search/#set-up-neural-search).
|
||||
To learn how to use the model for vector search, see [Using an ML model for neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/#using-an-ml-model-for-neural-search).
|
||||
|
||||
## Cross-encoder models
|
||||
|
||||
Cross-encoder models support query reranking.
|
||||
|
||||
To register a cross-encoder model, send a request in the following format. The `model_config` object is optional. For cross-encoder models, specify the `function_name` as `TEXT_SIMILARITY`. For example, the following request registers an `ms-marco-TinyBERT-L-2-v2` model:
|
||||
|
||||
```json
|
||||
POST /_plugins/_ml/models/_register
|
||||
{
|
||||
"name": "ms-marco-TinyBERT-L-2-v2",
|
||||
"version": "1.0.0",
|
||||
"function_name": "TEXT_SIMILARITY",
|
||||
"description": "test model",
|
||||
"model_format": "TORCH_SCRIPT",
|
||||
"model_group_id": "lN4AP40BKolAMNtR4KJ5",
|
||||
"model_content_hash_value": "90e39a926101d1a4e542aade0794319404689b12acfd5d7e65c03d91c668b5cf",
|
||||
"model_config": {
|
||||
"model_type": "bert",
|
||||
"embedding_dimension": 1,
|
||||
"framework_type": "huggingface_transformers",
|
||||
"total_chunks":2,
|
||||
"all_config": "{\"total_chunks\":2}"
|
||||
},
|
||||
"url": "https://github.com/opensearch-project/ml-commons/blob/main/ml-algorithms/src/test/resources/org/opensearch/ml/engine/algorithms/text_similarity/TinyBERT-CE-torch_script.zip?raw=true"
|
||||
}
|
||||
```
|
||||
{% include copy-curl.html %}
|
||||
|
||||
Then send a request to deploy the model:
|
||||
|
||||
```json
|
||||
POST _plugins/_ml/models/<model_id>/_deploy
|
||||
```
|
||||
{% include copy-curl.html %}
|
||||
|
||||
To test a cross-encoder model, send the following request:
|
||||
|
||||
```json
|
||||
POST _plugins/_ml/models/<model_id>/_predict
|
||||
{
|
||||
"query_text": "today is sunny",
|
||||
"text_docs": [
|
||||
"how are you",
|
||||
"today is sunny",
|
||||
"today is july fifth",
|
||||
"it is winter"
|
||||
]
|
||||
}
|
||||
```
|
||||
{% include copy-curl.html %}
|
||||
|
||||
The model calculates the similarity score of `query_text` and each document in `text_docs` and returns a list of scores for each document in the order they were provided in `text_docs`:
|
||||
|
||||
```json
|
||||
{
|
||||
"inference_results": [
|
||||
{
|
||||
"output": [
|
||||
{
|
||||
"name": "similarity",
|
||||
"data_type": "FLOAT32",
|
||||
"shape": [
|
||||
1
|
||||
],
|
||||
"data": [
|
||||
-6.077798
|
||||
],
|
||||
"byte_buffer": {
|
||||
"array": "Un3CwA==",
|
||||
"order": "LITTLE_ENDIAN"
|
||||
}
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"output": [
|
||||
{
|
||||
"name": "similarity",
|
||||
"data_type": "FLOAT32",
|
||||
"shape": [
|
||||
1
|
||||
],
|
||||
"data": [
|
||||
10.223609
|
||||
],
|
||||
"byte_buffer": {
|
||||
"array": "55MjQQ==",
|
||||
"order": "LITTLE_ENDIAN"
|
||||
}
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"output": [
|
||||
{
|
||||
"name": "similarity",
|
||||
"data_type": "FLOAT32",
|
||||
"shape": [
|
||||
1
|
||||
],
|
||||
"data": [
|
||||
-1.3987057
|
||||
],
|
||||
"byte_buffer": {
|
||||
"array": "ygizvw==",
|
||||
"order": "LITTLE_ENDIAN"
|
||||
}
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"output": [
|
||||
{
|
||||
"name": "similarity",
|
||||
"data_type": "FLOAT32",
|
||||
"shape": [
|
||||
1
|
||||
],
|
||||
"data": [
|
||||
-4.5923924
|
||||
],
|
||||
"byte_buffer": {
|
||||
"array": "4fSSwA==",
|
||||
"order": "LITTLE_ENDIAN"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
A higher document score means higher similarity. In the preceding response, documents are scored as follows against the query text `today is sunny`:
|
||||
|
||||
Document text | Score
|
||||
:--- | :---
|
||||
`how are you` | -6.077798
|
||||
`today is sunny` | 10.223609
|
||||
`today is july fifth` | -1.3987057
|
||||
`it is winter` | -4.5923924
|
||||
|
||||
The document that contains the same text as the query is scored the highest, and the remaining documents are scored based on the text similarity.
|
||||
|
||||
To learn how to use the model for reranking, see [Reranking search results]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/reranking-search-results/).
|
|
@ -296,4 +296,4 @@ The following table provides a list of sparse encoding models and artifact links
|
|||
|---|---|---|---|
|
||||
| `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1` | 1.0.1 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-encoding-v1-1.0.1-torch_script.zip)<br>- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-v1/1.0.1/torch_script/config.json) | A neural sparse encoding model. The model transforms text into a sparse vector, identifies the indexes of non-zero elements in the vector, and then converts the vector into `<entry, weight>` pairs, where each entry corresponds to a non-zero element index. |
|
||||
| `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` | 1.0.1 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-encoding-doc-v1-1.0.1-torch_script.zip)<br>- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1/1.0.1/torch_script/config.json) | A neural sparse encoding model. The model transforms text into a sparse vector, identifies the indexes of non-zero elements in the vector, and then converts the vector into `<entry, weight>` pairs, where each entry corresponds to a non-zero element index. |
|
||||
| `amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1` | 1.0.1 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-tokenizer-v1-1.0.1-torch_script.zip)<br>- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.1/torch_script/config.json) | A neural sparse tokenizer model. The model tokenizes text into tokens and assigns each token a predefined weight, which is the token's IDF (if the IDF file is not provided, the weight defaults to 1). For more information, see [Preparing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/custom-local-models/#preparing-a-model). |
|
||||
| `amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1` | 1.0.1 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-tokenizer-v1-1.0.1-torch_script.zip)<br>- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.1/torch_script/config.json) | A neural sparse tokenizer model. The model tokenizes text into tokens and assigns each token a predefined weight, which is the token's inverse document frequency (IDF). If the IDF file is not provided, the weight defaults to 1. For more information, see [Preparing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/custom-local-models/#preparing-a-model). |
|
||||
|
|
|
@ -13,7 +13,7 @@ Introduced 2.12
|
|||
You can rerank search results using a cross-encoder reranker in order to improve search relevance. To implement reranking, you need to configure a [search pipeline]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/index/) that runs at search time. The search pipeline intercepts search results and applies the [`rerank` processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rerank-processor/) to them. The `rerank` processor evaluates the search results and sorts them based on the new scores provided by the cross-encoder model.
|
||||
|
||||
**PREREQUISITE**<br>
|
||||
Before using hybrid search, you must set up a cross-encoder model. For more information, see [Choosing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#choosing-a-model).
|
||||
Before configuring a reranking pipeline, you must set up a cross-encoder model. For more information, see [Cross-encoder models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/custom-local-models/#cross-encoder-models).
|
||||
{: .note}
|
||||
|
||||
## Running a search with reranking
|
||||
|
|
Loading…
Reference in New Issue