Add cross-encoder model documentation (#6357)
* Add cross-ranking model documentation Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Model id format Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Move to custom models Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Update _search-plugins/search-relevance/reranking-search-results.md Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _ml-commons-plugin/custom-local-models.md Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Tech review and doc review comments Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Apply suggestions from code review Co-authored-by: Nathan Bower <nbower@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _ml-commons-plugin/pretrained-models.md Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> --------- Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Co-authored-by: Nathan Bower <nbower@amazon.com>
This commit is contained in:
parent
37ee05d979
commit
e76ec7c0c7
|
@ -315,4 +315,149 @@ The response contains the tokens and weights:
|
||||||
|
|
||||||
## Step 5: Use the model for search
|
## Step 5: Use the model for search
|
||||||
|
|
||||||
To learn how to use the model for vector search, see [Set up neural search]({{site.url}}{{site.baseurl}}http://localhost:4000/docs/latest/search-plugins/neural-search/#set-up-neural-search).
|
To learn how to use the model for vector search, see [Using an ML model for neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/#using-an-ml-model-for-neural-search).
|
||||||
|
|
||||||
|
## Cross-encoder models
|
||||||
|
|
||||||
|
Cross-encoder models support query reranking.
|
||||||
|
|
||||||
|
To register a cross-encoder model, send a request in the following format. The `model_config` object is optional. For cross-encoder models, specify the `function_name` as `TEXT_SIMILARITY`. For example, the following request registers an `ms-marco-TinyBERT-L-2-v2` model:
|
||||||
|
|
||||||
|
```json
|
||||||
|
POST /_plugins/_ml/models/_register
|
||||||
|
{
|
||||||
|
"name": "ms-marco-TinyBERT-L-2-v2",
|
||||||
|
"version": "1.0.0",
|
||||||
|
"function_name": "TEXT_SIMILARITY",
|
||||||
|
"description": "test model",
|
||||||
|
"model_format": "TORCH_SCRIPT",
|
||||||
|
"model_group_id": "lN4AP40BKolAMNtR4KJ5",
|
||||||
|
"model_content_hash_value": "90e39a926101d1a4e542aade0794319404689b12acfd5d7e65c03d91c668b5cf",
|
||||||
|
"model_config": {
|
||||||
|
"model_type": "bert",
|
||||||
|
"embedding_dimension": 1,
|
||||||
|
"framework_type": "huggingface_transformers",
|
||||||
|
"total_chunks":2,
|
||||||
|
"all_config": "{\"total_chunks\":2}"
|
||||||
|
},
|
||||||
|
"url": "https://github.com/opensearch-project/ml-commons/blob/main/ml-algorithms/src/test/resources/org/opensearch/ml/engine/algorithms/text_similarity/TinyBERT-CE-torch_script.zip?raw=true"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
{% include copy-curl.html %}
|
||||||
|
|
||||||
|
Then send a request to deploy the model:
|
||||||
|
|
||||||
|
```json
|
||||||
|
POST _plugins/_ml/models/<model_id>/_deploy
|
||||||
|
```
|
||||||
|
{% include copy-curl.html %}
|
||||||
|
|
||||||
|
To test a cross-encoder model, send the following request:
|
||||||
|
|
||||||
|
```json
|
||||||
|
POST _plugins/_ml/models/<model_id>/_predict
|
||||||
|
{
|
||||||
|
"query_text": "today is sunny",
|
||||||
|
"text_docs": [
|
||||||
|
"how are you",
|
||||||
|
"today is sunny",
|
||||||
|
"today is july fifth",
|
||||||
|
"it is winter"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
{% include copy-curl.html %}
|
||||||
|
|
||||||
|
The model calculates the similarity score of `query_text` and each document in `text_docs` and returns a list of scores for each document in the order they were provided in `text_docs`:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"inference_results": [
|
||||||
|
{
|
||||||
|
"output": [
|
||||||
|
{
|
||||||
|
"name": "similarity",
|
||||||
|
"data_type": "FLOAT32",
|
||||||
|
"shape": [
|
||||||
|
1
|
||||||
|
],
|
||||||
|
"data": [
|
||||||
|
-6.077798
|
||||||
|
],
|
||||||
|
"byte_buffer": {
|
||||||
|
"array": "Un3CwA==",
|
||||||
|
"order": "LITTLE_ENDIAN"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"output": [
|
||||||
|
{
|
||||||
|
"name": "similarity",
|
||||||
|
"data_type": "FLOAT32",
|
||||||
|
"shape": [
|
||||||
|
1
|
||||||
|
],
|
||||||
|
"data": [
|
||||||
|
10.223609
|
||||||
|
],
|
||||||
|
"byte_buffer": {
|
||||||
|
"array": "55MjQQ==",
|
||||||
|
"order": "LITTLE_ENDIAN"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"output": [
|
||||||
|
{
|
||||||
|
"name": "similarity",
|
||||||
|
"data_type": "FLOAT32",
|
||||||
|
"shape": [
|
||||||
|
1
|
||||||
|
],
|
||||||
|
"data": [
|
||||||
|
-1.3987057
|
||||||
|
],
|
||||||
|
"byte_buffer": {
|
||||||
|
"array": "ygizvw==",
|
||||||
|
"order": "LITTLE_ENDIAN"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"output": [
|
||||||
|
{
|
||||||
|
"name": "similarity",
|
||||||
|
"data_type": "FLOAT32",
|
||||||
|
"shape": [
|
||||||
|
1
|
||||||
|
],
|
||||||
|
"data": [
|
||||||
|
-4.5923924
|
||||||
|
],
|
||||||
|
"byte_buffer": {
|
||||||
|
"array": "4fSSwA==",
|
||||||
|
"order": "LITTLE_ENDIAN"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
A higher document score means higher similarity. In the preceding response, documents are scored as follows against the query text `today is sunny`:
|
||||||
|
|
||||||
|
Document text | Score
|
||||||
|
:--- | :---
|
||||||
|
`how are you` | -6.077798
|
||||||
|
`today is sunny` | 10.223609
|
||||||
|
`today is july fifth` | -1.3987057
|
||||||
|
`it is winter` | -4.5923924
|
||||||
|
|
||||||
|
The document that contains the same text as the query is scored the highest, and the remaining documents are scored based on the text similarity.
|
||||||
|
|
||||||
|
To learn how to use the model for reranking, see [Reranking search results]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/reranking-search-results/).
|
|
@ -296,4 +296,4 @@ The following table provides a list of sparse encoding models and artifact links
|
||||||
|---|---|---|---|
|
|---|---|---|---|
|
||||||
| `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1` | 1.0.1 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-encoding-v1-1.0.1-torch_script.zip)<br>- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-v1/1.0.1/torch_script/config.json) | A neural sparse encoding model. The model transforms text into a sparse vector, identifies the indexes of non-zero elements in the vector, and then converts the vector into `<entry, weight>` pairs, where each entry corresponds to a non-zero element index. |
|
| `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1` | 1.0.1 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-encoding-v1-1.0.1-torch_script.zip)<br>- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-v1/1.0.1/torch_script/config.json) | A neural sparse encoding model. The model transforms text into a sparse vector, identifies the indexes of non-zero elements in the vector, and then converts the vector into `<entry, weight>` pairs, where each entry corresponds to a non-zero element index. |
|
||||||
| `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` | 1.0.1 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-encoding-doc-v1-1.0.1-torch_script.zip)<br>- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1/1.0.1/torch_script/config.json) | A neural sparse encoding model. The model transforms text into a sparse vector, identifies the indexes of non-zero elements in the vector, and then converts the vector into `<entry, weight>` pairs, where each entry corresponds to a non-zero element index. |
|
| `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` | 1.0.1 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-encoding-doc-v1-1.0.1-torch_script.zip)<br>- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1/1.0.1/torch_script/config.json) | A neural sparse encoding model. The model transforms text into a sparse vector, identifies the indexes of non-zero elements in the vector, and then converts the vector into `<entry, weight>` pairs, where each entry corresponds to a non-zero element index. |
|
||||||
| `amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1` | 1.0.1 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-tokenizer-v1-1.0.1-torch_script.zip)<br>- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.1/torch_script/config.json) | A neural sparse tokenizer model. The model tokenizes text into tokens and assigns each token a predefined weight, which is the token's IDF (if the IDF file is not provided, the weight defaults to 1). For more information, see [Preparing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/custom-local-models/#preparing-a-model). |
|
| `amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1` | 1.0.1 | Yes | - [model_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.1/torch_script/neural-sparse_opensearch-neural-sparse-tokenizer-v1-1.0.1-torch_script.zip)<br>- [config_url](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.1/torch_script/config.json) | A neural sparse tokenizer model. The model tokenizes text into tokens and assigns each token a predefined weight, which is the token's inverse document frequency (IDF). If the IDF file is not provided, the weight defaults to 1. For more information, see [Preparing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/custom-local-models/#preparing-a-model). |
|
||||||
|
|
|
@ -13,7 +13,7 @@ Introduced 2.12
|
||||||
You can rerank search results using a cross-encoder reranker in order to improve search relevance. To implement reranking, you need to configure a [search pipeline]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/index/) that runs at search time. The search pipeline intercepts search results and applies the [`rerank` processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rerank-processor/) to them. The `rerank` processor evaluates the search results and sorts them based on the new scores provided by the cross-encoder model.
|
You can rerank search results using a cross-encoder reranker in order to improve search relevance. To implement reranking, you need to configure a [search pipeline]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/index/) that runs at search time. The search pipeline intercepts search results and applies the [`rerank` processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rerank-processor/) to them. The `rerank` processor evaluates the search results and sorts them based on the new scores provided by the cross-encoder model.
|
||||||
|
|
||||||
**PREREQUISITE**<br>
|
**PREREQUISITE**<br>
|
||||||
Before using hybrid search, you must set up a cross-encoder model. For more information, see [Choosing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#choosing-a-model).
|
Before configuring a reranking pipeline, you must set up a cross-encoder model. For more information, see [Cross-encoder models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/custom-local-models/#cross-encoder-models).
|
||||||
{: .note}
|
{: .note}
|
||||||
|
|
||||||
## Running a search with reranking
|
## Running a search with reranking
|
||||||
|
|
Loading…
Reference in New Issue