226 lines
7.7 KiB
Markdown
226 lines
7.7 KiB
Markdown
|
---
|
||
|
layout: default
|
||
|
title: Neural Sparse Search tool
|
||
|
has_children: false
|
||
|
has_toc: false
|
||
|
nav_order: 50
|
||
|
parent: Tools
|
||
|
grand_parent: Agents and tools
|
||
|
---
|
||
|
|
||
|
<!-- vale off -->
|
||
|
# Neural Sparse Search tool
|
||
|
**Introduced 2.12**
|
||
|
{: .label .label-purple }
|
||
|
<!-- vale on -->
|
||
|
|
||
|
This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/1161).
|
||
|
{: .warning}
|
||
|
|
||
|
The `NeuralSparseSearchTool` performs sparse vector retrieval. For more information about neural sparse search, see [Neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/).
|
||
|
|
||
|
## Step 1: Register and deploy a sparse encoding model
|
||
|
|
||
|
OpenSearch supports several pretrained sparse encoding models. You can either use one of those models or your own custom model. For a list of supported pretrained models, see [Sparse encoding models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/#sparse-encoding-models). For more information, see [OpenSearch-provided pretrained models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/) and [Custom local models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/custom-local-models/).
|
||
|
|
||
|
In this example, you'll use the `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1` pretrained model for both ingestion and search. To register and deploy the model to OpenSearch, send the following request:
|
||
|
|
||
|
```json
|
||
|
POST /_plugins/_ml/models/_register?deploy=true
|
||
|
{
|
||
|
"name": "amazon/neural-sparse/opensearch-neural-sparse-encoding-v1",
|
||
|
"version": "1.0.1",
|
||
|
"model_format": "TORCH_SCRIPT"
|
||
|
}
|
||
|
```
|
||
|
{% include copy-curl.html %}
|
||
|
|
||
|
OpenSearch responds with a task ID for the model registration and deployment task:
|
||
|
|
||
|
```json
|
||
|
{
|
||
|
"task_id": "M_9KY40Bk4MTqirc5lP8",
|
||
|
"status": "CREATED"
|
||
|
}
|
||
|
```
|
||
|
|
||
|
You can monitor the status of the task by calling the Tasks API:
|
||
|
|
||
|
```json
|
||
|
GET _plugins/_ml/tasks/M_9KY40Bk4MTqirc5lP8
|
||
|
```
|
||
|
{% include copy-curl.html %}
|
||
|
|
||
|
Once the model is registered and deployed, the task `state` changes to `COMPLETED` and OpenSearch returns a model ID for the model:
|
||
|
|
||
|
```json
|
||
|
{
|
||
|
"model_id": "Nf9KY40Bk4MTqirc6FO7",
|
||
|
"task_type": "REGISTER_MODEL",
|
||
|
"function_name": "SPARSE_ENCODING",
|
||
|
"state": "COMPLETED",
|
||
|
"worker_node": [
|
||
|
"UyQSTQ3nTFa3IP6IdFKoug"
|
||
|
],
|
||
|
"create_time": 1706767869692,
|
||
|
"last_update_time": 1706767935556,
|
||
|
"is_async": true
|
||
|
}
|
||
|
```
|
||
|
|
||
|
## Step 2: Ingest data into an index
|
||
|
|
||
|
First, you'll set up an ingest pipeline to encode documents using the sparse encoding model set up in the previous step:
|
||
|
|
||
|
```json
|
||
|
PUT /_ingest/pipeline/pipeline-sparse
|
||
|
{
|
||
|
"description": "An sparse encoding ingest pipeline",
|
||
|
"processors": [
|
||
|
{
|
||
|
"sparse_encoding": {
|
||
|
"model_id": "Nf9KY40Bk4MTqirc6FO7",
|
||
|
"field_map": {
|
||
|
"passage_text": "passage_embedding"
|
||
|
}
|
||
|
}
|
||
|
}
|
||
|
]
|
||
|
}
|
||
|
```
|
||
|
{% include copy-curl.html %}
|
||
|
|
||
|
Next, create an index specifying the pipeline as the default pipeline:
|
||
|
|
||
|
```json
|
||
|
PUT index_for_neural_sparse
|
||
|
{
|
||
|
"settings": {
|
||
|
"default_pipeline": "pipeline-sparse"
|
||
|
},
|
||
|
"mappings": {
|
||
|
"properties": {
|
||
|
"passage_embedding": {
|
||
|
"type": "rank_features"
|
||
|
},
|
||
|
"passage_text": {
|
||
|
"type": "text"
|
||
|
}
|
||
|
}
|
||
|
}
|
||
|
}
|
||
|
```
|
||
|
{% include copy-curl.html %}
|
||
|
|
||
|
Last, ingest data into the index by sending a bulk request:
|
||
|
|
||
|
```json
|
||
|
POST _bulk
|
||
|
{ "index" : { "_index" : "index_for_neural_sparse", "_id" : "1" } }
|
||
|
{ "passage_text" : "company AAA has a history of 123 years" }
|
||
|
{ "index" : { "_index" : "index_for_neural_sparse", "_id" : "2" } }
|
||
|
{ "passage_text" : "company AAA has over 7000 employees" }
|
||
|
{ "index" : { "_index" : "index_for_neural_sparse", "_id" : "3" } }
|
||
|
{ "passage_text" : "Jack and Mark established company AAA" }
|
||
|
{ "index" : { "_index" : "index_for_neural_sparse", "_id" : "4" } }
|
||
|
{ "passage_text" : "company AAA has a net profit of 13 millions in 2022" }
|
||
|
{ "index" : { "_index" : "index_for_neural_sparse", "_id" : "5" } }
|
||
|
{ "passage_text" : "company AAA focus on the large language models domain" }
|
||
|
```
|
||
|
{% include copy-curl.html %}
|
||
|
|
||
|
## Step 3: Register a flow agent that will run the NeuralSparseSearchTool
|
||
|
|
||
|
A flow agent runs a sequence of tools in order and returns the last tool's output. To create a flow agent, send the following request, providing the model ID for the model set up in Step 1. This model will encode your queries into sparse vector embeddings:
|
||
|
|
||
|
```json
|
||
|
POST /_plugins/_ml/agents/_register
|
||
|
{
|
||
|
"name": "Test_Neural_Sparse_Agent_For_RAG",
|
||
|
"type": "flow",
|
||
|
"tools": [
|
||
|
{
|
||
|
"type": "NeuralSparseSearchTool",
|
||
|
"parameters": {
|
||
|
"description":"use this tool to search data from the knowledge base of company AAA",
|
||
|
"model_id": "Nf9KY40Bk4MTqirc6FO7",
|
||
|
"index": "index_for_neural_sparse",
|
||
|
"embedding_field": "passage_embedding",
|
||
|
"source_field": ["passage_text"],
|
||
|
"input": "${parameters.question}",
|
||
|
"doc_size":2
|
||
|
}
|
||
|
}
|
||
|
]
|
||
|
}
|
||
|
```
|
||
|
{% include copy-curl.html %}
|
||
|
|
||
|
For parameter descriptions, see [Register parameters](#register-parameters).
|
||
|
|
||
|
OpenSearch responds with an agent ID:
|
||
|
|
||
|
```json
|
||
|
{
|
||
|
"agent_id": "9X7xWI0Bpc3sThaJdY9i"
|
||
|
}
|
||
|
```
|
||
|
|
||
|
## Step 4: Run the agent
|
||
|
|
||
|
Before you run the agent, make sure that you add the sample OpenSearch Dashboards `Sample web logs` dataset. To learn more, see [Adding sample data]({{site.url}}{{site.baseurl}}/dashboards/quickstart#adding-sample-data).
|
||
|
|
||
|
Then, run the agent by sending the following request:
|
||
|
|
||
|
```json
|
||
|
POST /_plugins/_ml/agents/9X7xWI0Bpc3sThaJdY9i/_execute
|
||
|
{
|
||
|
"parameters": {
|
||
|
"question":"how many employees does AAA have?"
|
||
|
}
|
||
|
}
|
||
|
```
|
||
|
{% include copy-curl.html %}
|
||
|
|
||
|
OpenSearch returns the inference results:
|
||
|
|
||
|
```json
|
||
|
{
|
||
|
"inference_results": [
|
||
|
{
|
||
|
"output": [
|
||
|
{
|
||
|
"name": "response",
|
||
|
"result": """{"_index":"index_for_neural_sparse","_source":{"passage_text":"company AAA has over 7000 employees"},"_id":"2","_score":30.586042}
|
||
|
{"_index":"index_for_neural_sparse","_source":{"passage_text":"company AAA has a history of 123 years"},"_id":"1","_score":16.088133}
|
||
|
"""
|
||
|
}
|
||
|
]
|
||
|
}
|
||
|
]
|
||
|
}
|
||
|
```
|
||
|
|
||
|
## Register parameters
|
||
|
|
||
|
The following table lists all tool parameters that are available when registering an agent.
|
||
|
|
||
|
Parameter | Type | Required/Optional | Description
|
||
|
:--- | :--- | :--- | :---
|
||
|
`model_id` | String | Required | The model ID of the sparse encoding model to use at search time.
|
||
|
`index` | String | Required | The index to search.
|
||
|
`embedding_field` | String | Required | When the neural sparse model encodes raw text documents, the encoding result is saved in a field. Specify this field as the `embedding_field`. Neural sparse search matches documents to the query by calculating the similarity score between the query text and the text in the document's `embedding_field`.
|
||
|
`source_field` | String | Required | The document field or fields to return. You can provide a list of multiple fields as an array of strings, for example, `["field1", "field2"]`.
|
||
|
`input` | String | Required for flow agent | Runtime input sourced from flow agent parameters. If using a large language model (LLM), this field is populated with the LLM response.
|
||
|
`name` | String | Optional | The tool name. Useful when an LLM needs to select an appropriate tool for a task.
|
||
|
`description` | String | Optional | A description of the tool. Useful when an LLM needs to select an appropriate tool for a task.
|
||
|
`doc_size` | Integer | Optional | The number of documents to fetch. Default is `2`.
|
||
|
|
||
|
## Execute parameters
|
||
|
|
||
|
The following table lists all tool parameters that are available when running the agent.
|
||
|
|
||
|
Parameter | Type | Required/Optional | Description
|
||
|
:--- | :--- | :--- | :---
|
||
|
`question` | String | Required | The natural language question to send to the LLM.
|