Add multimodal search/sparse search/pre- and post-processing function documentation (#5168)

* Add multimodal search documentation

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Text image embedding processor

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add prerequisite

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Change query text

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Added bedrock connector tutorial and renamed ML TOC

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Name changes and rewording

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Change connector link

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Change link

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Implemented tech review comments

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Link fix and field name fix

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add default text embedding preprocessing and post-processing functions

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add sparse search documentation

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Fix links

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Pre/post processing function tech review comments

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Fix link

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Sparse search tech review comments

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Apply suggestions from code review

Co-authored-by: Melissa Vagi <vagimeli@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Implemented doc review comments

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add actual test sparse pipeline response

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Added tested examples

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Added model choice for sparse search

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Remove Bedrock connector

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Implemented tech review feedback

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add that the model must be deployed to neural search

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Link fix

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add session token to sagemaker blueprint

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Formatted bullet points the same way

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Specified both model types in neural sparse query

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Added more explanation for default pre/post-processing functions

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Remove framework and extensibility references

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Minor rewording

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

---------

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: Melissa Vagi <vagimeli@amazon.com>
Co-authored-by: Nathan Bower <nbower@amazon.com>
This commit is contained in:
kolchfa-aws 2023-10-16 10:45:35 -04:00 committed by GitHub
parent da7a701311
commit a97c719591
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
23 changed files with 1553 additions and 607 deletions

View File

@ -0,0 +1,147 @@
---
layout: default
title: Sparse encoding
parent: Ingest processors
grand_parent: Ingest APIs
nav_order: 240
---
# Sparse encoding
The `sparse_encoding` processor is used to generate a sparse vector/token and weights from text fields for [neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/) using sparse retrieval.
**PREREQUISITE**<br>
Before using the `sparse_encoding` processor, you must set up a machine learning (ML) model. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/) and [Semantic search]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search/).
{: .note}
The following is the syntax for the `sparse_encoding` processor:
```json
{
"sparse_encoding": {
"model_id": "<model_id>",
"field_map": {
"<input_field>": "<vector_field>"
}
}
}
```
{% include copy-curl.html %}
#### Configuration parameters
The following table lists the required and optional parameters for the `sparse_encoding` processor.
| Name | Data type | Required | Description |
|:---|:---|:---|:---|
`model_id` | String | Required | The ID of the model that will be used to generate the embeddings. The model must be deployed in OpenSearch before it can be used in neural search. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/) and [Semantic search]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search/).
`field_map` | Object | Required | Contains key-value pairs that specify the mapping of a text field to a `rank_features` field.
`field_map.<input_field>` | String | Required | The name of the field from which to obtain text for generating vector embeddings.
`field_map.<vector_field>` | String | Required | The name of the vector field in which to store the generated vector embeddings.
`description` | String | Optional | A brief description of the processor. |
`tag` | String | Optional | An identifier tag for the processor. Useful for debugging to distinguish between processors of the same type. |
## Using the processor
Follow these steps to use the processor in a pipeline. You must provide a model ID when creating the processor. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/).
**Step 1: Create a pipeline.**
The following example request creates an ingest pipeline where the text from `passage_text` will be converted into text embeddings and the embeddings will be stored in `passage_embedding`:
```json
PUT /_ingest/pipeline/nlp-ingest-pipeline
{
"description": "A sparse encoding ingest pipeline",
"processors": [
{
"sparse_encoding": {
"model_id": "aP2Q8ooBpBj3wT4HVS8a",
"field_map": {
"passage_text": "passage_embedding"
}
}
}
]
}
```
{% include copy-curl.html %}
**Step 2 (Optional): Test the pipeline.**
It is recommended that you test your pipeline before you ingest documents.
{: .tip}
To test the pipeline, run the following query:
```json
POST _ingest/pipeline/nlp-ingest-pipeline/_simulate
{
"docs": [
{
"_index": "testindex1",
"_id": "1",
"_source":{
"passage_text": "hello world"
}
}
]
}
```
{% include copy-curl.html %}
#### Response
The response confirms that in addition to the `passage_text` field, the processor has generated text embeddings in the `passage_embedding` field:
```json
{
"docs" : [
{
"doc" : {
"_index" : "testindex1",
"_id" : "1",
"_source" : {
"passage_embedding" : {
"!" : 0.8708904,
"door" : 0.8587369,
"hi" : 2.3929274,
"worlds" : 2.7839446,
"yes" : 0.75845814,
"##world" : 2.5432441,
"born" : 0.2682308,
"nothing" : 0.8625516,
"goodbye" : 0.17146169,
"greeting" : 0.96817183,
"birth" : 1.2788506,
"come" : 0.1623208,
"global" : 0.4371151,
"it" : 0.42951578,
"life" : 1.5750692,
"thanks" : 0.26481047,
"world" : 4.7300377,
"tiny" : 0.5462298,
"earth" : 2.6555297,
"universe" : 2.0308156,
"worldwide" : 1.3903781,
"hello" : 6.696973,
"so" : 0.20279501,
"?" : 0.67785245
},
"passage_text" : "hello world"
},
"_ingest" : {
"timestamp" : "2023-10-11T22:35:53.654650086Z"
}
}
}
]
}
```
## Next steps
- To learn how to use the `neural_sparse` query for a sparse search, see [Neural sparse query]({{site.url}}{{site.baseurl}}/query-dsl/specialized/neural-sparse/).
- To learn more about sparse neural search, see [Sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/).
- To learn more about using models in OpenSearch, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/).
- For a semantic search tutorial, see [Semantic search]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search/).

View File

@ -0,0 +1,128 @@
---
layout: default
title: Text embedding
parent: Ingest processors
grand_parent: Ingest APIs
nav_order: 260
---
# Text embedding
The `text_embedding` processor is used to generate vector embeddings from text fields for [neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/).
**PREREQUISITE**<br>
Before using the `text_embedding` processor, you must set up a machine learning (ML) model. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/) and [Semantic search]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search/).
{: .note}
The following is the syntax for the `text_embedding` processor:
```json
{
"text_embedding": {
"model_id": "<model_id>",
"field_map": {
"<input_field>": "<vector_field>"
}
}
}
```
{% include copy-curl.html %}
#### Configuration parameters
The following table lists the required and optional parameters for the `text_embedding` processor.
| Name | Data type | Required | Description |
|:---|:---|:---|:---|
`model_id` | String | Required | The ID of the model that will be used to generate the embeddings. The model must be deployed in OpenSearch before it can be used in neural search. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/) and [Semantic search]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search/).
`field_map` | Object | Required | Contains key-value pairs that specify the mapping of a text field to a vector field.
`field_map.<input_field>` | String | Required | The name of the field from which to obtain text for generating text embeddings.
`field_map.<vector_field>` | String | Required | The name of the vector field in which to store the generated text embeddings.
`description` | String | Optional | A brief description of the processor. |
`tag` | String | Optional | An identifier tag for the processor. Useful for debugging to distinguish between processors of the same type. |
## Using the processor
Follow these steps to use the processor in a pipeline. You must provide a model ID when creating the processor. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/).
**Step 1: Create a pipeline.**
The following example request creates an ingest pipeline where the text from `passage_text` will be converted into text embeddings and the embeddings will be stored in `passage_embedding`:
```json
PUT /_ingest/pipeline/nlp-ingest-pipeline
{
"description": "A text embedding pipeline",
"processors": [
{
"text_embedding": {
"model_id": "bQ1J8ooBpBj3wT4HVUsb",
"field_map": {
"passage_text": "passage_embedding"
}
}
}
]
}
```
{% include copy-curl.html %}
**Step 2 (Optional): Test the pipeline.**
It is recommended that you test your pipeline before you ingest documents.
{: .tip}
To test the pipeline, run the following query:
```json
POST _ingest/pipeline/nlp-ingest-pipeline/_simulate
{
"docs": [
{
"_index": "testindex1",
"_id": "1",
"_source":{
"passage_text": "hello world"
}
}
]
}
```
{% include copy-curl.html %}
#### Response
The response confirms that in addition to the `passage_text` field, the processor has generated text embeddings in the `passage_embedding` field:
```json
{
"docs": [
{
"doc": {
"_index": "testindex1",
"_id": "1",
"_source": {
"passage_embedding": [
-0.048237972,
-0.07612712,
0.3262124,
...
-0.16352308
],
"passage_text": "hello world"
},
"_ingest": {
"timestamp": "2023-10-05T15:15:19.691345393Z"
}
}
}
]
}
```
## Next steps
- To learn how to use the `neural` query for text search, see [Neural query]({{site.url}}{{site.baseurl}}/query-dsl/specialized/neural/).
- To learn more about neural text search, see [Text search]({{site.url}}{{site.baseurl}}/search-plugins/neural-text-search/).
- To learn more about using models in OpenSearch, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/).
- For a semantic search tutorial, see [Semantic search]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search/).

View File

@ -0,0 +1,138 @@
---
layout: default
title: Text/image embedding
parent: Ingest processors
grand_parent: Ingest APIs
nav_order: 270
---
# Text/image embedding
The `text_image_embedding` processor is used to generate combined vector embeddings from text and image fields for [multimodal neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-multimodal-search/).
**PREREQUISITE**<br>
Before using the `text_image_embedding` processor, you must set up a machine learning (ML) model. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/) and [Semantic search]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search/).
{: .note}
The following is the syntax for the `text_image_embedding` processor:
```json
{
"text_image_embedding": {
"model_id": "<model_id>",
"embedding": "<vector_field>",
"field_map": {
"text": "<input_text_field>",
"image": "<input_image_field>"
}
}
}
```
{% include copy-curl.html %}
## Parameters
The following table lists the required and optional parameters for the `text_image_embedding` processor.
| Name | Data type | Required | Description |
|:---|:---|:---|:---|
`model_id` | String | Required | The ID of the model that will be used to generate the embeddings. The model must be deployed in OpenSearch before it can be used in neural search. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/) and [Semantic search]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search/).
`embedding` | String | Required | The name of the vector field in which to store the generated embeddings. A single embedding is generated for both `text` and `image` fields.
`field_map` | Object | Required | Contains key-value pairs that specify the fields from which to generate embeddings.
`field_map.text` | String | Optional | The name of the field from which to obtain text for generating vector embeddings. You must specify at least one `text` or `image`.
`field_map.image` | String | Optional | The name of the field from which to obtain the image for generating vector embeddings. You must specify at least one `text` or `image`.
`description` | String | Optional | A brief description of the processor. |
`tag` | String | Optional | An identifier tag for the processor. Useful for debugging to distinguish between processors of the same type. |
## Using the processor
Follow these steps to use the processor in a pipeline. You must provide a model ID when creating the processor. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/).
**Step 1: Create a pipeline.**
The following example request creates an ingest pipeline where the text from `image_description` and the image from `image_binary` will be converted into vector embeddings and the embeddings will be stored in `vector_embedding`:
```json
PUT /_ingest/pipeline/nlp-ingest-pipeline
{
"description": "A text/image embedding pipeline",
"processors": [
{
"text_image_embedding": {
"model_id": "bQ1J8ooBpBj3wT4HVUsb",
"embedding": "vector_embedding",
"field_map": {
"text": "image_description",
"image": "image_binary"
}
}
}
]
}
```
{% include copy-curl.html %}
You can set up multiple processors in one pipeline to generate embeddings for multiple fields.
{: .note}
**Step 2 (Optional): Test the pipeline.**
It is recommended that you test your pipeline before you ingest documents.
{: .tip}
To test the pipeline, run the following query:
```json
POST _ingest/pipeline/nlp-ingest-pipeline/_simulate
{
"docs": [
{
"_index": "testindex1",
"_id": "1",
"_source":{
"image_description": "Orange table",
"image_binary": "bGlkaHQtd29rfx43..."
}
}
]
}
```
{% include copy-curl.html %}
#### Response
The response confirms that in addition to the `image_description` and `image_binary` fields, the processor has generated vector embeddings in the `vector_embedding` field:
```json
{
"docs": [
{
"doc": {
"_index": "testindex1",
"_id": "1",
"_source": {
"vector_embedding": [
-0.048237972,
-0.07612712,
0.3262124,
...
-0.16352308
],
"image_description": "Orange table",
"image_binary": "bGlkaHQtd29rfx43..."
},
"_ingest": {
"timestamp": "2023-10-05T15:15:19.691345393Z"
}
}
}
]
}
```
## Next steps
- To learn how to use the `neural` query for a multimodal search, see [Neural query]({{site.url}}{{site.baseurl}}/query-dsl/specialized/neural/).
- To learn more about multimodal neural search, see [Multimodal search]({{site.url}}{{site.baseurl}}/search-plugins/neural-multimodal-search/).
- To learn more about using models in OpenSearch, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/).
- For a semantic search tutorial, see [Semantic search]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search/).

View File

@ -2,10 +2,10 @@
layout: default
title: Supported Algorithms
has_children: false
nav_order: 100
nav_order: 30
---
# Supported Algorithms
# Supported algorithms
ML Commons supports various algorithms to help train and predict machine learning (ML) models or test data-driven predictions without a model. This page outlines the algorithms supported by the ML Commons plugin and the API operations they support.

View File

@ -2,7 +2,7 @@
layout: default
title: API
has_children: false
nav_order: 99
nav_order: 130
---
# ML Commons API

View File

@ -407,6 +407,6 @@ If your LLM includes a set token limit, set the `size` field in your OpenSearch
## Next steps
- To learn more about ML connectors, see [Creating connectors for third-party ML platforms]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/connectors/).
- To learn more about the OpenSearch ML framework, see [ML framework]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/).
- To learn more about connecting to models on external platforms, see [Connectors]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/connectors/).
- To learn more about using custom models within your OpenSearch cluster, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/).

View File

@ -1,12 +1,12 @@
---
layout: default
title: Building blueprints
title: Connector blueprints
has_children: false
nav_order: 65
parent: ML extensibility
parent: Connecting to remote models
---
# Building blueprints
# Connector blueprints
All connectors consist of a JSON blueprint created by machine learning (ML) developers. The blueprint allows administrators and data scientists to make connections between OpenSearch and an AI service or model-serving technology.
@ -41,7 +41,7 @@ POST /_plugins/_ml/connectors/_create
]
}
```
{% include copy-curl.html %}
## Example blueprints
@ -58,7 +58,7 @@ The following configuration options are **required** in order to build a connect
| `version` | Integer | The version of the connector. |
| `protocol` | String | The protocol for the connection. For AWS services such as Amazon SageMaker and Amazon Bedrock, use `aws_sigv4`. For all other services, use `http`. |
| `parameters` | JSON object | The default connector parameters, including `endpoint` and `model`. Any parameters indicated in this field can be overridden by parameters specified in a predict request. |
| `credential` | `Map<string, string>` | Defines any credential variables required to connect to your chosen endpoint. ML Commons uses **AES/GCM/NoPadding** symmetric encryption to encrypt your credentials. When the connection to the cluster first starts, OpenSearch creates a random 32-byte encryption key that persists in OpenSearch's system index. Therefore, you do not need to manually set the encryption key. |
| `credential` | JSON object | Defines any credential variables required in order to connect to your chosen endpoint. ML Commons uses **AES/GCM/NoPadding** symmetric encryption to encrypt your credentials. When the connection to the cluster first starts, OpenSearch creates a random 32-byte encryption key that persists in OpenSearch's system index. Therefore, you do not need to manually set the encryption key. |
| `actions` | JSON array | Define what actions can run within the connector. If you're an administrator making a connection, add the [blueprint]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/blueprints/) for your desired connection. |
| `backend_roles` | JSON array | A list of OpenSearch backend roles. For more information about setting up backend roles, see [Assigning backend roles to users]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-access-control#assigning-backend-roles-to-users). |
| `access_mode` | String | Sets the access mode for the model, either `public`, `restricted`, or `private`. Default is `private`. For more information about `access_mode`, see [Model groups]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-access-control#model-groups). |
@ -73,7 +73,107 @@ The `action` parameter supports the following options.
| `url` | String | Required. Sets the connection endpoint at which the action takes place. This must match the regex expression for the connection used when [adding trusted endpoints]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/index#adding-trusted-endpoints). |
| `headers` | JSON object | Sets the headers used inside the request or response body. Default is `ContentType: application/json`. If your third-party ML tool requires access control, define the required `credential` parameters in the `headers` parameter. |
| `request_body` | String | Required. Sets the parameters contained inside the request body of the action. The parameters must include `\"inputText\`, which specifies how users of the connector should construct the request payload for the `action_type`. |
| `pre_process_function` | String | Optional. A built-in or custom Painless script used to preprocess the input data. OpenSearch provides the following built-in preprocess functions that you can call directly:<br> - `connector.pre_process.cohere.embedding` for [Cohere](https://cohere.com/) embedding models<br> - `connector.pre_process.openai.embedding` for [OpenAI](https://openai.com/) embedding models <br> - `connector.pre_process.default.embedding`, which you can use to preprocess documents in neural search requests so that they are in the format that ML Commons can process with the default preprocessor (OpenSearch 2.11 or later). For more information, see [built-in functions](#built-in-pre--and-post-processing-functions). |
| `post_process_function` | String | Optional. A built-in or custom Painless script used to post-process the model output data. OpenSearch provides the following built-in post-process functions that you can call directly:<br> - `connector.pre_process.cohere.embedding` for [Cohere text embedding models](https://docs.cohere.com/reference/embed)<br> - `connector.pre_process.openai.embedding` for [OpenAI text embedding models](https://platform.openai.com/docs/api-reference/embeddings) <br> - `connector.post_process.default.embedding`, which you can use to post-process documents in the model response so that they are in the format that neural search expects (OpenSearch 2.11 or later). For more information, see [built-in functions](#built-in-pre--and-post-processing-functions). |
## Built-in pre- and post-processing functions
Call the built-in pre- and post-processing functions instead of writing a custom Painless script when connecting to the following text embedding models or your own text embedding models deployed on a remote server (for example, Amazon SageMaker):
- [OpenAI remote models](https://platform.openai.com/docs/api-reference/embeddings)
- [Cohere remote models](https://docs.cohere.com/reference/embed)
OpenSearch provides the following pre- and post-processing functions:
- OpenAI: `connector.pre_process.openai.embedding` and `connector.post_process.openai.embedding`
- Cohere: `connector.pre_process.cohere.embedding` and `connector.post_process.cohere.embedding`
- [Default](#default-pre--and-post-processing-functions) (for neural search): `connector.pre_process.default.embedding` and `connector.post_process.default.embedding`
### Default pre- and post-processing functions
When you perform vector search using [neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/), the neural search request is routed first to ML Commons and then to the model. If the model is one of the [pretrained models provided by OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/), it can parse the ML Commons request and return the response in the format that ML Commons expects. However, for a remote model, the expected format may be different from the ML Commons format. The default pre- and post-processing functions translate between the format that the model expects and the format that neural search expects.
#### Example request
The following example request creates a SageMaker text embedding connector and calls the default post-processing function:
```json
POST /_plugins/_ml/connectors/_create
{
"name": "Sagemaker text embedding connector",
"description": "The connector to Sagemaker",
"version": 1,
"protocol": "aws_sigv4",
"credential": {
"access_key": "<REPLACE WITH SAGEMAKER ACCESS KEY>",
"secret_key": "<REPLACE WITH SAGEMAKER SECRET KEY>",
"session_token": "<REPLACE WITH AWS SECURITY TOKEN>"
},
"parameters": {
"region": "ap-northeast-1",
"service_name": "sagemaker"
},
"actions": [
{
"action_type": "predict",
"method": "POST",
"url": "sagemaker.ap-northeast-1.amazonaws.com/endpoints/",
"headers": {
"content-type": "application/json"
},
"post_process_function": "connector.post_process.default.embedding",
"request_body": "${parameters.input}"
}
]
}
```
{% include copy-curl.html %}
The `request_body` template must be `${parameters.input}``.
{: .important}
### Preprocessing function
The `connector.pre_process.default.embedding` default preprocessing function parses the neural search request and transforms it into the format that the model expects as input.
The ML Commons [Predict API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/#predict) provides parameters in the following format:
```json
{
"parameters": {
"input": ["hello", "world"]
}
}
```
The default preprocessing function sends the `input` field contents to the model. Thus, the model input format must be a list of strings, for example:
```json
["hello", "world"]
```
### Post-processing function
The `connector.post_process.default.embedding` default post-processing function parses the model response and transforms it into the format that neural search expects as input.
The remote text embedding model output must be a two-dimensional float array, each element of which represents an embedding of a string from the input list. For example, the following two-dimensional array corresponds to the embedding of the list `["hello", "world"]`:
```json
[
[
-0.048237994,
-0.07612697,
...
],
[
0.32621247,
0.02328475,
...
]
]
```
## Next step
To see how system administrators and data scientists use blueprints for connectors, see [Creating connectors for third-party ML platforms]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/connectors/).
For examples of creating various connectors, see [Connectors]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/connectors/).

View File

@ -1,42 +1,43 @@
---
layout: default
title: Creating connectors for third-party ML platforms
title: Connectors
has_children: false
has_toc: false
nav_order: 61
parent: ML extensibility
parent: Connecting to remote models
---
# Creating connectors for third-party ML platforms
Machine Learning (ML) connectors provide the ability to integrate OpenSearch ML capabilities with third-party ML tools and platforms. Through connectors, OpenSearch can invoke these third-party endpoints to enrich query results and data pipelines.
Connectors facilitate access to remote models hosted on third-party platforms.
You can provision connectors in two ways:
1. An [external connector](#external-connector), saved in a connector index, which can be reused and shared with multiple remote models but requires access to both the model, the connector inside of OpenSearch, and the third party being accessed by the connector, such as OpenAI or SageMaker.
1. Create a [standalone connector](#standalone-connector): A standalone connector can be reused and shared by multiple remote models but requires access to both the model and connector in OpenSearch and the third-party platform, such as OpenAI or Amazon SageMaker, that the connector is accessing. Standalone connectors are saved in a connector index.
2. A [local connector](#local-connector), saved in the model index, which can only be used with one remote model. Unlike a standalone connector, users only need access to the model itself to access an internal connector because the connection is established inside the model.
2. Create a remote model with an [internal connector](#internal-connector): An internal connector can only be used with the remote model in which it was created. To access an internal connector, you only need access to the model itself because the connection is established inside the model. Internal connectors are saved in the model index.
## Supported connectors
As of OpenSearch 2.9, connectors have been tested for the following ML services, though it is possible to create connectors for other platforms not listed here:
- [Amazon SageMaker](https://aws.amazon.com/sagemaker/) allows you to host and manage the lifecycle of text-embedding models, powering semantic search queries in OpenSearch. When connected, Amazon SageMaker hosts your models and OpenSearch is used to query inferences. This benefits Amazon SageMaker users who value its functionality, such as model monitoring, serverless hosting, and workflow automation for continuous training and deployment.
- [Amazon SageMaker](https://aws.amazon.com/sagemaker/) allows you to host and manage the lifecycle of text embedding models, powering semantic search queries in OpenSearch. When connected, Amazon SageMaker hosts your models and OpenSearch is used to query inferences. This benefits Amazon SageMaker users who value its functionality, such as model monitoring, serverless hosting, and workflow automation for continuous training and deployment.
- [OpenAI ChatGPT](https://openai.com/blog/chatgpt) enables you to invoke an OpenAI chat model from inside an OpenSearch cluster.
- [Cohere](https://cohere.com/) allows you to use data from OpenSearch to power Cohere's large language models.
- [Cohere](https://cohere.com/) allows you to use data from OpenSearch to power the Cohere large language models.
- The [Bedrock Titan Embeddings](https://aws.amazon.com/bedrock/titan/) model can drive semantic search and retrieval-augmented generation in OpenSearch.
All connectors consist of a JSON blueprint created by machine learning (ML) developers. The blueprint allows administrators and data scientists to make connections between OpenSearch and an AI service or model-serving technology.
You can find blueprints for each connector in the [ML Commons repository](https://github.com/opensearch-project/ml-commons/tree/2.x/docs/remote_inference_blueprints).
If you want to build your own blueprint, see [Building blueprints]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/blueprints/).
## External connector
For more information about blueprint parameters, see [Connector blueprints]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/blueprints/).
Admins are only required to enter their `credential` settings, such as `"openAI_key"`, for the service they are connecting to. All other parameters are defined within the [blueprint]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/blueprints/).
{: .note}
The connector creation API, `/_plugins/_ml/connectors/_create`, creates connections that allow users to deploy and register external models through OpenSearch. Using the `endpoint` parameter, you can connect ML Commons to any supported ML tool using its specific API endpoint. For example, to connect to a ChatGPT model, you can connect using `api.openai.com`, as shown in the following example:
## Standalone connector
To create a standalone connector, send a request to the `connectors/_create` endpoint and provide all of the parameters described in [Connector blueprints]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/blueprints/):
```json
POST /_plugins/_ml/connectors/_create
@ -67,33 +68,9 @@ POST /_plugins/_ml/connectors/_create
```
{% include copy-curl.html %}
If successful, the connector API responds with the `connector_id` for the connection:
## Internal connector
```json
{
"connector_id": "a1eMb4kBJ1eYAeTMAljY"
}
```
With the returned `connector_id` we can register a model that uses that connector:
```json
POST /_plugins/_ml/models/_register
{
"name": "openAI-gpt-3.5-turbo",
"function_name": "remote",
"model_group_id": "lEFGL4kB4ubqQRzegPo2",
"description": "test model",
"connector_id": "a1eMb4kBJ1eYAeTMAljY"
}
```
## Local connector
Admins are only required to enter their `credential` settings, such as `"openAI_key"`, for the service they are connecting to. All other parameters are defined within the [blueprint]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/blueprints/).
{: .note}
To create an internal connector, add the `connector` parameter to the Register model API, as shown in the following example:
To create an internal connector, provide all of the parameters described in [Connector blueprints]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/blueprints/) within the `connector` object of a request to the `models/_register` endpoint:
```json
POST /_plugins/_ml/models/_register
@ -129,192 +106,10 @@ POST /_plugins/_ml/models/_register
]
}
}
}
```
{% include copy-curl.html %}
## Registering and deploying a connected model
After a connection has been created, use the `connector_id` from the response to register and deploy a connected model.
To register a model, you have the following options:
- You can use `model_group_id` to register a model version to an existing model group.
- If you do not use `model_group_id`, ML Commons creates a model with a new model group.
If you want to create a new `model_group`, use the following example:
```json
POST /_plugins/_ml/model_groups/_register
{
"name": "remote_model_group",
"description": "This is an example description"
}
```
ML Commons returns the following response:
```json
{
"model_group_id": "wlcnb4kBJ1eYAeTMHlV6",
"status": "CREATED"
}
```
The following example registers a model named `openAI-gpt-3.5-turbo`:
```json
POST /_plugins/_ml/models/_register
{
"name": "openAI-gpt-3.5-turbo",
"function_name": "remote",
"model_group_id": "wlcnb4kBJ1eYAeTMHlV6",
"description": "test model",
"connector_id": "a1eMb4kBJ1eYAeTMAljY"
}
```
ML Commons returns the `task_id` and registration status of the model:
```json
{
"task_id": "cVeMb4kBJ1eYAeTMFFgj",
"status": "CREATED"
}
```
You can use the `task_id` to find the `model_id`, as shown the following example:
**GET task request**
```json
GET /_plugins/_ml/tasks/cVeMb4kBJ1eYAeTMFFgj
```
**GET task response**
```json
{
"model_id": "cleMb4kBJ1eYAeTMFFg4",
"task_type": "REGISTER_MODEL",
"function_name": "REMOTE",
"state": "COMPLETED",
"worker_node": [
"XPcXLV7RQoi5m8NI_jEOVQ"
],
"create_time": 1689793598499,
"last_update_time": 1689793598530,
"is_async": false
}
```
Lastly, use the `model_id` to deploy the model:
**Deploy model request**
```json
POST /_plugins/_ml/models/cleMb4kBJ1eYAeTMFFg4/_deploy
```
**Deploy model response**
```json
{
"task_id": "vVePb4kBJ1eYAeTM7ljG",
"status": "CREATED"
}
```
Use the `task_id` from the deploy model response to make sure the model deployment completes:
**Verify deploy completion request**
```json
GET /_plugins/_ml/tasks/vVePb4kBJ1eYAeTM7ljG
```
**Verify deploy completion response**
```json
{
"model_id": "cleMb4kBJ1eYAeTMFFg4",
"task_type": "DEPLOY_MODEL",
"function_name": "REMOTE",
"state": "COMPLETED",
"worker_node": [
"n-72khvBTBi3bnIIR8FTTw"
],
"create_time": 1689793851077,
"last_update_time": 1689793851101,
"is_async": true
}
```
After a successful deployment, you can test the model using the Predict API set in the connector's `action` settings, as shown in the following example:
```json
POST /_plugins/_ml/models/cleMb4kBJ1eYAeTMFFg4/_predict
{
"parameters": {
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
]
}
}
```
The Predict API returns inference results for the connected model, as shown in the following example response:
```json
{
"inference_results": [
{
"output": [
{
"name": "response",
"dataAsMap": {
"id": "chatcmpl-7e6s5DYEutmM677UZokF9eH40dIY7",
"object": "chat.completion",
"created": 1689793889,
"model": "gpt-3.5-turbo-0613",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I assist you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 19,
"completion_tokens": 9,
"total_tokens": 28
}
}
}
]
}
]
}
```
## Examples
The following example connector requests show how to create a connector with supported third-party tools.
### OpenAI chat connector
## OpenAI chat connector
The following example creates a standalone OpenAI chat connector. The same options can be used for an internal connector under the `connector` parameter:
@ -350,7 +145,7 @@ POST /_plugins/_ml/connectors/_create
After creating the connector, you can retrieve the `task_id` and `connector_id` to register and deploy the model and then use the Predict API, similarly to a standalone connector.
### Amazon SageMaker
## Amazon SageMaker connector
The following example creates a standalone Amazon SageMaker connector. The same options can be used for an internal connector under the `connector` parameter:
@ -395,7 +190,7 @@ The `parameters` section requires the following options when using `aws_sigv4` a
- `region`: The AWS Region in which the AWS instance is located.
- `service_name`: The name of the AWS service for the connector.
### Cohere
## Cohere connector
The following example request creates a standalone Cohere connection:
@ -431,8 +226,5 @@ POST /_plugins/_ml/connectors/_create
## Next steps
- To learn more about using models in OpenSearch, see [ML Framework]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/).
- To learn more about using models in OpenSearch, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/).
- To learn more about model access control and model groups, see [Model access control]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-access-control/).

View File

@ -1,18 +1,19 @@
---
layout: default
title: ML extensibility
title: Connecting to remote models
has_children: true
has_toc: false
nav_order: 60
---
# ML extensibility
# Connecting to remote models
Machine learning (ML) extensibility enables ML developers to create integrations with other ML services, such as Amazon SageMaker or OpenAI. These integrations provide system administrators and data scientists the ability to run ML workloads outside of their OpenSearch cluster.
To get started with ML extensibility, choose from the following options:
- If you're an ML developer wanting to integrate with your specific ML services, see [Building blueprints]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/blueprints/).
- If you're a system administrator or data scientist wanting to create a connection to an ML service, see [Creating connectors for third-party ML platforms]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/connectors/).
- If you're an ML developer wanting to integrate with your specific ML services, see [Connector blueprints]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/blueprints/).
- If you're a system administrator or data scientist wanting to create a connection to an ML service, see [Connectors]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/connectors/).
## Prerequisites
@ -22,7 +23,7 @@ When access control is enabled on your third-party platform, you can enter your
### Adding trusted endpoints
To configure connectors in OpenSearch, add the trusted endpoints to your cluster settings using the `plugins.ml_commons.trusted_connector_endpoints_regex` setting, which supports Java regex expressions, as shown in the following example:
To configure connectors in OpenSearch, add the trusted endpoints to your cluster settings by using the `plugins.ml_commons.trusted_connector_endpoints_regex` setting, which supports Java regex expressions:
```json
PUT /_cluster/settings
@ -75,7 +76,7 @@ When access control is enabled, you can install the [Security plugin]({{site.url
### Node settings
Remote models based on external connectors consume fewer resources. Therefore, you can deploy any model from a standalone connector using data nodes. To make sure that your standalone connection uses data nodes, set `plugins.ml_commons.only_run_on_ml_node` to `false`, as shown in the following example:
Remote models based on external connectors consume fewer resources. Therefore, you can deploy any model from a standalone connector using data nodes. To make sure that your standalone connection uses data nodes, set `plugins.ml_commons.only_run_on_ml_node` to `false`:
```json
PUT /_cluster/settings
@ -88,9 +89,236 @@ PUT /_cluster/settings
```
{% include copy-curl.html %}
## Step 1: Register a model group
To register a model, you have the following options:
- You can use `model_group_id` to register a model version to an existing model group.
- If you do not use `model_group_id`, ML Commons creates a model with a new model group.
To register a model group, send the following request:
```json
POST /_plugins/_ml/model_groups/_register
{
"name": "remote_model_group",
"description": "A model group for remote models"
}
```
{% include copy-curl.html %}
The response contains the model group ID that you'll use to register a model to this model group:
```json
{
"model_group_id": "wlcnb4kBJ1eYAeTMHlV6",
"status": "CREATED"
}
```
To learn more about model groups, see [Model access control]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-access-control/).
## Step 2: Create a connector
You can create a standalone connector or an internal connector as part of a specific model. For more information about connectors and connector examples, see [Connectors]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/connectors/).
The Connectors Create API, `/_plugins/_ml/connectors/_create`, creates connectors that facilitate registering and deploying external models in OpenSearch. Using the `endpoint` parameter, you can connect ML Commons to any supported ML tool by using its specific API endpoint. For example, you can connect to a ChatGPT model by using the `api.openai.com` endpoint:
```json
POST /_plugins/_ml/connectors/_create
{
"name": "OpenAI Chat Connector",
"description": "The connector to public OpenAI model service for GPT 3.5",
"version": 1,
"protocol": "http",
"parameters": {
"endpoint": "api.openai.com",
"model": "gpt-3.5-turbo"
},
"credential": {
"openAI_key": "..."
},
"actions": [
{
"action_type": "predict",
"method": "POST",
"url": "https://${parameters.endpoint}/v1/chat/completions",
"headers": {
"Authorization": "Bearer ${credential.openAI_key}"
},
"request_body": "{ \"model\": \"${parameters.model}\", \"messages\": ${parameters.messages} }"
}
]
}
```
{% include copy-curl.html %}
The response contains the connector ID for the newly created connector:
```json
{
"connector_id": "a1eMb4kBJ1eYAeTMAljY"
}
```
## Step 3: Register a remote model
To register a remote model to the model group created in step 1, provide the model group ID from step 1 and the connector ID from step 2 in the following request:
```json
POST /_plugins/_ml/models/_register
{
"name": "openAI-gpt-3.5-turbo",
"function_name": "remote",
"model_group_id": "1jriBYsBq7EKuKzZX131",
"description": "test model",
"connector_id": "a1eMb4kBJ1eYAeTMAljY"
}
```
{% include copy-curl.html %}
OpenSearch returns the task ID of the register operation:
```json
{
"task_id": "cVeMb4kBJ1eYAeTMFFgj",
"status": "CREATED"
}
```
To check the status of the operation, provide the task ID to the [Tasks API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/#searching-for-a-task):
```bash
GET /_plugins/_ml/tasks/cVeMb4kBJ1eYAeTMFFgj
```
{% include copy-curl.html %}
When the operation is complete, the state changes to `COMPLETED`:
```json
{
"model_id": "cleMb4kBJ1eYAeTMFFg4",
"task_type": "REGISTER_MODEL",
"function_name": "REMOTE",
"state": "COMPLETED",
"worker_node": [
"XPcXLV7RQoi5m8NI_jEOVQ"
],
"create_time": 1689793598499,
"last_update_time": 1689793598530,
"is_async": false
}
```
## Step 4: Deploy the remote model
To deploy the registered model, provide its model ID from step 3 in the following request:
```bash
POST /_plugins/_ml/models/cleMb4kBJ1eYAeTMFFg4/_deploy
```
{% include copy-curl.html %}
The response contains the task ID that you can use to check the status of the deploy operation:
```json
{
"task_id": "vVePb4kBJ1eYAeTM7ljG",
"status": "CREATED"
}
```
As in the previous step, check the status of the operation by calling the Tasks API:
```bash
GET /_plugins/_ml/tasks/vVePb4kBJ1eYAeTM7ljG
```
{% include copy-curl.html %}
When the operation is complete, the state changes to `COMPLETED`:
```json
{
"model_id": "cleMb4kBJ1eYAeTMFFg4",
"task_type": "DEPLOY_MODEL",
"function_name": "REMOTE",
"state": "COMPLETED",
"worker_node": [
"n-72khvBTBi3bnIIR8FTTw"
],
"create_time": 1689793851077,
"last_update_time": 1689793851101,
"is_async": true
}
```
## Step 5: Make predictions
Use the [Predict API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/#predict) to make predictions:
```json
POST /_plugins/_ml/models/cleMb4kBJ1eYAeTMFFg4/_predict
{
"parameters": {
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
]
}
}
```
{% include copy-curl.html %}
To learn more about chat functionality within OpenAI, see the [OpenAI Chat API](https://platform.openai.com/docs/api-reference/chat).
The response contains the inference results provided by the OpenAI model:
```json
{
"inference_results": [
{
"output": [
{
"name": "response",
"dataAsMap": {
"id": "chatcmpl-7e6s5DYEutmM677UZokF9eH40dIY7",
"object": "chat.completion",
"created": 1689793889,
"model": "gpt-3.5-turbo-0613",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I assist you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 19,
"completion_tokens": 9,
"total_tokens": 28
}
}
}
]
}
]
}
```
## Next steps
- For more information about managing ML models in OpenSearch, see [ML Framework]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-serving-framework/).
- For more information about connectors, including connector examples, see [Connectors]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/connectors/).
- For more information about connector parameters, see [Connector blueprints]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/blueprints/).
- For more information about managing ML models in OpenSearch, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-serving-framework/).
- For more information about interacting with ML models in OpenSearch, see [Managing ML models in OpenSearch Dashboards]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-dashboard/)

View File

@ -1,7 +1,7 @@
---
layout: default
title: GPU acceleration
parent: ML framework
parent: Using custom models within OpenSearch
nav_order: 150
---
@ -12,7 +12,7 @@ When running a natural language processing (NLP) model in your OpenSearch cluste
## Supported GPUs
Currently, ML nodes following GPU instances:
Currently, ML nodes support the following GPU instances:
- [NVIDIA instances with CUDA 11.6](https://aws.amazon.com/nvidia/)
- [AWS Inferentia](https://aws.amazon.com/machine-learning/inferentia/)

View File

@ -9,18 +9,18 @@ nav_exclude: true
# ML Commons plugin
ML Commons for OpenSearch eases the development of machine learning features by providing a set of common machine learning (ML) algorithms through transport and REST API calls. Those calls choose the right nodes and resources for each ML request and monitors ML tasks to ensure uptime. This allows you to leverage existing open-source ML algorithms and reduce the effort required to develop new ML features.
ML Commons for OpenSearch simplifies the development of machine learning (ML) features by providing a set of ML algorithms through transport and REST API calls. Those calls choose the right nodes and resources for each ML request and monitor ML tasks to ensure uptime. This allows you to use existing open-source ML algorithms and reduce the effort required to develop new ML features.
Interaction with the ML Commons plugin occurs through either the [REST API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api) or [`ad`]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/functions#ad) and [`kmeans`]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/functions#kmeans) Piped Processing Language (PPL) commands.
Models [trained]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api#training-the-model) through the ML Commons plugin support model-based algorithms such as k-means. After you've trained a model enough so that it meets your precision requirements, you can apply the model to [predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api#predict) new data safely.
[Models trained]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api#training-the-model) through the ML Commons plugin support model-based algorithms, such as k-means. After you've trained a model to your precision requirements, use the model to [make predictions]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api#predict).
Should you not want to use a model, you can use the [Train and Predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api#train-and-predict) API to test your model without having to evaluate the model's performance.
If you don't want to use a model, you can use the [Train and Predict API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api#train-and-predict) to test your model without having to evaluate the model's performance.
## Using ML Commons
1. Ensure that you've appropriately set the cluster settings described in [Cluster Settings]({{site.url}}{{site.baseurl}}/ml-commons-plugin/cluster-settings/).
2. Set up model access as described in [Model Access Control]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-access-control/).
1. Ensure that you've appropriately set the cluster settings described in [ML Commons cluster settings]({{site.url}}{{site.baseurl}}/ml-commons-plugin/cluster-settings/).
2. Set up model access as described in [Model access control]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-access-control/).
3. Start using models:
- [ML Framework]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/) allows you to run models within OpenSearch.
- [ML Extensibility]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/index/) allows you to access remote models.
- [Run your custom models within an OpenSearch cluster]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/).
- [Integrate models hosted on an external platform]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/index/).

View File

@ -66,4 +66,4 @@ A list of nodes gives you a view of each node the model is running on, including
## Next steps
For more information about how to manage ML models in OpenSearch, see [ML Framework]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-serving-framework/).
For more information about how to manage ML models in OpenSearch, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-serving-framework/).

View File

@ -1,6 +1,6 @@
---
layout: default
title: ML framework
title: Using custom models within OpenSearch
has_children: true
nav_order: 50
redirect_from:
@ -11,7 +11,7 @@ ML Framework was taken out of experimental status and released as Generally Avai
{: .note}
# ML Framework
# Using custom models within OpenSearch
ML Commons allows you to serve custom models and use those models to make inferences through the OpenSearch Machine Learning (ML) Framework. For those who want to run their PyTorch deep learning model inside an OpenSearch cluster, you can upload and run that model with the ML Commons REST API.

View File

@ -1,7 +1,7 @@
---
layout: default
title: Pretrained models
parent: ML framework
parent: Using custom models within OpenSearch
nav_order: 120
---
@ -28,7 +28,7 @@ POST /_plugins/_ml/models/_upload
}
```
For more information about how to upload and use ML models, see [ML Framework]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-serving-framework/).
For more information about how to upload and use ML models, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-serving-framework/).
## Supported pretrained models

View File

@ -112,7 +112,7 @@ For this tutorial, you'll use the [DistilBERT](https://huggingface.co/docs/trans
#### Advanced: Using a different model
Alternatively, you can choose to use one of the [pretrained language models provided by OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/) or your own custom model. For information about choosing a model, see [Further reading](#further-reading). For instructions on how to set up a custom model, see [ML framework]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/).
Alternatively, you can choose to use one of the [pretrained language models provided by OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/) or your own custom model. For information about choosing a model, see [Further reading](#further-reading). For instructions on how to set up a custom model, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/).
Take note of the dimensionality of the model because you'll need it when you set up a k-NN index.
{: .important}
@ -332,7 +332,7 @@ POST /_plugins/_ml/models/_register
}
```
For more information, see [ML framework]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/).
For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/).
### Step 1(d): Deploy the model
@ -602,7 +602,7 @@ GET /my-nlp-index/_doc/1
```
{% include copy-curl.html %}
The response shows the document `_source` containing the original `text` and `id` fields and the added `passage_embeddings` field:
The response includes the document `_source` containing the original `text` and `id` fields and the added `passage_embedding` field:
```json
{

View File

@ -14,12 +14,16 @@ OpenSearch supports the following specialized queries:
- `more_like_this`: Finds documents similar to the provided text, document, or collection of documents.
- [`neural`]({{site.url}}{{site.baseurl}}/query-dsl/specialized/neural/): Used for vector field search in [neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/).
- [`neural_sparse`]({{site.url}}{{site.baseurl}}/query-dsl/specialized/neural-sparse/): Used for vector field search in [sparse neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/).
- `percolate`: Finds queries (stored as documents) that match the provided document.
- `rank_feature`: Calculates scores based on the values of numeric features. This query can skip non-competitive hits.
- `script`: Uses a script as a filter.
- `script_score`: Calculates a custom score for matching documents using a script.
- [`script_score`]({{site.url}}{{site.baseurl}}/query-dsl/specialized/script-score/): Calculates a custom score for matching documents using a script.
- `wrapper`: Accepts other queries as JSON or YAML strings.

View File

@ -0,0 +1,53 @@
---
layout: default
title: Neural sparse
parent: Specialized queries
grand_parent: Query DSL
nav_order: 55
---
# Neural sparse query
Introduced 2.11
{: .label .label-purple }
Use the `neural_sparse` query for vector field search in [sparse neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/).
## Request fields
Include the following request fields in the `neural_sparse` query:
```json
"neural_sparse": {
"<vector_field>": {
"query_text": "<query_text>",
"model_id": "<model_id>",
"max_token_score": "<max_token_score>"
}
}
```
The top-level `vector_field` specifies the vector field against which to run a search query. The following table lists the other `neural_sparse` query fields.
Field | Data type | Required/Optional | Description
:--- | :--- | :---
`query_text` | String | Required | The query text from which to generate vector embeddings.
`model_id` | String | Required | The ID of the sparse encoding model or tokenizer model that will be used to generate vector embeddings from the query text. The model must be deployed in OpenSearch before it can be used in sparse neural search. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/) and [Semantic search]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search/).
`max_token_score` | Float | Optional | The theoretical upper bound of the score for all tokens in the vocabulary (required for performance optimization).
#### Example request
```json
GET my-nlp-index/_search
{
"query": {
"neural_sparse": {
"passage_embedding": {
"query_text": "Hi world",
"model_id": "aP2Q8ooBpBj3wT4HVS8a",
"max_token_score": 2
}
}
}
}
```
{% include copy-curl.html %}

View File

@ -0,0 +1,53 @@
---
layout: default
title: Neural
parent: Specialized queries
grand_parent: Query DSL
nav_order: 50
---
# Neural query
Use the `neural` query for vector field search in [neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/).
## Request fields
Include the following request fields in the `neural` query:
```json
"neural": {
"<vector_field>": {
"query_text": "<query_text>",
"query_image": "<image_binary>",
"model_id": "<model_id>",
"k": 100
}
}
```
The top-level `vector_field` specifies the vector field against which to run a search query. The following table lists the other neural query fields.
Field | Data type | Required/Optional | Description
:--- | :--- | :---
`query_text` | String | Optional | The query text from which to generate vector embeddings. You must specify at least one `query_text` or `query_image`.
`query_image` | String | Optional | A base-64 encoded string that corresponds to the query image from which to generate vector embeddings. You must specify at least one `query_text` or `query_image`.
`model_id` | String | Required if the default model ID is not set. For more information, see [Setting a default model on an index or field]({{site.url}}{{site.baseurl}}/search-plugins/neural-text-search/#setting-a-default-model-on-an-index-or-field). | The ID of the model that will be used to generate vector embeddings from the query text. The model must be deployed in OpenSearch before it can be used in neural search. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/) and [Semantic search]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search/).
`k` | Integer | Optional | The number of results returned by the k-NN search. Default is 10.
#### Example request
```json
GET /my-nlp-index/_search
{
"query": {
"neural": {
"passage_embedding": {
"query_text": "Hi world",
"query_image": "iVBORw0KGgoAAAAN...",
"k": 100
}
}
}
}
```
{% include copy-curl.html %}

View File

@ -0,0 +1,132 @@
---
layout: default
title: Multimodal search
nav_order: 20
has_children: false
parent: Neural search
---
# Multimodal search
Introduced 2.11
{: .label .label-purple }
Use multimodal search to search text and image data. In neural search, text search is facilitated by multimodal embedding models.
**PREREQUISITE**<br>
Before using text search, you must set up a multimodal embedding model. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/).
{: .note}
## Using multimodal search
To use neural search with text and image embeddings, follow these steps:
1. [Create an ingest pipeline](#step-1-create-an-ingest-pipeline).
1. [Create an index for ingestion](#step-2-create-an-index-for-ingestion).
1. [Ingest documents into the index](#step-3-ingest-documents-into-the-index).
1. [Search the index using neural search](#step-4-search-the-index-using-neural-search).
## Step 1: Create an ingest pipeline
To generate vector embeddings, you need to create an [ingest pipeline]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/index/) that contains a [`text_image_embedding` processor]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/processors/text-image-embedding/), which will convert the text or image in a document field to vector embeddings. The processor's `field_map` determines the text and image fields from which to generate vector embeddings and the output vector field in which to store the embeddings.
The following example request creates an ingest pipeline where the text from `image_description` and an image from `image_binary` will be converted into text embeddings and the embeddings will be stored in `vector_embedding`:
```json
PUT /_ingest/pipeline/nlp-ingest-pipeline
{
"description": "A text/image embedding pipeline",
"processors": [
{
"text_image_embedding": {
"model_id": "-fYQAosBQkdnhhBsK593",
"embedding": "vector_embedding",
"field_map": {
"text": "image_description",
"image": "image_binary"
}
}
}
]
}
```
{% include copy-curl.html %}
## Step 2: Create an index for ingestion
In order to use the text embedding processor defined in your pipeline, create a k-NN index, adding the pipeline created in the previous step as the default pipeline. Ensure that the fields defined in the `field_map` are mapped as correct types. Continuing with the example, the `vector_embedding` field must be mapped as a k-NN vector with a dimension that matches the model dimension. Similarly, the `image_description` field should be mapped as `text`, and the `image_binary` should be mapped as `binary`.
The following example request creates a k-NN index that is set up with a default ingest pipeline:
```json
PUT /my-nlp-index
{
"settings": {
"index.knn": true,
"default_pipeline": "nlp-ingest-pipeline",
"number_of_shards": 2
},
"mappings": {
"properties": {
"vector_embedding": {
"type": "knn_vector",
"dimension": 1024,
"method": {
"name": "hnsw",
"engine": "lucene",
"parameters": {}
}
},
"image_description": {
"type": "text"
},
"image_binary": {
"type": "binary"
}
}
}
}
```
{% include copy-curl.html %}
For more information about creating a k-NN index and its supported methods, see [k-NN index]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index/).
## Step 3: Ingest documents into the index
To ingest documents into the index created in the previous step, send the following request:
```json
PUT /nlp-index/_doc/1
{
"image_description": "Orange table",
"image_binary": "iVBORw0KGgoAAAANSUI..."
}
```
{% include copy-curl.html %}
Before the document is ingested into the index, the ingest pipeline runs the `text_image_embedding` processor on the document, generating vector embeddings for the `image_description` and `image_binary` fields. In addition to the original `image_description` and `image_binary` fields, the indexed document includes the `vector_embedding` field, which contains the combined vector embeddings.
## Step 4: Search the index using neural search
To perform vector search on your index, use the `neural` query clause either in the [k-NN plugin API]({{site.url}}{{site.baseurl}}/search-plugins/knn/api/#search-model) or [Query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/index/) queries. You can refine the results by using a [k-NN search filter]({{site.url}}{{site.baseurl}}/search-plugins/knn/filter-search-knn/). You can search by text, image, or both text and image.
The following example request uses a neural query to search for text and image:
```json
GET /my-nlp-index/_search
{
"size": 10,
"query": {
"neural": {
"vector_embedding": {
"query_text": "Orange table",
"query_image": "iVBORw0KGgoAAAANSUI...",
"model_id": "-fYQAosBQkdnhhBsK593",
"k": 5
}
}
}
}
```
{% include copy-curl.html %}
To eliminate passing the model ID with each neural query request, you can set a default model on a k-NN index or a field. To learn more, see [Setting a default model on an index or field]({{site.url}}{{site.baseurl}}/search-plugins/neural-text-search/##setting-a-default-model-on-an-index-or-field).

View File

@ -2,7 +2,7 @@
layout: default
title: Neural search
nav_order: 200
has_children: false
has_children: true
has_toc: false
redirect_from:
- /neural-search-plugin/index/
@ -10,349 +10,16 @@ redirect_from:
# Neural search
Neural search transforms text into vectors and facilitates vector search both at ingestion time and at search time. During ingestion, neural search transforms document text into vector embeddings and indexes both the text and its vector embeddings in a k-NN index. When you use a neural query during search, neural search converts the query text into vector embeddings, uses vector search to compare the query and document embeddings, and returns the closest results.
Neural search transforms text into vectors and facilitates vector search both at ingestion time and at search time. During ingestion, neural search transforms document text into vector embeddings and indexes both the text and its vector embeddings in a vector index. When you use a neural query during search, neural search converts the query text into vector embeddings, uses vector search to compare the query and document embeddings, and returns the closest results.
The Neural Search plugin comes bundled with OpenSearch and is generally available as of OpenSearch 2.9. For more information, see [Managing plugins]({{site.url}}{{site.baseurl}}/opensearch/install/plugins#managing-plugins).
Neural search supports the following search types:
## Using neural search
- [Text search]({{site.url}}{{site.baseurl}}/search-plugins/neural-text-search/): Uses dense retrieval based on text embedding models to search text data.
- [Multimodal search]({{site.url}}{{site.baseurl}}/search-plugins/neural-multimodal-search/): Uses vision-language embedding models to search text and image data.
- [Sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/): Uses sparse retrieval based on sparse embedding models to search text data.
To use neural search, follow these steps:
## Embedding models
1. [Create an ingest pipeline](#step-1-create-an-ingest-pipeline).
1. [Create an index for ingestion](#step-2-create-an-index-for-ingestion).
1. [Ingest documents into the index](#step-3-ingest-documents-into-the-index).
1. [Search the index using neural search](#step-4-search-the-index-using-neural-search).
Before using neural search, you must set up a machine learning (ML) model. You can either use a pretrained model provided by OpenSearch, upload your own model to OpenSearch, or connect to a foundation model hosted on an external platform. For more information about ML models, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/) and [ML Extensibility]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/index/). For a step-by-step tutorial, see [Semantic search]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search/).
## Step 1: Create an ingest pipeline
To generate vector embeddings for text fields, you need to create a neural search [ingest pipeline]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/index/). An ingest pipeline consists of a series of processors that manipulate documents during ingestion, allowing the documents to be vectorized.
### Path and HTTP method
The following API operation creates a neural search ingest pipeline:
```json
PUT _ingest/pipeline/<pipeline_name>
```
### Path parameter
Use `pipeline_name` to create a name for your neural search ingest pipeline.
### Request fields
In the pipeline request body, you must set up a `text_embedding` processor (the only processor supported by neural search), which will convert the text in a document field to vector embeddings. The processor's `field_map` determines the input fields from which to generate vector embeddings and the output fields in which to store the embeddings:
```json
"text_embedding": {
"model_id": "<model_id>",
"field_map": {
"<input_field>": "<vector_field>"
}
}
```
The following table lists the `text_embedding` processor request fields.
Field | Data type | Description
:--- | :--- | :---
`model_id` | String | The ID of the model that will be used to generate the embeddings. The model must be indexed in OpenSearch before it can be used in neural search. For more information, see [ML Framework]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/) and [Semantic search]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search/).
`field_map.<input_field>` | String | The name of the field from which to obtain text for generating text embeddings.
`field_map.<vector_field>` | String | The name of the vector field in which to store the generated text embeddings.
### Example request
The following example request creates an ingest pipeline where the text from `passage_text` will be converted into text embeddings and the embeddings will be stored in `passage_embedding`:
```json
PUT /_ingest/pipeline/nlp-ingest-pipeline
{
"description": "An NLP ingest pipeline",
"processors": [
{
"text_embedding": {
"model_id": "bQ1J8ooBpBj3wT4HVUsb",
"field_map": {
"passage_text": "passage_embedding"
}
}
}
]
}
```
{% include copy-curl.html %}
## Step 2: Create an index for ingestion
In order to use the text embedding processor defined in your pipelines, create a k-NN index with mapping data that aligns with the maps specified in your pipeline. For example, the `<vector_field>` defined in the `field_map` of your processor must be mapped as a k-NN vector field with a dimension that matches the model dimension. Similarly, the `<input_field>` defined in your processor should be mapped as `text` in your index.
### Example request
The following example request creates a k-NN index that is set up with a default ingest pipeline:
```json
PUT /my-nlp-index
{
"settings": {
"index.knn": true,
"default_pipeline": "nlp-ingest-pipeline"
},
"mappings": {
"properties": {
"id": {
"type": "text"
},
"passage_embedding": {
"type": "knn_vector",
"dimension": 768,
"method": {
"engine": "lucene",
"space_type": "l2",
"name": "hnsw",
"parameters": {}
}
},
"passage_text": {
"type": "text"
}
}
}
}
```
{% include copy-curl.html %}
For more information about creating a k-NN index and the methods it supports, see [k-NN index]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index/).
## Step 3: Ingest documents into the index
To ingest documents into the index created in the previous step, send a POST request for each document:
```json
PUT /my-nlp-index/_doc/1
{
"passage_text": "Hello world",
"id": "s1"
}
```
{% include copy-curl.html %}
```json
PUT /my-nlp-index/_doc/2
{
"passage_text": "Hi planet",
"id": "s2"
}
```
{% include copy-curl.html %}
Before the document is ingested into the index, the ingest pipeline runs the `text_embedding` processor on the document, generating text embeddings for the `passage_text` field. The indexed document contains the `passage_text` field that has the original text and the `passage_embedding` field that has the vector embeddings.
## Step 4: Search the index using neural search
To perform vector search on your index, use the `neural` query clause either in the [k-NN plugin API]({{site.url}}{{site.baseurl}}/search-plugins/knn/api/#search-model) or [Query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/index/) queries. You can refine the results by using a [k-NN search filter]({{site.url}}{{site.baseurl}}/search-plugins/knn/filter-search-knn/).
### Neural query request fields
Include the following request fields under the `neural` query clause:
```json
"neural": {
"<vector_field>": {
"query_text": "<query_text>",
"model_id": "<model_id>",
"k": 100
}
}
```
The top-level `vector_field` specifies the vector field against which to run a search query. The following table lists the other neural query fields.
Field | Data type | Description
:--- | :--- | :---
`query_text` | String | The query text from which to generate text embeddings.
`model_id` | String | The ID of the model that will be used to generate text embeddings from the query text. The model must be indexed in OpenSearch before it can be used in neural search.
`k` | Integer | The number of results returned by the k-NN search.
### Example request
The following example request uses a Boolean query to combine a filter clause and two query clauses---a neural query and a `match` query. The `script_score` query assigns custom weights to the query clauses:
```json
GET /my-nlp-index/_search
{
"_source": {
"excludes": [
"passage_embedding"
]
},
"query": {
"bool": {
"filter": {
"wildcard": { "id": "*1" }
},
"should": [
{
"script_score": {
"query": {
"neural": {
"passage_embedding": {
"query_text": "Hi world",
"model_id": "bQ1J8ooBpBj3wT4HVUsb",
"k": 100
}
}
},
"script": {
"source": "_score * 1.5"
}
}
},
{
"script_score": {
"query": {
"match": {
"passage_text": "Hi world"
}
},
"script": {
"source": "_score * 1.7"
}
}
}
]
}
}
}
```
{% include copy-curl.html %}
The response contains the matching document:
```json
{
"took" : 36,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.2251667,
"hits" : [
{
"_index" : "my-nlp-index",
"_id" : "1",
"_score" : 1.2251667,
"_source" : {
"passage_text" : "Hello world",
"id" : "s1"
}
}
]
}
}
```
### Setting a default model on an index or field
To eliminate passing the model ID with each neural query request, you can set a default model on a k-NN index or a field.
First, create a [search pipeline]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/index/) with a [`neural_query_enricher`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/neural-query-enricher/) request processor. To set a default model for an index, provide the model ID in the `default_model_id` parameter. To set a default model for a specific field, provide the field name and the corresponding model ID in the `neural_field_default_id` map. If you provide both `default_model_id` and `neural_field_default_id`, `neural_field_default_id` takes precedence:
```json
PUT /_search/pipeline/default_model_pipeline
{
"request_processors": [
{
"neural_query_enricher" : {
"default_model_id": "bQ1J8ooBpBj3wT4HVUsb",
"neural_field_default_id": {
"my_field_1": "uZj0qYoBMtvQlfhaYeud",
"my_field_2": "upj0qYoBMtvQlfhaZOuM"
}
}
}
]
}
```
{% include copy-curl.html %}
Then set the default model for your index:
```json
PUT /my-nlp-index/_settings
{
"index.search.default_pipeline" : "default_model_pipeline"
}
```
{% include copy-curl.html %}
You can now omit the model ID when searching:
```json
GET /my-nlp-index/_search
{
"_source": {
"excludes": [
"passage_embedding"
]
},
"query": {
"neural": {
"passage_embedding": {
"query_text": "Hi world",
"k": 100
}
}
}
}
```
{% include copy-curl.html %}
The response contains both documents:
```json
{
"took" : 41,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.22762,
"hits" : [
{
"_index" : "my-nlp-index",
"_id" : "2",
"_score" : 1.22762,
"_source" : {
"passage_text" : "Hi planet",
"id" : "s2"
}
},
{
"_index" : "my-nlp-index",
"_id" : "1",
"_score" : 1.2251667,
"_source" : {
"passage_text" : "Hello world",
"id" : "s1"
}
}
]
}
}
```
Before you ingest documents into an index, documents are passed through the ML model, which generates vector embeddings for the document fields. When you send a search request, the query text or image is also passed through the ML model, which generates the corresponding vector embeddings. Then neural search performs a vector search on the embeddings and returns matching documents.

View File

@ -0,0 +1,207 @@
---
layout: default
title: Sparse search
nav_order: 30
has_children: false
parent: Neural search
---
# Sparse search
Introduced 2.11
{: .label .label-purple }
[Neural text search]({{site.url}}{{site.baseurl}}/search-plugins/neural-text-search/) relies on dense retrieval that is based on text embedding models. However, dense methods use k-NN search, which consumes a large amount of memory and CPU resources. An alternative to neural text search, sparse neural search is implemented using an inverted index and thus is as efficient as BM25. Sparse search is facilitated by sparse embedding models. When you perform a sparse search, it creates a sparse vector (a list of `token: weight` key-value pairs representing an entry and its weight) and ingests data into a rank features index.
When selecting a model, choose one of the following options:
- Use a sparse encoding model at both ingestion time and search time (high performance, relatively high latency).
- Use a sparse encoding model at ingestion time and a tokenizer model at search time (low performance, relatively low latency).
**PREREQUISITE**<br>
Before using sparse search, you must set up a sparse embedding model. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/).
{: .note}
## Using sparse search
To use sparse search, follow these steps:
1. [Create an ingest pipeline](#step-1-create-an-ingest-pipeline).
1. [Create an index for ingestion](#step-2-create-an-index-for-ingestion).
1. [Ingest documents into the index](#step-3-ingest-documents-into-the-index).
1. [Search the index using neural search](#step-4-search-the-index-using-neural-search).
## Step 1: Create an ingest pipeline
To generate vector embeddings, you need to create an [ingest pipeline]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/index/) that contains a [`sparse_encoding` processor]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/processors/sparse-encoding/), which will convert the text in a document field to vector embeddings. The processor's `field_map` determines the input fields from which to generate vector embeddings and the output fields in which to store the embeddings.
The following example request creates an ingest pipeline where the text from `passage_text` will be converted into text embeddings and the embeddings will be stored in `passage_embedding`:
```json
PUT /_ingest/pipeline/nlp-ingest-pipeline-sparse
{
"description": "An sparse encoding ingest pipeline",
"processors": [
{
"sparse_encoding": {
"model_id": "aP2Q8ooBpBj3wT4HVS8a",
"field_map": {
"passage_text": "passage_embedding"
}
}
}
]
}
```
{% include copy-curl.html %}
## Step 2: Create an index for ingestion
In order to use the text embedding processor defined in your pipeline, create a rank features index, adding the pipeline created in the previous step as the default pipeline. Ensure that the fields defined in the `field_map` are mapped as correct types. Continuing with the example, the `passage_embedding` field must be mapped as [`rank_features`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/rank/#rank-features). Similarly, the `passage_text` field should be mapped as `text`.
The following example request creates a rank features index that is set up with a default ingest pipeline:
```json
PUT /my-nlp-index
{
"settings": {
"default_pipeline": "nlp-ingest-pipeline-sparse"
},
"mappings": {
"properties": {
"id": {
"type": "text"
},
"passage_embedding": {
"type": "rank_features"
},
"passage_text": {
"type": "text"
}
}
}
}
```
{% include copy-curl.html %}
## Step 3: Ingest documents into the index
To ingest documents into the index created in the previous step, send the following requests:
```json
PUT /my-nlp-index/_doc/1
{
"passage_text": "Hello world",
"id": "s1"
}
```
{% include copy-curl.html %}
```json
PUT /my-nlp-index/_doc/2
{
"passage_text": "Hi planet",
"id": "s2"
}
```
{% include copy-curl.html %}
Before the document is ingested into the index, the ingest pipeline runs the `sparse_encoding` processor on the document, generating vector embeddings for the `passage_text` field. The indexed document includes the `passage_text` field, which contains the original text, and the `passage_embedding` field, which contains the vector embeddings.
## Step 4: Search the index using neural search
To perform a sparse vector search on your index, use the `neural_sparse` query clause in [Query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/index/) queries.
The following example request uses a `neural_sparse` query to search for relevant documents:
```json
GET my-nlp-index/_search
{
"query": {
"neural_sparse": {
"passage_embedding": {
"query_text": "Hi world",
"model_id": "aP2Q8ooBpBj3wT4HVS8a",
"max_token_score": 2
}
}
}
}
```
{% include copy-curl.html %}
The response contains the matching documents:
```json
{
"took" : 688,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 30.0029,
"hits" : [
{
"_index" : "my-nlp-index",
"_id" : "1",
"_score" : 30.0029,
"_source" : {
"passage_text" : "Hello world",
"passage_embedding" : {
"!" : 0.8708904,
"door" : 0.8587369,
"hi" : 2.3929274,
"worlds" : 2.7839446,
"yes" : 0.75845814,
"##world" : 2.5432441,
"born" : 0.2682308,
"nothing" : 0.8625516,
"goodbye" : 0.17146169,
"greeting" : 0.96817183,
"birth" : 1.2788506,
"come" : 0.1623208,
"global" : 0.4371151,
"it" : 0.42951578,
"life" : 1.5750692,
"thanks" : 0.26481047,
"world" : 4.7300377,
"tiny" : 0.5462298,
"earth" : 2.6555297,
"universe" : 2.0308156,
"worldwide" : 1.3903781,
"hello" : 6.696973,
"so" : 0.20279501,
"?" : 0.67785245
},
"id" : "s1"
}
},
{
"_index" : "my-nlp-index",
"_id" : "2",
"_score" : 16.480486,
"_source" : {
"passage_text" : "Hi planet",
"passage_embedding" : {
"hi" : 4.338913,
"planets" : 2.7755864,
"planet" : 5.0969057,
"mars" : 1.7405145,
"earth" : 2.6087382,
"hello" : 3.3210192
},
"id" : "s2"
}
}
]
}
}
```

View File

@ -0,0 +1,297 @@
---
layout: default
title: Text search
nav_order: 10
has_children: false
parent: Neural search
---
# Text search
Use text search for text data. In neural search, text search is facilitated by text embedding models. Text search creates a dense vector (a list of floats) and ingests data into a k-NN index.
**PREREQUISITE**<br>
Before using text search, you must set up a text embedding model. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/).
{: .note}
## Using text search
To use text search, follow these steps:
1. [Create an ingest pipeline](#step-1-create-an-ingest-pipeline).
1. [Create an index for ingestion](#step-2-create-an-index-for-ingestion).
1. [Ingest documents into the index](#step-3-ingest-documents-into-the-index).
1. [Search the index using neural search](#step-4-search-the-index-using-neural-search).
## Step 1: Create an ingest pipeline
To generate vector embeddings, you need to create an [ingest pipeline]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/index/) that contains a [`text_embedding` processor]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/processors/text-embedding/), which will convert the text in a document field to vector embeddings. The processor's `field_map` determines the input fields from which to generate vector embeddings and the output fields in which to store the embeddings.
The following example request creates an ingest pipeline where the text from `passage_text` will be converted into text embeddings and the embeddings will be stored in `passage_embedding`:
```json
PUT /_ingest/pipeline/nlp-ingest-pipeline
{
"description": "A text embedding pipeline",
"processors": [
{
"text_embedding": {
"model_id": "bQ1J8ooBpBj3wT4HVUsb",
"field_map": {
"passage_text": "passage_embedding"
}
}
}
]
}
```
{% include copy-curl.html %}
## Step 2: Create an index for ingestion
In order to use the text embedding processor defined in your pipeline, create a k-NN index, adding the pipeline created in the previous step as the default pipeline. Ensure that the fields defined in the `field_map` are mapped as correct types. Continuing with the example, the `passage_embedding` field must be mapped as a k-NN vector with a dimension that matches the model dimension. Similarly, the `passage_text` field should be mapped as `text`.
The following example request creates a k-NN index that is set up with a default ingest pipeline:
```json
PUT /my-nlp-index
{
"settings": {
"index.knn": true,
"default_pipeline": "nlp-ingest-pipeline"
},
"mappings": {
"properties": {
"id": {
"type": "text"
},
"passage_embedding": {
"type": "knn_vector",
"dimension": 768,
"method": {
"engine": "lucene",
"space_type": "l2",
"name": "hnsw",
"parameters": {}
}
},
"passage_text": {
"type": "text"
}
}
}
}
```
{% include copy-curl.html %}
For more information about creating a k-NN index and its supported methods, see [k-NN index]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index/).
## Step 3: Ingest documents into the index
To ingest documents into the index created in the previous step, send the following requests:
```json
PUT /my-nlp-index/_doc/1
{
"passage_text": "Hello world",
"id": "s1"
}
```
{% include copy-curl.html %}
```json
PUT /my-nlp-index/_doc/2
{
"passage_text": "Hi planet",
"id": "s2"
}
```
{% include copy-curl.html %}
Before the document is ingested into the index, the ingest pipeline runs the `text_embedding` processor on the document, generating text embeddings for the `passage_text` field. The indexed document includes the `passage_text` field, which contains the original text, and the `passage_embedding` field, which contains the vector embeddings.
## Step 4: Search the index using neural search
To perform vector search on your index, use the `neural` query clause either in the [k-NN plugin API]({{site.url}}{{site.baseurl}}/search-plugins/knn/api/#search-model) or [Query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/index/) queries. You can refine the results by using a [k-NN search filter]({{site.url}}{{site.baseurl}}/search-plugins/knn/filter-search-knn/).
The following example request uses a Boolean query to combine a filter clause and two query clauses---a neural query and a `match` query. The `script_score` query assigns custom weights to the query clauses:
```json
GET /my-nlp-index/_search
{
"_source": {
"excludes": [
"passage_embedding"
]
},
"query": {
"bool": {
"filter": {
"wildcard": { "id": "*1" }
},
"should": [
{
"script_score": {
"query": {
"neural": {
"passage_embedding": {
"query_text": "Hi world",
"model_id": "bQ1J8ooBpBj3wT4HVUsb",
"k": 100
}
}
},
"script": {
"source": "_score * 1.5"
}
}
},
{
"script_score": {
"query": {
"match": {
"passage_text": "Hi world"
}
},
"script": {
"source": "_score * 1.7"
}
}
}
]
}
}
}
```
{% include copy-curl.html %}
The response contains the matching document:
```json
{
"took" : 36,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.2251667,
"hits" : [
{
"_index" : "my-nlp-index",
"_id" : "1",
"_score" : 1.2251667,
"_source" : {
"passage_text" : "Hello world",
"id" : "s1"
}
}
]
}
}
```
## Setting a default model on an index or field
A [`neural`]({{site.url}}{{site.baseurl}}/query-dsl/specialized/neural/) query requires a model ID for generating vector embeddings. To eliminate passing the model ID with each neural query request, you can set a default model on a k-NN index or a field.
First, create a [search pipeline]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/index/) with a [`neural_query_enricher`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/neural-query-enricher/) request processor. To set a default model for an index, provide the model ID in the `default_model_id` parameter. To set a default model for a specific field, provide the field name and the corresponding model ID in the `neural_field_default_id` map. If you provide both `default_model_id` and `neural_field_default_id`, `neural_field_default_id` takes precedence:
```json
PUT /_search/pipeline/default_model_pipeline
{
"request_processors": [
{
"neural_query_enricher" : {
"default_model_id": "bQ1J8ooBpBj3wT4HVUsb",
"neural_field_default_id": {
"my_field_1": "uZj0qYoBMtvQlfhaYeud",
"my_field_2": "upj0qYoBMtvQlfhaZOuM"
}
}
}
]
}
```
{% include copy-curl.html %}
Then set the default model for your index:
```json
PUT /my-nlp-index/_settings
{
"index.search.default_pipeline" : "default_model_pipeline"
}
```
{% include copy-curl.html %}
You can now omit the model ID when searching:
```json
GET /my-nlp-index/_search
{
"_source": {
"excludes": [
"passage_embedding"
]
},
"query": {
"neural": {
"passage_embedding": {
"query_text": "Hi world",
"k": 100
}
}
}
}
```
{% include copy-curl.html %}
The response contains both documents:
```json
{
"took" : 41,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.22762,
"hits" : [
{
"_index" : "my-nlp-index",
"_id" : "2",
"_score" : 1.22762,
"_source" : {
"passage_text" : "Hi planet",
"id" : "s2"
}
},
{
"_index" : "my-nlp-index",
"_id" : "1",
"_score" : 1.2251667,
"_source" : {
"passage_text" : "Hello world",
"id" : "s1"
}
}
]
}
}
```

View File

@ -9,7 +9,7 @@ grand_parent: Search pipelines
# Neural query enricher processor
The `neural_query_enricher` search request processor is designed to set a default machine learning (ML) model ID at the index or field level for [neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/) queries. To learn more about ML models, see [ML Framework]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/).
The `neural_query_enricher` search request processor is designed to set a default machine learning (ML) model ID at the index or field level for [neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/) queries. To learn more about ML models, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/).
## Request fields