diff --git a/_api-reference/ingest-apis/processors/sparse-encoding.md b/_api-reference/ingest-apis/processors/sparse-encoding.md new file mode 100644 index 00000000..7c1eda36 --- /dev/null +++ b/_api-reference/ingest-apis/processors/sparse-encoding.md @@ -0,0 +1,147 @@ +--- +layout: default +title: Sparse encoding +parent: Ingest processors +grand_parent: Ingest APIs +nav_order: 240 +--- + +# Sparse encoding + +The `sparse_encoding` processor is used to generate a sparse vector/token and weights from text fields for [neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/) using sparse retrieval. + +**PREREQUISITE**
+Before using the `sparse_encoding` processor, you must set up a machine learning (ML) model. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/) and [Semantic search]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search/). +{: .note} + +The following is the syntax for the `sparse_encoding` processor: + +```json +{ + "sparse_encoding": { + "model_id": "", + "field_map": { + "": "" + } + } +} +``` +{% include copy-curl.html %} + +#### Configuration parameters + +The following table lists the required and optional parameters for the `sparse_encoding` processor. + +| Name | Data type | Required | Description | +|:---|:---|:---|:---| +`model_id` | String | Required | The ID of the model that will be used to generate the embeddings. The model must be deployed in OpenSearch before it can be used in neural search. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/) and [Semantic search]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search/). +`field_map` | Object | Required | Contains key-value pairs that specify the mapping of a text field to a `rank_features` field. +`field_map.` | String | Required | The name of the field from which to obtain text for generating vector embeddings. +`field_map.` | String | Required | The name of the vector field in which to store the generated vector embeddings. +`description` | String | Optional | A brief description of the processor. | +`tag` | String | Optional | An identifier tag for the processor. Useful for debugging to distinguish between processors of the same type. | + +## Using the processor + +Follow these steps to use the processor in a pipeline. You must provide a model ID when creating the processor. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/). + +**Step 1: Create a pipeline.** + +The following example request creates an ingest pipeline where the text from `passage_text` will be converted into text embeddings and the embeddings will be stored in `passage_embedding`: + +```json +PUT /_ingest/pipeline/nlp-ingest-pipeline +{ + "description": "A sparse encoding ingest pipeline", + "processors": [ + { + "sparse_encoding": { + "model_id": "aP2Q8ooBpBj3wT4HVS8a", + "field_map": { + "passage_text": "passage_embedding" + } + } + } + ] +} +``` +{% include copy-curl.html %} + +**Step 2 (Optional): Test the pipeline.** + +It is recommended that you test your pipeline before you ingest documents. +{: .tip} + +To test the pipeline, run the following query: + +```json +POST _ingest/pipeline/nlp-ingest-pipeline/_simulate +{ + "docs": [ + { + "_index": "testindex1", + "_id": "1", + "_source":{ + "passage_text": "hello world" + } + } + ] +} +``` +{% include copy-curl.html %} + +#### Response + +The response confirms that in addition to the `passage_text` field, the processor has generated text embeddings in the `passage_embedding` field: + +```json +{ + "docs" : [ + { + "doc" : { + "_index" : "testindex1", + "_id" : "1", + "_source" : { + "passage_embedding" : { + "!" : 0.8708904, + "door" : 0.8587369, + "hi" : 2.3929274, + "worlds" : 2.7839446, + "yes" : 0.75845814, + "##world" : 2.5432441, + "born" : 0.2682308, + "nothing" : 0.8625516, + "goodbye" : 0.17146169, + "greeting" : 0.96817183, + "birth" : 1.2788506, + "come" : 0.1623208, + "global" : 0.4371151, + "it" : 0.42951578, + "life" : 1.5750692, + "thanks" : 0.26481047, + "world" : 4.7300377, + "tiny" : 0.5462298, + "earth" : 2.6555297, + "universe" : 2.0308156, + "worldwide" : 1.3903781, + "hello" : 6.696973, + "so" : 0.20279501, + "?" : 0.67785245 + }, + "passage_text" : "hello world" + }, + "_ingest" : { + "timestamp" : "2023-10-11T22:35:53.654650086Z" + } + } + } + ] +} +``` + +## Next steps + +- To learn how to use the `neural_sparse` query for a sparse search, see [Neural sparse query]({{site.url}}{{site.baseurl}}/query-dsl/specialized/neural-sparse/). +- To learn more about sparse neural search, see [Sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/). +- To learn more about using models in OpenSearch, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/). +- For a semantic search tutorial, see [Semantic search]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search/). diff --git a/_api-reference/ingest-apis/processors/text-embedding.md b/_api-reference/ingest-apis/processors/text-embedding.md new file mode 100644 index 00000000..e81bbb82 --- /dev/null +++ b/_api-reference/ingest-apis/processors/text-embedding.md @@ -0,0 +1,128 @@ +--- +layout: default +title: Text embedding +parent: Ingest processors +grand_parent: Ingest APIs +nav_order: 260 +--- + +# Text embedding + +The `text_embedding` processor is used to generate vector embeddings from text fields for [neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/). + +**PREREQUISITE**
+Before using the `text_embedding` processor, you must set up a machine learning (ML) model. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/) and [Semantic search]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search/). +{: .note} + +The following is the syntax for the `text_embedding` processor: + +```json +{ + "text_embedding": { + "model_id": "", + "field_map": { + "": "" + } + } +} +``` +{% include copy-curl.html %} + +#### Configuration parameters + +The following table lists the required and optional parameters for the `text_embedding` processor. + +| Name | Data type | Required | Description | +|:---|:---|:---|:---| +`model_id` | String | Required | The ID of the model that will be used to generate the embeddings. The model must be deployed in OpenSearch before it can be used in neural search. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/) and [Semantic search]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search/). +`field_map` | Object | Required | Contains key-value pairs that specify the mapping of a text field to a vector field. +`field_map.` | String | Required | The name of the field from which to obtain text for generating text embeddings. +`field_map.` | String | Required | The name of the vector field in which to store the generated text embeddings. +`description` | String | Optional | A brief description of the processor. | +`tag` | String | Optional | An identifier tag for the processor. Useful for debugging to distinguish between processors of the same type. | + +## Using the processor + +Follow these steps to use the processor in a pipeline. You must provide a model ID when creating the processor. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/). + +**Step 1: Create a pipeline.** + +The following example request creates an ingest pipeline where the text from `passage_text` will be converted into text embeddings and the embeddings will be stored in `passage_embedding`: + +```json +PUT /_ingest/pipeline/nlp-ingest-pipeline +{ + "description": "A text embedding pipeline", + "processors": [ + { + "text_embedding": { + "model_id": "bQ1J8ooBpBj3wT4HVUsb", + "field_map": { + "passage_text": "passage_embedding" + } + } + } + ] +} +``` +{% include copy-curl.html %} + +**Step 2 (Optional): Test the pipeline.** + +It is recommended that you test your pipeline before you ingest documents. +{: .tip} + +To test the pipeline, run the following query: + +```json +POST _ingest/pipeline/nlp-ingest-pipeline/_simulate +{ + "docs": [ + { + "_index": "testindex1", + "_id": "1", + "_source":{ + "passage_text": "hello world" + } + } + ] +} +``` +{% include copy-curl.html %} + +#### Response + +The response confirms that in addition to the `passage_text` field, the processor has generated text embeddings in the `passage_embedding` field: + +```json +{ + "docs": [ + { + "doc": { + "_index": "testindex1", + "_id": "1", + "_source": { + "passage_embedding": [ + -0.048237972, + -0.07612712, + 0.3262124, + ... + -0.16352308 + ], + "passage_text": "hello world" + }, + "_ingest": { + "timestamp": "2023-10-05T15:15:19.691345393Z" + } + } + } + ] +} +``` + +## Next steps + +- To learn how to use the `neural` query for text search, see [Neural query]({{site.url}}{{site.baseurl}}/query-dsl/specialized/neural/). +- To learn more about neural text search, see [Text search]({{site.url}}{{site.baseurl}}/search-plugins/neural-text-search/). +- To learn more about using models in OpenSearch, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/). +- For a semantic search tutorial, see [Semantic search]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search/). \ No newline at end of file diff --git a/_api-reference/ingest-apis/processors/text-image-embedding.md b/_api-reference/ingest-apis/processors/text-image-embedding.md new file mode 100644 index 00000000..57ce7bf3 --- /dev/null +++ b/_api-reference/ingest-apis/processors/text-image-embedding.md @@ -0,0 +1,138 @@ +--- +layout: default +title: Text/image embedding +parent: Ingest processors +grand_parent: Ingest APIs +nav_order: 270 +--- + +# Text/image embedding + +The `text_image_embedding` processor is used to generate combined vector embeddings from text and image fields for [multimodal neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-multimodal-search/). + +**PREREQUISITE**
+Before using the `text_image_embedding` processor, you must set up a machine learning (ML) model. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/) and [Semantic search]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search/). +{: .note} + +The following is the syntax for the `text_image_embedding` processor: + +```json +{ + "text_image_embedding": { + "model_id": "", + "embedding": "", + "field_map": { + "text": "", + "image": "" + } + } +} +``` +{% include copy-curl.html %} + +## Parameters + +The following table lists the required and optional parameters for the `text_image_embedding` processor. + +| Name | Data type | Required | Description | +|:---|:---|:---|:---| +`model_id` | String | Required | The ID of the model that will be used to generate the embeddings. The model must be deployed in OpenSearch before it can be used in neural search. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/) and [Semantic search]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search/). +`embedding` | String | Required | The name of the vector field in which to store the generated embeddings. A single embedding is generated for both `text` and `image` fields. +`field_map` | Object | Required | Contains key-value pairs that specify the fields from which to generate embeddings. +`field_map.text` | String | Optional | The name of the field from which to obtain text for generating vector embeddings. You must specify at least one `text` or `image`. +`field_map.image` | String | Optional | The name of the field from which to obtain the image for generating vector embeddings. You must specify at least one `text` or `image`. +`description` | String | Optional | A brief description of the processor. | +`tag` | String | Optional | An identifier tag for the processor. Useful for debugging to distinguish between processors of the same type. | + +## Using the processor + +Follow these steps to use the processor in a pipeline. You must provide a model ID when creating the processor. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/). + +**Step 1: Create a pipeline.** + +The following example request creates an ingest pipeline where the text from `image_description` and the image from `image_binary` will be converted into vector embeddings and the embeddings will be stored in `vector_embedding`: + +```json +PUT /_ingest/pipeline/nlp-ingest-pipeline +{ + "description": "A text/image embedding pipeline", + "processors": [ + { + "text_image_embedding": { + "model_id": "bQ1J8ooBpBj3wT4HVUsb", + "embedding": "vector_embedding", + "field_map": { + "text": "image_description", + "image": "image_binary" + } + } + } + ] +} +``` +{% include copy-curl.html %} + +You can set up multiple processors in one pipeline to generate embeddings for multiple fields. +{: .note} + +**Step 2 (Optional): Test the pipeline.** + +It is recommended that you test your pipeline before you ingest documents. +{: .tip} + +To test the pipeline, run the following query: + +```json +POST _ingest/pipeline/nlp-ingest-pipeline/_simulate +{ + "docs": [ + { + "_index": "testindex1", + "_id": "1", + "_source":{ + "image_description": "Orange table", + "image_binary": "bGlkaHQtd29rfx43..." + } + } + ] +} +``` +{% include copy-curl.html %} + +#### Response + +The response confirms that in addition to the `image_description` and `image_binary` fields, the processor has generated vector embeddings in the `vector_embedding` field: + +```json +{ + "docs": [ + { + "doc": { + "_index": "testindex1", + "_id": "1", + "_source": { + "vector_embedding": [ + -0.048237972, + -0.07612712, + 0.3262124, + ... + -0.16352308 + ], + "image_description": "Orange table", + "image_binary": "bGlkaHQtd29rfx43..." + }, + "_ingest": { + "timestamp": "2023-10-05T15:15:19.691345393Z" + } + } + } + ] +} +``` + +## Next steps + +- To learn how to use the `neural` query for a multimodal search, see [Neural query]({{site.url}}{{site.baseurl}}/query-dsl/specialized/neural/). +- To learn more about multimodal neural search, see [Multimodal search]({{site.url}}{{site.baseurl}}/search-plugins/neural-multimodal-search/). +- To learn more about using models in OpenSearch, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/). +- For a semantic search tutorial, see [Semantic search]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search/). \ No newline at end of file diff --git a/_ml-commons-plugin/algorithms.md b/_ml-commons-plugin/algorithms.md index 1db8b432..25844b35 100644 --- a/_ml-commons-plugin/algorithms.md +++ b/_ml-commons-plugin/algorithms.md @@ -2,10 +2,10 @@ layout: default title: Supported Algorithms has_children: false -nav_order: 100 +nav_order: 30 --- -# Supported Algorithms +# Supported algorithms ML Commons supports various algorithms to help train and predict machine learning (ML) models or test data-driven predictions without a model. This page outlines the algorithms supported by the ML Commons plugin and the API operations they support. diff --git a/_ml-commons-plugin/api.md b/_ml-commons-plugin/api.md index 055b66d1..6578c7e9 100644 --- a/_ml-commons-plugin/api.md +++ b/_ml-commons-plugin/api.md @@ -2,7 +2,7 @@ layout: default title: API has_children: false -nav_order: 99 +nav_order: 130 --- # ML Commons API diff --git a/_ml-commons-plugin/conversational-search.md b/_ml-commons-plugin/conversational-search.md index 3017ffb3..676f19ac 100644 --- a/_ml-commons-plugin/conversational-search.md +++ b/_ml-commons-plugin/conversational-search.md @@ -407,6 +407,6 @@ If your LLM includes a set token limit, set the `size` field in your OpenSearch ## Next steps -- To learn more about ML connectors, see [Creating connectors for third-party ML platforms]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/connectors/). -- To learn more about the OpenSearch ML framework, see [ML framework]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/). +- To learn more about connecting to models on external platforms, see [Connectors]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/connectors/). +- To learn more about using custom models within your OpenSearch cluster, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/). diff --git a/_ml-commons-plugin/extensibility/blueprints.md b/_ml-commons-plugin/extensibility/blueprints.md index d5de6086..a1b0f19a 100644 --- a/_ml-commons-plugin/extensibility/blueprints.md +++ b/_ml-commons-plugin/extensibility/blueprints.md @@ -1,12 +1,12 @@ --- layout: default -title: Building blueprints +title: Connector blueprints has_children: false nav_order: 65 -parent: ML extensibility +parent: Connecting to remote models --- -# Building blueprints +# Connector blueprints All connectors consist of a JSON blueprint created by machine learning (ML) developers. The blueprint allows administrators and data scientists to make connections between OpenSearch and an AI service or model-serving technology. @@ -41,7 +41,7 @@ POST /_plugins/_ml/connectors/_create ] } ``` - +{% include copy-curl.html %} ## Example blueprints @@ -58,7 +58,7 @@ The following configuration options are **required** in order to build a connect | `version` | Integer | The version of the connector. | | `protocol` | String | The protocol for the connection. For AWS services such as Amazon SageMaker and Amazon Bedrock, use `aws_sigv4`. For all other services, use `http`. | | `parameters` | JSON object | The default connector parameters, including `endpoint` and `model`. Any parameters indicated in this field can be overridden by parameters specified in a predict request. | -| `credential` | `Map` | Defines any credential variables required to connect to your chosen endpoint. ML Commons uses **AES/GCM/NoPadding** symmetric encryption to encrypt your credentials. When the connection to the cluster first starts, OpenSearch creates a random 32-byte encryption key that persists in OpenSearch's system index. Therefore, you do not need to manually set the encryption key. | +| `credential` | JSON object | Defines any credential variables required in order to connect to your chosen endpoint. ML Commons uses **AES/GCM/NoPadding** symmetric encryption to encrypt your credentials. When the connection to the cluster first starts, OpenSearch creates a random 32-byte encryption key that persists in OpenSearch's system index. Therefore, you do not need to manually set the encryption key. | | `actions` | JSON array | Define what actions can run within the connector. If you're an administrator making a connection, add the [blueprint]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/blueprints/) for your desired connection. | | `backend_roles` | JSON array | A list of OpenSearch backend roles. For more information about setting up backend roles, see [Assigning backend roles to users]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-access-control#assigning-backend-roles-to-users). | | `access_mode` | String | Sets the access mode for the model, either `public`, `restricted`, or `private`. Default is `private`. For more information about `access_mode`, see [Model groups]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-access-control#model-groups). | @@ -73,7 +73,107 @@ The `action` parameter supports the following options. | `url` | String | Required. Sets the connection endpoint at which the action takes place. This must match the regex expression for the connection used when [adding trusted endpoints]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/index#adding-trusted-endpoints). | | `headers` | JSON object | Sets the headers used inside the request or response body. Default is `ContentType: application/json`. If your third-party ML tool requires access control, define the required `credential` parameters in the `headers` parameter. | | `request_body` | String | Required. Sets the parameters contained inside the request body of the action. The parameters must include `\"inputText\`, which specifies how users of the connector should construct the request payload for the `action_type`. | +| `pre_process_function` | String | Optional. A built-in or custom Painless script used to preprocess the input data. OpenSearch provides the following built-in preprocess functions that you can call directly:
- `connector.pre_process.cohere.embedding` for [Cohere](https://cohere.com/) embedding models
- `connector.pre_process.openai.embedding` for [OpenAI](https://openai.com/) embedding models
- `connector.pre_process.default.embedding`, which you can use to preprocess documents in neural search requests so that they are in the format that ML Commons can process with the default preprocessor (OpenSearch 2.11 or later). For more information, see [built-in functions](#built-in-pre--and-post-processing-functions). | +| `post_process_function` | String | Optional. A built-in or custom Painless script used to post-process the model output data. OpenSearch provides the following built-in post-process functions that you can call directly:
- `connector.pre_process.cohere.embedding` for [Cohere text embedding models](https://docs.cohere.com/reference/embed)
- `connector.pre_process.openai.embedding` for [OpenAI text embedding models](https://platform.openai.com/docs/api-reference/embeddings)
- `connector.post_process.default.embedding`, which you can use to post-process documents in the model response so that they are in the format that neural search expects (OpenSearch 2.11 or later). For more information, see [built-in functions](#built-in-pre--and-post-processing-functions). | + +## Built-in pre- and post-processing functions + +Call the built-in pre- and post-processing functions instead of writing a custom Painless script when connecting to the following text embedding models or your own text embedding models deployed on a remote server (for example, Amazon SageMaker): + +- [OpenAI remote models](https://platform.openai.com/docs/api-reference/embeddings) +- [Cohere remote models](https://docs.cohere.com/reference/embed) + +OpenSearch provides the following pre- and post-processing functions: + +- OpenAI: `connector.pre_process.openai.embedding` and `connector.post_process.openai.embedding` +- Cohere: `connector.pre_process.cohere.embedding` and `connector.post_process.cohere.embedding` +- [Default](#default-pre--and-post-processing-functions) (for neural search): `connector.pre_process.default.embedding` and `connector.post_process.default.embedding` + +### Default pre- and post-processing functions + +When you perform vector search using [neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/), the neural search request is routed first to ML Commons and then to the model. If the model is one of the [pretrained models provided by OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/), it can parse the ML Commons request and return the response in the format that ML Commons expects. However, for a remote model, the expected format may be different from the ML Commons format. The default pre- and post-processing functions translate between the format that the model expects and the format that neural search expects. + +#### Example request + +The following example request creates a SageMaker text embedding connector and calls the default post-processing function: + +```json +POST /_plugins/_ml/connectors/_create +{ + "name": "Sagemaker text embedding connector", + "description": "The connector to Sagemaker", + "version": 1, + "protocol": "aws_sigv4", + "credential": { + "access_key": "", + "secret_key": "", + "session_token": "" + }, + "parameters": { + "region": "ap-northeast-1", + "service_name": "sagemaker" + }, + "actions": [ + { + "action_type": "predict", + "method": "POST", + "url": "sagemaker.ap-northeast-1.amazonaws.com/endpoints/", + "headers": { + "content-type": "application/json" + }, + "post_process_function": "connector.post_process.default.embedding", + "request_body": "${parameters.input}" + } + ] +} +``` +{% include copy-curl.html %} + +The `request_body` template must be `${parameters.input}``. +{: .important} + +### Preprocessing function + +The `connector.pre_process.default.embedding` default preprocessing function parses the neural search request and transforms it into the format that the model expects as input. + +The ML Commons [Predict API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/#predict) provides parameters in the following format: + +```json +{ + "parameters": { + "input": ["hello", "world"] + } +} +``` + +The default preprocessing function sends the `input` field contents to the model. Thus, the model input format must be a list of strings, for example: + +```json +["hello", "world"] +``` + +### Post-processing function + +The `connector.post_process.default.embedding` default post-processing function parses the model response and transforms it into the format that neural search expects as input. + +The remote text embedding model output must be a two-dimensional float array, each element of which represents an embedding of a string from the input list. For example, the following two-dimensional array corresponds to the embedding of the list `["hello", "world"]`: + +```json +[ + [ + -0.048237994, + -0.07612697, + ... + ], + [ + 0.32621247, + 0.02328475, + ... + ] +] +``` + ## Next step -To see how system administrators and data scientists use blueprints for connectors, see [Creating connectors for third-party ML platforms]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/connectors/). \ No newline at end of file +For examples of creating various connectors, see [Connectors]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/connectors/). \ No newline at end of file diff --git a/_ml-commons-plugin/extensibility/connectors.md b/_ml-commons-plugin/extensibility/connectors.md index d22f05eb..640120c1 100644 --- a/_ml-commons-plugin/extensibility/connectors.md +++ b/_ml-commons-plugin/extensibility/connectors.md @@ -1,42 +1,43 @@ --- layout: default -title: Creating connectors for third-party ML platforms +title: Connectors has_children: false +has_toc: false nav_order: 61 -parent: ML extensibility +parent: Connecting to remote models --- # Creating connectors for third-party ML platforms -Machine Learning (ML) connectors provide the ability to integrate OpenSearch ML capabilities with third-party ML tools and platforms. Through connectors, OpenSearch can invoke these third-party endpoints to enrich query results and data pipelines. +Connectors facilitate access to remote models hosted on third-party platforms. You can provision connectors in two ways: -1. An [external connector](#external-connector), saved in a connector index, which can be reused and shared with multiple remote models but requires access to both the model, the connector inside of OpenSearch, and the third party being accessed by the connector, such as OpenAI or SageMaker. +1. Create a [standalone connector](#standalone-connector): A standalone connector can be reused and shared by multiple remote models but requires access to both the model and connector in OpenSearch and the third-party platform, such as OpenAI or Amazon SageMaker, that the connector is accessing. Standalone connectors are saved in a connector index. -2. A [local connector](#local-connector), saved in the model index, which can only be used with one remote model. Unlike a standalone connector, users only need access to the model itself to access an internal connector because the connection is established inside the model. +2. Create a remote model with an [internal connector](#internal-connector): An internal connector can only be used with the remote model in which it was created. To access an internal connector, you only need access to the model itself because the connection is established inside the model. Internal connectors are saved in the model index. ## Supported connectors As of OpenSearch 2.9, connectors have been tested for the following ML services, though it is possible to create connectors for other platforms not listed here: -- [Amazon SageMaker](https://aws.amazon.com/sagemaker/) allows you to host and manage the lifecycle of text-embedding models, powering semantic search queries in OpenSearch. When connected, Amazon SageMaker hosts your models and OpenSearch is used to query inferences. This benefits Amazon SageMaker users who value its functionality, such as model monitoring, serverless hosting, and workflow automation for continuous training and deployment. +- [Amazon SageMaker](https://aws.amazon.com/sagemaker/) allows you to host and manage the lifecycle of text embedding models, powering semantic search queries in OpenSearch. When connected, Amazon SageMaker hosts your models and OpenSearch is used to query inferences. This benefits Amazon SageMaker users who value its functionality, such as model monitoring, serverless hosting, and workflow automation for continuous training and deployment. - [OpenAI ChatGPT](https://openai.com/blog/chatgpt) enables you to invoke an OpenAI chat model from inside an OpenSearch cluster. -- [Cohere](https://cohere.com/) allows you to use data from OpenSearch to power Cohere's large language models. +- [Cohere](https://cohere.com/) allows you to use data from OpenSearch to power the Cohere large language models. +- The [Bedrock Titan Embeddings](https://aws.amazon.com/bedrock/titan/) model can drive semantic search and retrieval-augmented generation in OpenSearch. All connectors consist of a JSON blueprint created by machine learning (ML) developers. The blueprint allows administrators and data scientists to make connections between OpenSearch and an AI service or model-serving technology. You can find blueprints for each connector in the [ML Commons repository](https://github.com/opensearch-project/ml-commons/tree/2.x/docs/remote_inference_blueprints). -If you want to build your own blueprint, see [Building blueprints]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/blueprints/). - - -## External connector +For more information about blueprint parameters, see [Connector blueprints]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/blueprints/). Admins are only required to enter their `credential` settings, such as `"openAI_key"`, for the service they are connecting to. All other parameters are defined within the [blueprint]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/blueprints/). {: .note} -The connector creation API, `/_plugins/_ml/connectors/_create`, creates connections that allow users to deploy and register external models through OpenSearch. Using the `endpoint` parameter, you can connect ML Commons to any supported ML tool using its specific API endpoint. For example, to connect to a ChatGPT model, you can connect using `api.openai.com`, as shown in the following example: +## Standalone connector + +To create a standalone connector, send a request to the `connectors/_create` endpoint and provide all of the parameters described in [Connector blueprints]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/blueprints/): ```json POST /_plugins/_ml/connectors/_create @@ -67,33 +68,9 @@ POST /_plugins/_ml/connectors/_create ``` {% include copy-curl.html %} -If successful, the connector API responds with the `connector_id` for the connection: +## Internal connector -```json -{ - "connector_id": "a1eMb4kBJ1eYAeTMAljY" -} -``` - -With the returned `connector_id` we can register a model that uses that connector: - -```json -POST /_plugins/_ml/models/_register -{ - "name": "openAI-gpt-3.5-turbo", - "function_name": "remote", - "model_group_id": "lEFGL4kB4ubqQRzegPo2", - "description": "test model", - "connector_id": "a1eMb4kBJ1eYAeTMAljY" -} -``` - -## Local connector - -Admins are only required to enter their `credential` settings, such as `"openAI_key"`, for the service they are connecting to. All other parameters are defined within the [blueprint]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/blueprints/). -{: .note} - -To create an internal connector, add the `connector` parameter to the Register model API, as shown in the following example: +To create an internal connector, provide all of the parameters described in [Connector blueprints]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/blueprints/) within the `connector` object of a request to the `models/_register` endpoint: ```json POST /_plugins/_ml/models/_register @@ -129,192 +106,10 @@ POST /_plugins/_ml/models/_register ] } } -} ``` +{% include copy-curl.html %} -## Registering and deploying a connected model - -After a connection has been created, use the `connector_id` from the response to register and deploy a connected model. - -To register a model, you have the following options: - -- You can use `model_group_id` to register a model version to an existing model group. -- If you do not use `model_group_id`, ML Commons creates a model with a new model group. - -If you want to create a new `model_group`, use the following example: - -```json -POST /_plugins/_ml/model_groups/_register -{ - "name": "remote_model_group", - "description": "This is an example description" -} -``` - -ML Commons returns the following response: - -```json -{ - "model_group_id": "wlcnb4kBJ1eYAeTMHlV6", - "status": "CREATED" -} -``` - -The following example registers a model named `openAI-gpt-3.5-turbo`: - -```json -POST /_plugins/_ml/models/_register -{ - "name": "openAI-gpt-3.5-turbo", - "function_name": "remote", - "model_group_id": "wlcnb4kBJ1eYAeTMHlV6", - "description": "test model", - "connector_id": "a1eMb4kBJ1eYAeTMAljY" -} -``` - -ML Commons returns the `task_id` and registration status of the model: - -```json -{ - "task_id": "cVeMb4kBJ1eYAeTMFFgj", - "status": "CREATED" -} -``` - - -You can use the `task_id` to find the `model_id`, as shown the following example: - - -**GET task request** - -```json -GET /_plugins/_ml/tasks/cVeMb4kBJ1eYAeTMFFgj -``` - -**GET task response** - -```json -{ - "model_id": "cleMb4kBJ1eYAeTMFFg4", - "task_type": "REGISTER_MODEL", - "function_name": "REMOTE", - "state": "COMPLETED", - "worker_node": [ - "XPcXLV7RQoi5m8NI_jEOVQ" - ], - "create_time": 1689793598499, - "last_update_time": 1689793598530, - "is_async": false -} -``` - -Lastly, use the `model_id` to deploy the model: - -**Deploy model request** - -```json -POST /_plugins/_ml/models/cleMb4kBJ1eYAeTMFFg4/_deploy -``` - -**Deploy model response** - -```json -{ - "task_id": "vVePb4kBJ1eYAeTM7ljG", - "status": "CREATED" -} -``` - -Use the `task_id` from the deploy model response to make sure the model deployment completes: - -**Verify deploy completion request** - -```json -GET /_plugins/_ml/tasks/vVePb4kBJ1eYAeTM7ljG -``` - -**Verify deploy completion response** - -```json -{ - "model_id": "cleMb4kBJ1eYAeTMFFg4", - "task_type": "DEPLOY_MODEL", - "function_name": "REMOTE", - "state": "COMPLETED", - "worker_node": [ - "n-72khvBTBi3bnIIR8FTTw" - ], - "create_time": 1689793851077, - "last_update_time": 1689793851101, - "is_async": true -} -``` - -After a successful deployment, you can test the model using the Predict API set in the connector's `action` settings, as shown in the following example: - -```json -POST /_plugins/_ml/models/cleMb4kBJ1eYAeTMFFg4/_predict -{ - "parameters": { - "messages": [ - { - "role": "system", - "content": "You are a helpful assistant." - }, - { - "role": "user", - "content": "Hello!" - } - ] - } -} -``` - -The Predict API returns inference results for the connected model, as shown in the following example response: - -```json -{ - "inference_results": [ - { - "output": [ - { - "name": "response", - "dataAsMap": { - "id": "chatcmpl-7e6s5DYEutmM677UZokF9eH40dIY7", - "object": "chat.completion", - "created": 1689793889, - "model": "gpt-3.5-turbo-0613", - "choices": [ - { - "index": 0, - "message": { - "role": "assistant", - "content": "Hello! How can I assist you today?" - }, - "finish_reason": "stop" - } - ], - "usage": { - "prompt_tokens": 19, - "completion_tokens": 9, - "total_tokens": 28 - } - } - } - ] - } - ] -} -``` - - -## Examples - -The following example connector requests show how to create a connector with supported third-party tools. - - -### OpenAI chat connector +## OpenAI chat connector The following example creates a standalone OpenAI chat connector. The same options can be used for an internal connector under the `connector` parameter: @@ -350,7 +145,7 @@ POST /_plugins/_ml/connectors/_create After creating the connector, you can retrieve the `task_id` and `connector_id` to register and deploy the model and then use the Predict API, similarly to a standalone connector. -### Amazon SageMaker +## Amazon SageMaker connector The following example creates a standalone Amazon SageMaker connector. The same options can be used for an internal connector under the `connector` parameter: @@ -395,7 +190,7 @@ The `parameters` section requires the following options when using `aws_sigv4` a - `region`: The AWS Region in which the AWS instance is located. - `service_name`: The name of the AWS service for the connector. -### Cohere +## Cohere connector The following example request creates a standalone Cohere connection: @@ -431,8 +226,5 @@ POST /_plugins/_ml/connectors/_create ## Next steps -- To learn more about using models in OpenSearch, see [ML Framework]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/). +- To learn more about using models in OpenSearch, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/). - To learn more about model access control and model groups, see [Model access control]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-access-control/). - - - diff --git a/_ml-commons-plugin/extensibility/index.md b/_ml-commons-plugin/extensibility/index.md index 4defaa5e..e52d8271 100644 --- a/_ml-commons-plugin/extensibility/index.md +++ b/_ml-commons-plugin/extensibility/index.md @@ -1,18 +1,19 @@ --- layout: default -title: ML extensibility +title: Connecting to remote models has_children: true +has_toc: false nav_order: 60 --- -# ML extensibility +# Connecting to remote models Machine learning (ML) extensibility enables ML developers to create integrations with other ML services, such as Amazon SageMaker or OpenAI. These integrations provide system administrators and data scientists the ability to run ML workloads outside of their OpenSearch cluster. To get started with ML extensibility, choose from the following options: -- If you're an ML developer wanting to integrate with your specific ML services, see [Building blueprints]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/blueprints/). -- If you're a system administrator or data scientist wanting to create a connection to an ML service, see [Creating connectors for third-party ML platforms]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/connectors/). +- If you're an ML developer wanting to integrate with your specific ML services, see [Connector blueprints]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/blueprints/). +- If you're a system administrator or data scientist wanting to create a connection to an ML service, see [Connectors]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/connectors/). ## Prerequisites @@ -22,7 +23,7 @@ When access control is enabled on your third-party platform, you can enter your ### Adding trusted endpoints -To configure connectors in OpenSearch, add the trusted endpoints to your cluster settings using the `plugins.ml_commons.trusted_connector_endpoints_regex` setting, which supports Java regex expressions, as shown in the following example: +To configure connectors in OpenSearch, add the trusted endpoints to your cluster settings by using the `plugins.ml_commons.trusted_connector_endpoints_regex` setting, which supports Java regex expressions: ```json PUT /_cluster/settings @@ -75,7 +76,7 @@ When access control is enabled, you can install the [Security plugin]({{site.url ### Node settings -Remote models based on external connectors consume fewer resources. Therefore, you can deploy any model from a standalone connector using data nodes. To make sure that your standalone connection uses data nodes, set `plugins.ml_commons.only_run_on_ml_node` to `false`, as shown in the following example: +Remote models based on external connectors consume fewer resources. Therefore, you can deploy any model from a standalone connector using data nodes. To make sure that your standalone connection uses data nodes, set `plugins.ml_commons.only_run_on_ml_node` to `false`: ```json PUT /_cluster/settings @@ -88,9 +89,236 @@ PUT /_cluster/settings ``` {% include copy-curl.html %} +## Step 1: Register a model group + +To register a model, you have the following options: + +- You can use `model_group_id` to register a model version to an existing model group. +- If you do not use `model_group_id`, ML Commons creates a model with a new model group. + +To register a model group, send the following request: + +```json +POST /_plugins/_ml/model_groups/_register +{ + "name": "remote_model_group", + "description": "A model group for remote models" +} +``` +{% include copy-curl.html %} + +The response contains the model group ID that you'll use to register a model to this model group: + +```json +{ + "model_group_id": "wlcnb4kBJ1eYAeTMHlV6", + "status": "CREATED" +} +``` + +To learn more about model groups, see [Model access control]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-access-control/). + +## Step 2: Create a connector + +You can create a standalone connector or an internal connector as part of a specific model. For more information about connectors and connector examples, see [Connectors]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/connectors/). + +The Connectors Create API, `/_plugins/_ml/connectors/_create`, creates connectors that facilitate registering and deploying external models in OpenSearch. Using the `endpoint` parameter, you can connect ML Commons to any supported ML tool by using its specific API endpoint. For example, you can connect to a ChatGPT model by using the `api.openai.com` endpoint: + +```json +POST /_plugins/_ml/connectors/_create +{ + "name": "OpenAI Chat Connector", + "description": "The connector to public OpenAI model service for GPT 3.5", + "version": 1, + "protocol": "http", + "parameters": { + "endpoint": "api.openai.com", + "model": "gpt-3.5-turbo" + }, + "credential": { + "openAI_key": "..." + }, + "actions": [ + { + "action_type": "predict", + "method": "POST", + "url": "https://${parameters.endpoint}/v1/chat/completions", + "headers": { + "Authorization": "Bearer ${credential.openAI_key}" + }, + "request_body": "{ \"model\": \"${parameters.model}\", \"messages\": ${parameters.messages} }" + } + ] +} +``` +{% include copy-curl.html %} + +The response contains the connector ID for the newly created connector: + +```json +{ + "connector_id": "a1eMb4kBJ1eYAeTMAljY" +} +``` + +## Step 3: Register a remote model + +To register a remote model to the model group created in step 1, provide the model group ID from step 1 and the connector ID from step 2 in the following request: + +```json +POST /_plugins/_ml/models/_register +{ + "name": "openAI-gpt-3.5-turbo", + "function_name": "remote", + "model_group_id": "1jriBYsBq7EKuKzZX131", + "description": "test model", + "connector_id": "a1eMb4kBJ1eYAeTMAljY" +} +``` +{% include copy-curl.html %} + +OpenSearch returns the task ID of the register operation: + +```json +{ + "task_id": "cVeMb4kBJ1eYAeTMFFgj", + "status": "CREATED" +} +``` + +To check the status of the operation, provide the task ID to the [Tasks API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/#searching-for-a-task): + +```bash +GET /_plugins/_ml/tasks/cVeMb4kBJ1eYAeTMFFgj +``` +{% include copy-curl.html %} + +When the operation is complete, the state changes to `COMPLETED`: + +```json +{ + "model_id": "cleMb4kBJ1eYAeTMFFg4", + "task_type": "REGISTER_MODEL", + "function_name": "REMOTE", + "state": "COMPLETED", + "worker_node": [ + "XPcXLV7RQoi5m8NI_jEOVQ" + ], + "create_time": 1689793598499, + "last_update_time": 1689793598530, + "is_async": false +} +``` + +## Step 4: Deploy the remote model + +To deploy the registered model, provide its model ID from step 3 in the following request: + +```bash +POST /_plugins/_ml/models/cleMb4kBJ1eYAeTMFFg4/_deploy +``` +{% include copy-curl.html %} + +The response contains the task ID that you can use to check the status of the deploy operation: + +```json +{ + "task_id": "vVePb4kBJ1eYAeTM7ljG", + "status": "CREATED" +} +``` + +As in the previous step, check the status of the operation by calling the Tasks API: + +```bash +GET /_plugins/_ml/tasks/vVePb4kBJ1eYAeTM7ljG +``` +{% include copy-curl.html %} + +When the operation is complete, the state changes to `COMPLETED`: + +```json +{ + "model_id": "cleMb4kBJ1eYAeTMFFg4", + "task_type": "DEPLOY_MODEL", + "function_name": "REMOTE", + "state": "COMPLETED", + "worker_node": [ + "n-72khvBTBi3bnIIR8FTTw" + ], + "create_time": 1689793851077, + "last_update_time": 1689793851101, + "is_async": true +} +``` + +## Step 5: Make predictions + +Use the [Predict API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/#predict) to make predictions: + +```json +POST /_plugins/_ml/models/cleMb4kBJ1eYAeTMFFg4/_predict +{ + "parameters": { + "messages": [ + { + "role": "system", + "content": "You are a helpful assistant." + }, + { + "role": "user", + "content": "Hello!" + } + ] + } +} +``` +{% include copy-curl.html %} + +To learn more about chat functionality within OpenAI, see the [OpenAI Chat API](https://platform.openai.com/docs/api-reference/chat). + +The response contains the inference results provided by the OpenAI model: + +```json +{ + "inference_results": [ + { + "output": [ + { + "name": "response", + "dataAsMap": { + "id": "chatcmpl-7e6s5DYEutmM677UZokF9eH40dIY7", + "object": "chat.completion", + "created": 1689793889, + "model": "gpt-3.5-turbo-0613", + "choices": [ + { + "index": 0, + "message": { + "role": "assistant", + "content": "Hello! How can I assist you today?" + }, + "finish_reason": "stop" + } + ], + "usage": { + "prompt_tokens": 19, + "completion_tokens": 9, + "total_tokens": 28 + } + } + } + ] + } + ] +} +``` + ## Next steps -- For more information about managing ML models in OpenSearch, see [ML Framework]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-serving-framework/). +- For more information about connectors, including connector examples, see [Connectors]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/connectors/). +- For more information about connector parameters, see [Connector blueprints]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/blueprints/). +- For more information about managing ML models in OpenSearch, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-serving-framework/). - For more information about interacting with ML models in OpenSearch, see [Managing ML models in OpenSearch Dashboards]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-dashboard/) diff --git a/_ml-commons-plugin/gpu-acceleration.md b/_ml-commons-plugin/gpu-acceleration.md index c0532cf0..75b58ef0 100644 --- a/_ml-commons-plugin/gpu-acceleration.md +++ b/_ml-commons-plugin/gpu-acceleration.md @@ -1,7 +1,7 @@ --- layout: default title: GPU acceleration -parent: ML framework +parent: Using custom models within OpenSearch nav_order: 150 --- @@ -12,7 +12,7 @@ When running a natural language processing (NLP) model in your OpenSearch cluste ## Supported GPUs -Currently, ML nodes following GPU instances: +Currently, ML nodes support the following GPU instances: - [NVIDIA instances with CUDA 11.6](https://aws.amazon.com/nvidia/) - [AWS Inferentia](https://aws.amazon.com/machine-learning/inferentia/) diff --git a/_ml-commons-plugin/index.md b/_ml-commons-plugin/index.md index e5b4923f..837be5b7 100644 --- a/_ml-commons-plugin/index.md +++ b/_ml-commons-plugin/index.md @@ -9,18 +9,18 @@ nav_exclude: true # ML Commons plugin -ML Commons for OpenSearch eases the development of machine learning features by providing a set of common machine learning (ML) algorithms through transport and REST API calls. Those calls choose the right nodes and resources for each ML request and monitors ML tasks to ensure uptime. This allows you to leverage existing open-source ML algorithms and reduce the effort required to develop new ML features. +ML Commons for OpenSearch simplifies the development of machine learning (ML) features by providing a set of ML algorithms through transport and REST API calls. Those calls choose the right nodes and resources for each ML request and monitor ML tasks to ensure uptime. This allows you to use existing open-source ML algorithms and reduce the effort required to develop new ML features. Interaction with the ML Commons plugin occurs through either the [REST API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api) or [`ad`]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/functions#ad) and [`kmeans`]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/functions#kmeans) Piped Processing Language (PPL) commands. -Models [trained]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api#training-the-model) through the ML Commons plugin support model-based algorithms such as k-means. After you've trained a model enough so that it meets your precision requirements, you can apply the model to [predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api#predict) new data safely. +[Models trained]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api#training-the-model) through the ML Commons plugin support model-based algorithms, such as k-means. After you've trained a model to your precision requirements, use the model to [make predictions]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api#predict). -Should you not want to use a model, you can use the [Train and Predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api#train-and-predict) API to test your model without having to evaluate the model's performance. +If you don't want to use a model, you can use the [Train and Predict API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api#train-and-predict) to test your model without having to evaluate the model's performance. ## Using ML Commons -1. Ensure that you've appropriately set the cluster settings described in [Cluster Settings]({{site.url}}{{site.baseurl}}/ml-commons-plugin/cluster-settings/). -2. Set up model access as described in [Model Access Control]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-access-control/). +1. Ensure that you've appropriately set the cluster settings described in [ML Commons cluster settings]({{site.url}}{{site.baseurl}}/ml-commons-plugin/cluster-settings/). +2. Set up model access as described in [Model access control]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-access-control/). 3. Start using models: - - [ML Framework]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/) allows you to run models within OpenSearch. - - [ML Extensibility]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/index/) allows you to access remote models. \ No newline at end of file + - [Run your custom models within an OpenSearch cluster]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/). + - [Integrate models hosted on an external platform]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/index/). \ No newline at end of file diff --git a/_ml-commons-plugin/ml-dashboard.md b/_ml-commons-plugin/ml-dashboard.md index 49a07a1d..5dd3c4ed 100644 --- a/_ml-commons-plugin/ml-dashboard.md +++ b/_ml-commons-plugin/ml-dashboard.md @@ -66,4 +66,4 @@ A list of nodes gives you a view of each node the model is running on, including ## Next steps -For more information about how to manage ML models in OpenSearch, see [ML Framework]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-serving-framework/). +For more information about how to manage ML models in OpenSearch, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-serving-framework/). diff --git a/_ml-commons-plugin/ml-framework.md b/_ml-commons-plugin/ml-framework.md index eddb9260..e728d916 100644 --- a/_ml-commons-plugin/ml-framework.md +++ b/_ml-commons-plugin/ml-framework.md @@ -1,6 +1,6 @@ --- layout: default -title: ML framework +title: Using custom models within OpenSearch has_children: true nav_order: 50 redirect_from: @@ -11,7 +11,7 @@ ML Framework was taken out of experimental status and released as Generally Avai {: .note} -# ML Framework +# Using custom models within OpenSearch ML Commons allows you to serve custom models and use those models to make inferences through the OpenSearch Machine Learning (ML) Framework. For those who want to run their PyTorch deep learning model inside an OpenSearch cluster, you can upload and run that model with the ML Commons REST API. diff --git a/_ml-commons-plugin/pretrained-models.md b/_ml-commons-plugin/pretrained-models.md index c4bea64a..825e2a41 100644 --- a/_ml-commons-plugin/pretrained-models.md +++ b/_ml-commons-plugin/pretrained-models.md @@ -1,7 +1,7 @@ --- layout: default title: Pretrained models -parent: ML framework +parent: Using custom models within OpenSearch nav_order: 120 --- @@ -28,7 +28,7 @@ POST /_plugins/_ml/models/_upload } ``` -For more information about how to upload and use ML models, see [ML Framework]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-serving-framework/). +For more information about how to upload and use ML models, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-serving-framework/). ## Supported pretrained models diff --git a/_ml-commons-plugin/semantic-search.md b/_ml-commons-plugin/semantic-search.md index b94c88f1..3dc8e10e 100644 --- a/_ml-commons-plugin/semantic-search.md +++ b/_ml-commons-plugin/semantic-search.md @@ -112,7 +112,7 @@ For this tutorial, you'll use the [DistilBERT](https://huggingface.co/docs/trans #### Advanced: Using a different model -Alternatively, you can choose to use one of the [pretrained language models provided by OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/) or your own custom model. For information about choosing a model, see [Further reading](#further-reading). For instructions on how to set up a custom model, see [ML framework]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/). +Alternatively, you can choose to use one of the [pretrained language models provided by OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/) or your own custom model. For information about choosing a model, see [Further reading](#further-reading). For instructions on how to set up a custom model, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/). Take note of the dimensionality of the model because you'll need it when you set up a k-NN index. {: .important} @@ -332,7 +332,7 @@ POST /_plugins/_ml/models/_register } ``` -For more information, see [ML framework]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/). +For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/). ### Step 1(d): Deploy the model @@ -602,7 +602,7 @@ GET /my-nlp-index/_doc/1 ``` {% include copy-curl.html %} -The response shows the document `_source` containing the original `text` and `id` fields and the added `passage_embeddings` field: +The response includes the document `_source` containing the original `text` and `id` fields and the added `passage_embedding` field: ```json { diff --git a/_query-dsl/specialized/index.md b/_query-dsl/specialized/index.md index 888f3302..8a4cd81a 100644 --- a/_query-dsl/specialized/index.md +++ b/_query-dsl/specialized/index.md @@ -14,12 +14,16 @@ OpenSearch supports the following specialized queries: - `more_like_this`: Finds documents similar to the provided text, document, or collection of documents. +- [`neural`]({{site.url}}{{site.baseurl}}/query-dsl/specialized/neural/): Used for vector field search in [neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/). + +- [`neural_sparse`]({{site.url}}{{site.baseurl}}/query-dsl/specialized/neural-sparse/): Used for vector field search in [sparse neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/). + - `percolate`: Finds queries (stored as documents) that match the provided document. - `rank_feature`: Calculates scores based on the values of numeric features. This query can skip non-competitive hits. - `script`: Uses a script as a filter. -- `script_score`: Calculates a custom score for matching documents using a script. +- [`script_score`]({{site.url}}{{site.baseurl}}/query-dsl/specialized/script-score/): Calculates a custom score for matching documents using a script. - `wrapper`: Accepts other queries as JSON or YAML strings. diff --git a/_query-dsl/specialized/neural-sparse.md b/_query-dsl/specialized/neural-sparse.md new file mode 100644 index 00000000..18b4914f --- /dev/null +++ b/_query-dsl/specialized/neural-sparse.md @@ -0,0 +1,53 @@ +--- +layout: default +title: Neural sparse +parent: Specialized queries +grand_parent: Query DSL +nav_order: 55 +--- + +# Neural sparse query +Introduced 2.11 +{: .label .label-purple } + +Use the `neural_sparse` query for vector field search in [sparse neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/). + +## Request fields + +Include the following request fields in the `neural_sparse` query: + +```json +"neural_sparse": { + "": { + "query_text": "", + "model_id": "", + "max_token_score": "" + } +} +``` + +The top-level `vector_field` specifies the vector field against which to run a search query. The following table lists the other `neural_sparse` query fields. + +Field | Data type | Required/Optional | Description +:--- | :--- | :--- +`query_text` | String | Required | The query text from which to generate vector embeddings. +`model_id` | String | Required | The ID of the sparse encoding model or tokenizer model that will be used to generate vector embeddings from the query text. The model must be deployed in OpenSearch before it can be used in sparse neural search. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/) and [Semantic search]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search/). +`max_token_score` | Float | Optional | The theoretical upper bound of the score for all tokens in the vocabulary (required for performance optimization). + +#### Example request + +```json +GET my-nlp-index/_search +{ + "query": { + "neural_sparse": { + "passage_embedding": { + "query_text": "Hi world", + "model_id": "aP2Q8ooBpBj3wT4HVS8a", + "max_token_score": 2 + } + } + } +} +``` +{% include copy-curl.html %} \ No newline at end of file diff --git a/_query-dsl/specialized/neural.md b/_query-dsl/specialized/neural.md new file mode 100644 index 00000000..6a4dc549 --- /dev/null +++ b/_query-dsl/specialized/neural.md @@ -0,0 +1,53 @@ +--- +layout: default +title: Neural +parent: Specialized queries +grand_parent: Query DSL +nav_order: 50 +--- + +# Neural query + +Use the `neural` query for vector field search in [neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/). + +## Request fields + +Include the following request fields in the `neural` query: + +```json +"neural": { + "": { + "query_text": "", + "query_image": "", + "model_id": "", + "k": 100 + } +} +``` + +The top-level `vector_field` specifies the vector field against which to run a search query. The following table lists the other neural query fields. + +Field | Data type | Required/Optional | Description +:--- | :--- | :--- +`query_text` | String | Optional | The query text from which to generate vector embeddings. You must specify at least one `query_text` or `query_image`. +`query_image` | String | Optional | A base-64 encoded string that corresponds to the query image from which to generate vector embeddings. You must specify at least one `query_text` or `query_image`. +`model_id` | String | Required if the default model ID is not set. For more information, see [Setting a default model on an index or field]({{site.url}}{{site.baseurl}}/search-plugins/neural-text-search/#setting-a-default-model-on-an-index-or-field). | The ID of the model that will be used to generate vector embeddings from the query text. The model must be deployed in OpenSearch before it can be used in neural search. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/) and [Semantic search]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search/). +`k` | Integer | Optional | The number of results returned by the k-NN search. Default is 10. + +#### Example request + +```json +GET /my-nlp-index/_search +{ + "query": { + "neural": { + "passage_embedding": { + "query_text": "Hi world", + "query_image": "iVBORw0KGgoAAAAN...", + "k": 100 + } + } + } +} +``` +{% include copy-curl.html %} \ No newline at end of file diff --git a/_search-plugins/neural-multimodal-search.md b/_search-plugins/neural-multimodal-search.md new file mode 100644 index 00000000..e81af0ce --- /dev/null +++ b/_search-plugins/neural-multimodal-search.md @@ -0,0 +1,132 @@ +--- +layout: default +title: Multimodal search +nav_order: 20 +has_children: false +parent: Neural search +--- + +# Multimodal search +Introduced 2.11 +{: .label .label-purple } + +Use multimodal search to search text and image data. In neural search, text search is facilitated by multimodal embedding models. + +**PREREQUISITE**
+Before using text search, you must set up a multimodal embedding model. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/). +{: .note} + +## Using multimodal search + +To use neural search with text and image embeddings, follow these steps: + +1. [Create an ingest pipeline](#step-1-create-an-ingest-pipeline). +1. [Create an index for ingestion](#step-2-create-an-index-for-ingestion). +1. [Ingest documents into the index](#step-3-ingest-documents-into-the-index). +1. [Search the index using neural search](#step-4-search-the-index-using-neural-search). + +## Step 1: Create an ingest pipeline + +To generate vector embeddings, you need to create an [ingest pipeline]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/index/) that contains a [`text_image_embedding` processor]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/processors/text-image-embedding/), which will convert the text or image in a document field to vector embeddings. The processor's `field_map` determines the text and image fields from which to generate vector embeddings and the output vector field in which to store the embeddings. + +The following example request creates an ingest pipeline where the text from `image_description` and an image from `image_binary` will be converted into text embeddings and the embeddings will be stored in `vector_embedding`: + +```json +PUT /_ingest/pipeline/nlp-ingest-pipeline +{ + "description": "A text/image embedding pipeline", + "processors": [ + { + "text_image_embedding": { + "model_id": "-fYQAosBQkdnhhBsK593", + "embedding": "vector_embedding", + "field_map": { + "text": "image_description", + "image": "image_binary" + } + } + } + ] +} +``` +{% include copy-curl.html %} + +## Step 2: Create an index for ingestion + +In order to use the text embedding processor defined in your pipeline, create a k-NN index, adding the pipeline created in the previous step as the default pipeline. Ensure that the fields defined in the `field_map` are mapped as correct types. Continuing with the example, the `vector_embedding` field must be mapped as a k-NN vector with a dimension that matches the model dimension. Similarly, the `image_description` field should be mapped as `text`, and the `image_binary` should be mapped as `binary`. + +The following example request creates a k-NN index that is set up with a default ingest pipeline: + +```json +PUT /my-nlp-index +{ + "settings": { + "index.knn": true, + "default_pipeline": "nlp-ingest-pipeline", + "number_of_shards": 2 + }, + "mappings": { + "properties": { + "vector_embedding": { + "type": "knn_vector", + "dimension": 1024, + "method": { + "name": "hnsw", + "engine": "lucene", + "parameters": {} + } + }, + "image_description": { + "type": "text" + }, + "image_binary": { + "type": "binary" + } + } + } +} +``` +{% include copy-curl.html %} + +For more information about creating a k-NN index and its supported methods, see [k-NN index]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index/). + +## Step 3: Ingest documents into the index + +To ingest documents into the index created in the previous step, send the following request: + +```json +PUT /nlp-index/_doc/1 +{ + "image_description": "Orange table", + "image_binary": "iVBORw0KGgoAAAANSUI..." +} +``` +{% include copy-curl.html %} + +Before the document is ingested into the index, the ingest pipeline runs the `text_image_embedding` processor on the document, generating vector embeddings for the `image_description` and `image_binary` fields. In addition to the original `image_description` and `image_binary` fields, the indexed document includes the `vector_embedding` field, which contains the combined vector embeddings. + +## Step 4: Search the index using neural search + +To perform vector search on your index, use the `neural` query clause either in the [k-NN plugin API]({{site.url}}{{site.baseurl}}/search-plugins/knn/api/#search-model) or [Query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/index/) queries. You can refine the results by using a [k-NN search filter]({{site.url}}{{site.baseurl}}/search-plugins/knn/filter-search-knn/). You can search by text, image, or both text and image. + +The following example request uses a neural query to search for text and image: + +```json +GET /my-nlp-index/_search +{ + "size": 10, + "query": { + "neural": { + "vector_embedding": { + "query_text": "Orange table", + "query_image": "iVBORw0KGgoAAAANSUI...", + "model_id": "-fYQAosBQkdnhhBsK593", + "k": 5 + } + } + } +} +``` +{% include copy-curl.html %} + +To eliminate passing the model ID with each neural query request, you can set a default model on a k-NN index or a field. To learn more, see [Setting a default model on an index or field]({{site.url}}{{site.baseurl}}/search-plugins/neural-text-search/##setting-a-default-model-on-an-index-or-field). diff --git a/_search-plugins/neural-search.md b/_search-plugins/neural-search.md index 2b2c1e1c..f1109c4c 100644 --- a/_search-plugins/neural-search.md +++ b/_search-plugins/neural-search.md @@ -2,7 +2,7 @@ layout: default title: Neural search nav_order: 200 -has_children: false +has_children: true has_toc: false redirect_from: - /neural-search-plugin/index/ @@ -10,349 +10,16 @@ redirect_from: # Neural search -Neural search transforms text into vectors and facilitates vector search both at ingestion time and at search time. During ingestion, neural search transforms document text into vector embeddings and indexes both the text and its vector embeddings in a k-NN index. When you use a neural query during search, neural search converts the query text into vector embeddings, uses vector search to compare the query and document embeddings, and returns the closest results. +Neural search transforms text into vectors and facilitates vector search both at ingestion time and at search time. During ingestion, neural search transforms document text into vector embeddings and indexes both the text and its vector embeddings in a vector index. When you use a neural query during search, neural search converts the query text into vector embeddings, uses vector search to compare the query and document embeddings, and returns the closest results. -The Neural Search plugin comes bundled with OpenSearch and is generally available as of OpenSearch 2.9. For more information, see [Managing plugins]({{site.url}}{{site.baseurl}}/opensearch/install/plugins#managing-plugins). +Neural search supports the following search types: -## Using neural search +- [Text search]({{site.url}}{{site.baseurl}}/search-plugins/neural-text-search/): Uses dense retrieval based on text embedding models to search text data. +- [Multimodal search]({{site.url}}{{site.baseurl}}/search-plugins/neural-multimodal-search/): Uses vision-language embedding models to search text and image data. +- [Sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/): Uses sparse retrieval based on sparse embedding models to search text data. -To use neural search, follow these steps: +## Embedding models -1. [Create an ingest pipeline](#step-1-create-an-ingest-pipeline). -1. [Create an index for ingestion](#step-2-create-an-index-for-ingestion). -1. [Ingest documents into the index](#step-3-ingest-documents-into-the-index). -1. [Search the index using neural search](#step-4-search-the-index-using-neural-search). +Before using neural search, you must set up a machine learning (ML) model. You can either use a pretrained model provided by OpenSearch, upload your own model to OpenSearch, or connect to a foundation model hosted on an external platform. For more information about ML models, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/) and [ML Extensibility]({{site.url}}{{site.baseurl}}/ml-commons-plugin/extensibility/index/). For a step-by-step tutorial, see [Semantic search]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search/). -## Step 1: Create an ingest pipeline - -To generate vector embeddings for text fields, you need to create a neural search [ingest pipeline]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/index/). An ingest pipeline consists of a series of processors that manipulate documents during ingestion, allowing the documents to be vectorized. - -### Path and HTTP method - -The following API operation creates a neural search ingest pipeline: - -```json -PUT _ingest/pipeline/ -``` - -### Path parameter - -Use `pipeline_name` to create a name for your neural search ingest pipeline. - -### Request fields - -In the pipeline request body, you must set up a `text_embedding` processor (the only processor supported by neural search), which will convert the text in a document field to vector embeddings. The processor's `field_map` determines the input fields from which to generate vector embeddings and the output fields in which to store the embeddings: - -```json -"text_embedding": { - "model_id": "", - "field_map": { - "": "" - } -} -``` - -The following table lists the `text_embedding` processor request fields. - -Field | Data type | Description -:--- | :--- | :--- -`model_id` | String | The ID of the model that will be used to generate the embeddings. The model must be indexed in OpenSearch before it can be used in neural search. For more information, see [ML Framework]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/) and [Semantic search]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search/). -`field_map.` | String | The name of the field from which to obtain text for generating text embeddings. -`field_map.` | String | The name of the vector field in which to store the generated text embeddings. - -### Example request - -The following example request creates an ingest pipeline where the text from `passage_text` will be converted into text embeddings and the embeddings will be stored in `passage_embedding`: - -```json -PUT /_ingest/pipeline/nlp-ingest-pipeline -{ - "description": "An NLP ingest pipeline", - "processors": [ - { - "text_embedding": { - "model_id": "bQ1J8ooBpBj3wT4HVUsb", - "field_map": { - "passage_text": "passage_embedding" - } - } - } - ] -} -``` -{% include copy-curl.html %} - -## Step 2: Create an index for ingestion - -In order to use the text embedding processor defined in your pipelines, create a k-NN index with mapping data that aligns with the maps specified in your pipeline. For example, the `` defined in the `field_map` of your processor must be mapped as a k-NN vector field with a dimension that matches the model dimension. Similarly, the `` defined in your processor should be mapped as `text` in your index. - -### Example request - -The following example request creates a k-NN index that is set up with a default ingest pipeline: - -```json -PUT /my-nlp-index -{ - "settings": { - "index.knn": true, - "default_pipeline": "nlp-ingest-pipeline" - }, - "mappings": { - "properties": { - "id": { - "type": "text" - }, - "passage_embedding": { - "type": "knn_vector", - "dimension": 768, - "method": { - "engine": "lucene", - "space_type": "l2", - "name": "hnsw", - "parameters": {} - } - }, - "passage_text": { - "type": "text" - } - } - } -} -``` -{% include copy-curl.html %} - -For more information about creating a k-NN index and the methods it supports, see [k-NN index]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index/). - -## Step 3: Ingest documents into the index - -To ingest documents into the index created in the previous step, send a POST request for each document: - -```json -PUT /my-nlp-index/_doc/1 -{ - "passage_text": "Hello world", - "id": "s1" -} -``` -{% include copy-curl.html %} - -```json -PUT /my-nlp-index/_doc/2 -{ - "passage_text": "Hi planet", - "id": "s2" -} -``` -{% include copy-curl.html %} - -Before the document is ingested into the index, the ingest pipeline runs the `text_embedding` processor on the document, generating text embeddings for the `passage_text` field. The indexed document contains the `passage_text` field that has the original text and the `passage_embedding` field that has the vector embeddings. - -## Step 4: Search the index using neural search - -To perform vector search on your index, use the `neural` query clause either in the [k-NN plugin API]({{site.url}}{{site.baseurl}}/search-plugins/knn/api/#search-model) or [Query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/index/) queries. You can refine the results by using a [k-NN search filter]({{site.url}}{{site.baseurl}}/search-plugins/knn/filter-search-knn/). - -### Neural query request fields - -Include the following request fields under the `neural` query clause: - -```json -"neural": { - "": { - "query_text": "", - "model_id": "", - "k": 100 - } -} -``` - -The top-level `vector_field` specifies the vector field against which to run a search query. The following table lists the other neural query fields. - -Field | Data type | Description -:--- | :--- | :--- -`query_text` | String | The query text from which to generate text embeddings. -`model_id` | String | The ID of the model that will be used to generate text embeddings from the query text. The model must be indexed in OpenSearch before it can be used in neural search. -`k` | Integer | The number of results returned by the k-NN search. - -### Example request - -The following example request uses a Boolean query to combine a filter clause and two query clauses---a neural query and a `match` query. The `script_score` query assigns custom weights to the query clauses: - -```json -GET /my-nlp-index/_search -{ - "_source": { - "excludes": [ - "passage_embedding" - ] - }, - "query": { - "bool": { - "filter": { - "wildcard": { "id": "*1" } - }, - "should": [ - { - "script_score": { - "query": { - "neural": { - "passage_embedding": { - "query_text": "Hi world", - "model_id": "bQ1J8ooBpBj3wT4HVUsb", - "k": 100 - } - } - }, - "script": { - "source": "_score * 1.5" - } - } - }, - { - "script_score": { - "query": { - "match": { - "passage_text": "Hi world" - } - }, - "script": { - "source": "_score * 1.7" - } - } - } - ] - } - } -} -``` -{% include copy-curl.html %} - -The response contains the matching document: - -```json -{ - "took" : 36, - "timed_out" : false, - "_shards" : { - "total" : 1, - "successful" : 1, - "skipped" : 0, - "failed" : 0 - }, - "hits" : { - "total" : { - "value" : 1, - "relation" : "eq" - }, - "max_score" : 1.2251667, - "hits" : [ - { - "_index" : "my-nlp-index", - "_id" : "1", - "_score" : 1.2251667, - "_source" : { - "passage_text" : "Hello world", - "id" : "s1" - } - } - ] - } -} -``` - -### Setting a default model on an index or field - -To eliminate passing the model ID with each neural query request, you can set a default model on a k-NN index or a field. - -First, create a [search pipeline]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/index/) with a [`neural_query_enricher`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/neural-query-enricher/) request processor. To set a default model for an index, provide the model ID in the `default_model_id` parameter. To set a default model for a specific field, provide the field name and the corresponding model ID in the `neural_field_default_id` map. If you provide both `default_model_id` and `neural_field_default_id`, `neural_field_default_id` takes precedence: - -```json -PUT /_search/pipeline/default_model_pipeline -{ - "request_processors": [ - { - "neural_query_enricher" : { - "default_model_id": "bQ1J8ooBpBj3wT4HVUsb", - "neural_field_default_id": { - "my_field_1": "uZj0qYoBMtvQlfhaYeud", - "my_field_2": "upj0qYoBMtvQlfhaZOuM" - } - } - } - ] -} -``` -{% include copy-curl.html %} - -Then set the default model for your index: - -```json -PUT /my-nlp-index/_settings -{ - "index.search.default_pipeline" : "default_model_pipeline" -} -``` -{% include copy-curl.html %} - -You can now omit the model ID when searching: - -```json -GET /my-nlp-index/_search -{ - "_source": { - "excludes": [ - "passage_embedding" - ] - }, - "query": { - "neural": { - "passage_embedding": { - "query_text": "Hi world", - "k": 100 - } - } - } -} -``` -{% include copy-curl.html %} - -The response contains both documents: - -```json -{ - "took" : 41, - "timed_out" : false, - "_shards" : { - "total" : 1, - "successful" : 1, - "skipped" : 0, - "failed" : 0 - }, - "hits" : { - "total" : { - "value" : 2, - "relation" : "eq" - }, - "max_score" : 1.22762, - "hits" : [ - { - "_index" : "my-nlp-index", - "_id" : "2", - "_score" : 1.22762, - "_source" : { - "passage_text" : "Hi planet", - "id" : "s2" - } - }, - { - "_index" : "my-nlp-index", - "_id" : "1", - "_score" : 1.2251667, - "_source" : { - "passage_text" : "Hello world", - "id" : "s1" - } - } - ] - } -} -``` \ No newline at end of file +Before you ingest documents into an index, documents are passed through the ML model, which generates vector embeddings for the document fields. When you send a search request, the query text or image is also passed through the ML model, which generates the corresponding vector embeddings. Then neural search performs a vector search on the embeddings and returns matching documents. \ No newline at end of file diff --git a/_search-plugins/neural-sparse-search.md b/_search-plugins/neural-sparse-search.md new file mode 100644 index 00000000..e9d480e9 --- /dev/null +++ b/_search-plugins/neural-sparse-search.md @@ -0,0 +1,207 @@ +--- +layout: default +title: Sparse search +nav_order: 30 +has_children: false +parent: Neural search +--- + +# Sparse search +Introduced 2.11 +{: .label .label-purple } + +[Neural text search]({{site.url}}{{site.baseurl}}/search-plugins/neural-text-search/) relies on dense retrieval that is based on text embedding models. However, dense methods use k-NN search, which consumes a large amount of memory and CPU resources. An alternative to neural text search, sparse neural search is implemented using an inverted index and thus is as efficient as BM25. Sparse search is facilitated by sparse embedding models. When you perform a sparse search, it creates a sparse vector (a list of `token: weight` key-value pairs representing an entry and its weight) and ingests data into a rank features index. + +When selecting a model, choose one of the following options: + +- Use a sparse encoding model at both ingestion time and search time (high performance, relatively high latency). +- Use a sparse encoding model at ingestion time and a tokenizer model at search time (low performance, relatively low latency). + +**PREREQUISITE**
+Before using sparse search, you must set up a sparse embedding model. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/). +{: .note} + +## Using sparse search + +To use sparse search, follow these steps: + +1. [Create an ingest pipeline](#step-1-create-an-ingest-pipeline). +1. [Create an index for ingestion](#step-2-create-an-index-for-ingestion). +1. [Ingest documents into the index](#step-3-ingest-documents-into-the-index). +1. [Search the index using neural search](#step-4-search-the-index-using-neural-search). + +## Step 1: Create an ingest pipeline + +To generate vector embeddings, you need to create an [ingest pipeline]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/index/) that contains a [`sparse_encoding` processor]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/processors/sparse-encoding/), which will convert the text in a document field to vector embeddings. The processor's `field_map` determines the input fields from which to generate vector embeddings and the output fields in which to store the embeddings. + +The following example request creates an ingest pipeline where the text from `passage_text` will be converted into text embeddings and the embeddings will be stored in `passage_embedding`: + +```json +PUT /_ingest/pipeline/nlp-ingest-pipeline-sparse +{ + "description": "An sparse encoding ingest pipeline", + "processors": [ + { + "sparse_encoding": { + "model_id": "aP2Q8ooBpBj3wT4HVS8a", + "field_map": { + "passage_text": "passage_embedding" + } + } + } + ] +} +``` +{% include copy-curl.html %} + +## Step 2: Create an index for ingestion + +In order to use the text embedding processor defined in your pipeline, create a rank features index, adding the pipeline created in the previous step as the default pipeline. Ensure that the fields defined in the `field_map` are mapped as correct types. Continuing with the example, the `passage_embedding` field must be mapped as [`rank_features`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/rank/#rank-features). Similarly, the `passage_text` field should be mapped as `text`. + +The following example request creates a rank features index that is set up with a default ingest pipeline: + +```json +PUT /my-nlp-index +{ + "settings": { + "default_pipeline": "nlp-ingest-pipeline-sparse" + }, + "mappings": { + "properties": { + "id": { + "type": "text" + }, + "passage_embedding": { + "type": "rank_features" + }, + "passage_text": { + "type": "text" + } + } + } +} +``` +{% include copy-curl.html %} + + +## Step 3: Ingest documents into the index + +To ingest documents into the index created in the previous step, send the following requests: + +```json +PUT /my-nlp-index/_doc/1 +{ + "passage_text": "Hello world", + "id": "s1" +} +``` +{% include copy-curl.html %} + +```json +PUT /my-nlp-index/_doc/2 +{ + "passage_text": "Hi planet", + "id": "s2" +} +``` +{% include copy-curl.html %} + +Before the document is ingested into the index, the ingest pipeline runs the `sparse_encoding` processor on the document, generating vector embeddings for the `passage_text` field. The indexed document includes the `passage_text` field, which contains the original text, and the `passage_embedding` field, which contains the vector embeddings. + +## Step 4: Search the index using neural search + +To perform a sparse vector search on your index, use the `neural_sparse` query clause in [Query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/index/) queries. + +The following example request uses a `neural_sparse` query to search for relevant documents: + +```json +GET my-nlp-index/_search +{ + "query": { + "neural_sparse": { + "passage_embedding": { + "query_text": "Hi world", + "model_id": "aP2Q8ooBpBj3wT4HVS8a", + "max_token_score": 2 + } + } + } +} +``` +{% include copy-curl.html %} + +The response contains the matching documents: + +```json +{ + "took" : 688, + "timed_out" : false, + "_shards" : { + "total" : 1, + "successful" : 1, + "skipped" : 0, + "failed" : 0 + }, + "hits" : { + "total" : { + "value" : 2, + "relation" : "eq" + }, + "max_score" : 30.0029, + "hits" : [ + { + "_index" : "my-nlp-index", + "_id" : "1", + "_score" : 30.0029, + "_source" : { + "passage_text" : "Hello world", + "passage_embedding" : { + "!" : 0.8708904, + "door" : 0.8587369, + "hi" : 2.3929274, + "worlds" : 2.7839446, + "yes" : 0.75845814, + "##world" : 2.5432441, + "born" : 0.2682308, + "nothing" : 0.8625516, + "goodbye" : 0.17146169, + "greeting" : 0.96817183, + "birth" : 1.2788506, + "come" : 0.1623208, + "global" : 0.4371151, + "it" : 0.42951578, + "life" : 1.5750692, + "thanks" : 0.26481047, + "world" : 4.7300377, + "tiny" : 0.5462298, + "earth" : 2.6555297, + "universe" : 2.0308156, + "worldwide" : 1.3903781, + "hello" : 6.696973, + "so" : 0.20279501, + "?" : 0.67785245 + }, + "id" : "s1" + } + }, + { + "_index" : "my-nlp-index", + "_id" : "2", + "_score" : 16.480486, + "_source" : { + "passage_text" : "Hi planet", + "passage_embedding" : { + "hi" : 4.338913, + "planets" : 2.7755864, + "planet" : 5.0969057, + "mars" : 1.7405145, + "earth" : 2.6087382, + "hello" : 3.3210192 + }, + "id" : "s2" + } + } + ] + } +} +``` diff --git a/_search-plugins/neural-text-search.md b/_search-plugins/neural-text-search.md new file mode 100644 index 00000000..9aa59079 --- /dev/null +++ b/_search-plugins/neural-text-search.md @@ -0,0 +1,297 @@ +--- +layout: default +title: Text search +nav_order: 10 +has_children: false +parent: Neural search +--- + +# Text search + +Use text search for text data. In neural search, text search is facilitated by text embedding models. Text search creates a dense vector (a list of floats) and ingests data into a k-NN index. + +**PREREQUISITE**
+Before using text search, you must set up a text embedding model. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/). +{: .note} + +## Using text search + +To use text search, follow these steps: + +1. [Create an ingest pipeline](#step-1-create-an-ingest-pipeline). +1. [Create an index for ingestion](#step-2-create-an-index-for-ingestion). +1. [Ingest documents into the index](#step-3-ingest-documents-into-the-index). +1. [Search the index using neural search](#step-4-search-the-index-using-neural-search). + +## Step 1: Create an ingest pipeline + +To generate vector embeddings, you need to create an [ingest pipeline]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/index/) that contains a [`text_embedding` processor]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/processors/text-embedding/), which will convert the text in a document field to vector embeddings. The processor's `field_map` determines the input fields from which to generate vector embeddings and the output fields in which to store the embeddings. + +The following example request creates an ingest pipeline where the text from `passage_text` will be converted into text embeddings and the embeddings will be stored in `passage_embedding`: + +```json +PUT /_ingest/pipeline/nlp-ingest-pipeline +{ + "description": "A text embedding pipeline", + "processors": [ + { + "text_embedding": { + "model_id": "bQ1J8ooBpBj3wT4HVUsb", + "field_map": { + "passage_text": "passage_embedding" + } + } + } + ] +} +``` +{% include copy-curl.html %} + +## Step 2: Create an index for ingestion + +In order to use the text embedding processor defined in your pipeline, create a k-NN index, adding the pipeline created in the previous step as the default pipeline. Ensure that the fields defined in the `field_map` are mapped as correct types. Continuing with the example, the `passage_embedding` field must be mapped as a k-NN vector with a dimension that matches the model dimension. Similarly, the `passage_text` field should be mapped as `text`. + +The following example request creates a k-NN index that is set up with a default ingest pipeline: + +```json +PUT /my-nlp-index +{ + "settings": { + "index.knn": true, + "default_pipeline": "nlp-ingest-pipeline" + }, + "mappings": { + "properties": { + "id": { + "type": "text" + }, + "passage_embedding": { + "type": "knn_vector", + "dimension": 768, + "method": { + "engine": "lucene", + "space_type": "l2", + "name": "hnsw", + "parameters": {} + } + }, + "passage_text": { + "type": "text" + } + } + } +} +``` +{% include copy-curl.html %} + +For more information about creating a k-NN index and its supported methods, see [k-NN index]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index/). + +## Step 3: Ingest documents into the index + +To ingest documents into the index created in the previous step, send the following requests: + +```json +PUT /my-nlp-index/_doc/1 +{ + "passage_text": "Hello world", + "id": "s1" +} +``` +{% include copy-curl.html %} + +```json +PUT /my-nlp-index/_doc/2 +{ + "passage_text": "Hi planet", + "id": "s2" +} +``` +{% include copy-curl.html %} + +Before the document is ingested into the index, the ingest pipeline runs the `text_embedding` processor on the document, generating text embeddings for the `passage_text` field. The indexed document includes the `passage_text` field, which contains the original text, and the `passage_embedding` field, which contains the vector embeddings. + +## Step 4: Search the index using neural search + +To perform vector search on your index, use the `neural` query clause either in the [k-NN plugin API]({{site.url}}{{site.baseurl}}/search-plugins/knn/api/#search-model) or [Query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/index/) queries. You can refine the results by using a [k-NN search filter]({{site.url}}{{site.baseurl}}/search-plugins/knn/filter-search-knn/). + +The following example request uses a Boolean query to combine a filter clause and two query clauses---a neural query and a `match` query. The `script_score` query assigns custom weights to the query clauses: + +```json +GET /my-nlp-index/_search +{ + "_source": { + "excludes": [ + "passage_embedding" + ] + }, + "query": { + "bool": { + "filter": { + "wildcard": { "id": "*1" } + }, + "should": [ + { + "script_score": { + "query": { + "neural": { + "passage_embedding": { + "query_text": "Hi world", + "model_id": "bQ1J8ooBpBj3wT4HVUsb", + "k": 100 + } + } + }, + "script": { + "source": "_score * 1.5" + } + } + }, + { + "script_score": { + "query": { + "match": { + "passage_text": "Hi world" + } + }, + "script": { + "source": "_score * 1.7" + } + } + } + ] + } + } +} +``` +{% include copy-curl.html %} + +The response contains the matching document: + +```json +{ + "took" : 36, + "timed_out" : false, + "_shards" : { + "total" : 1, + "successful" : 1, + "skipped" : 0, + "failed" : 0 + }, + "hits" : { + "total" : { + "value" : 1, + "relation" : "eq" + }, + "max_score" : 1.2251667, + "hits" : [ + { + "_index" : "my-nlp-index", + "_id" : "1", + "_score" : 1.2251667, + "_source" : { + "passage_text" : "Hello world", + "id" : "s1" + } + } + ] + } +} +``` + +## Setting a default model on an index or field + +A [`neural`]({{site.url}}{{site.baseurl}}/query-dsl/specialized/neural/) query requires a model ID for generating vector embeddings. To eliminate passing the model ID with each neural query request, you can set a default model on a k-NN index or a field. + +First, create a [search pipeline]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/index/) with a [`neural_query_enricher`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/neural-query-enricher/) request processor. To set a default model for an index, provide the model ID in the `default_model_id` parameter. To set a default model for a specific field, provide the field name and the corresponding model ID in the `neural_field_default_id` map. If you provide both `default_model_id` and `neural_field_default_id`, `neural_field_default_id` takes precedence: + +```json +PUT /_search/pipeline/default_model_pipeline +{ + "request_processors": [ + { + "neural_query_enricher" : { + "default_model_id": "bQ1J8ooBpBj3wT4HVUsb", + "neural_field_default_id": { + "my_field_1": "uZj0qYoBMtvQlfhaYeud", + "my_field_2": "upj0qYoBMtvQlfhaZOuM" + } + } + } + ] +} +``` +{% include copy-curl.html %} + +Then set the default model for your index: + +```json +PUT /my-nlp-index/_settings +{ + "index.search.default_pipeline" : "default_model_pipeline" +} +``` +{% include copy-curl.html %} + +You can now omit the model ID when searching: + +```json +GET /my-nlp-index/_search +{ + "_source": { + "excludes": [ + "passage_embedding" + ] + }, + "query": { + "neural": { + "passage_embedding": { + "query_text": "Hi world", + "k": 100 + } + } + } +} +``` +{% include copy-curl.html %} + +The response contains both documents: + +```json +{ + "took" : 41, + "timed_out" : false, + "_shards" : { + "total" : 1, + "successful" : 1, + "skipped" : 0, + "failed" : 0 + }, + "hits" : { + "total" : { + "value" : 2, + "relation" : "eq" + }, + "max_score" : 1.22762, + "hits" : [ + { + "_index" : "my-nlp-index", + "_id" : "2", + "_score" : 1.22762, + "_source" : { + "passage_text" : "Hi planet", + "id" : "s2" + } + }, + { + "_index" : "my-nlp-index", + "_id" : "1", + "_score" : 1.2251667, + "_source" : { + "passage_text" : "Hello world", + "id" : "s1" + } + } + ] + } +} +``` \ No newline at end of file diff --git a/_search-plugins/search-pipelines/neural-query-enricher.md b/_search-plugins/search-pipelines/neural-query-enricher.md index 610b0503..215fb85f 100644 --- a/_search-plugins/search-pipelines/neural-query-enricher.md +++ b/_search-plugins/search-pipelines/neural-query-enricher.md @@ -9,7 +9,7 @@ grand_parent: Search pipelines # Neural query enricher processor -The `neural_query_enricher` search request processor is designed to set a default machine learning (ML) model ID at the index or field level for [neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/) queries. To learn more about ML models, see [ML Framework]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/). +The `neural_query_enricher` search request processor is designed to set a default machine learning (ML) model ID at the index or field level for [neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/) queries. To learn more about ML models, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/). ## Request fields