opensearch-docs-cn/_ml-commons-plugin/custom-local-models.md

---
layout: default
title: Custom models
parent: Using ML models within OpenSearch
grand_parent: Integrating ML models
nav_order: 120
---

# Custom local models
**Generally available 2.9**
{: .label .label-purple }

To use a custom model locally, you can upload it to the OpenSearch cluster.

## Model support

As of OpenSearch 2.6, OpenSearch supports local text embedding models.

As of OpenSearch 2.11, OpenSearch supports local sparse encoding models.

## Preparing a model

For both text embedding and sparse encoding models, you must provide a tokenizer JSON file within the model zip file.

For sparse encoding models, make sure your output format is `{"output":<sparse_vector>}` so that ML Commons can post-process the sparse vector.

If you fine-tune a sparse model on your own dataset, you may also want to use your own sparse tokenizer model. It is preferable to provide your own [IDF](https://en.wikipedia.org/wiki/Tf%E2%80%93idf) JSON file in the tokenizer model zip file because this increases query performance when you use the tokenizer model in the query. Alternatively, you can use an OpenSearch-provided generic [IDF from MSMARCO](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.0/torch_script/opensearch-neural-sparse-tokenizer-v1-1.0.0.zip). If the IDF file is not provided, the default weight of each token is set to 1, which may influence sparse neural search performance.  

### Model format

To use a model in OpenSearch, you'll need to export the model into a portable format. As of Version 2.5, OpenSearch only supports the [TorchScript](https://pytorch.org/docs/stable/jit.html) and [ONNX](https://onnx.ai/) formats.

You must save the model file as zip before uploading it to OpenSearch. To ensure that ML Commons can upload your model, compress your TorchScript file before uploading. For an example, download a TorchScript [model file](https://github.com/opensearch-project/ml-commons/blob/2.x/ml-algorithms/src/test/resources/org/opensearch/ml/engine/algorithms/text_embedding/all-MiniLM-L6-v2_torchscript_sentence-transformer.zip).

Additionally, you must calculate a SHA256 checksum for the model zip file that you'll need to provide when registering the model. For example, on UNIX, use the following command to obtain the checksum:

```bash
shasum -a 256 sentence-transformers_paraphrase-mpnet-base-v2-1.0.0-onnx.zip
```

### Model size

Most deep learning models are more than 100 MB, making it difficult to fit them into a single document. OpenSearch splits the model file into smaller chunks to be stored in a model index. When allocating ML or data nodes for your OpenSearch cluster, make sure you correctly size your ML nodes so that you have enough memory when making ML inferences.

## Prerequisites 

To upload a custom model to OpenSearch, you need to prepare it outside of your OpenSearch cluster. You can use a pretrained model, like one from [Hugging Face](https://huggingface.co/), or train a new model in accordance with your needs.

### Cluster settings

This example uses a simple setup with no dedicated ML nodes and allows running a model on a non-ML node. 

On clusters with dedicated ML nodes, specify `"only_run_on_ml_node": "true"` for improved performance. For more information, see [ML Commons cluster settings]({{site.url}}{{site.baseurl}}/ml-commons-plugin/cluster-settings/).

To ensure that this basic local setup works, specify the following cluster settings:

```json
PUT _cluster/settings
{
  "persistent": {
    "plugins": {
      "ml_commons": {
        "allow_registering_model_via_url": "true",
        "only_run_on_ml_node": "false",
        "model_access_control_enabled": "true",
        "native_memory_threshold": "99"
      }
    }
  }
}
```
{% include copy-curl.html %}

## Step 1: Register a model group

To register a model, you have the following options:

- You can use `model_group_id` to register a model version to an existing model group.
- If you do not use `model_group_id`, ML Commons creates a model with a new model group.

To register a model group, send the following request:

```json
POST /_plugins/_ml/model_groups/_register
{
  "name": "local_model_group",
  "description": "A model group for local models"
}
```
{% include copy-curl.html %}

The response contains the model group ID that you'll use to register a model to this model group:

```json
{
 "model_group_id": "wlcnb4kBJ1eYAeTMHlV6",
 "status": "CREATED"
}
```

To learn more about model groups, see [Model access control]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-access-control/).

## Step 2: Register a local model

To register a remote model to the model group created in step 1, provide the model group ID from step 1 in the following request:

```json
POST /_plugins/_ml/models/_register
{
  "name": "huggingface/sentence-transformers/msmarco-distilbert-base-tas-b",
  "version": "1.0.1",
  "model_group_id": "wlcnb4kBJ1eYAeTMHlV6",
  "description": "This is a port of the DistilBert TAS-B Model to sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space and is optimized for the task of semantic search.",
  "model_task_type": "TEXT_EMBEDDING",
  "model_format": "TORCH_SCRIPT",
  "model_content_size_in_bytes": 266352827,
  "model_content_hash_value": "acdc81b652b83121f914c5912ae27c0fca8fabf270e6f191ace6979a19830413",
  "model_config": {
    "model_type": "distilbert",
    "embedding_dimension": 768,
    "framework_type": "sentence_transformers",
    "all_config": "{\"_name_or_path\":\"old_models/msmarco-distilbert-base-tas-b/0_Transformer\",\"activation\":\"gelu\",\"architectures\":[\"DistilBertModel\"],\"attention_dropout\":0.1,\"dim\":768,\"dropout\":0.1,\"hidden_dim\":3072,\"initializer_range\":0.02,\"max_position_embeddings\":512,\"model_type\":\"distilbert\",\"n_heads\":12,\"n_layers\":6,\"pad_token_id\":0,\"qa_dropout\":0.1,\"seq_classif_dropout\":0.2,\"sinusoidal_pos_embds\":false,\"tie_weights_\":true,\"transformers_version\":\"4.7.0\",\"vocab_size\":30522}"
  },
  "created_time": 1676073973126,
  "url": "https://artifacts.opensearch.org/models/ml-models/huggingface/sentence-transformers/msmarco-distilbert-base-tas-b/1.0.1/torch_script/sentence-transformers_msmarco-distilbert-base-tas-b-1.0.1-torch_script.zip"
}
```
{% include copy-curl.html %}

Note that in OpenSearch Dashboards, wrapping the `all_config` field contents in triple quotes (`"""`) automatically escapes quotation marks within the field and provides better readability:

```json
POST /_plugins/_ml/models/_register
{
  "name": "huggingface/sentence-transformers/msmarco-distilbert-base-tas-b",
  "version": "1.0.1",
  "model_group_id": "wlcnb4kBJ1eYAeTMHlV6",
  "description": "This is a port of the DistilBert TAS-B Model to sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space and is optimized for the task of semantic search.",
  "model_task_type": "TEXT_EMBEDDING",
  "model_format": "TORCH_SCRIPT",
  "model_content_size_in_bytes": 266352827,
  "model_content_hash_value": "acdc81b652b83121f914c5912ae27c0fca8fabf270e6f191ace6979a19830413",
  "model_config": {
    "model_type": "distilbert",
    "embedding_dimension": 768,
    "framework_type": "sentence_transformers",
    "all_config": """{"_name_or_path":"old_models/msmarco-distilbert-base-tas-b/0_Transformer","activation":"gelu","architectures":["DistilBertModel"],"attention_dropout":0.1,"dim":768,"dropout":0.1,"hidden_dim":3072,"initializer_range":0.02,"max_position_embeddings":512,"model_type":"distilbert","n_heads":12,"n_layers":6,"pad_token_id":0,"qa_dropout":0.1,"seq_classif_dropout":0.2,"sinusoidal_pos_embds":false,"tie_weights_":true,"transformers_version":"4.7.0","vocab_size":30522}"""
  },
  "created_time": 1676073973126,
  "url": "https://artifacts.opensearch.org/models/ml-models/huggingface/sentence-transformers/msmarco-distilbert-base-tas-b/1.0.1/torch_script/sentence-transformers_msmarco-distilbert-base-tas-b-1.0.1-torch_script.zip"
}
```
{% include copy.html %}

For a description of Register API parameters, see [Register a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/register-model/).

OpenSearch returns the task ID of the register operation:

```json
{
  "task_id": "cVeMb4kBJ1eYAeTMFFgj",
  "status": "CREATED"
}
```

To check the status of the operation, provide the task ID to the [Get task]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/tasks-apis/get-task/):

```bash
GET /_plugins/_ml/tasks/cVeMb4kBJ1eYAeTMFFgj
```
{% include copy-curl.html %}

When the operation is complete, the state changes to `COMPLETED`:

```json
{
  "model_id": "cleMb4kBJ1eYAeTMFFg4",
  "task_type": "REGISTER_MODEL",
  "function_name": "REMOTE",
  "state": "COMPLETED",
  "worker_node": [
    "XPcXLV7RQoi5m8NI_jEOVQ"
  ],
  "create_time": 1689793598499,
  "last_update_time": 1689793598530,
  "is_async": false
}
```

Take note of the returned `model_id` because you’ll need it to deploy the model.

## Step 3: Deploy the model

The deploy operation reads the model's chunks from the model index and then creates an instance of the model to load into memory. The bigger the model, the more chunks the model is split into and longer it takes for the model to load into memory.

To deploy the registered model, provide its model ID from step 3 in the following request:

```bash
POST /_plugins/_ml/models/cleMb4kBJ1eYAeTMFFg4/_deploy
```
{% include copy-curl.html %}

The response contains the task ID that you can use to check the status of the deploy operation:

```json
{
  "task_id": "vVePb4kBJ1eYAeTM7ljG",
  "status": "CREATED"
}
```

As in the previous step, check the status of the operation by calling the Tasks API:

```bash
GET /_plugins/_ml/tasks/vVePb4kBJ1eYAeTM7ljG
```
{% include copy-curl.html %}

When the operation is complete, the state changes to `COMPLETED`:

```json
{
  "model_id": "cleMb4kBJ1eYAeTMFFg4",
  "task_type": "DEPLOY_MODEL",
  "function_name": "REMOTE",
  "state": "COMPLETED",
  "worker_node": [
    "n-72khvBTBi3bnIIR8FTTw"
  ],
  "create_time": 1689793851077,
  "last_update_time": 1689793851101,
  "is_async": true
}
```

If a cluster or node is restarted, then you need to redeploy the model. To learn how to set up automatic redeployment, see [Enable auto redeploy]({{site.url}}{{site.baseurl}}/ml-commons-plugin/cluster-settings/#enable-auto-redeploy).
{: .tip} 

## Step 4 (Optional): Test the model

Use the [Predict API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/predict/) to test the model.

For a text embedding model, send the following request:

```json
POST /_plugins/_ml/_predict/text_embedding/cleMb4kBJ1eYAeTMFFg4
{
  "text_docs":[ "today is sunny"],
  "return_number": true,
  "target_response": ["sentence_embedding"]
}
```
{% include copy-curl.html %}

The response contains text embeddings for the provided sentence:

```json
{
  "inference_results" : [
    {
      "output" : [
        {
          "name" : "sentence_embedding",
          "data_type" : "FLOAT32",
          "shape" : [
            768
          ],
          "data" : [
            0.25517133,
            -0.28009856,
            0.48519906,
            ...
          ]
        }
      ]
    }
  ]
}
```

For a sparse encoding model, send the following request:

```json
POST /_plugins/_ml/_predict/sparse_encoding/cleMb4kBJ1eYAeTMFFg4
{
  "text_docs":[ "today is sunny"]
}
```
{% include copy-curl.html %}

The response contains the tokens and weights:

```json
{
  "inference_results": [
    {
      "output": [
        {
          "name": "output",
          "dataAsMap": {
            "response": [
              {
                "saturday": 0.48336542,
                "week": 0.1034762,
                "mood": 0.09698499,
                "sunshine": 0.5738209,
                "bright": 0.1756877,
                ...
              }
          }
        }
    }
}
```

## Step 5: Use the model for search

To learn how to use the model for vector search, see [Using an ML model for neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/#using-an-ml-model-for-neural-search).

## Cross-encoder models

Cross-encoder models support query reranking. 

To register a cross-encoder model, send a request in the following format. The `model_config` object is optional. For cross-encoder models, specify the `function_name` as `TEXT_SIMILARITY`. For example, the following request registers an `ms-marco-TinyBERT-L-2-v2` model:

```json
POST /_plugins/_ml/models/_register
{
    "name": "ms-marco-TinyBERT-L-2-v2",
    "version": "1.0.0",
    "function_name": "TEXT_SIMILARITY",
    "description": "test model",
    "model_format": "TORCH_SCRIPT",
    "model_group_id": "lN4AP40BKolAMNtR4KJ5",
    "model_content_hash_value": "90e39a926101d1a4e542aade0794319404689b12acfd5d7e65c03d91c668b5cf",
    "model_config": { 
        "model_type": "bert",
        "embedding_dimension": 1,
        "framework_type": "huggingface_transformers",
        "total_chunks":2,
        "all_config": "{\"total_chunks\":2}"
    },
    "url": "https://github.com/opensearch-project/ml-commons/blob/main/ml-algorithms/src/test/resources/org/opensearch/ml/engine/algorithms/text_similarity/TinyBERT-CE-torch_script.zip?raw=true"
}
```
{% include copy-curl.html %}

Then send a request to deploy the model:

```json
POST _plugins/_ml/models/<model_id>/_deploy
```
{% include copy-curl.html %}

To test a cross-encoder model, send the following request:

```json
POST _plugins/_ml/models/<model_id>/_predict
{
    "query_text": "today is sunny",
    "text_docs": [
        "how are you",
        "today is sunny",
        "today is july fifth",
        "it is winter"
    ]
}
```
{% include copy-curl.html %}

The model calculates the similarity score of `query_text` and each document in `text_docs` and returns a list of scores for each document in the order they were provided in `text_docs`:

```json
{
  "inference_results": [
    {
      "output": [
        {
          "name": "similarity",
          "data_type": "FLOAT32",
          "shape": [
            1
          ],
          "data": [
            -6.077798
          ],
          "byte_buffer": {
            "array": "Un3CwA==",
            "order": "LITTLE_ENDIAN"
          }
        }
      ]
    },
    {
      "output": [
        {
          "name": "similarity",
          "data_type": "FLOAT32",
          "shape": [
            1
          ],
          "data": [
            10.223609
          ],
          "byte_buffer": {
            "array": "55MjQQ==",
            "order": "LITTLE_ENDIAN"
          }
        }
      ]
    },
    {
      "output": [
        {
          "name": "similarity",
          "data_type": "FLOAT32",
          "shape": [
            1
          ],
          "data": [
            -1.3987057
          ],
          "byte_buffer": {
            "array": "ygizvw==",
            "order": "LITTLE_ENDIAN"
          }
        }
      ]
    },
    {
      "output": [
        {
          "name": "similarity",
          "data_type": "FLOAT32",
          "shape": [
            1
          ],
          "data": [
            -4.5923924
          ],
          "byte_buffer": {
            "array": "4fSSwA==",
            "order": "LITTLE_ENDIAN"
          }
        }
      ]
    }
  ]
}
```

A higher document score means higher similarity. In the preceding response, documents are scored as follows against the query text `today is sunny`:

Document text | Score
:--- | :---
`how are you` | -6.077798
`today is sunny` | 10.223609
`today is july fifth` | -1.3987057
`it is winter` | -4.5923924

The document that contains the same text as the query is scored the highest, and the remaining documents are scored based on the text similarity.

To learn how to use the model for reranking, see [Reranking search results]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/reranking-search-results/).
-												Refactor ML section - local and remote models (#5609)

* Refactor ML section - local and remote models

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Added command to calculate checksum

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add ONNX format to register API

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add sparse encoding predict example

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add API section

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Refactor the API section

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Typo

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Implemented Vale comments

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add get connector API

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Reword heading

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Addressed tech review comments

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Apply suggestions from code review

Co-authored-by: Melissa Vagi <vagimeli@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

---------

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: Melissa Vagi <vagimeli@amazon.com>
											
										
										
											2023-11-17 15:59:27 -05:00
+								---
 								layout: default
 								title: Custom models
 								parent: Using ML models within OpenSearch
-												Add an overview of search methods and pages for each search method (#5636)

* Restructuring TOC

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Resolve merge conflicts

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* More foundational rewrites of ML

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* TOC restructure

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Rename and rewrite search pages and add keyword search

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Small wording change

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Small wording change

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Updated response

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Apply suggestions from code review

Co-authored-by: Melissa Vagi <vagimeli@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Small rewording

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Move neural search to top of vector search list

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Change terminology

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Reorganize search methods list

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Rename links

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* More link renames

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Implemented editorial comments

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

---------

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: Melissa Vagi <vagimeli@amazon.com>
Co-authored-by: Nathan Bower <nbower@amazon.com>
											
										
										
											2023-11-29 15:28:20 -05:00
+								grand_parent: Integrating ML models
-												Refactor ML section - local and remote models (#5609)

* Refactor ML section - local and remote models

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Added command to calculate checksum

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add ONNX format to register API

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add sparse encoding predict example

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add API section

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Refactor the API section

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Typo

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Implemented Vale comments

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add get connector API

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Reword heading

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Addressed tech review comments

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Apply suggestions from code review

Co-authored-by: Melissa Vagi <vagimeli@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

---------

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: Melissa Vagi <vagimeli@amazon.com>
											
										
										
											2023-11-17 15:59:27 -05:00
+								nav_order: 120
 								---
 								# Custom local models
 								**Generally available 2.9**
 								{: .label .label-purple }
 								To use a custom model locally, you can upload it to the OpenSearch cluster.
 								## Model support
 								As of OpenSearch 2.6, OpenSearch supports local text embedding models.
 								As of OpenSearch 2.11, OpenSearch supports local sparse encoding models.
 								## Preparing a model
 								For both text embedding and sparse encoding models, you must provide a tokenizer JSON file within the model zip file.
 								For sparse encoding models, make sure your output format is `{"output":<sparse_vector>}` so that ML Commons can post-process the sparse vector.
 								If you fine-tune a sparse model on your own dataset, you may also want to use your own sparse tokenizer model. It is preferable to provide your own [IDF](https://en.wikipedia.org/wiki/Tf%E2%80%93idf) JSON file in the tokenizer model zip file because this increases query performance when you use the tokenizer model in the query. Alternatively, you can use an OpenSearch-provided generic [IDF from MSMARCO](https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.0/torch_script/opensearch-neural-sparse-tokenizer-v1-1.0.0.zip). If the IDF file is not provided, the default weight of each token is set to 1, which may influence sparse neural search performance.
 								### Model format
 								To use a model in OpenSearch, you'll need to export the model into a portable format. As of Version 2.5, OpenSearch only supports the [TorchScript](https://pytorch.org/docs/stable/jit.html) and [ONNX](https://onnx.ai/) formats.
 								You must save the model file as zip before uploading it to OpenSearch. To ensure that ML Commons can upload your model, compress your TorchScript file before uploading. For an example, download a TorchScript [model file](https://github.com/opensearch-project/ml-commons/blob/2.x/ml-algorithms/src/test/resources/org/opensearch/ml/engine/algorithms/text_embedding/all-MiniLM-L6-v2_torchscript_sentence-transformer.zip).
 								Additionally, you must calculate a SHA256 checksum for the model zip file that you'll need to provide when registering the model. For example, on UNIX, use the following command to obtain the checksum:
 								```bash
 								shasum -a 256 sentence-transformers_paraphrase-mpnet-base-v2-1.0.0-onnx.zip
 								```
 								### Model size
 								Most deep learning models are more than 100 MB, making it difficult to fit them into a single document. OpenSearch splits the model file into smaller chunks to be stored in a model index. When allocating ML or data nodes for your OpenSearch cluster, make sure you correctly size your ML nodes so that you have enough memory when making ML inferences.
 								## Prerequisites
 								To upload a custom model to OpenSearch, you need to prepare it outside of your OpenSearch cluster. You can use a pretrained model, like one from [Hugging Face](https://huggingface.co/), or train a new model in accordance with your needs.
 								### Cluster settings
 								This example uses a simple setup with no dedicated ML nodes and allows running a model on a non-ML node.
 								On clusters with dedicated ML nodes, specify `"only_run_on_ml_node": "true"` for improved performance. For more information, see [ML Commons cluster settings]({{site.url}}{{site.baseurl}}/ml-commons-plugin/cluster-settings/).
 								To ensure that this basic local setup works, specify the following cluster settings:
 								```json
 								PUT _cluster/settings
 								{
 								  "persistent": {
 								    "plugins": {
 								      "ml_commons": {
 								        "allow_registering_model_via_url": "true",
 								        "only_run_on_ml_node": "false",
 								        "model_access_control_enabled": "true",
 								        "native_memory_threshold": "99"
 								      }
 								    }
 								  }
 								}
 								```
 								{% include copy-curl.html %}
 								## Step 1: Register a model group
 								To register a model, you have the following options:
 								- You can use `model_group_id` to register a model version to an existing model group.
 								- If you do not use `model_group_id`, ML Commons creates a model with a new model group.
 								To register a model group, send the following request:
 								```json
 								POST /_plugins/_ml/model_groups/_register
 								{
 								  "name": "local_model_group",
 								  "description": "A model group for local models"
 								}
 								```
 								{% include copy-curl.html %}
 								The response contains the model group ID that you'll use to register a model to this model group:
 								```json
 								{
 								 "model_group_id": "wlcnb4kBJ1eYAeTMHlV6",
 								 "status": "CREATED"
 								}
 								```
 								To learn more about model groups, see [Model access control]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-access-control/).
 								## Step 2: Register a local model
 								To register a remote model to the model group created in step 1, provide the model group ID from step 1 in the following request:
 								```json
 								POST /_plugins/_ml/models/_register
 								{
 								  "name": "huggingface/sentence-transformers/msmarco-distilbert-base-tas-b",
 								  "version": "1.0.1",
 								  "model_group_id": "wlcnb4kBJ1eYAeTMHlV6",
 								  "description": "This is a port of the DistilBert TAS-B Model to sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space and is optimized for the task of semantic search.",
 								  "model_task_type": "TEXT_EMBEDDING",
 								  "model_format": "TORCH_SCRIPT",
 								  "model_content_size_in_bytes": 266352827,
 								  "model_content_hash_value": "acdc81b652b83121f914c5912ae27c0fca8fabf270e6f191ace6979a19830413",
 								  "model_config": {
 								    "model_type": "distilbert",
 								    "embedding_dimension": 768,
 								    "framework_type": "sentence_transformers",
-												Add two formats for fields with quotation marks in Dashboards (#5876)

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
											
										
										
											2023-12-14 12:01:14 -05:00
+								    "all_config": "{\"_name_or_path\":\"old_models/msmarco-distilbert-base-tas-b/0_Transformer\",\"activation\":\"gelu\",\"architectures\":[\"DistilBertModel\"],\"attention_dropout\":0.1,\"dim\":768,\"dropout\":0.1,\"hidden_dim\":3072,\"initializer_range\":0.02,\"max_position_embeddings\":512,\"model_type\":\"distilbert\",\"n_heads\":12,\"n_layers\":6,\"pad_token_id\":0,\"qa_dropout\":0.1,\"seq_classif_dropout\":0.2,\"sinusoidal_pos_embds\":false,\"tie_weights_\":true,\"transformers_version\":\"4.7.0\",\"vocab_size\":30522}"
-												Refactor ML section - local and remote models (#5609)

* Refactor ML section - local and remote models

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Added command to calculate checksum

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add ONNX format to register API

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add sparse encoding predict example

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add API section

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Refactor the API section

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Typo

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Implemented Vale comments

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add get connector API

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Reword heading

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Addressed tech review comments

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Apply suggestions from code review

Co-authored-by: Melissa Vagi <vagimeli@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

---------

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: Melissa Vagi <vagimeli@amazon.com>
											
										
										
											2023-11-17 15:59:27 -05:00
+								  },
 								  "created_time": 1676073973126,
 								  "url": "https://artifacts.opensearch.org/models/ml-models/huggingface/sentence-transformers/msmarco-distilbert-base-tas-b/1.0.1/torch_script/sentence-transformers_msmarco-distilbert-base-tas-b-1.0.1-torch_script.zip"
 								}
 								```
 								{% include copy-curl.html %}
-												Add two formats for fields with quotation marks in Dashboards (#5876)

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
											
										
										
											2023-12-14 12:01:14 -05:00
+								Note that in OpenSearch Dashboards, wrapping the `all_config` field contents in triple quotes (`"""`) automatically escapes quotation marks within the field and provides better readability:
 								```json
 								POST /_plugins/_ml/models/_register
 								{
 								  "name": "huggingface/sentence-transformers/msmarco-distilbert-base-tas-b",
 								  "version": "1.0.1",
 								  "model_group_id": "wlcnb4kBJ1eYAeTMHlV6",
 								  "description": "This is a port of the DistilBert TAS-B Model to sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space and is optimized for the task of semantic search.",
 								  "model_task_type": "TEXT_EMBEDDING",
 								  "model_format": "TORCH_SCRIPT",
 								  "model_content_size_in_bytes": 266352827,
 								  "model_content_hash_value": "acdc81b652b83121f914c5912ae27c0fca8fabf270e6f191ace6979a19830413",
 								  "model_config": {
 								    "model_type": "distilbert",
 								    "embedding_dimension": 768,
 								    "framework_type": "sentence_transformers",
 								    "all_config": """{"_name_or_path":"old_models/msmarco-distilbert-base-tas-b/0_Transformer","activation":"gelu","architectures":["DistilBertModel"],"attention_dropout":0.1,"dim":768,"dropout":0.1,"hidden_dim":3072,"initializer_range":0.02,"max_position_embeddings":512,"model_type":"distilbert","n_heads":12,"n_layers":6,"pad_token_id":0,"qa_dropout":0.1,"seq_classif_dropout":0.2,"sinusoidal_pos_embds":false,"tie_weights_":true,"transformers_version":"4.7.0","vocab_size":30522}"""
 								  },
 								  "created_time": 1676073973126,
 								  "url": "https://artifacts.opensearch.org/models/ml-models/huggingface/sentence-transformers/msmarco-distilbert-base-tas-b/1.0.1/torch_script/sentence-transformers_msmarco-distilbert-base-tas-b-1.0.1-torch_script.zip"
 								}
 								```
 								{% include copy.html %}
-												Refactor ML section - local and remote models (#5609)

* Refactor ML section - local and remote models

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Added command to calculate checksum

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add ONNX format to register API

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add sparse encoding predict example

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add API section

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Refactor the API section

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Typo

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Implemented Vale comments

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add get connector API

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Reword heading

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Addressed tech review comments

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Apply suggestions from code review

Co-authored-by: Melissa Vagi <vagimeli@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

---------

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: Melissa Vagi <vagimeli@amazon.com>
											
										
										
											2023-11-17 15:59:27 -05:00
+								For a description of Register API parameters, see [Register a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/register-model/).
 								OpenSearch returns the task ID of the register operation:
 								```json
 								{
 								  "task_id": "cVeMb4kBJ1eYAeTMFFgj",
 								  "status": "CREATED"
 								}
 								```
 								To check the status of the operation, provide the task ID to the [Get task]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/tasks-apis/get-task/):
 								```bash
 								GET /_plugins/_ml/tasks/cVeMb4kBJ1eYAeTMFFgj
 								```
 								{% include copy-curl.html %}
 								When the operation is complete, the state changes to `COMPLETED`:
 								```json
 								{
 								  "model_id": "cleMb4kBJ1eYAeTMFFg4",
 								  "task_type": "REGISTER_MODEL",
 								  "function_name": "REMOTE",
 								  "state": "COMPLETED",
 								  "worker_node": [
 								    "XPcXLV7RQoi5m8NI_jEOVQ"
 								  ],
 								  "create_time": 1689793598499,
 								  "last_update_time": 1689793598530,
 								  "is_async": false
 								}
 								```
 								Take note of the returned `model_id` because you’ll need it to deploy the model.
 								## Step 3: Deploy the model
 								The deploy operation reads the model's chunks from the model index and then creates an instance of the model to load into memory. The bigger the model, the more chunks the model is split into and longer it takes for the model to load into memory.
 								To deploy the registered model, provide its model ID from step 3 in the following request:
 								```bash
 								POST /_plugins/_ml/models/cleMb4kBJ1eYAeTMFFg4/_deploy
 								```
 								{% include copy-curl.html %}
 								The response contains the task ID that you can use to check the status of the deploy operation:
 								```json
 								{
 								  "task_id": "vVePb4kBJ1eYAeTM7ljG",
 								  "status": "CREATED"
 								}
 								```
 								As in the previous step, check the status of the operation by calling the Tasks API:
 								```bash
 								GET /_plugins/_ml/tasks/vVePb4kBJ1eYAeTM7ljG
 								```
 								{% include copy-curl.html %}
 								When the operation is complete, the state changes to `COMPLETED`:
 								```json
 								{
 								  "model_id": "cleMb4kBJ1eYAeTMFFg4",
 								  "task_type": "DEPLOY_MODEL",
 								  "function_name": "REMOTE",
 								  "state": "COMPLETED",
 								  "worker_node": [
 								    "n-72khvBTBi3bnIIR8FTTw"
 								  ],
 								  "create_time": 1689793851077,
 								  "last_update_time": 1689793851101,
 								  "is_async": true
 								}
 								```
-												Add model redeploy tip (#5764)

* Add model redeploy tip

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Change to active voice

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Fix link

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

---------

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: Nathan Bower <nbower@amazon.com>
											
										
										
											2023-12-01 15:48:10 -05:00
+								If a cluster or node is restarted, then you need to redeploy the model. To learn how to set up automatic redeployment, see [Enable auto redeploy]({{site.url}}{{site.baseurl}}/ml-commons-plugin/cluster-settings/#enable-auto-redeploy).
 								{: .tip}
-												Refactor ML section - local and remote models (#5609)

* Refactor ML section - local and remote models

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Added command to calculate checksum

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add ONNX format to register API

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add sparse encoding predict example

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add API section

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Refactor the API section

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Typo

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Implemented Vale comments

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add get connector API

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Reword heading

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Addressed tech review comments

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Apply suggestions from code review

Co-authored-by: Melissa Vagi <vagimeli@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

---------

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: Melissa Vagi <vagimeli@amazon.com>
											
										
										
											2023-11-17 15:59:27 -05:00
+								## Step 4 (Optional): Test the model
 								Use the [Predict API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/predict/) to test the model.
 								For a text embedding model, send the following request:
 								```json
 								POST /_plugins/_ml/_predict/text_embedding/cleMb4kBJ1eYAeTMFFg4
 								{
 								  "text_docs":[ "today is sunny"],
 								  "return_number": true,
 								  "target_response": ["sentence_embedding"]
 								}
 								```
 								{% include copy-curl.html %}
 								The response contains text embeddings for the provided sentence:
 								```json
 								{
 								  "inference_results" : [
 								    {
 								      "output" : [
 								        {
 								          "name" : "sentence_embedding",
 								          "data_type" : "FLOAT32",
 								          "shape" : [
 
 								          ],
 								          "data" : [
 .25517133,
 								            -0.28009856,
 .48519906,
 								            ...
 								          ]
 								        }
 								      ]
 								    }
 								  ]
 								}
 								```
 								For a sparse encoding model, send the following request:
 								```json
 								POST /_plugins/_ml/_predict/sparse_encoding/cleMb4kBJ1eYAeTMFFg4
 								{
 								  "text_docs":[ "today is sunny"]
 								}
 								```
 								{% include copy-curl.html %}
 								The response contains the tokens and weights:
 								```json
 								{
 								  "inference_results": [
 								    {
 								      "output": [
 								        {
 								          "name": "output",
 								          "dataAsMap": {
 								            "response": [
 								              {
 								                "saturday": 0.48336542,
 								                "week": 0.1034762,
 								                "mood": 0.09698499,
 								                "sunshine": 0.5738209,
 								                "bright": 0.1756877,
 								                ...
 								              }
 								          }
 								        }
 								    }
 								}
 								```
 								## Step 5: Use the model for search
-												Add cross-encoder model documentation (#6357)

* Add cross-ranking model documentation

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Model id format

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Move to custom models

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Update _search-plugins/search-relevance/reranking-search-results.md

Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Update _ml-commons-plugin/custom-local-models.md

Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Tech review and doc review comments

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Update _ml-commons-plugin/pretrained-models.md

Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

---------

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: Nathan Bower <nbower@amazon.com>
											
										
										
											2024-02-16 08:44:29 -05:00
+								To learn how to use the model for vector search, see [Using an ML model for neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/#using-an-ml-model-for-neural-search).
 								## Cross-encoder models
 								Cross-encoder models support query reranking.
 								To register a cross-encoder model, send a request in the following format. The `model_config` object is optional. For cross-encoder models, specify the `function_name` as `TEXT_SIMILARITY`. For example, the following request registers an `ms-marco-TinyBERT-L-2-v2` model:
 								```json
 								POST /_plugins/_ml/models/_register
 								{
 								    "name": "ms-marco-TinyBERT-L-2-v2",
 								    "version": "1.0.0",
 								    "function_name": "TEXT_SIMILARITY",
 								    "description": "test model",
 								    "model_format": "TORCH_SCRIPT",
 								    "model_group_id": "lN4AP40BKolAMNtR4KJ5",
 								    "model_content_hash_value": "90e39a926101d1a4e542aade0794319404689b12acfd5d7e65c03d91c668b5cf",
 								    "model_config": {
 								        "model_type": "bert",
 								        "embedding_dimension": 1,
 								        "framework_type": "huggingface_transformers",
 								        "total_chunks":2,
 								        "all_config": "{\"total_chunks\":2}"
 								    },
 								    "url": "https://github.com/opensearch-project/ml-commons/blob/main/ml-algorithms/src/test/resources/org/opensearch/ml/engine/algorithms/text_similarity/TinyBERT-CE-torch_script.zip?raw=true"
 								}
 								```
 								{% include copy-curl.html %}
 								Then send a request to deploy the model:
 								```json
 								POST _plugins/_ml/models/<model_id>/_deploy
 								```
 								{% include copy-curl.html %}
 								To test a cross-encoder model, send the following request:
 								```json
 								POST _plugins/_ml/models/<model_id>/_predict
 								{
 								    "query_text": "today is sunny",
 								    "text_docs": [
 								        "how are you",
 								        "today is sunny",
 								        "today is july fifth",
 								        "it is winter"
 								    ]
 								}
 								```
 								{% include copy-curl.html %}
 								The model calculates the similarity score of `query_text` and each document in `text_docs` and returns a list of scores for each document in the order they were provided in `text_docs`:
 								```json
 								{
 								  "inference_results": [
 								    {
 								      "output": [
 								        {
 								          "name": "similarity",
 								          "data_type": "FLOAT32",
 								          "shape": [
 
 								          ],
 								          "data": [
 								            -6.077798
 								          ],
 								          "byte_buffer": {
 								            "array": "Un3CwA==",
 								            "order": "LITTLE_ENDIAN"
 								          }
 								        }
 								      ]
 								    },
 								    {
 								      "output": [
 								        {
 								          "name": "similarity",
 								          "data_type": "FLOAT32",
 								          "shape": [
 
 								          ],
 								          "data": [
 .223609
 								          ],
 								          "byte_buffer": {
 								            "array": "55MjQQ==",
 								            "order": "LITTLE_ENDIAN"
 								          }
 								        }
 								      ]
 								    },
 								    {
 								      "output": [
 								        {
 								          "name": "similarity",
 								          "data_type": "FLOAT32",
 								          "shape": [
 
 								          ],
 								          "data": [
 								            -1.3987057
 								          ],
 								          "byte_buffer": {
 								            "array": "ygizvw==",
 								            "order": "LITTLE_ENDIAN"
 								          }
 								        }
 								      ]
 								    },
 								    {
 								      "output": [
 								        {
 								          "name": "similarity",
 								          "data_type": "FLOAT32",
 								          "shape": [
 
 								          ],
 								          "data": [
 								            -4.5923924
 								          ],
 								          "byte_buffer": {
 								            "array": "4fSSwA==",
 								            "order": "LITTLE_ENDIAN"
 								          }
 								        }
 								      ]
 								    }
 								  ]
 								}
 								```
 								A higher document score means higher similarity. In the preceding response, documents are scored as follows against the query text `today is sunny`:
 								Document text | Score
 								:--- | :---
 								`how are you` | -6.077798
 								`today is sunny` | 10.223609
 								`today is july fifth` | -1.3987057
 								`it is winter` | -4.5923924
 								The document that contains the same text as the query is scored the highest, and the remaining documents are scored based on the text similarity.
 								To learn how to use the model for reranking, see [Reranking search results]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/reranking-search-results/).