kolchfa-aws a97c719591
Add multimodal search/sparse search/pre- and post-processing function documentation (#5168)
* Add multimodal search documentation

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Text image embedding processor

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add prerequisite

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Change query text

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Added bedrock connector tutorial and renamed ML TOC

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Name changes and rewording

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Change connector link

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Change link

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Implemented tech review comments

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Link fix and field name fix

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add default text embedding preprocessing and post-processing functions

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add sparse search documentation

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Fix links

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Pre/post processing function tech review comments

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Fix link

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Sparse search tech review comments

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Apply suggestions from code review

Co-authored-by: Melissa Vagi <vagimeli@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Implemented doc review comments

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add actual test sparse pipeline response

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Added tested examples

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Added model choice for sparse search

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Remove Bedrock connector

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Implemented tech review feedback

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add that the model must be deployed to neural search

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Link fix

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add session token to sagemaker blueprint

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Formatted bullet points the same way

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Specified both model types in neural sparse query

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Added more explanation for default pre/post-processing functions

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Remove framework and extensibility references

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Minor rewording

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

---------

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: Melissa Vagi <vagimeli@amazon.com>
Co-authored-by: Nathan Bower <nbower@amazon.com>
2023-10-16 10:45:35 -04:00

9.8 KiB

layout title has_children has_toc nav_order
default Connecting to remote models true false 60

Connecting to remote models

Machine learning (ML) extensibility enables ML developers to create integrations with other ML services, such as Amazon SageMaker or OpenAI. These integrations provide system administrators and data scientists the ability to run ML workloads outside of their OpenSearch cluster.

To get started with ML extensibility, choose from the following options:

  • If you're an ML developer wanting to integrate with your specific ML services, see Connector blueprints.
  • If you're a system administrator or data scientist wanting to create a connection to an ML service, see Connectors.

Prerequisites

If you're an admin deploying an ML connector, make sure that the target model of the connector has already been deployed on your chosen platform. Furthermore, make sure that you have permissions to send and receive data to the third-party API for your connector.

When access control is enabled on your third-party platform, you can enter your security settings using the authorization or credential settings inside the connector API.

Adding trusted endpoints

To configure connectors in OpenSearch, add the trusted endpoints to your cluster settings by using the plugins.ml_commons.trusted_connector_endpoints_regex setting, which supports Java regex expressions:

PUT /_cluster/settings
{
    "persistent": {
        "plugins.ml_commons.trusted_connector_endpoints_regex": [
          "^https://runtime\\.sagemaker\\..*[a-z0-9-]\\.amazonaws\\.com/.*$",
          "^https://api\\.openai\\.com/.*$",
          "^https://api\\.cohere\\.ai/.*$"
        ]
    }
}

{% include copy-curl.html %}

Setting up connector access control

If you plan on using a remote connector, make sure to use an OpenSearch cluster with the Security plugin enabled. Using the Security plugin gives you access to connector access control, which is required when using a remote connector. {: .warning}

If you require granular access control for your connectors, use the following cluster setting:

PUT /_cluster/settings
{
    "persistent": {
        "plugins.ml_commons.connector_access_control_enabled": true
    }
}

{% include copy-curl.html %}

When access control is enabled, you can install the Security plugin. This makes the backend_roles, add_all_backend_roles, or access_model options required in order to use the connector API. If successful, OpenSearch returns the following response:

{
  "acknowledged": true,
  "persistent": {
    "plugins": {
      "ml_commons": {
        "connector_access_control_enabled": "true"
      }
    }
  },
  "transient": {}
}

Node settings

Remote models based on external connectors consume fewer resources. Therefore, you can deploy any model from a standalone connector using data nodes. To make sure that your standalone connection uses data nodes, set plugins.ml_commons.only_run_on_ml_node to false:

PUT /_cluster/settings
{
    "persistent": {
        "plugins.ml_commons.only_run_on_ml_node": false
    }
}

{% include copy-curl.html %}

Step 1: Register a model group

To register a model, you have the following options:

  • You can use model_group_id to register a model version to an existing model group.
  • If you do not use model_group_id, ML Commons creates a model with a new model group.

To register a model group, send the following request:

POST /_plugins/_ml/model_groups/_register
{
  "name": "remote_model_group",
  "description": "A model group for remote models"
}

{% include copy-curl.html %}

The response contains the model group ID that you'll use to register a model to this model group:

{
 "model_group_id": "wlcnb4kBJ1eYAeTMHlV6",
 "status": "CREATED"
}

To learn more about model groups, see Model access control.

Step 2: Create a connector

You can create a standalone connector or an internal connector as part of a specific model. For more information about connectors and connector examples, see Connectors.

The Connectors Create API, /_plugins/_ml/connectors/_create, creates connectors that facilitate registering and deploying external models in OpenSearch. Using the endpoint parameter, you can connect ML Commons to any supported ML tool by using its specific API endpoint. For example, you can connect to a ChatGPT model by using the api.openai.com endpoint:

POST /_plugins/_ml/connectors/_create
{
    "name": "OpenAI Chat Connector",
    "description": "The connector to public OpenAI model service for GPT 3.5",
    "version": 1,
    "protocol": "http",
    "parameters": {
        "endpoint": "api.openai.com",
        "model": "gpt-3.5-turbo"
    },
    "credential": {
        "openAI_key": "..."
    },
    "actions": [
        {
            "action_type": "predict",
            "method": "POST",
            "url": "https://${parameters.endpoint}/v1/chat/completions",
            "headers": {
                "Authorization": "Bearer ${credential.openAI_key}"
            },
            "request_body": "{ \"model\": \"${parameters.model}\", \"messages\": ${parameters.messages} }"
        }
    ]
}

{% include copy-curl.html %}

The response contains the connector ID for the newly created connector:

{
  "connector_id": "a1eMb4kBJ1eYAeTMAljY"
}

Step 3: Register a remote model

To register a remote model to the model group created in step 1, provide the model group ID from step 1 and the connector ID from step 2 in the following request:

POST /_plugins/_ml/models/_register
{
    "name": "openAI-gpt-3.5-turbo",
    "function_name": "remote",
    "model_group_id": "1jriBYsBq7EKuKzZX131",
    "description": "test model",
    "connector_id": "a1eMb4kBJ1eYAeTMAljY"
}

{% include copy-curl.html %}

OpenSearch returns the task ID of the register operation:

{
  "task_id": "cVeMb4kBJ1eYAeTMFFgj",
  "status": "CREATED"
}

To check the status of the operation, provide the task ID to the Tasks API:

GET /_plugins/_ml/tasks/cVeMb4kBJ1eYAeTMFFgj

{% include copy-curl.html %}

When the operation is complete, the state changes to COMPLETED:

{
  "model_id": "cleMb4kBJ1eYAeTMFFg4",
  "task_type": "REGISTER_MODEL",
  "function_name": "REMOTE",
  "state": "COMPLETED",
  "worker_node": [
    "XPcXLV7RQoi5m8NI_jEOVQ"
  ],
  "create_time": 1689793598499,
  "last_update_time": 1689793598530,
  "is_async": false
}

Step 4: Deploy the remote model

To deploy the registered model, provide its model ID from step 3 in the following request:

POST /_plugins/_ml/models/cleMb4kBJ1eYAeTMFFg4/_deploy

{% include copy-curl.html %}

The response contains the task ID that you can use to check the status of the deploy operation:

{
  "task_id": "vVePb4kBJ1eYAeTM7ljG",
  "status": "CREATED"
}

As in the previous step, check the status of the operation by calling the Tasks API:

GET /_plugins/_ml/tasks/vVePb4kBJ1eYAeTM7ljG

{% include copy-curl.html %}

When the operation is complete, the state changes to COMPLETED:

{
  "model_id": "cleMb4kBJ1eYAeTMFFg4",
  "task_type": "DEPLOY_MODEL",
  "function_name": "REMOTE",
  "state": "COMPLETED",
  "worker_node": [
    "n-72khvBTBi3bnIIR8FTTw"
  ],
  "create_time": 1689793851077,
  "last_update_time": 1689793851101,
  "is_async": true
}

Step 5: Make predictions

Use the Predict API to make predictions:

POST /_plugins/_ml/models/cleMb4kBJ1eYAeTMFFg4/_predict
{
  "parameters": {
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Hello!"
      }
    ]
  }
}

{% include copy-curl.html %}

To learn more about chat functionality within OpenAI, see the OpenAI Chat API.

The response contains the inference results provided by the OpenAI model:

{
  "inference_results": [
    {
      "output": [
        {
          "name": "response",
          "dataAsMap": {
            "id": "chatcmpl-7e6s5DYEutmM677UZokF9eH40dIY7",
            "object": "chat.completion",
            "created": 1689793889,
            "model": "gpt-3.5-turbo-0613",
            "choices": [
              {
                "index": 0,
                "message": {
                  "role": "assistant",
                  "content": "Hello! How can I assist you today?"
                },
                "finish_reason": "stop"
              }
            ],
            "usage": {
              "prompt_tokens": 19,
              "completion_tokens": 9,
              "total_tokens": 28
            }
          }
        }
      ]
    }
  ]
}

Next steps