Add Connectors and ML updates for 2.9 (#4554)

* Add Connectors and ML updates for 2.9 Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Fix code block Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Add Connectors and ML updates for 2.9 Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Fix code block Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Add connector settings and examples Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Add GA warning Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Add final experimental warning Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Address tech review. Fix typos Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Fix bad link. Add next steps section Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Fix typo Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Update cluster-settings.md Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Update _ml-commons-plugin/connectors.md Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Change cluster values for boolean. Fix typo. Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Apply suggestions from code review Co-authored-by: Nathan Bower <nbower@amazon.com> Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Fix cluser settings Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Add missing config options. More technical feedback. Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Adjust cluster setting description. Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Add updated ChatGPT example Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Add info and example for internal connector. Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * One last adjustment. Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Apply suggestions from code review Co-authored-by: Nathan Bower <nbower@amazon.com> Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Fix dead link Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Fix one last comment. Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * change ordered list to numbered. Signed-off-by: Naarcha-AWS <naarcha@amazon.com> --------- Signed-off-by: Naarcha-AWS <naarcha@amazon.com> Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Co-authored-by: Nathan Bower <nbower@amazon.com>
2023-07-19 16:35:46 -07:00 · 2023-07-19 16:35:46 -07:00 · 95d117ffb0
commit 95d117ffb0
parent 7aee5fc916
7 changed files with 502 additions and 26 deletions
--- a/STYLE_GUIDE.md
+++ b/STYLE_GUIDE.md
@ -45,7 +45,7 @@ Use lowercase when referring to features, unless you are referring to a formally
 * “*Remote-backed storage* is an experimental feature. Therefore, we do not recommend the use of *remote-backed storage* in a production environment.”
 * “You can take and restore *snapshots* using the snapshot API.”
 * “You can use the *VisBuilder* visualization type in OpenSearch Dashboards to create data visualizations by using a drag-and-drop gesture.” (You can refer to VisBuilder alone or qualify the term with “visualization type”.)
-* “As of OpenSearch 2.4, the *model-serving framework* only supports text embedding models without GPU acceleration.”
+* “As of OpenSearch 2.4, the *ML framework* only supports text-embedding models without GPU acceleration.”

 #### Plugin names

--- a/_ml-commons-plugin/cluster-settings.md
+++ b/_ml-commons-plugin/cluster-settings.md
@ -148,7 +148,7 @@ plugins.ml_commons.allow_registering_model_via_url: false
 ### Values

 - Default value: false
- Value range: [false, true]
+- Valid values: `false`, `true`

 ## Register models using local files

@ -163,7 +163,7 @@ plugins.ml_commons.allow_registering_model_via_local_file: false
 ### Values

 - Default value: false
- Value range: [false, true]
+- Valid values: `false`, `true`

 ## Add trusted URL

@ -230,7 +230,7 @@ plugins.ml_commons.allow_custom_deployment_plan: false
 ### Values

 - Default value: false
- Value range: [false, true]
+- Valid values: `false`, `true`

 ## Enable auto redeploy

@ -245,7 +245,7 @@ plugins.ml_commons.model_auto_redeploy.enable: false
 ### Values

 - Default value: false
- Value range: [false, true]
+- Valid values: `false`, `true`

 ## Set retires for auto redeploy

@ -290,7 +290,22 @@ plugins.ml_commons.enable_inhouse_python_model: false
 ### Values

 - Default value: false
- Value range: [false, true]
+- Valid values: `false`, `true`
+
+## Enable access control for connectors
+
+When set to `true`, the setting allows admins to control access and permissions to the connector API using `backend_roles`.
+
+### Setting
+
+```
+plugins.ml_commons.connector_access_control_enabled: true
+```
+
+### Values
+
+- Default value: false
+- Valid values: `false`, `true`



--- a/_ml-commons-plugin/connectors.md
+++ b/_ml-commons-plugin/connectors.md
@ -0,0 +1,461 @@
+---
+layout: default
+title: Connecting to other ML platforms
+has_children: false
+nav_order: 60
+---
+
+# Connecting to third-party ML platforms
+
+Machine Learning (ML) Connectors provides the ability to integrate OpenSearch ML capabilities with third-party ML tools and platforms. Through connectors, OpenSearch can invoke these third-party endpoints to enrich query results and data pipelines.
+
+## Supported connectors
+
+As of OpenSearch 2.9, connectors have been tested for the following ML tools, though it is possible to create connectors for other tools not listed here:
+
+- [Amazon SageMaker](https://aws.amazon.com/sagemaker/) allows you to host and manage the lifecycle of text-embedding models, powering semantic search queries in OpenSearch. When connected, Amazon SageMaker hosts your models and OpenSearch is used to query inferences. This benefits Amazon SageMaker users who value its functionality, such as model monitoring, serverless hosting, and workflow automation for continuous training and deployment.
+- [ChatGPT](https://openai.com/blog/chatgpt) enables you to run OpenSearch queries while invoking the ChatGPT API, helping you build on OpenSearch faster and improving the data retrieval speed for OpenSearch search functionality.
+
+Additional connectors will be added to this page as they are tested and verified. 
+
+
+## Prerequisites
+
+If you are an admin deploying an ML connector, make sure that the target model of the connector has already been deployed on your chosen platform. Furthermore, make sure that you have permissions to send and receive data to the third-party API for your connector. 
+
+When access control is enabled on your third-party platform, you can enter your security settings using the `authorization` or `credential` settings inside the connector API.
+
+### Adding trusted endpoints
+
+To configure connectors in OpenSearch, add the trusted endpoints to your cluster settings using the `plugins.ml_commons.trusted_connector_endpoints_regex` setting, which supports Java regex expressions, as shown in the following example:
+
+```json
+PUT /_cluster/settings
+{
+    "persistent": {
+        "plugins.ml_commons.trusted_connector_endpoints_regex": [
+            "^https://runtime\\.sagemaker\\..*\\.amazonaws\\.com/.*$",
+            "^https://api\\.openai\\.com/.*$",
+            "^https://api\\.cohere\\.ai/.*$",
+            "^https://bedrock\\..*\\.amazonaws.com/.*$"
+        ]
+    }
+}
+```
+{% include copy-curl.html %}
+
+### Enabling ML nodes
+
+Most connectors require the use of dedicated ML nodes. To make sure you have ML nodes enabled, update the following cluster settings:
+
+```json
+PUT /_cluster/settings
+{
+    "persistent": {
+        "plugins.ml_commons.only_run_on_ml_node": true,
+    }
+}
+```
+{% include copy-curl.html %}
+
+If you are running a remote inference or local model, you can set `"plugins.ml_commons.only_run_on_ml_node"` to `false` and use data nodes instead.
+
+
+### Setting up connector access control
+
+To enable access control on the connector API, use the following cluster setting:
+
+```json
+PUT /_cluster/settings
+{
+    "persistent": {
+        "plugins.ml_commons.connector_access_control_enabled": true
+    }
+}
+```
+{% include copy-curl.html %}
+
+When enabled, the `backend_roles`, `add_all_backend_roles`, or `access_model` options are required in order to use the connector API. If successful, OpenSearch returns the following response:
+
+```json
+{
+  "acknowledged": true,
+  "persistent": {
+    "plugins": {
+      "ml_commons": {
+        "connector_access_control_enabled": "true"
+      }
+    }
+  },
+  "transient": {}
+}
+```
+
+## Creating a connector
+
+You can build connectors in two ways:
+
+1. A **standalone connector**, saved in a connector index, can be reused and shared with multiple remote models but requires access to both the model and the third party being accessed by the connector, such as OpenAI.
+
+2. An **internal connector**, saved in the model index, can only be used with one remote model. Unlike a standalone connector, users only need access to the model itself to access an internal connector because the connection is established inside the model.
+
+## Configuration options
+
+The following configuration options are **required** in order to create a connector. These settings can be used for both standalone and internal connectors.
+
+| Field | Data type | Description |
+| :---  | :--- | :--- |
+| `name` | String | The name of the connector. |
+| `description` | String | A description of the connector. |
+| `version` | Integer | The version of the connector. |
+| `protocol` | String | The protocol for the connection. For AWS services such as Amazon SageMaker and Amazon Bedrock, use `aws_sigv4`. For all other services, use `http`. |
+| `parameter` | JSON array | The default connector parameters, including `endpoint` and `model`. 
+| `credential` | String | Defines any credential variables required to connect to your chosen endpoint. ML Commons uses **AES/GCM/NoPadding** symmetric encryption with a key length of 32 bytes. When a connection cluster first starts, the key persists in OpenSearch. Therefore, you do not need to manually encrypt the key.
+| `action` | JSON array | Tells the connector what actions to run after a connection to ML Commons has been established.
+| `backend_roles` | String | A list of OpenSearch backend roles. For more information about setting up backend roles, see [Assigning backend roles to users]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-access-control#assigning-backend-roles-to-users).
+| `access_mode` | String | Sets the access mode for the model, either `public`, `restricted`, or `private`. Default is `private`. For more information about `access_mode`, see [Model groups]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-access-control#model-groups).
+| `add_all_backend_roles` | Boolean | When set to `true`, adds all `backend_roles` to the access list, which only a user with admin permissions can adjust. When set to `false`, non-admins can add `backend_roles`.
+
+When creating a connection, the `action` setting tells the connector what ML Commons API operation to run against the connection endpoint. You can configure actions using the following settings.
+
+| Field | Data type | Description |
+| :---  | :--- | :--- |
+`action_type` | String | Required. Sets the ML Commons API operation to use upon connection. As of OpenSearch 2.9, only `predict` is supported. 
+`method` | String | Required. Defines the HTTP method for the API call. Supports `POST` and `GET`.
+`url` | String | Required. Sets the connection endpoint at which the action takes place. This must match the regex expression for the connection used when [adding trusted endpoints](#adding-trusted-endpoints).
+`headers` | String | Sets the headers used inside the request or response body. Default is `application/json`.
+`request_body` | String | Required. Sets the parameters contained inside the request body of the action.
+
+
+### Standalone connector
+
+The connector creation API, `/_plugins/_ml/connectors/_create`, creates connections to third-party ML tools. Using the `endpoint` parameter, you can connect ML Commons to any supported ML tool using its specific API endpoint. For example, to connect to a ChatGPT completion model, you can connect using the `api.openai.com`, as shown in the following example:
+
+```json
+POST /_plugins/_ml/connectors/_create
+{
+    "name": "OpenAI Chat Connector",
+    "description": "The connector to public OpenAI model service for GPT 3.5",
+    "version": 1,
+    "protocol": "http",
+    "parameters": {
+        "endpoint": "api.openai.com",
+        "model": "gpt-3.5-turbo"
+    },
+    "credential": {
+        "openAI_key": "..."
+    },
+    "actions": [
+        {
+            "action_type": "predict",
+            "method": "POST",
+            "url": "https://${parameters.endpoint}/v1/chat/completions",
+            "headers": {
+                "Authorization": "Bearer ${credential.openAI_key}"
+            },
+            "request_body": "{ \"model\": \"${parameters.model}\", \"messages\": ${parameters.messages} }"
+        }
+    ]
+}
+```
+{% include copy-curl.html %}
+
+If successful, the connector API responds with a `connector_id` and `status` for the connection:
+
+```json
+{
+  "connector_id": "a1eMb4kBJ1eYAeTMAljY"
+}
+```
+
+After a connection has been created, use the `connector_id` from the response to register and deploy a connected model.
+
+To register a model, you have the following options:
+
+- You can use `model_group_id` to register a model version to an existing model group.
+- If you do not use `model_group_id`, ML Commons creates a model with a new model group.
+
+The following example registers a model named `openAI-GPT-3.5 completions`:
+
+```json
+POST /_plugins/_ml/models/_register
+{
+    "name": "openAI-gpt-3.5-turbo",
+    "function_name": "remote",
+    "model_group_id": "wlcnb4kBJ1eYAeTMHlV6",
+    "description": "test model",
+    "connector_id": "a1eMb4kBJ1eYAeTMAljY"
+}
+```
+
+ML Commons returns the `task_id` and registration status of the model:
+
+```json
+{
+  "task_id": "cVeMb4kBJ1eYAeTMFFgj",
+  "status": "CREATED"
+}
+```
+
+
+You can use the `task_id` to find the `model_id`, as shown the following example:
+
+
+**GET task request**
+
+```json
+GET /_plugins/_ml/tasks/cVeMb4kBJ1eYAeTMFFgj
+```
+
+**GET task response**
+
+```json
+{
+  "model_id": "cleMb4kBJ1eYAeTMFFg4",
+  "task_type": "REGISTER_MODEL",
+  "function_name": "REMOTE",
+  "state": "COMPLETED",
+  "worker_node": [
+    "XPcXLV7RQoi5m8NI_jEOVQ"
+  ],
+  "create_time": 1689793598499,
+  "last_update_time": 1689793598530,
+  "is_async": false
+}
+```
+
+Lastly, use the `model_id` to deploy the model:
+
+**Deploy model request**
+
+```json
+POST /_plugins/_ml/models/cleMb4kBJ1eYAeTMFFg4/_deploy
+```
+
+**Deploy model response**
+
+```json
+{
+  "task_id": "vVePb4kBJ1eYAeTM7ljG",
+  "status": "CREATED"
+}
+```
+
+Use the `task_id` from the deploy model response to make sure the model deployment completes:
+
+**Verify deploy completion request**
+
+```json
+GET /_plugins/_ml/tasks/vVePb4kBJ1eYAeTM7ljG
+```
+
+**Verify deploy completion response**
+
+```
+{
+  "model_id": "cleMb4kBJ1eYAeTMFFg4",
+  "task_type": "DEPLOY_MODEL",
+  "function_name": "REMOTE",
+  "state": "COMPLETED",
+  "worker_node": [
+    "n-72khvBTBi3bnIIR8FTTw"
+  ],
+  "create_time": 1689793851077,
+  "last_update_time": 1689793851101,
+  "is_async": true
+}
+```
+
+After a successful deployment, you can test the model using the Predict API set in the connector's `action` settings, as shown in the following example:
+
+```json
+POST /_plugins/_ml/models/cleMb4kBJ1eYAeTMFFg4/_predict
+{
+  "parameters": {
+    "model": "gpt-3.5-turbo",
+    "messages": [
+      {
+        "role": "system",
+        "content": "You are a helpful assistant."
+      },
+      {
+        "role": "user",
+        "content": "Hello!"
+      }
+    ]
+  }
+}
+```
+
+The Predict API returns inference results for the connected model, as shown in the following example response:
+
+```json
+{
+  "inference_results": [
+    {
+      "output": [
+        {
+          "name": "response",
+          "dataAsMap": {
+            "id": "chatcmpl-7e6s5DYEutmM677UZokF9eH40dIY7",
+            "object": "chat.completion",
+            "created": 1689793889,
+            "model": "gpt-3.5-turbo-0613",
+            "choices": [
+              {
+                "index": 0,
+                "message": {
+                  "role": "assistant",
+                  "content": "Hello! How can I assist you today?"
+                },
+                "finish_reason": "stop"
+              }
+            ],
+            "usage": {
+              "prompt_tokens": 19,
+              "completion_tokens": 9,
+              "total_tokens": 28
+            }
+          }
+        }
+      ]
+    }
+  ]
+}
+```
+
+### Internal connector
+
+To create an internal connector, add the `connector` parameter to the Register model API, as shown in the following example:
+
+```json
+POST /_plugins/_ml/models/_register
+{
+    "name": "openAI-GPT-3.5 completions: internal connector",
+    "function_name": "remote",
+    "model_group_id": "lEFGL4kB4ubqQRzegPo2",
+    "description": "test model",
+    "connector": {
+        "name": "OpenAI Connector",
+        "description": "The connector to public OpenAI model service for GPT 3.5",
+        "version": 1,
+        "protocol": "http",
+        "parameters": {
+            "endpoint": "api.openai.com",
+            "max_tokens": 7,
+            "temperature": 0,
+            "model": "text-davinci-003"
+        },
+        "credential": {
+            "openAI_key": "..."
+        },
+        "actions": [
+            {
+                "action_type": "predict",
+                "method": "POST",
+                "url": "https://${parameters.endpoint}/v1/completions",
+                "headers": {
+                    "Authorization": "Bearer ${credential.openAI_key}"
+                },
+                "request_body": "{ \"model\": \"${parameters.model}\", \"prompt\": \"${parameters.prompt}\", \"max_tokens\": ${parameters.max_tokens}, \"temperature\": ${parameters.temperature} }"
+            }
+        ]
+    }
+}
+```
+
+
+## Examples 
+
+The following example connector requests show how to create a connector with supported third-party tools.
+
+
+### OpenAI chat connector
+
+The following example creates a standalone OpenAI chat connector. The same options can be used for an internal connector under the `connector` parameter:
+
+
+```json
+POST /_plugins/_ml/connectors/_create
+{
+    "name": "OpenAI Chat Connector",
+    "description": "The connector to public OpenAI model service for GPT 3.5",
+    "version": 1,
+    "protocol": "http",
+    "parameters": {
+        "endpoint": "api.openai.com",
+        "model": "gpt-3.5-turbo"
+    },
+    "credential": {
+        "openAI_key": "..."
+    },
+    "actions": [
+        {
+            "action_type": "predict",
+            "method": "POST",
+            "url": "https://${parameters.endpoint}/v1/chat/completions",
+            "headers": {
+                "Authorization": "Bearer ${credential.openAI_key}"
+            },
+            "request_body": "{ \"model\": \"${parameters.model}\", \"messages\": ${parameters.messages} }"
+        }
+    ]
+}
+```
+
+After creating the connector, you can retrieve the `task_id`, deploy the model, and use the Predict API, similar to a standalone connector.
+
+
+### AWS SageMaker
+
+The following example creates a standalone Amazon SageMaker connector. The same options can be used for an internal connector under the `connector` parameter:
+
+```json
+POST /_plugins/_ml/connectors/_create
+{
+    "name": "sagemaker: embedding",
+    "description": "Test connector for Sagemaker embedding model",
+    "version": 1,
+    "protocol": "aws_sigv4",
+    "credential": {
+        "access_key": "...",
+        "secret_key": "...",
+        "session_token": "..."
+    },
+    "parameters": {
+        "region": "us-west-2",
+        "service_name": "sagemaker"
+    },
+    "actions": [
+        {
+            "action_type": "predict",
+            "method": "POST",
+            "headers": {
+                "content-type": "application/json"
+            },
+            "url": "https://runtime.sagemaker.${parameters.region}.amazonaws.com/endpoints/lmi-model-2023-06-24-01-35-32-275/invocations",
+            "request_body": "[\"${parameters.inputs}\"]"
+        }
+    ]
+}
+```
+
+The `credential` parameter contains the following options reserved for `aws-sigv4` authentication:
+
+- `access_key`: Required. Provides the access key for the AWS instance.
+- `secret_key`: Required. Provides the secret key for the AWS instance.
+- `session_token`: Optional. Provides a temporary set of credentials for the AWS instance.
+
+The `paramaters` section requires the following options when using `aws-sigv4` authentication:
+
+- `region`: The AWS Region in which the AWS instance is located.
+- `service_name`: The name of the AWS service for the connector.
+
+
+## Next steps
+
+- To learn more about using models in OpenSearch, see [ML Framework]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-framework/).
+- To learn more about model access control and model groups, see [Model access control]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-access-control/).
+
+
+
--- a/_ml-commons-plugin/gpu-acceleration.md
+++ b/_ml-commons-plugin/gpu-acceleration.md
@ -1,16 +1,13 @@
 ---
 layout: default
 title: GPU acceleration
-parent: Model-serving framework
+parent: ML framework
 nav_order: 150
 ---


 # GPU acceleration 

-GPU acceleration is an experimental feature. For updates on the progress of GPU acceleration, or if you want to leave feedback that could help improve the feature, join the discussion in the [OpenSearch forum](https://forum.opensearch.org/).    
-{: .warning}
-
 When running a natural language processing (NLP) model in your OpenSearch cluster with a machine learning (ML) node, you can achieve better performance on the ML node using graphics processing unit (GPU) acceleration. GPUs can work in tandem with the CPU of your cluster to speed up the model upload and training. 

 ## Supported GPUs
--- a/_ml-commons-plugin/ml-dashboard.md
+++ b/_ml-commons-plugin/ml-dashboard.md
@ -6,8 +6,8 @@ redirect_from:
  - /ml-commons-plugin/ml-dashbaord/
 ---

-Released in OpenSearch 2.6, the machine learning (ML) functionality in OpenSearch Dashboards is experimental and can't be used in a production environment. For updates or to leave feedback, see the [OpenSearch Forum discussion](https://forum.opensearch.org/t/feedback-ml-commons-ml-model-health-dashboard-for-admins-experimental-release/12494).
-{: .warning }
+The ML dashboard was taken out of experimental status and released as Generally Available in OpenSearch 2.9.  
+{: .note}

 Administrators of machine learning (ML) clusters can use OpenSearch Dashboards to manage and check the status of ML models running inside a cluster. This can help ML developers provision nodes to ensure their models run efficiently.

@ -66,4 +66,4 @@ A list of nodes gives you a view of each node the model is running on, including

 ## Next steps

-For more information about how to manage ML models in OpenSearch, see [Model-serving framework]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-serving-framework/).
+For more information about how to manage ML models in OpenSearch, see [ML Framework]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-serving-framework/).
--- a/_ml-commons-plugin/model-serving-framework.md
+++ b/_ml-commons-plugin/model-serving-framework.md
@ -1,18 +1,21 @@
 ---
 layout: default
-title: Model-serving framework 
+title: ML framework 
 has_children: true
 nav_order: 50
+redirect_from:
+   - /ml-commons-plugin/model-serving-framework/
 ---

-# Model-serving framework
+ML Framework was taken out of experimental status and released as Generally Available in OpenSearch 2.9.  
+{: .note}

-The model-serving framework is an experimental feature. For updates on the progress of the model-serving framework, or if you want to leave feedback that could help improve the feature, join the discussion in the [Model-serving framework forum](https://forum.opensearch.org/t/feedback-machine-learning-model-serving-framework-experimental-release/11439).    
-{: .warning}

-ML Commons allows you to serve custom models and use those models to make inferences. For those who want to run their PyTorch deep learning model inside an OpenSearch cluster, you can upload and run that model with the ML Commons REST API.
+# ML Framework

-This page outlines the steps required to upload a custom model and run it with the ML Commons plugin.
+ML Commons allows you to serve custom models and use those models to make inferences through the OpenSearch Machine Learning (ML) Framework. For those who want to run their PyTorch deep learning model inside an OpenSearch cluster, you can upload and run that model with the ML Commons REST API.
+
+This page outlines the steps required to upload a custom model and run it with the ML Framework.


 ## Prerequisites 
@ -21,7 +24,7 @@ To upload a custom model to OpenSearch, you need to prepare it outside of your O

 ### Model support

-As of OpenSearch 2.6, the model-serving framework supports text embedding models.
+As of OpenSearch 2.6, the ML Framework supports text-embedding models.

 ### Model format

@ -37,7 +40,7 @@ Most deep learning models are more than 100 MB, making it difficult to fit them

 ## GPU acceleration

-To achieve better performance within the model-serving framework, you can take advantage of GPU acceleration on your ML node. For more information, see [GPU acceleration]({{site.url}}{{site.baseurl}}/ml-commons-plugin/gpu-acceleration/).
+To achieve better performance within the ML Framework, you can take advantage of GPU acceleration on your ML node. For more information, see [GPU acceleration]({{site.url}}{{site.baseurl}}/ml-commons-plugin/gpu-acceleration/).


 ## Upload model to OpenSearch
--- a/_ml-commons-plugin/pretrained-models.md
+++ b/_ml-commons-plugin/pretrained-models.md
@ -1,16 +1,16 @@
 ---
 layout: default
 title: Pretrained models
-parent: Model-serving framework
+parent: ML framework
 nav_order: 120
 ---

-The model-serving framework is an experimental feature. For updates on the progress of the model-serving framework, or if you want to leave feedback that could help improve the feature, join the discussion in the [Model-serving framework forum](https://forum.opensearch.org/t/feedback-machine-learning-model-serving-framework-experimental-release/11439).    
+Pretrained models were taken out of experimental status and released to General Availability in OpenSearch 2.9.  
 {: .warning}

 # Pretrained models

-The model-serving framework supports a variety of open-source pretrained models that can assist with a range of machine learning (ML) search and analytics use cases. 
+The ML framework supports a variety of open-source pretrained models that can assist with a range of machine learning (ML) search and analytics use cases. 

 ## Uploading pretrained models

@ -28,11 +28,11 @@ POST /_plugins/_ml/models/_upload
 }
 ```

-For more information on how to upload and use ML models, see [Model-serving framework]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-serving-framework).
+For more information about how to upload and use ML models, see [ML Framework]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-serving-framework/).

 ## Supported pretrained models

-The model-serving framework supports the following models, categorized by type. All models are traced from [Hugging Face](https://huggingface.co/). Although models with the same type will have similar use cases, each model has a different model size and performs differently depending on your cluster. For a comparison of the performances of some pretrained models, see the [sbert documentation](https://www.sbert.net/docs/pretrained_models.html#model-overview).
+The ML Framework supports the following models, categorized by type. All models are traced from [Hugging Face](https://huggingface.co/). Although models with the same type will have similar use cases, each model has a different model size and performs differently depending on your cluster. For a performance comparison of some pretrained models, see the [sbert documentation](https://www.sbert.net/docs/pretrained_models.html#model-overview).


 ### Sentence transformers