2.1 KiB

Raw Blame History

layout	title	parent	grand_parent	nav_order
default	Deploy model	Model APIs	ML Commons APIs	20

Deploy a model

The deploy model operation reads the model's chunks from the model index and then creates an instance of the model to cache into memory. This operation requires the model_id.

For information about user access for this API, see Model access control considerations.

Path and HTTP methods

POST /_plugins/_ml/models/<model_id>/_deploy

Example request: Deploying to all available ML nodes

In this example request, OpenSearch deploys the model to any available OpenSearch ML node:

POST /_plugins/_ml/models/WWQI44MBbzI2oUKAvNUt/_deploy

{% include copy-curl.html %}

Example request: Deploying to a specific node

If you want to reserve the memory of other ML nodes within your cluster, you can deploy your model to a specific node(s) by specifying the node_ids in the request body:

POST /_plugins/_ml/models/WWQI44MBbzI2oUKAvNUt/_deploy
{
    "node_ids": ["4PLK7KJWReyX0oWKnBA8nA"]
}

{% include copy-curl.html %}

Example response

{
  "task_id" : "hA8P44MBhyWuIwnfvTKP",
  "status" : "DEPLOYING"
}

Check the status of model deployment

To see the status of your model deployment and retrieve the model ID created for the new model version, pass the task_id as a path parameter to the Tasks API:

GET /_plugins/_ml/tasks/hA8P44MBhyWuIwnfvTKP

{% include copy-curl.html %}

The response contains the model ID of the model version:

{
  "model_id": "Qr1YbogBYOqeeqR7sI9L",
  "task_type": "DEPLOY_MODEL",
  "function_name": "TEXT_EMBEDDING",
  "state": "COMPLETED",
  "worker_node": [
    "N77RInqjTSq_UaLh1k0BUg"
  ],
  "create_time": 1685478486057,
  "last_update_time": 1685478491090,
  "is_async": true
}

If a cluster or node is restarted, then you need to redeploy the model. To learn how to set up automatic redeployment, see Enable auto redeploy. {: .tip}

2.1 KiB Raw Blame History

Deploy a model

Path and HTTP methods

Example request: Deploying to all available ML nodes

Example request: Deploying to a specific node

Example response

Check the status of model deployment

2.1 KiB

Raw Blame History