2022-03-17 13:02:29 -04:00
---
layout: default
title: API
has_children: false
2022-03-17 18:18:15 -04:00
nav_order: 99
2022-03-17 13:02:29 -04:00
---
2022-03-17 18:18:15 -04:00
# ML Commons API
2022-03-18 12:45:21 -04:00
---
#### Table of contents
- TOC
{:toc}
---
2022-03-21 17:22:48 -04:00
The Machine Learning (ML) commons API lets you train ML algorithms synchronously and asynchronously, make predictions with that trained model, and train and predict with the same data set.
2022-03-17 13:02:29 -04:00
In order to train tasks through the API, three inputs are required.
2022-05-17 21:36:38 -04:00
- Algorithm name: Must be one of a [FunctionName ](https://github.com/opensearch-project/ml-commons/blob/1.3/common/src/main/java/org/opensearch/ml/common/parameter/FunctionName.java ). This determines what algorithm the ML Engine runs. To add a new function, see [How To Add a New Function ](https://github.com/opensearch-project/ml-commons/blob/main/docs/how-to-add-new-function.md ).
2022-03-19 13:37:01 -04:00
- Model hyper parameters: Adjust these parameters to make the model train better.
2022-03-21 16:28:37 -04:00
- Input data: The data input that trains the ML model, or applies the ML models to predictions. You can input data in two ways, query against your index or use data frame.
2022-03-17 13:02:29 -04:00
## Train model
2022-03-17 18:18:15 -04:00
Training can occur both synchronously and asynchronously.
### Request
The following examples use the kmeans algorithm to train index data.
**Train with kmeans synchronously**
```json
POST /_plugins/_ml/_train/kmeans
{
"parameters": {
"centroids": 3,
"iterations": 10,
"distance_type": "COSINE"
},
"input_query": {
"_source": ["petal_length_in_cm", "petal_width_in_cm"],
"size": 10000
},
"input_index": [
"iris_data"
]
}
```
**Train with kmeans asynchronously**
```json
POST /_plugins/_ml/_train/kmeans?async=true
{
"parameters": {
"centroids": 3,
"iterations": 10,
"distance_type": "COSINE"
},
"input_query": {
"_source": ["petal_length_in_cm", "petal_width_in_cm"],
"size": 10000
},
"input_index": [
"iris_data"
]
}
```
### Response
**Synchronously**
2022-03-22 11:13:32 -04:00
For synchronous responses, the API returns the model_id, which can be used to get or delete a model.
2022-03-17 18:18:15 -04:00
```json
{
"model_id" : "lblVmX8BO5w8y8RaYYvN",
"status" : "COMPLETED"
}
```
**Asynchronously**
2022-03-22 11:13:32 -04:00
For asynchronous responses, the API returns the task_id, which can be used to get or delete a task.
2022-03-17 18:18:15 -04:00
```json
{
"task_id" : "lrlamX8BO5w8y8Ra2otd",
"status" : "CREATED"
}
```
## Get model information
You can retrieve information on your model using the model_id.
```json
GET /_plugins/_ml/models/< model-id >
```
The API returns information on the model, the algorithm used, and the content found within the model.
```json
{
"name" : "KMEANS",
"algorithm" : "KMEANS",
"version" : 1,
"content" : ""
}
```
2022-03-21 19:59:35 -04:00
## Search model
Use this command to search models you're already created.
2022-03-17 18:18:15 -04:00
```json
2022-03-21 19:59:35 -04:00
POST /_plugins/_ml/models/_search
{query}
2022-03-17 18:18:15 -04:00
```
2022-03-22 13:52:10 -04:00
### Example: Query all models
2022-03-17 18:18:15 -04:00
```json
2022-03-21 19:59:35 -04:00
POST /_plugins/_ml/models/_search
2022-03-17 18:18:15 -04:00
{
2022-03-21 19:59:35 -04:00
"query": {
"match_all": {}
},
"size": 1000
}
```
2022-03-22 13:52:10 -04:00
### Example: Query models with algorithm "FIT_RCF"
2022-03-21 19:59:35 -04:00
```json
POST /_plugins/_ml/models/_search
{
"query": {
"term": {
"algorithm": {
"value": "FIT_RCF"
}
}
}
}
```
### Response
```json
{
"took" : 8,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 2.4159138,
"hits" : [
{
"_index" : ".plugins-ml-model",
"_id" : "-QkKJX8BvytMh9aUeuLD",
"_version" : 1,
"_seq_no" : 12,
"_primary_term" : 15,
"_score" : 2.4159138,
"_source" : {
"name" : "FIT_RCF",
"version" : 1,
"content" : "xxx",
"algorithm" : "FIT_RCF"
}
},
{
"_index" : ".plugins-ml-model",
"_id" : "OxkvHn8BNJ65KnIpck8x",
"_version" : 1,
"_seq_no" : 2,
"_primary_term" : 8,
"_score" : 2.4159138,
"_source" : {
"name" : "FIT_RCF",
"version" : 1,
"content" : "xxx",
"algorithm" : "FIT_RCF"
}
}
]
}
}
```
## Delete model
Deletes a model based on the model_id
```json
DELETE /_plugins/_ml/models/< model_id >
```
The API returns the following:
```json
{
"_index" : ".plugins-ml-model",
"_id" : "MzcIJX8BA7mbufL6DOwl",
"_version" : 2,
"result" : "deleted",
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"_seq_no" : 27,
"_primary_term" : 18
2022-03-17 18:18:15 -04:00
}
```
## Predict
2022-05-26 14:35:56 -04:00
ML Commons can predict new data with your trained model either from indexed data or a data frame. To use the Predict API, the `model_id` is required.
2022-03-17 18:18:15 -04:00
```json
POST /_plugins/_ml/_predict/< algorithm_name > /< model_id >
```
### Request
```json
2022-03-18 12:25:19 -04:00
POST /_plugins/_ml/_predict/kmeans/< model-id >
2022-03-17 18:18:15 -04:00
{
"input_query": {
"_source": ["petal_length_in_cm", "petal_width_in_cm"],
"size": 10000
},
"input_index": [
"iris_data"
]
}
```
### Response
```json
{
"status" : "COMPLETED",
"prediction_result" : {
"column_metas" : [
{
"name" : "ClusterID",
"column_type" : "INTEGER"
}
],
"rows" : [
{
"values" : [
{
"column_type" : "INTEGER",
"value" : 1
}
]
},
{
"values" : [
{
"column_type" : "INTEGER",
"value" : 1
}
]
},
{
"values" : [
{
"column_type" : "INTEGER",
"value" : 0
}
]
},
{
"values" : [
{
"column_type" : "INTEGER",
"value" : 0
}
]
},
{
"values" : [
{
"column_type" : "INTEGER",
"value" : 0
}
]
},
{
"values" : [
{
"column_type" : "INTEGER",
"value" : 0
}
]
}
]
}
```
2022-03-22 13:52:10 -04:00
## Train and predict
2022-03-18 12:25:19 -04:00
2022-03-21 19:52:03 -04:00
Use to train and then immediately predict against the same training data set. Can only be used with unsupervised learning models and the following algorithms:
2022-03-19 13:37:01 -04:00
- BATCH_RCF
- FIT_RCF
- kmeans
2022-03-18 12:25:19 -04:00
2022-03-22 13:52:10 -04:00
### Example: Train and predict with indexed data
2022-03-18 12:25:19 -04:00
```json
POST /_plugins/_ml/_train_predict/kmeans
{
"parameters": {
"centroids": 2,
"iterations": 10,
"distance_type": "COSINE"
},
"input_query": {
"query": {
"bool": {
"filter": [
{
"range": {
"k1": {
"gte": 0
}
}
}
]
}
},
"size": 10
},
"input_index": [
"test_data"
]
}
```
### Example: Train and predict with data directly
```json
POST /_plugins/_ml/_train_predict/kmeans
{
"parameters": {
"centroids": 2,
"iterations": 1,
"distance_type": "EUCLIDEAN"
},
"input_data": {
"column_metas": [
{
"name": "k1",
"column_type": "DOUBLE"
},
{
"name": "k2",
"column_type": "DOUBLE"
}
],
"rows": [
{
"values": [
{
"column_type": "DOUBLE",
"value": 1.00
},
{
"column_type": "DOUBLE",
"value": 2.00
}
]
},
{
"values": [
{
"column_type": "DOUBLE",
"value": 1.00
},
{
"column_type": "DOUBLE",
"value": 4.00
}
]
},
{
"values": [
{
"column_type": "DOUBLE",
"value": 1.00
},
{
"column_type": "DOUBLE",
"value": 0.00
}
]
},
{
"values": [
{
"column_type": "DOUBLE",
"value": 10.00
},
{
"column_type": "DOUBLE",
"value": 2.00
}
]
},
{
"values": [
{
"column_type": "DOUBLE",
"value": 10.00
},
{
"column_type": "DOUBLE",
"value": 4.00
}
]
},
{
"values": [
{
"column_type": "DOUBLE",
"value": 10.00
},
{
"column_type": "DOUBLE",
"value": 0.00
}
]
}
]
}
}
```
### Response
```json
{
"status" : "COMPLETED",
"prediction_result" : {
"column_metas" : [
{
"name" : "ClusterID",
"column_type" : "INTEGER"
}
],
"rows" : [
{
"values" : [
{
"column_type" : "INTEGER",
2022-03-21 16:20:31 -04:00
"value" : 1
2022-03-18 12:25:19 -04:00
}
]
},
{
"values" : [
{
"column_type" : "INTEGER",
2022-03-21 16:20:31 -04:00
"value" : 1
2022-03-18 12:25:19 -04:00
}
]
},
{
"values" : [
{
"column_type" : "INTEGER",
2022-03-21 16:20:31 -04:00
"value" : 1
2022-03-18 12:25:19 -04:00
}
]
},
{
"values" : [
{
"column_type" : "INTEGER",
"value" : 0
}
]
},
{
"values" : [
{
"column_type" : "INTEGER",
"value" : 0
}
]
},
{
"values" : [
{
"column_type" : "INTEGER",
"value" : 0
}
]
}
]
}
}
```
2022-03-21 19:59:35 -04:00
## Get task information
2022-03-17 18:18:15 -04:00
2022-03-21 19:59:35 -04:00
You can retrieve information about a task using the task_id.
2022-03-17 18:18:15 -04:00
```json
2022-03-21 19:59:35 -04:00
GET /_plugins/_ml/tasks/< task_id >
2022-03-17 18:18:15 -04:00
```
2022-03-21 19:59:35 -04:00
The response includes information about the task.
2022-03-17 18:18:15 -04:00
```json
{
2022-03-21 19:59:35 -04:00
"model_id" : "l7lamX8BO5w8y8Ra2oty",
"task_type" : "TRAINING",
"function_name" : "KMEANS",
"state" : "COMPLETED",
"input_type" : "SEARCH_QUERY",
"worker_node" : "54xOe0w8Qjyze00UuLDfdA",
"create_time" : 1647545342556,
"last_update_time" : 1647545342587,
"is_async" : true
2022-03-17 18:18:15 -04:00
}
```
## Search task
2022-03-18 12:25:19 -04:00
Search tasks based on parameters indicated in the request body.
2022-03-17 18:18:15 -04:00
```json
GET /_plugins/_ml/tasks/_search
{query body}
```
2022-03-18 12:25:19 -04:00
### Example: Search task which "function_name" is "KMEANS"
2022-03-17 18:18:15 -04:00
```json
GET /_plugins/_ml/tasks/_search
{
"query": {
"bool": {
"filter": [
{
"term": {
"function_name": "KMEANS"
}
}
]
}
}
}
```
2022-03-22 13:52:10 -04:00
### Response
2022-03-17 18:18:15 -04:00
```json
{
"took" : 12,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 0.0,
"hits" : [
{
"_index" : ".plugins-ml-task",
"_id" : "_wnLJ38BvytMh9aUi-Ia",
"_version" : 4,
"_seq_no" : 29,
"_primary_term" : 4,
"_score" : 0.0,
"_source" : {
"last_update_time" : 1645640125267,
"create_time" : 1645640125209,
"is_async" : true,
"function_name" : "KMEANS",
"input_type" : "SEARCH_QUERY",
"worker_node" : "jjqFrlW7QWmni1tRnb_7Dg",
"state" : "COMPLETED",
"model_id" : "AAnLJ38BvytMh9aUi-M2",
"task_type" : "TRAINING"
}
},
{
"_index" : ".plugins-ml-task",
"_id" : "wwRRLX8BydmmU1x6I-AI",
"_version" : 3,
"_seq_no" : 38,
"_primary_term" : 7,
"_score" : 0.0,
"_source" : {
"last_update_time" : 1645732766656,
"create_time" : 1645732766472,
"is_async" : true,
"function_name" : "KMEANS",
"input_type" : "SEARCH_QUERY",
"worker_node" : "A_IiqoloTDK01uZvCjREaA",
"state" : "COMPLETED",
"model_id" : "xARRLX8BydmmU1x6I-CG",
"task_type" : "TRAINING"
}
}
]
}
}
```
2022-03-21 19:59:35 -04:00
## Delete task
Delete a task based on the task_id.
```json
DELETE /_plugins/_ml/tasks/{task_id}
```
The API returns the following:
```json
{
"_index" : ".plugins-ml-task",
"_id" : "xQRYLX8BydmmU1x6nuD3",
"_version" : 4,
"result" : "deleted",
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"_seq_no" : 42,
"_primary_term" : 7
}
```
2022-03-18 12:25:19 -04:00
## Stats
Get statistics related to the number of tasks.
To receive all stats, use:
2022-03-17 18:18:15 -04:00
```json
GET /_plugins/_ml/stats
2022-03-18 12:25:19 -04:00
```
To receive stats for a specific node, use:
```json
2022-03-17 18:18:15 -04:00
GET /_plugins/_ml/< nodeId > /stats/
2022-03-18 12:25:19 -04:00
```
2022-03-21 19:52:03 -04:00
To receive stats for a specific node and return a specified stat, use:
2022-03-18 12:25:19 -04:00
```json
2022-03-17 18:18:15 -04:00
GET /_plugins/_ml/< nodeId > /stats/< stat >
2022-03-18 12:25:19 -04:00
```
To receive information on a specific stat from all nodes, use:
```json
2022-03-17 18:18:15 -04:00
GET /_plugins/_ml/stats/< stat >
```
2022-03-18 12:25:19 -04:00
### Example: Get all stats
2022-03-17 18:18:15 -04:00
```json
GET /_plugins/_ml/stats
```
### Response
```json
{
"zbduvgCCSOeu6cfbQhTpnQ" : {
"ml_executing_task_count" : 0
},
"54xOe0w8Qjyze00UuLDfdA" : {
"ml_executing_task_count" : 0
},
"UJiykI7bTKiCpR-rqLYHyw" : {
"ml_executing_task_count" : 0
},
"zj2_NgIbTP-StNlGZJlxdg" : {
"ml_executing_task_count" : 0
},
"jjqFrlW7QWmni1tRnb_7Dg" : {
"ml_executing_task_count" : 0
},
"3pSSjl5PSVqzv5-hBdFqyA" : {
"ml_executing_task_count" : 0
},
"A_IiqoloTDK01uZvCjREaA" : {
"ml_executing_task_count" : 0
}
}
```
2022-03-21 16:20:31 -04:00
2022-03-17 18:18:15 -04:00
2022-03-17 13:02:29 -04:00