diff --git a/_ml-commons-plugin/cluster-settings.md b/_ml-commons-plugin/cluster-settings.md index fc81195f..4bee7be6 100644 --- a/_ml-commons-plugin/cluster-settings.md +++ b/_ml-commons-plugin/cluster-settings.md @@ -7,37 +7,35 @@ nav_order: 10 # ML Commons cluster settings -This page provides an overview of `opensearch.yml` settings that can be configured for the ML commons plugin. +To enhance and customize your OpenSearch cluster for machine learning (ML), you can add and modify several configuration settings for the ML commons plugin in your 'opensearch.yml' file. ## Run tasks and models on ML nodes only +If `true`, ML Commons tasks and models run machine learning (ML) tasks on ML nodes only. If `false`, tasks and models run on ML nodes first. If no ML nodes exist, tasks and models run on data nodes. Don't set as `false` on a production cluster. + ### Setting ``` plugins.ml_commons.only_run_on_ml_node: true ``` -### Description - -If `true`, ML Commons tasks and models run machine learning (ML) tasks on ML nodes only. If `false`, tasks and models run on ML nodes first. If no ML nodes exist, tasks and models run on data nodes. Don't set as "false" on production cluster. - ### Values -- Default value: `false` +- Default value: `true` - Value range: `true` or `false` ## Dispatch tasks to ML node +`round_robin` dispatches ML tasks to ML nodes using round robin routing. `least_load` gathers all ML nodes' runtime information, such as JVM heap memory usage and running tasks, then dispatches tasks to the ML node with the least load. + + ### Setting ``` plugins.ml_commons.task_dispatch_policy: round_robin ``` -### Description - -`round_robin` dispatches ML tasks to ML nodes using round robin routing. `least_load` gathers all an ML nodes' runtime information, like JVM heap memory usage and running tasks, then dispatches tasks to the ML node with the least load. ### Values @@ -47,16 +45,14 @@ plugins.ml_commons.task_dispatch_policy: round_robin ## Set sync up job intervals +When returning runtime information with the [profile API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api#profile), ML Commons will run a regular sync up job to sync up newly loaded or unloaded models on each node. When set to `0`, ML Commons immediately stops sync up jobs. + ### Setting ``` plugins.ml_commons.sync_up_job_interval_in_seconds: 10 ``` -### Description - -When returning runtime information with the [profile API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api#profile), ML Commons will run a regular sync up job to sync up newly loaded or unloaded models on each node. When set to `0`, ML Commons immediately stops sync up jobs. - ### Values - Default value: `10` @@ -64,16 +60,14 @@ When returning runtime information with the [profile API]({{site.url}}{{site.bas ## Predict monitoring requests +Controls how many predict requests are monitored on one node. If set to `0`, OpenSearch clears all monitoring predict requests in the node's cache, and does not monitor predict requests from that point forward. + ### Setting ``` plugins.ml_commons.monitoring_request_count: 100 ``` -### Description - -Controls how many upload model tasks can run in parallel on one node. If set to `0`, you cannot upload models to any node. - ### Value range - Default value: `100` @@ -81,15 +75,14 @@ Controls how many upload model tasks can run in parallel on one node. If set to ## Upload model tasks per node +Controls how many upload model tasks can run in parallel on one node. If set to `0`, you cannot upload models to any node. + ### Setting ``` plugins.ml_commons.max_upload_model_tasks_per_node: 10 ``` -### Description - -Controls how many upload model tasks can run in parallel on one node. If set to `0`, you cannot upload models to any node. ### Values @@ -99,16 +92,14 @@ Controls how many upload model tasks can run in parallel on one node. If set to ## Load model tasks per node +Controls how many load model tasks can run in parallel on one node. If set to `0`, you cannot load models to any node. + ### Setting ``` plugins.ml_commons.max_load_model_tasks_per_node: 10 ``` -### Description - -Controls how many load model tasks can run in parallel on one node. If set as 0, you cannot load models to any node. - ### Values - Default value: `10` @@ -116,16 +107,15 @@ Controls how many load model tasks can run in parallel on one node. If set as 0, ## Add trusted URL +The default value allows uploading a model file from any `http`, `https`, `ftp`, or local file. You can change this value to restrict trusted model URL. + + ### Setting ``` plugins.ml_commons.trusted_url_regex: ^(https?\|ftp\|file)://[-a-zA-Z0-9+&@#/%?=~_\|!:,.;]*[-a-zA-Z0-9+&@#/%=~_\|] ``` -### Description - -The default value allows uploading a model file from any http/https/ftp/local file. You can change this value to restrict trusted model URL - ### Values - Default value: `^(https?\|ftp\|file)://[-a-zA-Z0-9+&@#/%?=~_\|!:,.;]*[-a-zA-Z0-9+&@#/%=~_\|]` diff --git a/_ml-commons-plugin/model-serving-framework.md b/_ml-commons-plugin/model-serving-framework.md index 86c8fda8..253d520c 100644 --- a/_ml-commons-plugin/model-serving-framework.md +++ b/_ml-commons-plugin/model-serving-framework.md @@ -25,10 +25,12 @@ As of OpenSearch 2.4, the model-serving framework only supports text embedding m ### Model format -To use a model in OpenSearch, you'll need to export the model into a portable format. As of Version 2.4, OpenSearch only supports the [TorchScript](https://pytorch.org/docs/stable/jit.html) format. +To use a model in OpenSearch, you'll need to export the model into a portable format. As of Version 2.5, OpenSearch only supports the [TorchScript](https://pytorch.org/docs/stable/jit.html) and [ONNX][https://onnx.ai/] formats. Furthermore, files must be saved as zip files before upload. Therefore, to ensure that ML Commons can upload your model, compress your TorchScript file before uploading. You can download an example file [here](https://github.com/opensearch-project/ml-commons/blob/2.x/ml-algorithms/src/test/resources/org/opensearch/ml/engine/algorithms/text_embedding/all-MiniLM-L6-v2_torchscript_sentence-transformer.zip). + + ### Model size Most deep learning models are more than 100 MB, making it difficult to fit them into a single document. OpenSearch splits the model file into smaller chunks to be stored in a model index. When allocating machine learning (ML) or data nodes for your OpenSearch cluster, make sure you correctly size your ML nodes so that you have enough memory when making ML inferences.